我一直在使用Python上的spark(基于hadoop 2.7)工作,我正在尝试运行示例“字数”并且它是我的代码:#Imports#注意未使用的导入(以及未使用的变量),#请对它们进行评论,否则,您将在执行时遇到任何错误 . #请注意,指令“@PydevCodeAnalysisIgnore”和“@UnusedImport”#都不能解决该问题 . #from pyspark.mllib.clustering从pyspark导入KMeans导入SparkConf,SparkContext导入os
# Configure the Spark environment
sparkConf = SparkConf().setAppName("WordCounts").setMaster("local")
sc = SparkContext(conf = sparkConf)
# The WordCounts Spark program
textFile = sc.textFile(os.environ["SPARK_HOME"] + "/README.md")
wordCounts = textFile.flatMap(lambda line: line.split()).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a+b)
for wc in wordCounts.collect(): print wc
然后我得到以下错误:
17/08/07 12:28:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/08/07 12:28:16 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Traceback (most recent call last):
File "/home/hduser/eclipse-workspace/PythonSpark/src/WordCounts.py", line 12, in <module>
sc = SparkContext(conf = sparkConf)
File "/usr/local/spark/python/pyspark/context.py", line 118, in __init__
conf, jsc, profiler_cls)
File "/usr/local/spark/python/pyspark/context.py", line 186, in _do_init
self._accumulatorServer = accumulators._start_update_server()
File "/usr/local/spark/python/pyspark/accumulators.py", line 259, in _start_update_server
server = AccumulatorServer(("localhost", 0), _UpdateRequestHandler)
File "/usr/lib/python2.7/SocketServer.py", line 417, in __init__
self.server_bind()
File "/usr/lib/python2.7/SocketServer.py", line 431, in server_bind
self.socket.bind(self.server_address)
File "/usr/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
socket.gaierror: [Errno -3] Temporary failure in name resolution
任何帮助?我可以使用spark-shell以及任何(非火花)python程序运行任何带有Scala的项目在eclipse上没有错误我认为我的问题是pyspark有什么事要做?
4 回答
You could Try this ,Just Create SparkContext is enough ,its working.
Try This way...
启动你的火花后,它在COMMAND PROMPT sc上显示为SparkContext .
如果没有,您可以使用以下方式..
This is enough to run your program. 因为,sc可用你的Shell .
首先试试你的SHEEL MODE ...
逐行...
根据我的理解,如果正确安装Spark,下面的代码应该可以正常工作 .