我正在使用Windows 10来设置Apach Spark以开始使用它 .

预先构建的Spark版本是spark-2.2.1-bin-hadoop2.7 .

python --version
Python 2.7.14

java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) Client VM (build 25.121-b13, mixed mode)

我已经在文件夹 C:\Hadoop\bin 中下载了winutils.exe .

我设置了环境变量:

%HADOOP_HOME%= C:\ Hadoop \%SPARK_HOME%= C:\ spark-2.2.1-bin-hadoop2.7 \

之后我尝试使用winutils设置文件夹 C:\tmp\hive 的权限:

winutils.exe chmod 777 -R \tmp\hive
winutils.exe ls \tmp\hive\
drwxrwxrwx 1 BUILTIN\Administradores DESKTOP-V0O6RLF\Nenhum 0 Dec 10 2017 \tmp\hive

但是在初始化PySpark时我遇到以下错误:

C:\WINDOWS\system32>%SPARK_HOME%\bin\pyspark
Python 2.7.14 (v2.7.14:84471935ed, Sep 16 2017, 20:19:30) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
17/12/10 00:39:47 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Traceback (most recent call last):
  File "C:\spark-2.2.1-bin-hadoop2.7\python\pyspark\shell.py", line 45, in <module>
    spark = SparkSession.builder\
  File "C:\spark-2.2.1-bin-hadoop2.7\python\pyspark\sql\session.py", line 183, in getOrCreate
    session._jsparkSession.sessionState().conf().setConfString(key, value)
  File "C:\spark-2.2.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\java_gateway.py", line 1133, in __call__
  File "C:\spark-2.2.1-bin-hadoop2.7\python\pyspark\sql\utils.py", line 79, in deco
    raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':"
>>>

我发现的唯一解决方案是将权限设置为\ tmp \ hive,但它对我不起作用!

提前致谢!