首页 文章

如何在Windows中使用Scala将Cassandra与Spark连接起来

提问于
浏览
0

我正在尝试使用Scala连接Spark和Cassandra,如此处所述http://www.planetcassandra.org/blog/kindling-an-introduction-to-spark-with-cassandra/我在 Headers 下的步骤中遇到错误:

“将连接器加载到Spark Shell:”

val test_spark_rdd = sc.cassandraTable(“test_spark”,“test”)

test_spark_rdd.first 使用上面的命令(粗体)

它显示错误 Exception in task 0.0 in stage 0.0 (TID 0) java.lang.NullPointerException

我在这里上传了完整的堆栈跟踪

https://docs.google.com/document/d/1UjGXKifD6chq7-WrHd3GT3LoNcw8GawxAPeOtiEjKvM/edit?usp=sharing

cassandra.YAML文件中的一些rpc设置是:

rpc_address: localhost 
# rpc_interface: eth1 
# rpc_interface_prefer_ipv6: false 
# port for Thrift to listen for clients on 
rpc_port: 9160

My spark-defaults config file

# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.

# Example:
# spark.master                     spark://master:7077
# spark.eventLog.enabled           true
# spark.eventLog.dir               hdfs://namenode:8021/directory
#spark.serializer                 org.apache.spark.serializer.KryoSerializer
#spark.driver.memory              5g
#spark.executor.extraJavaOptions  -XX:+PrintGCDetails -#Dkey=value -Dnumbers="one two three"
spark.cassandra.connection.host localhost

1 回答

  • 0
    15/08/04 21:24:50 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
        java.lang.NullPointerException
                at java.lang.ProcessBuilder.start(Unknown Source)
                at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
                at org.apache.hadoop.util.Shell.run(Shell.java:418)
                at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
                at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
                at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
    

    看起来问题是底层分叉执行程序进程无法启动或对本地文件系统执行某些操作 . 确保Executor Process可以访问默认的spark目录 .

相关问题