spark 2.0与(datastax)cassandra 2.1.13兼容吗?我在我的本地mac上安装了spark 2.1.0,并且还安装了scala 2.11.x.我试图从安装了datastax 4.8.6的服务器读取cassandra表(spark 1.4和cassandra 2.1.13)
我在spark shell上运行以下代码
spark-shell
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.implicits._
import org.apache.spark.sql.cassandra._
import com.datastax.spark.connector.cql._
import org.apache.spark.sql
import org.apache.spark.SparkContext._
import com.datastax.spark.connector.cql.CassandraConnector._
spark.stop
val sparkSession = SparkSession.builder.appName("Spark app").config("spark.cassandra.connection.host",CassandraNodeList).config("spark.cassandra.auth.username", CassandraUser).config("spark.cassandra.auth.password", CassandraPassword).config("spark.cassandra.connection.port", "9042").getOrCreate()
sparkSession.sql("""CREATE TEMPORARY view hdfsfile
|USING org.apache.spark.sql.cassandra
|OPTIONS (
| table "hdfs_file",
| keyspace "keyspaceName")""".stripMargin)
*****收到以下错误
17/02/28 10:33:02错误执行程序:阶段3.0(TID 20)中的任务8.0中的异常java.lang.NoClassDefFoundError:com.datastax.spark.connector.util.CountingIterator中的scala / collection / GenTraversableOnce $ class . (CountingIterator.scala:4)at com.datastax.spark.connector.rdd.CassandraTableScanRDD.compute(CassandraTableScanRDD.scala:336)atg.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)at org . apache.spark.rdd.RDD.iterator(RDD.scala:287)org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD . scala:323)org.apache.spark.rdd.RDD.iterator(RDD.scala:287)org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)org.apache.spark.rdd .RDD.computeOrReadCheckpoint(RDD.scala:323)atg.apache.spark.rdd.RDD.iterator(RDD.scala:287)org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)at at org.apache.sp上的org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) ark.apd.RDD.iterator(RDD.scala:287)org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)org.apache.spark.scheduler.Task.run(Task.scala: 99)atg.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:282)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor $ Worker .run(ThreadPoolExecutor.java:617)在java.lang.Thread.run(Thread.java:745)
1 回答
这是Scala版本不匹配错误 . 您正在使用带有scala 2.11的scala 2.10库(反之亦然) . 它在SCC FAQ中进行了解释
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/FAQ.md#what-does-this-mean-noclassdeffounderror-scalacollectiongentraversableonceclass
引用常见问题解答