首页 文章

Spark-HBASE错误java.lang.IllegalStateException:未读块数据

提问于
浏览
3

我正在尝试使用jersey Rest-API通过java-Spark程序从HBASE表中获取记录然后我得到下面提到的错误,但是当我通过spark-Jar访问HBase表时,代码正在执行而没有错误 .

我有一个2个工作节点用于Hbase,2个工作节点用于火花,由同一个Master维护 .

WARN TaskSetManager:阶段0.0中的丢失任务1.0(TID 1,172.31.16.140):java.lang.IllegalStateException:java.io.ObjectInputStream中的未读块数据$ java.io.ObjectInputStream中的$ BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) .readObject0(ObjectInputStream.java:1382)java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream . java:1798)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala :69)atg.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)at org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:194)at java.util.concurrent . java的ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.lang.Thread.run中的.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:615)(Thread.java:745)

2 回答

  • 0

    好吧,我可能知道你的问题,因为我刚刚经历过 .

    原因很可能是错过了一些hbase jar,因为在spark runing期间,spark需要通过hbase jar来读取数据,如果不存在,那么会抛出一些异常,你该怎么办?这很容易

    在提交作业之前,你需要添加params --jars并加入以下的jar:

    --jars /ROOT/server/hive/lib/hive-hbase-handler-1.2.1.jar,
    /ROOT/server/hbase/lib/hbase-client-0.98.12-hadoop2.jar,
    /ROOT/server/hbase/lib/hbase-common-0.98.12-hadoop2.jar,
    /ROOT/server/hbase/lib/hbase-server-0.98.12-hadoop2.jar,
    /ROOT/server/hbase/lib/hbase-hadoop2-compat-0.98.12-hadoop2.jar,
    /ROOT/server/hbase/lib/guava-12.0.1.jar,
    /ROOT/server/hbase/lib/hbase-protocol-0.98.12-hadoop2.jar,
    /ROOT/server/hbase/lib/htrace-core-2.04.jar

    如果能,享受它!

  • 3

    当提交用java api实现的spark作业时,我在CDH5.4.0中遇到了同样的问题,这是我的解决方案:
    solution 1:Using spark-submit

    --jars zookeeper-3.4.5-cdh5.4.0.jar, 
    hbase-client-1.0.0-cdh5.4.0.jar, 
    hbase-common-1.0.0-cdh5.4.0.jar,
    hbase-server1.0.0-cdh5.4.0.jar,
    hbase-protocol1.0.0-cdh5.4.0.jar,
    htrace-core-3.1.0-incubating.jar,
    // custom jars which are needed in the spark executors
    

    solution 2:Use SparkConf in code
    SparkConf.setJars(new String[]{"zookeeper-3.4.5-cdh5.4.0.jar", "hbase-client-1.0.0-cdh5.4.0.jar", "hbase-common-1.0.0-cdh5.4.0.jar", "hbase-server1.0.0-cdh5.4.0.jar", "hbase-protocol1.0.0-cdh5.4.0.jar", "htrace-core-3.1.0-incubating.jar", // custom jars which are needed in the spark executors });

    To summary
    问题是由于火花项目中缺少jar,你需要将这些jar添加到项目类路径中,此外,使用上述2个解决方案来帮助将这些jar分发给你的spark集群 .

相关问题