首页 文章

运行DeepLearning4J MNIST示例时出现java.lang.OutOfMemoryError

提问于
浏览
3

这是我在单个节点上做的事情,本地火花集群:

git clone https://github.com/deeplearning4j/dl4j-spark-cdh5-examples.git
cd dl4j-spark-cdh5-examples.git
mvn package
export SPARK_WORKER_MEMORY=13g
spark-submit --class org.deeplearning4j.examples.cnn.MnistExample ./target/dl4j-spark-cdh5-examples-1.0-SNAPSHOT.jar

这就是我得到的:

Caused by: java.lang.OutOfMemoryError: Java heap space

这里是完整的堆栈跟踪:

spark-submit --class org.deeplearningarning4j.examples.cnn.MnistExample ./target/dl4j-spark-cdh5-examples-1.0-SNAPSHOT.jar 21:21:13414 INFO~加载数据....警告:可能不会负载本机系统BLAS ND4J性能将降低请安装本地BLAS库,如OpenBLAS或IntelMKL请参阅http://nd4j.org/getstarted.html#open了解更多详情21:21:20,571 INFO~Build model .... 21 :21:20,776 WARN~目标函数自动设置为最小化 . 在神经网络配置中设置stepFunction以更改默认设置 . 21:21:20,886 INFO~ ---开始网络训练--- [阶段0:>(0 6)/ 6] [阶段0:>(0 6)/ 6]线程“调度程序 - 事件 - 循环 - 异常 - 3“java.lang.OutOfMemoryError:Java堆空间21:24:12,358 ERROR~阶段0.0中的任务0.0中的异常(TID 0)java.lang.IllegalStateException:java.io.ObjectInputStream中的未读块数据$ BlockDataInputStream.setBlockDataMode(ObjectInputStream .java:2421)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)中,位于org.apache的java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) . org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaS)中的spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76) erializer.scala:115)atg.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:194)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)at java.util.concurrent .ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:615)at java.lang.Thread.run(Thread.java:744)21:24:12,358 ERROR~阶段0.0(TID 5)java.lang中的任务5.0中的异常 . OutOfMemoryError:java.io.ObjectInputStream.readObject0(ObjectInputStream.java:)中java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670)的java.lang.reflect.Array.newInstance(Array.java:70)中的Java堆空间: 1344)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)at org.nd4j.linalg.api.buffer.BaseDataBuffer.doReadObject(BaseDataBuffer.java) :880)atg.nd4j.linalg.api.buffer.BaseDataBuffer.readObject(BaseDataBuffer.java:868)at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)at sun.reflect.Delega tingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)位于java.io.Object.putInputStream.readSerialData的java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)的java.lang.reflect.Method.invoke(Method.java:606) (ObjectInputStream.java:1893)java.io.ObjectInputStream.readOream中的java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java): 1990)在java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)java. java.io.ObjectInputStream.readObject(ObjectInputStream.java:1798)java. java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) . io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)位于java.io.ObjectInputStream.readSerial上的java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)java.io.ObjectInputStream.readObject0中的java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) (ObjectInputStream.java :1350)java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java位于java.io.ObjectInputStream的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)的java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)中的.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) . defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)at org.apache.spark.rdd.ParallelCollectionPartition $$ anonfun $ readObject $ 1.apply $ mcV $ sp(ParallelCollectionRDD.scala: 74)在org.apache.spark.util.Utils $ .tryOrIOException(Utils.scala:1204)21:24:12,358 ERROR~阶段0.0(TID 3)的任务3.0中的异常java.lang.OutOfMemoryError:java.lang上的Java堆空间.reflect.Array.newInstance(Array.java:70)位于java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670),位于java.io.ObjectInputStream的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344) . defaultReadFields(ObjectInputStream.java:1990)位于org.nd4j.linalg的org.nd4j.linalg.api.buffer.BaseDataBuffer.doReadObject(BaseDataBuffer.java:880)的java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500) .api.buffer.BaseDataBuffer.readObject(BaseDataBuffer.java:868)at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method .invoke(Method.java:606)位于java.io.ObjectInputStream.readSerialDat的java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java)中的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)中的java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)中的a(ObjectInputStream.java:1893) :java)java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)at java位于java.io.ObjectInputStream的java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)的java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)中的.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) . readObject0(ObjectInputStream.java:1350)java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java) :1990)在java.io.Objec tInputStream.readSerialData(ObjectInputStream.java:1915)位于java.io.ObjectInputStream.defaultReadFields(ObjectInputStream)java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)的java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) .java:1990)java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)at org.apache.spark.rdd.ParallelCollectionPartition $$ anonfun $ readObject $ 1.apply $ mcV $ sp(ParallelCollectionRDD.scala:74)at at org.apache.spark.util.Utils $ .tryOrIOException(Utils.scala:1204)21:24:12,358 ERROR~阶段0.0(TID 1)的任务1.0中的异常java.lang.IllegalStateException:java.io中的未读块数据.ObjectInputStream $ BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421),java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382),位于java.io.ObjectInputStream的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) . java.io.ObjectInputStream中的readSerialData(ObjectInputStream.java:1915) . readOrdinaryObject(ObjectInputStream.java:1798)位于java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)的java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)at org.apache.spark.serializer.JavaDeserializationStream.readObject (JavaSerializer.scala:76)org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)at org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:194)at java . util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:615)at java.lang.Thread.run(Thread.java:744)21:21: 24:12,358 ERROR~阶段0.0(TID 2)中的任务2.0中的异常java.lang.OutOfMemoryError:java.io.ObjectInputStream.readArray中java.lang.reflect.Array.newInstance(Array.java:70)中的Java堆空间(ObjectInputStream.java:1670)java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)at java.io.ObjectInputStream.defaultReadFields(Ob jectInputStream.java:1990)位于org.nd4j.linalg.api的org.nd4j.linalg.api.buffer.BaseDataBuffer.doReadObject(BaseDataBuffer.java:880)的java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500) .buffer.BaseDataBuffer.readObject(BaseDataBuffer.java:868)at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke (Method.java:606)atjava.io.Object.reamInputStream中的java.io.ObjectInputStream.readSerial上的java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)中的java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)java.io.ObjectInputStream中的java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) .readObject0(ObjectInputStream.java:1350)位于java.io.ObjectInputStream.readOdialialObject(ObjectInputStream . )的java.io.ObjectInputStream.readSerial上的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)中 . (ObjectInputStream.java:1950) . java:1798)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)at at Java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)at java.io.ObjectInputStream .readObject0(ObjectI nputStream.java:1344)java.io.ObjectInputStream.readOdial对话(ObjectInputStream.java:1999)java.io.ObjectInputStream.readOdial )java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)org.apache .spark.rdd.ParallelCollectionPartition $$ anonfun $ readObject $ 1.apply $ mcV $ sp(ParallelCollectionRDD.scala:74)at org.apache.spark.util.Utils $ .tryOrIOException(Utils.scala:1204)21:24:12,375错误〜线程中未捕获的异常Thread [Executor task launch worker-5,5,main] java.lang.OutOfMemoryError:java.io中的java.lang.reflect.Array.newInstance(Array.java:70)中的Java堆空间 . java.io.Obj上的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)中的ObjectInputStream.readArray(ObjectInputStream.java:1670)位于org.nd4j的org.nd4j.linalg.api.buffer.BaseDataBuffer.doReadObject(BaseDataBuffer.java:880)的java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)中的ectInputStream.defaultReadFields(ObjectInputStream.java:1990)位于java.lang.reflect的sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)的sun.reflect.GeneratedMethodAccessor7.invoke(未知来源)的.linalg.api.buffer.BaseDataBuffer.readObject(BaseDataBuffer.java:868) .Method.invoke(Method.java:606)位于java.io.ObjectInputStream.readSerial上的java.io.ObjectInputStream.readSerial中的java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)中的java.io.Object.StreamClass.invokeReadObject(ObjectStreamClass.java:1017) ObjectInputStream.java:1798)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915 )在java.io.ObjectInputStream.readOrdinaryObject(Obj ectInputStream.java:1798)java.io.ObjectInputStream.readSerial上的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)java.io.ObjectInputStream.readSerial上的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) . (ObjectInputStream.java:1915) )java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)at java.io .ObjectInputStream.readObject0(ObjectInputStream.java:1344)位于java.io.ObjectInputStream.readOdial对话中的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1999)java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) ObjectInputStream.java:1798)java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)中java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) )org.apache.spark.rdd .ParallelCollectionPartition $$ anonfun $ readObject $ 1.apply $ mcV $ sp(ParallelCollectionRDD.scala:74)org.apache.spark.util.Utils $ .tryOrIOException(Utils.scala:1204)21:24:12,375错误〜未捕获的异常in thread Thread [Executor task launch worker-3,5,main] java.lang.OutOfMemoryError:java.io.ObjectInputStream.readArray中java.lang.reflect.Array.newInstance(Array.java:70)中的Java堆空间( ObjectInputStream.java:1670)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)at at位于org.nd4j.linalg.api.buffer.BaseDataBuffer.doReadObject(BaseDataBuffer.java:880)的java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)中的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) atg.nd4j.linalg.api.buffer.BaseDataBuffer.readObject(BaseDataBuffer.java:868)at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java .lang.reflect.Method.invoke(Method.java:606)位于java.io.ObjectInputStream.readSerial上的java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)中的java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) . java.io.ObjectInputStream.readSerial上的ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)java.io.ObjectInputStream.readSerial上的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)中的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350),ObjectInputStream .java:1915)在java.io.ObjectInputStream.readOrdinar java.io.ObjectInputStream.readSerialData(ObjectInputStream.java)中的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)中的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)中的yObject(ObjectInputStream.java:1798) :1915)在爪哇java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)位于java.io.ObjectInputStream的java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)中的.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344) . java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java)中的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)中的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)中的readOrdinaryObject(ObjectInputStream.java:1798) :500)在org.apach e.spark.rdd.ParallelCollectionPartition $$ anonfun $ readObject $ 1.apply $ mcV $ sp(ParallelCollectionRDD.scala:74)org.apache.spark.util.Utils $ .tryOrIOException(Utils.scala:1204)21:24: 12,375 ERROR~线程中未捕获的异常Thread [Executor task launch worker-2,5,main] java.lang.OutOfMemoryError:java.lang上的java.lang.reflect.Array.newInstance(Array.java:70)中的Java堆空间.ObjectInputStream.readArray(ObjectInputStream.java:1670)java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.defaultReadObject( ObjectInputStream.java:500)org.nd4j.linalg.api.buffer.BaseDataBuffer.doReadObject(BaseDataBuffer.java:880)at the ord.nd4j.linalg.api.buffer.BaseDataBuffer.readObject(BaseDataBuffer.java:868)at sun位于java.lang.reflect.Method的sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)的.reflect.GeneratedMethodAccessor7.invoke(未知来源) . 在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java)的java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)的java.io.Object.StreamClass.invokeReadObject(ObjectStreamClass.java:1017)中调用(Method.java:606) :java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)java. java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)at java位于java.io.ObjectInputStream的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)中的.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) . java.io.ObjectInputStream.readArray(ObjectInputStream.java)中的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)中的java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)中的readSerialData(ObjectInputStream.java:1915) :1706)在java.io.ObjectInpu java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream)java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)中的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)中的tStream.readObject0(ObjectInputStream.java:1344) .java:1798)at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)在org.apache.spark.rdd.ParallelCollectionPartition $$ anonfun $ readObject $ 1.apply $ mcV $ sp(ParallelCollectionRDD.scala:74)at org.apache.spark.util.Utils $ .tryOrIOException(Utils.scala:1204)21: 24:12,383 ERROR~阶段0.0的任务5失败1次;中止作业线程“main”中的异常org.apache.spark.SparkException:作业因阶段失败而中止:阶段0.0中的任务5失败1次,最近失败:阶段0.0中失去的任务5.0(TID 5,localhost):java .lang.OutOfMemoryError:java.io.ObjectInputStream.readObject0(ObjectInputStream)java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670)java.lang.reflect.Array.newInstance(Array.java:70)中的Java堆空间.java:1344)at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)at org.nd4j.linalg.api.buffer.BaseDataBuffer.doReadObject( BaseDataBuffer.java:880)在org.nd4j.linalg.api.buffer.BaseDataBuffer.readObject(BaseDataBuffer.java:868)在sun.reflect.GeneratedMethodAccessor7.invoke(来源不明)在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl . java:43)java.io.ObjectStreamClass.invok的java.lang.reflect.Method.invoke(Method.java:606) eReadObject(ObjectStreamClass.java:1017)位于java.io.ObjectInputStream.readSerial中的java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)java.io.ObjectInputStream.readObject0(ObjectInputStream.java)中的java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) :1350)java.io.ObjectInputStream.readSerial中的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1998)位于java.io.ObjectInputStream的java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)的java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)中的.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) . readOrdinaryObject(ObjectInputStream.java:1798)位于java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)java.io.ObjectInputStream.readObject0(ObjectInputStream.java) :1344)在java.i o.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)在java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)在java.io.ObjectInputStream.readObject0 (ObjectInputStream.java:1350)java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)at org.apache.spark.rdd.ParallelCollectionPartition $$ anonfun $ readObject $ 1.apply $ mcV $ sp(ParallelCollectionRDD.scala:74)at org.apache.spark.util.Utils $ .tryOrIOException(Utils.scala:1204)驱动程序堆栈跟踪:at org.apache.spark.scheduler.DAGScheduler . 组织$阿帕奇$火花$ $调度$$ DAGScheduler failJobAndIndependentStages(DAGScheduler.scala:1431)在org.apache.spark.scheduler.DAGScheduler $$ anonfun $ abortStage $ 1.适用(DAGScheduler.scala:1419)在org.apache.spark .scheduler.DAGScheduler $$ anonfun $ abortStage $ 1.apply(DAGScheduler.scala:1418)at scala.collection.mutable . ResizableArray $ class.foreach(ResizableArray.scala:59)at org.apache.spark.cheduler.DARScheduler.abortStage(DAGScheduler.scala:1418)的orc.aplection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) .apache.spark.scheduler.DAGScheduler $$ anonfun $ handleTaskSetFailed $ 1.适用(DAGScheduler.scala:799)在org.apache.spark.scheduler.DAGScheduler $$ anonfun $ handleTaskSetFailed $ 1.适用(DAGScheduler.scala:799)在斯卡拉.Option.foreach(Option.scala:236)在org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)在org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)在org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)在org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)在org.apache.spark.util.EventLoop $$匿名$ 1.run(EventLoop.scala:48)org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)org.apache.spark.SparkContext.ru nJob(SparkContext.scala:1832)org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)at org.apache.spark.SparkContext .runJob(SparkContext.scala:1929)atOrg.apache.spark.rdd.RDD.count(RDD.scala:1157)位于org.apache.spark.api的org.apache.spark.api.java.JavaRDDLike $ class.count(JavaRDDLike.scala:440) . java.AbstractJavaRDDLike.count(JavaRDDLike.scala:46)在org.deeplearning4j.spark.impl.multilayer.SparkDl4jMultiLayer.fitDataSet(SparkDl4jMultiLayer.java:239)在org.deeplearning4j.examples.cnn.MnistExample.main(MnistExample.java: 132)在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在java.lang.reflect中.Method.invoke(Method.java:606)org.apache.spark.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:731)org.apache.spark.deploy .Spark提交$ .doRunMain $ 1(SparkSubmit.scala:181)org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:206)at org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala :121)在org.apache .spark.deploy.SparkSubmit.main(SparkSubmit.scala)引起:java.lang.OutOfMemoryError:java.io.ObjectInputStream.readArray中java.lang.reflect.Array.newInstance(Array.java:70)中的Java堆空间(ObjectInputStream.java:1670)java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java: 500)atg.nd4j.linalg.api.buffer.BaseDataBuffer.doReadObject(BaseDataBuffer.java:880)atg.nd4j.linalg.api.buffer.BaseDataBuffer.readObject(BaseDataBuffer.java:868)at sun.reflect.GeneratedMethodAccessor7 .invoke(未知来源)在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在java.lang.reflect.Method.invoke(Method.java:606)在java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java :1017)java.io.ObjectInputStream.readOrd上的java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) inaryObject(ObjectInputStream.java:1798)在java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)在java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)在java.io.ObjectInputStream.readSerialData(ObjectInputStream.java :1915)在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)java. java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)java位于java.io.ObjectInputStream的java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)的java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)中的.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) . readArray(ObjectInputStream.java:1706)java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java) :1915)在java.io .ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)在java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)在java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)在java.io.ObjectInputStream.defaultReadObject( ObjectInputStream.java:500)org.apache.spark.rdd.ParallelCollectionPartition $$ anonfun $ readObject $ 1.apply $ mcV $ sp(ParallelCollectionRDD.scala:74)at org.apache.spark.util.Utils $ .tryOrIOException(Utils .scala:1204)21:24:12,769 ERROR~阶段0.0(TID 4)中的任务4.0中的异常org.apache.spark.executor.Executor中的org.apache.spark.TaskKilledException $ TaskRunner.run(Executor.scala:204 )java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:615)at java.lang.Thread.run(Thread.java: 744)21:30:18,649 ERROR~线程中的未捕获异常org.apache.spark.SparkException:发送消息时出错[message = StopBlockManagerMast ER]在org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:118)在org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)在org.apache.spark.storage.BlockManagerMaster org.apache上的org.apache.spark.storage.BlockManagerMaster.stop(BlockManagerMaster.scala:217)中的.tell(BlockManagerMaster.scala:225)org.apache.spark.SparkEnv.stop(SparkEnv.scala:97) . spark.SparkContext $$ anonfun $ stop $ 12.apply $ mcV $ sp(SparkContext.scala:1756)at atorg.apache.spark.util.Utils $ .tryLogNonFatalError(Utils.scala:1229)org.apache.spark.SparkContext.stop(SparkContext.scala:1755)org.apache.spark.SparkContext $$ anonfun $ 3.apply $ mcV $ sp(SparkContext.scala:596)org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:267)at org.apache.spark.util.SparkShutdownHookManager $$ anonfun $ runAll $ 1 $$ anonfun $应用$ mcV $ sp $ 1.apply $ mcV $ sp(ShutdownHookManager.scala:239)在org.apache.spark.util.SparkShutdownHookManager $$ anonfun $ runAll $ 1 $$ anonfun $ apply $ mcV $ sp $ 1.apply(ShutdownHookManager . scala:239)org.apache.spark.util.SparkShutdownHookManager $$ anonfun $ runAll $ 1 $$ anonfun $ apply $ mcV $ sp $ 1.apply(ShutdownHookManager.scala:239)at org.apache.spark.util.Utils $ .logUncaughtExceptions(Utils.scala:1765)atg.apache.spark.util.SparkShutdownHookManager $$ anonfun $ runAll $ 1.apply $ mcV $ sp(ShutdownHookManager.scala:239)at org.apache.spark.util.SparkShutdownHookManager $$在org.apache.spark.util.Spa的anonfun $ runAll $ 1.apply(ShutdownHookManager.scala:239) rkShutdownHookManager $$ anonfun $ runAll $ 1.适用(ShutdownHookManager.scala:239)

有任何想法吗?

1 回答

  • 3

    就像任何其他火花工作一样,考虑碰撞奴隶的xmx以及主人 .

    Spark有两种内存:带独立spark的执行程序和执行程序 .

    请参阅:How to set Apache Spark Executor memory

相关问题