Spark Mesos集群模式,谁上传jar?

我正在尝试使用Mesos集群模式运行Spark应用程序 . (我有客户端模式工作,但仍想尝试集群模式)

我在Mesos主节点上启动了 spark-mesos-dispatcher .

当我使用以下命令在本地路径 /tmp/assembly.jar 提交程序集时,

bin/spark-submit --master mesos://dispatcher:7077 --deploy-mode cluster --class com.example.Example /tmp/assembly.jar

它失败,因为mesos从属节点上不存在文件 /tmp/assembly.jar .

I1129 10:47:43.839771  5884 fetcher.cpp:414] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/9d725348-931a-48fb-96f7-d29a4b09f3e8-S9\/deploy","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/tmp\/assembly.jar"}}],"sandbox_directory":"\/var\/lib\/mesos\/slaves\/9d725348-931a-48fb-96f7-d29a4b09f3e8-S9\/frameworks\/9d725348-931a-48fb-96f7-d29a4b09f3e8-0291\/executors\/driver-20151129104742-0008\/runs\/31bf5840-226e-4b87-ae76-d14bd2f17950","user":"user"}
I1129 10:47:43.840710  5884 fetcher.cpp:369] Fetching URI '/tmp/assembly.jar'
I1129 10:47:43.840721  5884 fetcher.cpp:243] Fetching directly into the sandbox directory
I1129 10:47:43.840731  5884 fetcher.cpp:180] Fetching URI '/tmp/assembly.jar'
I1129 10:47:43.840737  5884 fetcher.cpp:160] Copying resource with command:cp '/tmp/assembly.jar' '/var/lib/mesos/slaves/9d725348-931a-48fb-96f7-d29a4b09f3e8-S9/frameworks/9d725348-931a-48fb-96f7-d29a4b09f3e8-0291/executors/driver-20151129104742-0008/runs/31bf5840-226e-4b87-ae76-d14bd2f17950/assembly.jar'
cp: cannot stat `/tmp/assembly.jar': No such file or directory
Failed to fetch '/tmp/assembly.jar': Failed to copy with command 'cp '/tmp/assembly.jar' '/var/lib/mesos/slaves/9d725348-931a-48fb-96f7-d29a4b09f3e8-S9/frameworks/9d725348-931a-48fb-96f7-d29a4b09f3e8-0291/executors/driver-20151129104742-0008/runs/31bf5840-226e-4b87-ae76-d14bd2f17950/assembly.jar'', exit status: 256
Failed to synchronize with slave (it's probably exited)

在YARN集群模式的情况下,Spark's YARN client implementation will upload the application jar to HDFS so that the driver and all executors have access to the jar,但我在RestSubmissionClient中找不到这样的代码,这是由Mesos或Standalond集群模式使用的 .

在这种情况下谁上传?或者我是否需要手动将应用程序程序集放在可通过HTTP URI访问的位置?

回答(1)

3 years ago

根据我的理解,您可以使用SparkContext addJar() 方法添加本地(到驱动程序应用程序)JAR文件路径,然后将其分发到执行程序节点(在客户端模式下) .

当您声明要使用群集模式时,我建议您查看Spark Jobserver项目,该项目应该使Mesos上的Spark应用程序的运行比使用内置工具更容易 .