在此example之后,我能够向YARN提交Spark应用程序(使用REST API),但在完成作业后没有输出 . 它只是Scala中的一个WordCount示例,它应该在System.out中打印计数并在HDFS中创建一个具有相同计数的textFile,这两件事情都不会发生,并且作业中只有4秒钟 . 我正在使用CDH 5.8 .

这是我发送的application.json提交应用程序:

{ 
"application-id":"application_1480574893850_0007", 
"application-name":"WordCount", 
"am-container-spec":
    { 
       "local-resources":
       { 
          "entry":
         [
            { 
               "key":"WordCount.jar", 
               "value":
               { 
                  "resource":"hdfs://localhost:8020/user/cloudera/WordCount.jar", 
              "type":"FILE", 
              "visibility":"APPLICATION", 
              "size": "7049743", 
              "timestamp": "1480577893660"
           }
        }
      ]
   },
  "commands":
  {
    "command":"{{JAVA_HOME}}/bin/java -Xmx10m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 10 --container_vcores 1 --num_containers 1 --priority 0 1><LOG_DIR>/AppMaster.stdout 2><LOG_DIR>/AppMaster.stderr"
  },
  "unmanaged-AM":false,
  "max-app-attempts":2,
  "resource":
  {
  "memory":512,
  "vCores":1
  },
  "environment":
  {
    "entry":
    [
      {
        "key": "DISTRIBUTEDSHELLSCRIPTTIMESTAMP",
        "value": "1480515923741"
      },
      {
        "key": "DISTRIBUTEDSHELLSCRIPTLEN",
        "value": "13"
      },
      {
        "key": "DISTRIBUTEDSHELLSCRIPTLOCATION",
        "value": "hdfs://localhost:8020/user/cloudera/shell_vacio.sh"
      },
      {
        "key": "HADOOP_HOME",
        "value": "/usr/lib/hadoop"
      },
      {
        "key": "HADOOP_CONFIG_DIR",
        "value": "/etc/hadoop/conf"
      },
      {
        "key": "CLASSPATH",
        "value": "{{CLASSPATH}}<CPS>./*<CPS>{{HADOOP_CONF_DIR}}<CPS>{{HADOOP_COMMON_HOME}}/share/hadoop/common/*<CPS>{{HADOOP_COMMON_HOME}}/share/hadoop/common/lib/*<CPS>{{HADOOP_HDFS_HOME}}/share/hadoop/hdfs/*<CPS>{{HADOOP_HDFS_HOME}}/share/hadoop/hdfs/lib/*<CPS>{{HADOOP_YARN_HOME}}/share/hadoop/yarn/*<CPS>{{HADOOP_YARN_HOME}}/share/hadoop/yarn/lib/*<CPS>./log4j.properties<CPS>/usr/lib/hadoop-yarn/*<CPS>/usr/lib/hadoop-yarn/lib/*<CPS>/usr/lib/hadoop/*<CPS>/usr/lib/hadoop-mapreduce/*<CPS>/usr/lib/hadoop/lib/*<CPS>/usr/lib/hadoop-hdfs/*<CPS>/**"
      }
    ]
  },
  "application-type": "SPARK",
  "keep-containers-across-application-attempts":false

}
}

这是容器中日志的最后一部分:

16/12/01 08:41:20 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@771dcec7
16/12/01 08:41:20 DEBUG ipc.Client: The ping interval is 60000 ms.
16/12/01 08:41:20 DEBUG ipc.Client: Connecting to quickstart.cloudera/127.0.0.1:8041
16/12/01 08:41:20 DEBUG security.UserGroupInformation: PrivilegedAction as:appattempt_1480574893850_0007_000001 (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:725)
16/12/01 08:41:20 DEBUG security.SaslRpcClient: Sending sasl message state: NEGOTIATE

16/12/01 08:41:20 DEBUG security.SaslRpcClient: Received SASL message state: NEGOTIATE
auths {
  method: "TOKEN"
  mechanism: "DIGEST-MD5"
  protocol: ""
  serverId: "default"
  challenge: "realm=\"default\",nonce=\"ZNiuPwJdndJ00K4CCramt6/QJ6mTWpYug5ljfedK\",qop=\"auth\",charset=utf-8,algorithm=md5-sess"
}
16/12/01 08:41:20 DEBUG security.SaslRpcClient: Get token info proto:interface org.apache.hadoop.yarn.api.ContainerManagementProtocolPB info:org.apache.hadoop.yarn.security.ContainerManagerSecurityInfo$1@54517d84
16/12/01 08:41:20 INFO security.NMTokenSelector: Looking for service: 127.0.0.1:8041. Current token is Kind: NMToken, Service: 127.0.0.1:8041, Ident: (org.apache.hadoop.yarn.security.NMTokenIdentifier@3207905b)
16/12/01 08:41:20 DEBUG security.SaslRpcClient: Creating SASL DIGEST-MD5(TOKEN)  client to authenticate to service at default
16/12/01 08:41:20 DEBUG security.SaslRpcClient: Use TOKEN authentication for protocol ContainerManagementProtocolPB
16/12/01 08:41:20 DEBUG security.SaslRpcClient: SASL client callback: setting username: AAABWLkj/xoAAAAHAAAAAQAYcXVpY2tzdGFydC5jbG91ZGVyYTo4MDQxAAhjbG91ZGVyYfAgGIk=
16/12/01 08:41:20 DEBUG security.SaslRpcClient: SASL client callback: setting userPassword
16/12/01 08:41:20 DEBUG security.SaslRpcClient: SASL client callback: setting realm: default
16/12/01 08:41:20 DEBUG security.SaslRpcClient: Sending sasl message state: INITIATE
token: "charset=utf-8,username=\"AAABWLkj/xoAAAAHAAAAAQAYcXVpY2tzdGFydC5jbG91ZGVyYTo4MDQxAAhjbG91ZGVyYfAgGIk=\",realm=\"default\",nonce=\"ZNiuPwJdndJ00K4CCramt6/QJ6mTWpYug5ljfedK\",nc=00000001,cnonce=\"3dEbPN/6mDk/AUEQX6oMczO7F+xEkskX46op0FkN\",digest-uri=\"/default\",maxbuf=65536,response=536e71f2181a8cb1f0342bb2fb4f1d59,qop=auth"
auths {
  method: "TOKEN"
  mechanism: "DIGEST-MD5"
  protocol: ""
  serverId: "default"
}

16/12/01 08:41:20 DEBUG security.SaslRpcClient: Received SASL message state: SUCCESS
token: "rspauth=badbd23c1a5f0df809088bf983aecd3c"

16/12/01 08:41:20 DEBUG ipc.Client: Negotiated QOP is :auth
16/12/01 08:41:20 DEBUG ipc.Client: IPC Client (2113973089) connection to quickstart.cloudera/127.0.0.1:8041 from appattempt_1480574893850_0007_000001 sending #8
16/12/01 08:41:20 DEBUG ipc.Client: IPC Client (2113973089) connection to quickstart.cloudera/127.0.0.1:8041 from appattempt_1480574893850_0007_000001: starting, having connections 2
16/12/01 08:41:20 DEBUG ipc.Client: IPC Client (2113973089) connection to quickstart.cloudera/127.0.0.1:8041 from appattempt_1480574893850_0007_000001 got value #8
16/12/01 08:41:20 DEBUG ipc.ProtobufRpcEngine: Call: stopContainers took 38ms
16/12/01 08:41:20 DEBUG ipc.Client: IPC Client (2113973089) connection to quickstart.cloudera/127.0.0.1:8041 from appattempt_1480574893850_0007_000001: closed
16/12/01 08:41:20 DEBUG ipc.Client: IPC Client (2113973089) connection to quickstart.cloudera/127.0.0.1:8041 from appattempt_1480574893850_0007_000001: stopped, remaining connections 1
16/12/01 08:41:20 INFO distributedshell.ApplicationMaster: Application completed. Signalling finish to RM
16/12/01 08:41:20 DEBUG ipc.Client: IPC Client (2113973089) connection to quickstart.cloudera/127.0.0.1:8030 from yarn sending #9
16/12/01 08:41:20 DEBUG ipc.Client: IPC Client (2113973089) connection to quickstart.cloudera/127.0.0.1:8030 from yarn got value #9
16/12/01 08:41:20 DEBUG ipc.ProtobufRpcEngine: Call: finishApplicationMaster took 2ms
16/12/01 08:41:20 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.
16/12/01 08:41:20 DEBUG ipc.Client: IPC Client (2113973089) connection to quickstart.cloudera/127.0.0.1:8030 from yarn sending #10
16/12/01 08:41:20 DEBUG ipc.Client: IPC Client (2113973089) connection to quickstart.cloudera/127.0.0.1:8030 from yarn got value #10
16/12/01 08:41:20 DEBUG ipc.ProtobufRpcEngine: Call: finishApplicationMaster took 2ms
16/12/01 08:41:20 DEBUG service.AbstractService: Service: org.apache.hadoop.yarn.client.api.async.AMRMClientAsync entered state STOPPED
16/12/01 08:41:20 DEBUG impl.AMRMClientAsyncImpl: Heartbeater interrupted
java.lang.InterruptedException: sleep interrupted
    at java.lang.Thread.sleep(Native Method)
    at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:248)
16/12/01 08:41:20 DEBUG service.AbstractService: Service: org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl entered state STOPPED
16/12/01 08:41:20 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@771dcec7
16/12/01 08:41:20 INFO distributedshell.ApplicationMaster: Application Master completed successfully. exiting
16/12/01 08:41:20 INFO impl.AMRMClientAsyncImpl: Interrupted while waiting for queue
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
    at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)
16/12/01 08:41:20 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@771dcec7

在我看来,尽管该应用程序被提交给YARN并且它开始分配资源, there is no execution of the JAR . 我想我在命令(.json)中遗漏了一些东西,也许是JAR的mainClass?我使用这2个选项(--jar WordCount.jar --class SparkWordCount)进行了一些其他的执行但是我得到一个错误,上面写着:

FATAL distributedshell.ApplicationMaster: Error running ApplicationMaster
org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: --jar
    at org.apache.commons.cli.Parser.processOption(Parser.java:363)
    at org.apache.commons.cli.Parser.parse(Parser.java:199)
    at org.apache.commons.cli.Parser.parse(Parser.java:85)
    at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.init(ApplicationMaster.java:377)
    at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:298)