首页 文章

Hadoop distcp作业已成功,但由ApplicationMaster杀死的attempt_xxx

提问于
浏览
0

运行distcp作业我遇到以下问题:几乎所有的map任务都标记为成功,但注意说Container已被杀死 .

在联机界面上, Map 作业的日志显示:Progress 100.00 State SUCCEEDED

但是注意它几乎每次尝试(~200)容器被ApplicationMaster杀死 . ApplicationMaster杀死的容器 . 根据要求杀死容器 . 退出代码是143

在与该尝试相关联的日志文件中,我可以看到一个日志,说任务'attempt_xxxxxxxxx_0'已完成 .

对于所有作业/尝试,stderr输出为空 .

在查看应用程序主日志并执行其中一次成功(但已杀死)尝试后,我会找到以下日志:

2017-01-05 10:27:22,772 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with attempt attempt_1483370705805_4012_m_000000_0
2017-01-05 10:27:22,773 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1483370705805_4012_m_000000 Task Transitioned from RUNNING to SUCCEEDED
2017-01-05 10:27:22,775 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1
2017-01-05 10:27:22,775 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1483370705805_4012Job Transitioned from RUNNING to COMMITTING
2017-01-05 10:27:22,776 INFO [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_COMMIT
2017-01-05 10:27:23,118 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2017-01-05 10:27:24,125 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_e116_1483370705805_4012_01_000002
2017-01-05 10:27:24,126 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2017-01-05 10:27:24,126 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1483370705805_4012_m_000000_0: Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

我设置了“mapreduce.map.speculative = false”!

All MAP task are SUCCEEDED(distcp job has no REDUCE),but MAPREDUCE is going for a long time(several hours) , then it is succeeded and distcp job is done.

我正在运行'yarn version'= Hadoop 2.5.0-cdh5.3.1

我应该担心吗?是什么导致容器被杀?任何建议将不胜感激!

1 回答

  • 0

    那些被杀的企图可能是由于投机性的执行 . 在这种情况下,没有什么可担心的 .

    要确保是这种情况,请尝试运行您的distcp,如下所示:

    hadoop distcp  -Dmapreduce.map.speculative=false ...
    

    你应该停止看到那些被杀的企图 .

相关问题