我尝试使用Hadoop,然后安装并可以作为独立模式使用 . 但是当我用作伪分布式模式时,会发生以下消息并且没有继续进行该过程 .
17/10/24 02:04:15 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
17/10/24 02:04:16 INFO input.FileInputFormat: Total input files to process : 10
17/10/24 02:04:16 INFO mapreduce.JobSubmitter: number of splits:10
17/10/24 02:04:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1508778206216_0001
17/10/24 02:04:17 INFO impl.YarnClientImpl: Submitted application application_1508778206216_0001
17/10/24 02:04:17 INFO mapreduce.Job: The url to track the job: http://MacBook.local:8088/proxy/application_1508778206216_0001/
17/10/24 02:04:17 INFO mapreduce.Job: Running job: job_1508778206216_0001
我检查了localhost:50070并且有一个工作的Datanode . 我展示了我的设置程序 .
①安装Hadoop
brew install hadoop
②hadoop配置○libexec / etc / hadoop / core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
○的libexec的/ etc / hadoop的/ HDFS-site.xml中
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
○的libexec的/ etc / hadoop的/纱线的site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
○的libexec的/ etc /的hadoop / mapred-site.xml中
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
</configuration>
③开始hadoop
sbin/start-all.sh
用jps
-
ResourceManager
-
NodeManager
-
SecondaryNameNode
-
NameNode
-
DataNode
工作 .
④Hadoop的运行
hadoop jar libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar pi 10 100000
然后
Number of Maps = 10
Samples per Map = 100000
17/10/24 02:04:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
17/10/24 02:04:15 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
17/10/24 02:04:16 INFO input.FileInputFormat: Total input files to process : 10
17/10/24 02:04:16 INFO mapreduce.JobSubmitter: number of splits:10
17/10/24 02:04:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1508778206216_0001
17/10/24 02:04:17 INFO impl.YarnClientImpl: Submitted application application_1508778206216_0001
17/10/24 02:04:17 INFO mapreduce.Job: The url to track the job: http://MacBook.local:8088/proxy/application_1508778206216_0001/
17/10/24 02:04:17 INFO mapreduce.Job: Running job: job_1508778206216_0001
这个过程从未进行过 . 请告诉我它不起作用的原因 . 我访问了“http://macbook.local:8088/proxy/application_1508759907777_0001/ " to check Jobtracker, but an error code " ERR_EMPTY_RESPONSE” .
1 回答
首先,请注意Hadoop 2.x使用YARN资源管理器和节点管理器代替Job Trackers和Task Trackers .
相反,您可以尝试将以下属性添加到
yarn-site.xml
:我不确定在Hadoop 2.x中
jobtracker
属性应该发生什么,但也许它正在干扰 . 删除它并明确设置resourcemanager.hostname
可能会解决 .您可以通过在浏览器中打开
localhost:8032
来测试它是否可用 . 有关设置伪分布式群集的更多信息,请参见Apache Hadoop docs .