首页 文章

Hadoop:File ...只能复制到0个节点,而不是1个

提问于
浏览
0

我试图在8节点IB(OFED-1.5.3-4.0.42)集群上部署Hadoop-RDMA并遇到以下问题(a.k.a File ...只能复制到0个节点,而不是1个):

frolo@A11:~/hadoop-rdma-0.9.8> ./bin/hadoop dfs -copyFromLocal ../pg132.txt /user/frolo/input/pg132.txt
Warning: $HADOOP_HOME is deprecated.

14/02/05 19:06:30 WARN hdfs.DFSClient: DataStreamer Exception: java.lang.reflect.UndeclaredThrowableException
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Unknown Source)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Unknown Source)
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.From.Code(Unknown Source)
    at org.apache.hadoop.hdfs.From.F(Unknown Source)
    at org.apache.hadoop.hdfs.From.F(Unknown Source)
    at org.apache.hadoop.hdfs.The.run(Unknown Source)
Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/frolo/input/pg132.txt could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(Unknown Source)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(Unknown Source)
    at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.ipc.RPC$Server.call(Unknown Source)
    at org.apache.hadoop.ipc.rdma.madness.Code(Unknown Source)
    at org.apache.hadoop.ipc.rdma.madness.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(Unknown Source)
    at org.apache.hadoop.ipc.rdma.be.run(Unknown Source)
    at org.apache.hadoop.ipc.rdma.RDMAClient.Code(Unknown Source)
    at org.apache.hadoop.ipc.rdma.RDMAClient.call(Unknown Source)
    at org.apache.hadoop.ipc.Tempest.invoke(Unknown Source)
    ... 12 more`

14/02/05 19:06:30 WARN hdfs.DFSClient: Error Recovery for null bad datanode[0] nodes == null
14/02/05 19:06:30 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/frolo/input/pg132.txt" - Aborting...
14/02/05 19:06:30 INFO hdfs.DFSClient: exception in isClosed

当我开始从本地文件系统复制到HDFS时,似乎数据没有传输到DataNode . 我测试了DataNodes的可用性:

frolo@A11:~/hadoop-rdma-0.9.8> ./bin/hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.

Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: �%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0`

-------------------------------------------------
Datanodes available: 0 (4 total, 4 dead)`

`Name: 10.10.1.13:50010
Decommission Status : Normal
Configured Capacity: 0 (0 KB)
DFS Used: 0 (0 KB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 0(0 KB)
DFS Used%: 100%
DFS Remaining%: 0%
Last contact: Wed Feb 05 19:02:54 MSK 2014


Name: 10.10.1.14:50010
Decommission Status : Normal
Configured Capacity: 0 (0 KB)
DFS Used: 0 (0 KB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 0(0 KB)
DFS Used%: 100%
DFS Remaining%: 0%
Last contact: Wed Feb 05 19:02:54 MSK 2014


Name: 10.10.1.16:50010
Decommission Status : Normal
Configured Capacity: 0 (0 KB)
DFS Used: 0 (0 KB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 0(0 KB)
DFS Used%: 100%
DFS Remaining%: 0%
Last contact: Wed Feb 05 19:02:54 MSK 2014


Name: 10.10.1.11:50010
Decommission Status : Normal
Configured Capacity: 0 (0 KB)
DFS Used: 0 (0 KB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 0(0 KB)
DFS Used%: 100%
DFS Remaining%: 0%
Last contact: Wed Feb 05 19:02:55 MSK 2014

并尝试在HDFS文件系统中成功的mkdir . 重新启动Hadoop守护进程并没有产生任何积极影响 .

你能帮我解决这个问题吗?谢谢 .

最好,亚历克斯

1 回答

  • 4

    我发现了我的问题 . 该问题与已配置为NFS分区的hadoop.tmp.dir的配置有关 . 默认情况下,它配置为/ tmp,即本地fs . 从core-site.xml中删除hadoop.tmp.dir之后,问题就解决了 .

相关问题