HDFS集群常見報錯匯總
作者:尹正傑
版權聲明:原創作品,謝絕轉載!否則將追究法律責任。
一.DataXceiver error processing WRITE_BLOCK operation
報錯信息以及截圖如下:
calculation112.aggrx:50010:DataXceiver error processing WRITE_BLOCK operation src: /10.1.1.116:36274 dst: /10.1.1.112:50010 java.io.IOException: Premature EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:203) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:901) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:808) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246) at java.lang.Thread.run(Thread.java:748)
......
報錯原因:
文件操作超租期,實際上就是data stream操作過程中文件被刪掉了。
解決方案:
第一步驟:(修改進程最大文件打開數)
[root@calculation101 ~]# cat /etc/security/limits.conf | grep -v ^# * soft nofile 1000000 * hard nofile 1048576 * soft nproc 65536 * hard nproc unlimited * soft memlock unlimited * hard memlock unlimited * - nofile 1000000 * - nproc 1000000 [root@calculation101 ~]#
第二步驟:(修改數據傳輸線程個數)