hadoop2 datanode啟動異常解決步驟


1.datanode起不來
2016-11-25 09:46:43,685 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid dfs.datanode.data.dir /home/hadoop3/hadoop_data/data :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /home/hadoop3/hadoop_data/data
        at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193)
        at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
        at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157)
        at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2272)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2314)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2296)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2188)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2235)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2411)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2435)

java.io.IOException: BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hdmaster/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262
        at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:210)
        at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:242)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:391)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:472)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1322)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1292)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:321)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:862)
        at java.lang.Thread.run(Thread.java:745)
2016-11-25 09:27:18,795 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage for block pool: BP-994368505-192.168.30.223-1441944900262 : BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hdmaster/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262
2016-11-25 09:27:18,795 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory [DISK]file:/home/hadoop1/hadoop_data/data/ has already been used.
2016-11-25 09:27:18,818 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage directories for bpid BP-994368505-192.168.30.223-1441944900262
2016-11-25 09:27:18,818 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to analyze storage directories for block pool BP-994368505-192.168.30.223-1441944900262
java.io.IOException: BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hadoop1/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262
        at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:210)
        at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:242)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:391)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:472)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1322)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1292)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:321)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:862)
        at java.lang.Thread.run(Thread.java:745)
2016-11-25 09:27:18,818 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage for block pool: BP-994368505-192.168.30.223-1441944900262 : BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hadoop1/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262
2016-11-25 09:27:18,818 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory [DISK]file:/home/hadoop2/hadoop_data/data/ has already been used.
2016-11-25 09:27:18,839 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage directories for bpid BP-994368505-192.168.30.223-1441944900262
2016-11-25 09:27:18,839 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to analyze storage directories for block pool BP-994368505-192.168.30.223-1441944900262
java.io.IOException: BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hadoop2/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262
        at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:210)
        at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:242)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:391)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:472)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1322)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1292)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:321)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:862)
        at java.lang.Thread.run(Thread.java:745)
2016-11-25 09:27:18,840 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage for block pool: BP-994368505-192.168.30.223-1441944900262 : BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /home/hadoop2/hadoop_data/data/current/BP-994368505-192.168.30.223-1441944900262
2016-11-25 09:27:18,840 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to namenode01/192.168.30.223:9000. Exiting.
java.io.IOException: All specified directories are failed to load.
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:473)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1322)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1292)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:321)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:862)
        at java.lang.Thread.run(Thread.java:745)
2016-11-25 09:27:18,840 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to namenode02/192.168.32.124:9000. Exiting.
org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 3, volumes configured: 4, volumes failed: 1, volume failures tolerated: 0
        at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.<init>(FsDatasetImpl.java:247)
        at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34)
        at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1335)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1292)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:321)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:862)
        at java.lang.Thread.run(Thread.java:745)

---原因第三塊盤壞了
解決步驟1:修復磁盤
[root@hdslave04 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_server03-lv_root
                       50G   27G   20G  58% /
tmpfs                  16G   68K   16G   1% /dev/shm
/dev/sda1             477M   59M  393M  13% /boot
/dev/mapper/vg_server03-lv_home
                      1.8T  1.5T  207G  88% /home
/dev/sdb1             1.8T  1.5T  303G  83% /home/hadoop1
/dev/sdc1             1.8T  1.5T  286G  84% /home/hadoop2
/dev/sdd1             1.8T  1.5T  250G  86% /home/hadoop3

umount /dev/sdd1
如果出現umount: /dev/sdd1: device is busy,
fuser -m /home/hadoop3
kill pid
umount /dev/sdd1

[root@hdslave04 ~]# fsck -y /dev/sdd1
fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
fsck.ext4: 沒有那個設備或地址 當嘗試打開 /dev/sdd1 時
Possibly non-existent or swap device?

--以上說明磁盤已經損壞

解決步驟2:更換磁盤 Parted工具來實現對GPT磁盤進行分區
  參考:http://soft.chinabyte.com/os/447/12439447.shtml
  parted /dev/sdd
  (parted) mklabel ----創建創建磁盤標簽
   New disk labeltype? gpt
   (parted) p ----查看分區狀態
   Model: VMware,VMware Virtual S (scsi)

  Disk /dev/sde:2000GB

  Sector size(logical/physical): 512B/512B

  Partition Table:gpt

  Number Start End Size File system Name Flags

  (parted) mkpart

  Partition name? []? sdd1 ---指定分區名稱

  File system type? [ext2]ext4 ----指定分區類型

  Start? 1 ---指定開始位置

  End? 2000GB ---指定結束位置

  (parted) P ----顯示分區信息

  Model: VMware, VMware Virtual S (scsi)

  Disk /dev/sde: 2000GB

  Sector size (logical/physical): 512B/512B

  Partition Table: gpt

  Number Start End Size File system Name Flags

  1 17.4kB 2000GB 2000GB sdd1

  (parted) Q ---退出

步驟三:格式化最新磁盤
mkfs.ext4 /dev/sdd1
修改fstab掛載最新磁盤
/dev/sdd1             1.8T  1.5T  250G  86% /home/hadoop3
重啟
shutdown -r now

步驟四:重啟datanode ,nodeManager
sh hadoop-daemon.sh start datanode
sh yarn-daemon.sh start nodemanager


--步驟1附加 死馬當活馬醫看看重啟是否
重啟服務器看看 shutdown -r now

 


發現機器卡住了
1.進入Linux單用戶模式
執行 root# mount -o remount,rw /
vim /etc/fstab
#/dev/sdd1             1.8T  1.5T  250G  86% /home/hadoop3
2.重啟服務器進入系統正常
shutdown -r now


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM