報錯:hadoop There appears to be a gap in the edit log. We expected txid 927, but got txid 1265.


報錯背景

hadoop啟動報錯。

報錯現象

Number of suppressed write-lock reports: 0
    Longest write-lock held interval: 10734
2020-12-15 10:48:07,720 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
java.io.IOException: There appears to be a gap in the edit log.  We expected txid 927, but got txid 1265.
    at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:215)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:321)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:978)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:685)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:585)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:645)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:819)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:803)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1500)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1566)
2020-12-15 10:48:07,722 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@bigdata01:50070
2020-12-15 10:48:07,823 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2020-12-15 10:48:07,823 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2020-12-15 10:48:07,823 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2020-12-15 10:48:07,823 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.io.IOException: There appears to be a gap in the edit log.  We expected txid 927, but got txid 1265.
    at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:215)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:321)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:978)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:685)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:585)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:645)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:819)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:803)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1500)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1566)
2020-12-15 10:48:07,825 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2020-12-15 10:48:07,825 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 

 

報錯原因

上面日志提示 We expected txid 927, but got txid 1265 也就是Namenode想要獲取927這個edit,但是獲取到的是1265,查看 jn 目錄下,發現的確只有1265 即:edits_0000000000000001265-0000000000000001265

報錯解決

(1)方法一:命令修復元數據

# hadoop namenode -recover

遇到 (Y/N) 選Y

遇到 (c/s/q/a) 選c

重啟hadoop即可。

(2)方法二:復制缺失文件

hadoop集群下會有兩個JournalNode,可以將另一個節點存在,而此節點不存在的文件使用scp命令復制過來,

文件所在目錄是自定義的,例如:/data/hadoop/hdfs/journal/cluster/current

# scp edits_0000000000000000041-0000000000000000043 root@bigdata01:/data/hadoop/hdfs/journal/cluster/current
# hdfs zkfc -formatZK

重啟hadoop即可。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM