zookeeper集群崩潰處理


今天在私有化項目中遇到如下問題:

1.客戶反饋用戶登錄返回303

2.登錄服務器查看是大量的log將服務器磁盤空間占用殆盡,導致所有服務進程仍舊存在但是監聽端口失敗,服務不可用

3.清理日志文件

4.日志文件清理完成后,重啟服務,重啟zookeeper服務時出現以下報錯

2017-07-12 10:52:39,171 [myid:] - INFO [main:QuorumPeerConfig@103] - Reading configuration from: /data/apps/config/zookeeper/zoo.cfg
2017-07-12 10:52:39,176 [myid:] - INFO [main:QuorumPeerConfig@340] - Defaulting to majority quorums
2017-07-12 10:52:39,180 [myid:2] - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 5
2017-07-12 10:52:39,180 [myid:2] - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 24
2017-07-12 10:52:39,183 [myid:2] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
2017-07-12 10:52:39,194 [myid:2] - INFO [main:QuorumPeerMain@127] - Starting quorum peer
2017-07-12 10:52:39,196 [myid:2] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
2017-07-12 10:52:39,206 [myid:2] - INFO [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181
2017-07-12 10:52:39,218 [myid:2] - INFO [main:QuorumPeer@959] - tickTime set to 2000
2017-07-12 10:52:39,218 [myid:2] - INFO [main:QuorumPeer@979] - minSessionTimeout set to -1
2017-07-12 10:52:39,218 [myid:2] - INFO [main:QuorumPeer@990] - maxSessionTimeout set to -1
2017-07-12 10:52:39,218 [myid:2] - INFO [main:QuorumPeer@1005] - initLimit set to 10
2017-07-12 10:52:39,230 [myid:2] - INFO [main:FileSnap@83] - Reading snapshot /data/apps/data/zookeeper/version-2/snapshot.60000888d
2017-07-12 10:52:39,341 [myid:2] - ERROR [main:Util@239] - Last transaction was partial.
2017-07-12 10:52:39,342 [myid:2] - ERROR [main:QuorumPeer@497] - Unable to load database on disk
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:576)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:595)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:561)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:643)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:547)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:522)
at org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:354)
at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:450)
at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:440)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
2017-07-12 10:52:39,345 [myid:2] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server

經查閱資料得知,造成zookeeper崩潰的原因是

zookeeper呈現給使用某些狀態的所有客戶端進程一致性的狀態視圖。當一個客戶端從zookeeper獲得響應時,客戶端可以非常肯定這個響應信息與其他響應信息或其他客戶端所接收的響應均保持一致。有時,zookeeper客戶端庫與zookeeper服務的連接會丟失,而且服務提供一致性保證信息,當客戶端發現自己處於這種狀態時就會返回這種狀態。

 

解決方法:

1.查看zookeeper的配置文件,找到數據的存放目錄

cat /etc/zookeeper/conf/zoo.cfg

2.刪除或重命名數據配置文件

cd /var/lib/zookeeper

mv ./version-2 ./version-2.bak

3.重新啟動zookeeper,查看進程以及端口號是否被監聽。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM