zookeeper 超時問題


問題1:

2020-03-01 16:04:06,085 [myid:1] - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181:ZooKeeperServer@694] - Established session 0x10635fe2a6368f1 with negotiated timeout 120000 for client /10.62.3.14:55222
2020-03-01 16:06:12,006 [myid:1] - WARN  [SyncThread:1:FileTxnLog@338] - fsync-ing the write ahead log in SyncThread:1 took 5073ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2020-03-01 16:06:13,906 [myid:1] - WARN  [SyncThread:1:FileTxnLog@338] - fsync-ing the write ahead log in SyncThread:1 took 1123ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

分析: ZK服務端在fsync-ing the write ahead log日志時超長引起。

解決辦法:

1、在zoo.cfg添加:

forceSync=no

默認是開啟的,為避免同步延遲問題,ZK接收到數據后會立刻去講當前狀態信息同步到磁盤日志文件中,同步完成后才會應答。將此項關閉后,客戶端連接可以得到快速響應。Zk涮日志源碼如下圖:

 

關閉forceSync選項后,會存在潛在風險,雖然依舊會刷磁盤(log.flush()首先被執行),但因為操作系統為提高寫磁盤效率,會先寫緩存,當機器異常后,可能導致一些zk狀態信息沒有同步到磁盤,從而帶來ZK前后信息不一樣問題。
2、把zookeeper的日志文件和數據文件分開存儲,不存在在一塊磁盤

 

問題2:

2020-03-01 16:27:16,786 [myid:1] - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181:ZooKeeperServer@694] - Established session 0x3075e0e93860151 with negotiated timeout 120000 for client /10.62.3.2:60124
2020-03-01 16:34:52,706 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x3075e0e93860135, likely client has closed socket
2020-03-01 16:34:52,706 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1056] - Closed socket connection for client /10.62.3.2:50244 which had sessionid 0x3075e0e93860135
2020-03-01 16:35:48,351 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x10635fe2a636912, likely client has closed socket
2020-03-01 16:35:48,351 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1056] - Closed socket connection for client /10.62.3.14:60822 which had sessionid 0x10635fe2a636912
2020-03-01 16:35:58,226 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x10635fe2a636914, likely client has closed socket
2020-03-01 16:35:58,226 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1056] - Closed socket connection for client /10.62.3.14:60856 which had sessionid 0x10635fe2a636914
2020-03-01 16:36:04,902 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x10635fe2a636910, likely client has closed socket

分析: 客戶端連接Zookeeper時,配置的超時時長過短。

從上述的信息可以看出來,,會話超時時間已經設置了120s,對於hbase集群來說,,這個超時時間應該是沒問題的,但是還是有的regionserver機器由於在flush memstor時失敗了,,這里暫且在zoo.cfg文件,修改tickTime參數在觀察看看。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM