zookeeper 超时问题


问题1:

2020-03-01 16:04:06,085 [myid:1] - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181:ZooKeeperServer@694] - Established session 0x10635fe2a6368f1 with negotiated timeout 120000 for client /10.62.3.14:55222
2020-03-01 16:06:12,006 [myid:1] - WARN  [SyncThread:1:FileTxnLog@338] - fsync-ing the write ahead log in SyncThread:1 took 5073ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2020-03-01 16:06:13,906 [myid:1] - WARN  [SyncThread:1:FileTxnLog@338] - fsync-ing the write ahead log in SyncThread:1 took 1123ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide

分析: ZK服务端在fsync-ing the write ahead log日志时超长引起。

解决办法:

1、在zoo.cfg添加:

forceSync=no

默认是开启的,为避免同步延迟问题,ZK接收到数据后会立刻去讲当前状态信息同步到磁盘日志文件中,同步完成后才会应答。将此项关闭后,客户端连接可以得到快速响应。Zk涮日志源码如下图:

 

关闭forceSync选项后,会存在潜在风险,虽然依旧会刷磁盘(log.flush()首先被执行),但因为操作系统为提高写磁盘效率,会先写缓存,当机器异常后,可能导致一些zk状态信息没有同步到磁盘,从而带来ZK前后信息不一样问题。
2、把zookeeper的日志文件和数据文件分开存储,不存在在一块磁盘

 

问题2:

2020-03-01 16:27:16,786 [myid:1] - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181:ZooKeeperServer@694] - Established session 0x3075e0e93860151 with negotiated timeout 120000 for client /10.62.3.2:60124
2020-03-01 16:34:52,706 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x3075e0e93860135, likely client has closed socket
2020-03-01 16:34:52,706 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1056] - Closed socket connection for client /10.62.3.2:50244 which had sessionid 0x3075e0e93860135
2020-03-01 16:35:48,351 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x10635fe2a636912, likely client has closed socket
2020-03-01 16:35:48,351 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1056] - Closed socket connection for client /10.62.3.14:60822 which had sessionid 0x10635fe2a636912
2020-03-01 16:35:58,226 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x10635fe2a636914, likely client has closed socket
2020-03-01 16:35:58,226 [myid:1] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1056] - Closed socket connection for client /10.62.3.14:60856 which had sessionid 0x10635fe2a636914
2020-03-01 16:36:04,902 [myid:1] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x10635fe2a636910, likely client has closed socket

分析: 客户端连接Zookeeper时,配置的超时时长过短。

从上述的信息可以看出来,,会话超时时间已经设置了120s,对于hbase集群来说,,这个超时时间应该是没问题的,但是还是有的regionserver机器由于在flush memstor时失败了,,这里暂且在zoo.cfg文件,修改tickTime参数在观察看看。


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM