Zookeeper第一課 安裝和配置


簡介:

Zookeeper,是Google的Chubby一個開源的實現,Hadoop的分布式協調服務,它包含一個簡單的原語集,來實現同步、配置維護、分集群、命名的服務

zookeeper是一個由多個service組成的集群,一個leader,多個follower,每個server數據一致,分布式讀寫,更新請求轉發由leader實施.

更新請求順序進行,來自同一個client的更新請求按其發送順序依次執行,數據更新原子性,一次數據更新要么成功,要么失敗,全局唯一數據試圖,client無論連接到哪個server,數據試圖是一致的.

 

下載zookeeper的安裝包之后, 解壓到合適目錄.

下載路徑:http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.6/

 

ZooKeeper集群是一個獨立的分布式協調服務集群,“獨立”的含義就是說,如果想使用ZooKeeper實現分布式應用的協調與管理,簡化協調與管理,任何分布式應用都可以使用,這就要歸功於Zookeeper的數據模型(Data Model)和層次命名空間(Hierarchical Namespace)結構,詳細可以參考http://zookeeper.apache.org/doc/trunk/zookeeperOver.html。在設計你的分布式應用協調服務時,首要的就是考慮如何組織層次命名空間ZooKeeper集群中具有兩個關鍵的角色:Leader和Follower。集群中所有的結點作為一個整體對分布式應用提供服務,集群中每個結點之間都互相連接,所以,在配置的ZooKeeper集群的時候,每一個結點的host到IP地址的映射都要配置上集群中其它結點的映射信息。

ZooKeeper采用一種稱為Leader election的選舉算法。在整個集群運行過程中,只有一個Leader,其他的都是Follower,如果ZooKeeper集群在運行過程中Leader出了問題,系統會采用該算法重新選出一個Leader。因此,各個結點之間要能夠保證互相連接,必須配置上述映射。
ZooKeeper集群啟動的時候,會首先選出一個Leader,在Leader election過程中,某一個滿足選舉算的結點就能成為Leader。整個集群的架構可以參考http://zookeeper.apache.org/doc/trunk/zookeeperOver.html#sc_designGoals。

Zookeeper 不僅可以單機提供服務,同時也支持多機組成集群來提供服務,實際上Zookeeper還支持另外一種偽集群的方式(也就是可以在一台物理機上運行多個Zookeeper實例)。Zookeeper通過復制來實現高可用性,只要集合體中半數以上的機器處於可用狀態,它就能夠保證服務繼續。集群容災性:3台機器只要有2台可用就可以選出leader並且對外提供服務(2n+1台機器,可以容n台機器掛掉)。 

 

單機模式

1、進入zookeeper目錄下的conf子目錄, 創建zoo.cfg(也可以使用默認的zoo_sample.cfg,只需要把名稱改下即可):

  

  參數說明:

  • tickTime: zookeeper中使用的基本時間單位, 毫秒值.
  • dataDir: 數據目錄. 可以是任意目錄.
  • dataLogDir: log目錄, 同樣可以是任意目錄. 如果沒有設置該參數, 將使用和dataDir相同的設置.
  • clientPort: 監聽client連接的端口號.

2、啟動zookeeper:zkServer.cmd(bin目錄下)

3、啟動客戶端:雙擊zkCli.cmd(zk和客戶端在一個機器上的時候)或者zkCli.cmd -server localhost:2181(不在一個機器上的時候)

  

 

偽集群模式

所謂偽集群, 是指在單台機器中啟動多個zookeeper進程, 並組成一個集群. 以啟動3個zookeeper進程為例.

將zookeeper的目錄拷貝2份:

  1. |--zookeeper0  
  2. |--zookeeper1  
  3. |--zookeeper2  

 更改zookeeper0/conf/zoo.cfg文件為:

  tickTime=2000
  initLimit=5
  syncLimit=2
  dataDir=F:/ZOOKEEPER/zookeeper0/data
  dataLogDir=F:/ZOOKEEPER/zookeeper0/logs
  clientPort=4180
  server.0=127.0.0.1:8880:7770
  server.1=127.0.0.1:8881:7771
  server.2=127.0.0.1:8882:7772

 

新增了幾個參數, 其含義如下:

  • initLimit: zookeeper集群中的包含多台server, 其中一台為leader, 集群中其余的server為follower. initLimit參數配置初始化連接時, follower和leader之間的最長心跳時間. 此時該參數設置為5, 說明時間限制為5倍tickTime, 即5*2000=10000ms=10s.
  • syncLimit: 該參數配置leader和follower之間發送消息, 請求和應答的最大時間長度. 此時該參數設置為2, 說明時間限制為2倍tickTime, 即4000ms.
  • server.X=A:B:C 其中X是一個數字, 表示這是第幾號server. A是該server所在的IP地址. B配置該server和集群中的leader交換消息所使用的端口. C配置選舉leader時所使用的端口. 由於配置的是偽集群模式, 所以各個server的B, C參數必須不同.
  • 參照zookeeper0/conf/zoo.cfg, 配置zookeeper1/conf/zoo.cfg, 和zookeeper2/conf/zoo.cfg文件. 只需更改dataDir, dataLogDir, clientPort參數即可.
  • 在之前設置的dataDir中新建myid文件, 寫入一個數字, 該數字表示這是第幾號server. 該數字必須和zoo.cfg文件中的server.X中的X一一對應.F:/ZOOKEEPER/zookeeper0/data/myid文件中寫入0,F:/ZOOKEEPER/zookeeper1/data/myid文件中寫入1, F:/ZOOKEEPER/zookeeper2/data/myid文件中寫入2.

 

啟動server3個server.
任意選擇一個server目錄, 啟動客戶端:bin/zkCli.cmd -server localhost:4180

 

集群模式

集群模式的配置和偽集群基本一致.
由於集群模式下, 各server部署在不同的機器上, 因此各server的conf/zoo.cfg文件可以完全一樣.

 

在LINUX下的部署:

1、修改ZooKeeper配置文件conf/zoo.cfg:
  tickTime=2000
  dataDir=/home/xuhui/hadoop-2.2.0/tmp/zookeeper
  clientPort=2181
  initLimit=5
  syncLimit=2
  server.1=cloud001:2888:3888
  server.2=cloud002:2888:3888

2、遠程復制分發安裝文件
  上面已經在一台機器slave-01上配置完成ZooKeeper,現在可以將該配置好的安裝文件遠程拷貝到集群中的各個結點對應的目錄下:
  cd /home/xuhui/hadoop-2.2.0/
  scp -r zookeeper-3.4.6/ xuhui@cloud002:/home/xuhui/hadoop-2.2.0/
3、設置myid
  /dataDir下創建一個文件myid,里面內容為一個數字,用來標識當前主機,和conf/zoo.cfg文件中配置的server.X中X數字抱回一致,例如:
  xuhui@cloud001:~/hadoop-2.2.0/tmp$ mkdir zookeeper
  xuhui@cloud001:~/hadoop-2.2.0$ echo "1" > /home/xuhui/hadoop-2.2.0/tmp/zookeeper/myid
  xuhui@cloud002:~/hadoop-2.2.0/tmp$ mkdir zookeeper
  xuhui@cloud002:~/hadoop-2.2.0$ echo "2" > /home/xuhui/hadoop-2.2.0/tmp/zookeeper/myid
4、啟動ZooKeeper集群
在ZooKeeper集群的每個結點上,執行啟動ZooKeeper服務的腳本,如下所示:
  xuhui@cloud001:~/hadoop-2.2.0/zookeeper-3.4.6$ bin/zkServer.sh start
  xuhui@cloud002:~/hadoop-2.2.0/zookeeper-3.4.6$ bin/zkServer.sh start
以結點master為例,日志如下所示:
xuhui@cloud001:~/hadoop-2.2.0/zookeeper-3.4.6$ tail -500f zookeeper.out 
2014-05-21 11:26:42,603 [myid:] - INFO [main:QuorumPeerConfig@103] - Reading configuration from: /home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../conf/zoo.cfg
2014-05-21 11:26:42,611 [myid:] - WARN [main:QuorumPeerConfig@293] - No server failure will be tolerated. You need at least 3 servers.
2014-05-21 11:26:42,612 [myid:] - INFO [main:QuorumPeerConfig@340] - Defaulting to majority quorums
2014-05-21 11:26:42,626 [myid:1] - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2014-05-21 11:26:42,627 [myid:1] - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2014-05-21 11:26:42,627 [myid:1] - INFO [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2014-05-21 11:26:42,646 [myid:1] - INFO [main:QuorumPeerMain@127] - Starting quorum peer
2014-05-21 11:26:42,695 [myid:1] - INFO [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181
2014-05-21 11:26:42,744 [myid:1] - INFO [main:QuorumPeer@959] - tickTime set to 2000
2014-05-21 11:26:42,744 [myid:1] - INFO [main:QuorumPeer@979] - minSessionTimeout set to -1
2014-05-21 11:26:42,744 [myid:1] - INFO [main:QuorumPeer@990] - maxSessionTimeout set to -1
2014-05-21 11:26:42,744 [myid:1] - INFO [main:QuorumPeer@1005] - initLimit set to 5
2014-05-21 11:26:42,768 [myid:1] - INFO [main:QuorumPeer@473] - currentEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2014-05-21 11:26:42,940 [myid:1] - INFO [main:QuorumPeer@488] - acceptedEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2014-05-21 11:26:43,035 [myid:1] - INFO [Thread-1:QuorumCnxManager$Listener@504] - My election bind port: cloud001/172.24.241.56:3888
2014-05-21 11:26:43,050 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@714] - LOOKING
2014-05-21 11:26:43,054 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@815] - New election. My id = 1, proposed zxid=0x0
2014-05-21 11:26:43,057 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 1 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state)
2014-05-21 11:26:43,085 [myid:1] - WARN [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot open channel to 2 at election address cloud002/172.18.19.37:3888
java.net.ConnectException: 拒絕連接
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430)
at java.lang.Thread.run(Thread.java:744)
2014-05-21 11:26:43,263 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open channel to 2 at election address cloud002/172.18.19.37:3888
java.net.ConnectException: 拒絕連接
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2014-05-21 11:26:43,265 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 400
2014-05-21 11:26:43,667 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open channel to 2 at election address cloud002/172.18.19.37:3888
java.net.ConnectException: 拒絕連接
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2014-05-21 11:26:43,669 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 800
2014-05-21 11:26:44,471 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open channel to 2 at election address cloud002/172.18.19.37:3888
java.net.ConnectException: 拒絕連接
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2014-05-21 11:26:44,473 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 1600
2014-05-21 11:26:46,075 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open channel to 2 at election address cloud002/172.18.19.37:3888
java.net.ConnectException: 拒絕連接
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2014-05-21 11:26:46,076 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 3200
2014-05-21 11:26:49,278 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open channel to 2 at election address cloud002/172.18.19.37:3888
java.net.ConnectException: 拒絕連接
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2014-05-21 11:26:49,280 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 6400
2014-05-21 11:26:55,682 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open channel to 2 at election address cloud002/172.18.19.37:3888
java.net.ConnectException: 拒絕連接
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2014-05-21 11:26:55,684 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 12800
2014-05-21 11:27:08,539 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open channel to 2 at election address cloud002/172.18.19.37:3888
java.net.ConnectException: 拒絕連接
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2014-05-21 11:27:08,541 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 25600
2014-05-21 11:27:34,143 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open channel to 2 at election address cloud002/172.18.19.37:3888
java.net.ConnectException: 拒絕連接
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2014-05-21 11:27:34,145 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 51200
2014-05-21 11:28:25,347 [myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open channel to 2 at election address cloud002/172.18.19.37:3888
java.net.ConnectException: 拒絕連接
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2014-05-21 11:28:25,349 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 60000
2014-05-21 11:28:30,573 [myid:1] - INFO [cloud001/172.24.241.56:3888:QuorumCnxManager$Listener@511] - Received connection request /172.18.19.37:39108
2014-05-21 11:28:30,593 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state)
2014-05-21 11:28:30,594 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state)
2014-05-21 11:28:30,796 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@784] - FOLLOWING
2014-05-21 11:28:30,819 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@86] - TCP NoDelay set to: true
2014-05-21 11:28:30,830 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2014-05-21 11:28:30,830 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:host.name=cloud001
2014-05-21 11:28:30,830 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.version=1.7.0_45
2014-05-21 11:28:30,830 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.vendor=Oracle Corporation
2014-05-21 11:28:30,831 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.home=/usr/lib/jvm/jdk1.7.0_45/jre
2014-05-21 11:28:30,831 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.class.path=/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../build/classes:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../build/lib/*.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../lib/slf4j-log4j12-1.6.1.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../lib/slf4j-api-1.6.1.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../lib/netty-3.7.0.Final.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../lib/log4j-1.2.16.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../lib/jline-0.9.94.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../zookeeper-3.4.6.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../src/java/lib/*.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../conf:.:/usr/lib/jvm/jdk1.7.0_45/lib:/home/xuhui/hadoop-2.2.0/mahout-distribution-0.9/lib:/usr/lib/jvm/jdk1.7.0_45/jre/lib:.:/usr/lib/jvm/jdk1.7.0_45/lib:/lib:/usr/lib/jvm/jdk1.7.0_45/jre/lib:
2014-05-21 11:28:30,831 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.library.path=/usr/java/packages/lib/i386:/lib:/usr/lib
2014-05-21 11:28:30,831 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.io.tmpdir=/tmp
2014-05-21 11:28:30,831 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.compiler=<NA>
2014-05-21 11:28:30,836 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.name=Linux
2014-05-21 11:28:30,836 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.arch=i386
2014-05-21 11:28:30,836 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.version=3.8.0-29-generic
2014-05-21 11:28:30,837 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.name=xuhui
2014-05-21 11:28:30,837 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.home=/home/xuhui
2014-05-21 11:28:30,837 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.dir=/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6
2014-05-21 11:28:30,839 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@162] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /home/xuhui/hadoop-2.2.0/tmp/zookeeper/version-2 snapdir /home/xuhui/hadoop-2.2.0/tmp/zookeeper/version-2
2014-05-21 11:28:30,840 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 107786
2014-05-21 11:28:31,367 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@323] - Getting a diff from the leader 0x0
2014-05-21 11:28:31,371 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@240] - Snapshotting: 0x0 to /home/xuhui/hadoop-2.2.0/tmp/zookeeper/version-2/snapshot.0

啟動的順序是slave-01>slave-02>slave-03,由於ZooKeeper集群啟動的時候,每個結點都試圖去連接集群中的其它結點,先啟動的肯定連不上后面還沒啟動的,所以上面日志前面部分的異常是可以忽略的。通過后面部分可以看到,集群在選出一個Leader后,最后穩定了。
其他結點可能也出現類似問題,屬於正常。
第六步:安裝驗證
可以通過ZooKeeper的腳本來查看啟動狀態,包括集群中各個結點的角色(或是Leader,或是Follower),如下所示,是在ZooKeeper集群中的每個結點上查詢的結果:
xuhui@cloud002:~/hadoop-2.2.0/zookeeper-3.4.6$ bin/zkServer.sh status
JMX enabled by default
Using config: /home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: leader
通過上面狀態查詢結果可見,cloud002是集群的Leader,其余的兩個結點是Follower。
另外,可以通過客戶端腳本,連接到ZooKeeper集群上。對於客戶端來說,ZooKeeper是一個整體(ensemble),連接到ZooKeeper集群實際上感覺在獨享整個集群的服務,所以,你可以在任何一個結點上建立到服務集群的連接,例如:

xuhui@cloud002:~/hadoop-2.2.0/zookeeper-3.4.6$ bin/zkCli.sh -server cloud002:2181
Connecting to cloud002:2181
2014-05-21 11:38:55,520 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2014-05-21 11:38:55,523 [myid:] - INFO [main:Environment@100] - Client environment:host.name=cloud002
2014-05-21 11:38:55,524 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.7.0_45
2014-05-21 11:38:55,526 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2014-05-21 11:38:55,526 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/jdk1.7.0_45/jre
2014-05-21 11:38:55,526 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../build/classes:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../build/lib/*.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../lib/slf4j-log4j12-1.6.1.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../lib/slf4j-api-1.6.1.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../lib/netty-3.7.0.Final.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../lib/log4j-1.2.16.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../lib/jline-0.9.94.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../zookeeper-3.4.6.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../src/java/lib/*.jar:/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6/bin/../conf:.:/usr/lib/jvm/jdk1.7.0_45/lib:/usr/lib/jvm/jdk1.7.0_45/jre/lib:
2014-05-21 11:38:55,526 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/i386:/lib:/usr/lib
2014-05-21 11:38:55,526 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2014-05-21 11:38:55,526 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
2014-05-21 11:38:55,526 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux
2014-05-21 11:38:55,526 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=i386
2014-05-21 11:38:55,527 [myid:] - INFO [main:Environment@100] - Client environment:os.version=3.8.0-29-generic
2014-05-21 11:38:55,527 [myid:] - INFO [main:Environment@100] - Client environment:user.name=xuhui
2014-05-21 11:38:55,527 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/home/xuhui
2014-05-21 11:38:55,527 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/home/xuhui/hadoop-2.2.0/zookeeper-3.4.6
2014-05-21 11:38:55,528 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=cloud002:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@a61d64
Welcome to ZooKeeper!
2014-05-21 11:38:55,552 [myid:] - INFO [main-SendThread(cloud002:2181):ClientCnxn$SendThread@975] - Opening socket connection to server cloud002/172.18.19.37:2181. Will not attempt to authenticate using SASL (unknown error)
2014-05-21 11:38:55,575 [myid:] - INFO [main-SendThread(cloud002:2181):ClientCnxn$SendThread@852] - Socket connection established to cloud002/172.18.19.37:2181, initiating session
JLine support is enabled
[zk: cloud002:2181(CONNECTING) 0] 2014-05-21 11:38:55,744 [myid:] - INFO [main-SendThread(cloud002:2181):ClientCnxn$SendThread@1235] - Session establishment complete on server cloud002/172.18.19.37:2181, sessionid = 0x2461cd2455b0000, negotiated timeout = 30000


WATCHER::


WatchedEvent state:SyncConnected type:None path:null


[zk: cloud002:2181(CONNECTED) 1] ls /
[zookeeper]
[zk: cloud002:2181(CONNECTED) 2]

當前根路徑為/zookeeper。

總結說明
主機名與IP地址映射配置問題
啟動ZooKeeper集群時,如果ZooKeeper集群中slave-02結點的日志出現如下錯誤:

java.net.SocketTimeoutException
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:109)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)
2012-01-08 06:37:46,026 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@697] - Notification time out: 6400
2012-01-08 06:37:57,431 - WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 2 at election address slave-02/202.106.199.35:3888
java.net.SocketTimeoutException
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:109)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)
2012-01-08 06:38:02,442 - WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 3 at election address slave-03/202.106.199.35:3888
很顯然,slave-01在啟動時連接集群中其他結點(slave-02、slave-03)時,主機名映射的IP與我們實際配置的不一致,所以集群中各個結點之間無法建立鏈路,整個ZooKeeper集群啟動是失敗的。
上面錯誤日志中slave-02/202.106.199.35:3888實際應該是slave-02/202.192.168.0.178:3888就對了,但是在進行域名解析的時候映射有問題,修改每個結點的/etc/hosts文件,將ZooKeeper集群中所有結點主機名到IP地址的映射配置上。


參考鏈接:http://blog.csdn.net/shirdrn/article/details/7183503#


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM