hadoop的HA機制+zookeeper


 關於hadoop的HA配置以及wordcount測試

一,簡單環境配置

1,查看centos版本位數:

$>getconf LONG_BIT,

2,桌面模式和文本模式之間進行切換:

1),在終端命令行進行設置時只能暫時改變模式,

     $>init 3    表示切換到文本模式

     $>init 5    表示切換到桌面模式

2),永久改變模式需要修改配置文件,進入到etc目錄下

      $>sudo nano inittab   修改該文件最后一行

     若需要文本模式則改為 id:3:initdefault

     若需要桌面模式則改為:id:5:initdefault

注意:改為桌面模式時最好加大內存,改為4G左右

3,配置靜態ip,修改主機名,配置hosts文件

    1),進入桌面模式下進行配置,

2),查看配置的靜態ip

     cd /etc/sysconfig/networking/devices/ifcfg-eth2

3),修改主機名

          nano /etc/sysconfig/network  本機的主機名

4,域名映射的配置:

      nano /etc/hosts  集群中的主機域名映射表

5,檢驗配置是否生效

ping hostname觀察網絡配置是否生效

6,關閉防火牆

          1)查看防火牆狀態

                   service iptables status

          2)關閉防火牆

             service iptables stop

          3)查看防火牆開機啟動狀態

                   chkconfig iptables --list

          4)關閉防火牆開機啟動

                   chkconfig iptables off

7,配置ssh免登陸

           1)生成ssh免登陸密鑰

              進入到我的home目錄

                 cd ~/.ssh

ssh-keygen -t rsa (四個回車)

           2)執行完這個命令后,會生成兩個文件id_rsa(私鑰)、id_rsa.pub(公鑰)

              將公鑰拷貝到要免登陸的機器上

                 ssh-copy-id localhost

        

 

二,配置HA

1,讓兩個NN節點在某個時間只能有一個節點正常響應客戶端請求,響應請求的必須為ACTIVE狀態的那一台

2,standby狀態的節點(第二個namenode)必須能夠無縫的切換為ACTIVE狀態,意味着兩個NN必須時時刻刻保持一致

3,管理edits,寫了一個分布式應用qjournal:它依賴於zookeeper實現

4,如何避免狀態切換時發生brain split現象

   fencing: ssh發送kill指令

   執行自定義shell腳本

  

5,部署方式:

3台機器:

第一台                 第二台                              第三台

nn1                    nn2

zkfc1                  zkfc2                                   dn1                                                                   

zk1,                   zk2,                                    zk3

jn1,                   jn2,                                     jn3

RM                     RM                                    NM

 

7台機器:

1、        namenode                    zkfc

2,         namenode                    zkfc

3,         resourcemanager

4,         resourcemanager

5,         zookeeper  journalnode  datanode  nodemanager

6,         zookeeper  journalnode  datanode  nodemanager

7,         zookeeper  journalnode  datanode  nodemanager

hadoop2.0已經發布了穩定版本了,增加了很多特性,比如HDFS HA、YARN等。最新的hadoop-2.4.1又增加了YARN HA

前期准備就不詳細說了,我下面的HA配置選擇的是3個節點

1.修改Linux主機名

2.修改IP

3.修改主機名和IP的映射關系

4.關閉防火牆

5.ssh免登陸

6.安裝JDK,配置環境變量等

集群規划:

         主機名              IP                       安裝的軟件                                          運行的進程

         s104               192.168.43.104  jdk、hadoop,zookeeper1            NN1,zkfc1,RM1,QPM,JN1 

         s106               192.168.43.106     jdk、hadoop,zookeeper2      NN2,zkfc2,RM2,QPM,JN2 

         s108               192.168.43.108     jdk、hadoop、zookeeper3        DN,NM,QPM,JN3     

說明:

1.在hadoop2.0中通常由兩個NameNode組成,一個處於active狀態,另一個處於standby狀態。Active NameNode對外提供服務,而Standby NameNode則不對外提供服務,僅同步active namenode的狀態,以便能夠在它失敗時快速進行切換。

         hadoop2.0官方提供了兩種HDFS HA的解決方案,一種是NFS,另一種是QJM。這里我們使用簡單的QJM。在該方案中,主備NameNode之間通過一組JournalNode同步元數據信息,一條數據只要成功寫入多數JournalNode即認為寫入成功。通常配置奇數個JournalNode

         這里還配置了一個zookeeper集群,用於ZKFC(DFSZKFailoverController)故障轉移,當Active NameNode掛掉了,會自動切換Standby NameNode為standby狀態

2.hadoop-2.2.0中依然存在一個問題,就是ResourceManager只有一個,存在單點故障,hadoop-2.4.1解決了這個問題,有兩個ResourceManager,一個是Active,一個是Standby,狀態由zookeeper進行協調

安裝步驟:

1.安裝配置zooekeeper集群(在s104上)

1).上傳zk安裝包到~/目錄下

2).解壓 tar -zxvf zookeeper-3.4.5.tar.gz -C app

3),進入zookeeper-3.4.5,然后進入到文件夾conf進行配置(先在一台節點上配置)

3.1添加一個zoo.cfg配置文件

                  $>zookeeper-3.4.5/conf

                  $>mv zoo_sample.cfg zoo.cfg

3.2修改配置文件(zoo.cfg)

                        dataDir=/home/hadoop/app/zookeeper-3.4.5/data

                         server.1=itcast05:2888:3888

                         server.2=itcast06:2888:3888

                         server.3=itcast07:2888:3888

3.3在(dataDir=/home/hadoop/app/zookeeper-3.4.5/data)創建一個myid文件,里面內容是server.N中的N(server.2里面內容為2)

                       echo "1" > myid

3.4將配置好的zk拷貝到其他節點

                       scp -r /home/hadoop/app/zookeeper-3.4.5/ s106:/home/hadoop/app

                       scp -r /home/hadoop/app/zookeeper-3.4.5/ s108:/home/hadoop/app

3.5注意:在其他節點上一定要修改myid的內容

                      在s106應該講myid的內容改為2 (echo "2" > myid)

                      在s108應該講myid的內容改為3 (echo "3" > myid)

4).啟動集群(后面配置完成進行集群測試時再關閉,然后按步驟啟動)

                分別啟動zk,進入到zookeeper的bin目錄下

       ./zkServer.sh start

           默認端口號查看:netstat -nltp | grep 2181

           查看運行狀態: ./zkServer.sh status

5),設置成功后,進入命令行客戶端,連接到集群訪問數據,做測試

           cd zookeeper-3.4.5/bin          ./zkCli.sh

6),zookeeper管理客戶存放的數據采用的是類似於文件樹的結構,每一個節點叫做node。

           注意:zookeeper集群能夠開啟,開啟的節點數必須超過配置的一半    

2.安裝配置hadoop集群(在s104上操作)

2.1解壓

                            tar -zxvf hadoop-2.4.1.tar.gz -C app

2.2配置HDFS(hadoop2.0所有的配置文件都在$HADOOP_HOME/etc/hadoop目錄下)

                     #將hadoop添加到環境變量中

                            vim /etc/profile

                            export JAVA_HOME=/home/hadoop/app/jdk1.7.0_65

                            export HADOOP_HOME=/home/hadoop/app/hadoop-2.4.1

                            export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin

                     #hadoop2.0的配置文件全部在$HADOOP_HOME/etc/hadoop下

                            cd /home/hadoop/app/hadoop-2.4.1/etc/hadoop

2.2.1修改hadoop-env.sh

                                     export JAVA_HOME=/home/hadoop/app/jdk1.7.0_65

2.2.2修改core-site.xml

<configuration>

                             <!-- 指定hdfs的nameservice為ns1 -->

                                               <property>

                                                        <name>fs.defaultFS</name>

                                                        <value>hdfs://ns1/</value>

                                               </property>

                                               <!-- 指定hadoop臨時目錄 -->

                                               <property>

                                                        <name>hadoop.tmp.dir</name>

                                                        <value>/home/hadoop/app/hadoop-2.4.1/tmp</value>

                                               </property>

                                               <!-- 指定zookeeper地址 -->

                                               <property>

                                                        <name>ha.zookeeper.quorum</name>

                                                        <value>s104:2181,s106:2181,s108:2181</value>

                                               </property>

                                     </configuration>

2.2.3修改hdfs-site.xml

                                     <configuration>

                                     <!--指定hdfs的nameservice為ns1,需要和core-site.xml中的保持一致 -->

                                               <property>

                                                        <name>dfs.nameservices</name>

                                                        <value>ns1</value>

                                               </property>

                                     <!-- ns1下面有兩個NameNode,分別是nn1,nn2 -->

                                               <property>

                                                        <name>dfs.ha.namenodes.ns1</name>

                                                        <value>nn1,nn2</value>

                                               </property>

                                               <!-- nn1的RPC通信地址 -->

                                               <property>

                                                        <name>dfs.namenode.rpc-address.ns1.nn1</name>

                                                        <value>s104:9000</value>

                                               </property>

                                               <!-- nn1的http通信地址 -->

                                               <property>

                                                        <name>dfs.namenode.http-address.ns1.nn1</name>

                                                        <value>s104:50070</value>

                                               </property>

                                              <!-- nn2的RPC通信地址 -->

                                               <property>

                                                        <name>dfs.namenode.rpc-address.ns1.nn2</name>

                                                        <value>s106:9000</value>

                                               </property>

                                               <!-- nn2的http通信地址 -->

                                               <property>

                                                        <name>dfs.namenode.http-address.ns1.nn2</name>

                                                        <value>s106:50070</value>

                                               </property>

                                               <!-- 指定NameNode的元數據在JournalNode上的存放位置 -->

                                               <property>

                                                        <name>dfs.namenode.shared.edits.dir</name>

                                                        <value>qjournal://s104:8485;s106:8485;s108:8485/ns1</value>

                                               </property>

                                               <!-- 指定JournalNode在本地磁盤存放數據的位置 -->

                                               <property>

                                                        <name>dfs.journalnode.edits.dir</name>

                                                        <value>/home/hadoop/app/hadoop-2.4.1/journaldata</value>

                                               </property>

                                               <!-- 開啟NameNode失敗自動切換 -->

                                               <property>

                                                        <name>dfs.ha.automatic-failover.enabled</name>

                                                        <value>true</value>

                                               </property>

                                               <!-- 配置失敗自動切換實現方式 -->

                                               <property>

<name>dfs.client.failover.proxy.provider.ns1</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

<!-- 配置隔離機制方法,多個機制用換行分割,即每個機制暫用一行-->

                                               <property>

                                                        <name>dfs.ha.fencing.methods</name>

                                                        <value>

                                                                 sshfence

                                                                 shell(/bin/true)

                                                        </value>

                                               </property>

         <!-- 使用sshfence隔離機制時需要ssh免登陸 -->

                                               <property>

                                                        <name>dfs.ha.fencing.ssh.private-key-files</name>

                                                        <value>/home/hadoop/.ssh/id_rsa</value>

                                               </property>

                                               <!-- 配置sshfence隔離機制超時時間 -->

                                               <property>

                                                        <name>dfs.ha.fencing.ssh.connect-timeout</name>

                                                        <value>30000</value>

                                               </property>

                                     </configuration>

                           

2.2.4修改mapred-site.xml

                                     <configuration>

                                               <!-- 指定mr框架為yarn方式 -->

                                               <property>

                                                        <name>mapreduce.framework.name</name>

                                                        <value>yarn</value>

                                               </property>

                                     </configuration>     

                           

2.2.5修改yarn-site.xml

                                     <configuration>

                                                        <!-- 開啟RM高可用 -->

                                                        <property>

                                                           <name>yarn.resourcemanager.ha.enabled</name>

                                                           <value>true</value>

                                                        </property>

                                                        <!-- 指定RM的cluster id -->

                                                        <property>

                                                           <name>yarn.resourcemanager.cluster-id</name>

                                                           <value>yrc</value>

                                                        </property>

                                                        <!-- 指定RM的名字 -->

                                                        <property>

                                                           <name>yarn.resourcemanager.ha.rm-ids</name>

                                                           <value>rm1,rm2</value>

                                                        </property>

                                                        <!-- 分別指定RM的地址 -->

                                                        <property>

                                                           <name>yarn.resourcemanager.hostname.rm1</name>

                                                           <value>s104</value>

                                                        </property>

                                                        <property>

                                                           <name>yarn.resourcemanager.hostname.rm2</name>

                                                           <value>s106</value>

                                                        </property>

                                                        <!-- 指定zk集群地址 -->

                                                        <property>

                                                           <name>yarn.resourcemanager.zk-address</name>

                                                           <value>s104:2181,s106:2181,s108:2181</value>

                                                        </property>

                                                        <property>

                                                           <name>yarn.nodemanager.aux-services</name>

                                                           <value>mapreduce_shuffle</value>

                                                        </property>

                                     </configuration>     

                                    

2.2.6修改slaves(slaves是指定子節點的位置,因為要在s104上啟動HDFS、在s104又啟動yarn,所以s104上的slaves文件既指定的是datanode的位置,又指定的是nodemanager的位置)

                                     s108

2.2.7配置免密碼登陸

                   #首先要配置s104到s106,s108的免密碼登陸

                   #在s104上生產一對鑰匙

                      ssh-keygen -t rsa

                   #將公鑰拷貝到其他節點,包括自己

                      ssh-coyp-id s104

                     ssh-coyp-id s106

                      ssh-coyp-id s108

                   #配置s106到s104,s108的免密碼登陸(兩個namenode之間要配置ssh免密碼登陸,別忘了配置s106到s104的免登陸)

                   #在s106上生產一對鑰匙

                      ssh-keygen -t rsa

                   #將公鑰拷貝到其他節點

                      ssh-coyp-id s104

                      ssh-coyp-id s106

                      ssh-coyp-id s108                                    

2.4將配置好的hadoop拷貝到其他節點

                            scp -r /home/hadoop/app/hadoop-2.4.1/ hadoop@s106:/home/hadoop/app/

                            scp -r /home/hadoop/app/hadoop-2.4.1/ hadoop@s108:/home/hadoop/app/

###注意:嚴格按照下面的步驟

2.5啟動zookeeper集群(分別在s104,s105,s106上啟動zk)

                            cd /weekend/zookeeper-3.4.5/bin/

                            ./zkServer.sh start

                            #查看狀態:一個leader,兩個follower

                            ./zkServer.sh status        

2.6啟動journalnode(分別在在s104,s105,s106上執行)

                            cd /home/hadoop/app/hadoop-2.4.1

                            sbin/hadoop-daemon.sh start journalnode

                            #運行jps命令檢驗,s104,s105,s106上多了JournalNode進程

2.7格式化HDFS

                            #在s104上執行命令:

                            hdfs namenode -format

                          #格式化后會在根據core-site.xml中的hadoop.tmp.dir配置生成個文件,這里我配置的是/home/hadoop/app/hadoop-2.4.1/tmp,將其拷貝到s106 的/home/hadoop/app/hadoop-2.4.1/下。

                           scp -r tmp/ s106:/home/hadoop/app/hadoop-2.4.1/

                            ##也可以這樣,建議hdfs namenode –bootstrapStandby

2.8格式化ZKFC(在s104上執行即可)

                            hdfs zkfc –formatZK

2.9啟動HDFS(在s104上執行)

                            sbin/start-dfs.sh     

2.10,我配置的是3個節點的HA配置,所以namenode和resourcemanager放在同一個節點上,我配置的兩個resourcemanager放在s104,s106

          在s104上啟動 sbin/start-yarn.sh

              到此,hadoop-2.4.1配置完畢,可以統計瀏覽器訪問:

                       http://192.168.43.104:50070

                       NameNode 's104:9000' (active)

                       http://192.168.43.106:50070

                       NameNode 's106:9000' (standby)

        

                  

        

         驗證YARN:

                   運行一下hadoop提供的demo中的WordCount程序:

                   hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar wordcount /user/input /user/output

          OK,大功告成!!!     

測試集群工作狀態的一些指令:

bin/hdfs dfsadmin -report        查看hdfs的各節點狀態信息

bin/hdfs haadmin -getServiceState nn1           獲取一個namenode節點的HA狀態

sbin/hadoop-daemon.sh start namenode  單獨啟動一個namenode進程

./hadoop-daemon.sh start zkfc   單獨啟動一個zkfc進程

                           

 

HA配置成功,實驗測試過程如下:

HA架構:

                       

 S104節點

 

S106節點

 

 

S108節點

 

開始active節點為s104

 

 

 

 

 

當s104被我們kill掉時,active節點自動變為s106,如下圖

手動啟動那個掛掉的NameNode

                   sbin/hadoop-daemon.sh start namenode

                   通過瀏覽器訪問:http://192.168.43.104:50070,如下圖,該節點變為standby

         datanode節點一直是s108         

 

驗證yarn, 運行一下hadoop提供的demo中的WordCount程序:

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar wordcount  /profile  /out

        

         OK,大功告成!!!

 

 

 

最后wordcount的結果如下圖所示

 

網頁中生成的結果如圖所示

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM