Hadoop集群(四) Hadoop升級


Hadoop前面安裝的集群是2.6版本,現在升級到2.7版本。

注意,這個集群上有運行Hbase,所以,升級前后,需要啟停Hbase。

更多安裝步驟,請參考:

Hadoop集群(一) Zookeeper搭建

Hadoop集群(二) HDFS搭建

Hadoop集群(三) Hbase搭建

升級步驟如下:

集群IP列表

Namenode:192.168.143.46192.168.143.103

Journalnode:192.168.143.101192.168.143.102192.168.143.103

Datanode&Hbase regionserver:192.168.143.196192.168.143.231192.168.143.182192.168.143.235192.168.143.41192.168.143.127

Hbase master:192.168.143.103192.168.143.101

Zookeeper:192.168.143.101192.168.143.102192.168.143.103

1. 首先確定hadoop運行的路徑,將新版本的軟件分發到每個節點的這個路徑下,並解壓。

# ll /usr/local/hadoop/

total 493244

drwxrwxr-x 9 root root      4096 Mar 21  2017 hadoop-release ->hadoop-2.6.0-EDH-0u1-SNAPSHOT-HA-SECURITY

drwxr-xr-x 9 root root      4096 Oct 11 11:06 hadoop-2.7.1-rw-r--r-- 1 root root 194690531 Oct  9 10:55 hadoop-2.7.1.tar.gz

drwxrwxr-x 7 root root      4096 May 21  2016 hbase-1.1.3-rw-r--r-- 1 root root 128975247 Apr 10  2017 hbase-1.1.3.tar.gz

lrwxrwxrwx 1 root root        29 Apr 10  2017 hbase-release -> /usr/local/hadoop/hbase-1.1.3

由於是升級,配置文件完全不變,將原hadoop-2.6.0下的etc/hadoop路徑完全拷貝/替換到hadoop-2.7.1下。

至此,升級前的准備就已經完成了。

 

下面開始升級操作過程。全程都是在一個中轉機上執行的命令,通過shell腳本執行,省去頻繁ssh登陸的操作。

## 停止hbase,hbase用戶執行 

2. 停止Hbase master,hbase用戶執行

狀態檢查,確認master,先停standby master

http://192.168.143.101:16010/master-status

master:

ssh -t -q 192.168.143.103  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ master"

ssh -t -q 192.168.143.103  sudo su -l hbase -c "jps"

ssh -t -q 192.168.143.101  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ master"

ssh -t -q 192.168.143.101  sudo su -l hbase -c "jps"

3. 停止Hbase regionserver,hbase用戶執行

ssh -t -q 192.168.143.196  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"

ssh -t -q 192.168.143.231  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"

ssh -t -q 192.168.143.182  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"

ssh -t -q 192.168.143.235  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"

ssh -t -q 192.168.143.41   sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"

ssh -t -q 192.168.143.127  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ stop\ regionserver"

檢查運行狀態

ssh -t -q 192.168.143.196  sudo su -l hbase -c "jps" 

ssh -t -q 192.168.143.231  sudo su -l hbase -c "jps"

ssh -t -q 192.168.143.182  sudo su -l hbase -c "jps"

ssh -t -q 192.168.143.235  sudo su -l hbase -c "jps"

ssh -t -q 192.168.143.41   sudo su -l hbase -c "jps"

ssh -t -q 192.168.143.127  sudo su -l hbase -c "jps"

## 停止服務--HDFS

4. 先確認,active的namenode,網頁確認.后續要先啟動這個namenode

https://192.168.143.46:50470/dfshealth.html#tab-overview

5. 停止NameNode,hdfs用戶執行

NN: 先停standby namenode

ssh -t -q 192.168.143.103  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ namenode"

ssh -t -q 192.168.143.46   sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ namenode"

檢查狀態

ssh -t -q 192.168.143.103  sudo su -l hdfs -c "jps"

ssh -t -q 192.168.143.46   sudo su -l hdfs -c "jps"

6. 停止DataNode,hdfs用戶執行

ssh -t -q 192.168.143.196  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"

ssh -t -q 192.168.143.231  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"

ssh -t -q 192.168.143.182  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"

ssh -t -q 192.168.143.235  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"

ssh -t -q 192.168.143.41   sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"

ssh -t -q 192.168.143.127  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ datanode"

7. 停止ZKFC,hdfs用戶執行

ssh -t -q 192.168.143.46   sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ zkfc"

ssh -t -q 192.168.143.103  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ zkfc"

8.停止JournalNode,hdfs用戶執行

JN:

ssh -t -q 192.168.143.101  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ journalnode"

ssh -t -q 192.168.143.102  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ journalnode"

ssh -t -q 192.168.143.103  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ stop\ journalnode"

### 備份NameNode的數據,由於生產環境,原有的數據需要備份。以備升級失敗回滾。

9. 備份namenode1

ssh -t -q 192.168.143.46 "cp -r /data1/dfs/name    /data1/dfs/name.bak.20171011-2;ls -al /data1/dfs/;du -sm /data1/dfs/*" 

ssh -t -q 192.168.143.46 "cp -r /data2/dfs/name    /data2/dfs/name.bak.20171011-2;ls -al /data1/dfs/;du -sm /data1/dfs/*"

10. 備份namenode2

ssh -t -q 192.168.143.103 "cp -r /data1/dfs/name

/data1/dfs/name.bak.20171011-2;ls -al /data1/dfs/;du -sm /data1/dfs/*"

11. 備份journal

ssh -t -q 192.168.143.101 "cp -r /data1/journalnode   /data1/journalnode.bak.20171011;ls -al /data1/dfs/;du -sm /data1/*"

ssh -t -q 192.168.143.102 "cp -r /data1/journalnode   /data1/journalnode.bak.20171011;ls -al /data1/dfs/;du -sm /data1/*"

ssh -t -q 192.168.143.103 "cp -r /data1/journalnode   /data1/journalnode.bak.20171011;ls -al /data1/dfs/;du -sm /data1/*"

journal路徑,可以查看hdfs-site.xml文件

dfs.journalnode.edits.dir:  /data1/journalnode

### 升級相關

12. copy文件(已提前處理,參考第一步)

切換軟連接到2.7.1版本

ssh -t -q $h "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

13. 切換文件軟鏈接,root用戶執行

ssh -t -q 192.168.143.46   "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

ssh -t -q 192.168.143.103   "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

ssh -t -q 192.168.143.101   "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

ssh -t -q 192.168.143.102   "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

ssh -t -q 192.168.143.196   "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

ssh -t -q 192.168.143.231   "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

ssh -t -q 192.168.143.182   "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

ssh -t -q 192.168.143.235   "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

ssh -t -q 192.168.143.41    "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

ssh -t -q 192.168.143.127   "cd /usr/local/hadoop; rm hadoop-release; ln -s hadoop-2.7.1 hadoop-release"

確認狀態

ssh -t -q 192.168.143.46    "cd /usr/local/hadoop; ls -al"

ssh -t -q 192.168.143.103   "cd /usr/local/hadoop; ls -al"

ssh -t -q 192.168.143.101   "cd /usr/local/hadoop; ls -al"

ssh -t -q 192.168.143.102   "cd /usr/local/hadoop; ls -al"

ssh -t -q 192.168.143.196   "cd /usr/local/hadoop; ls -al"

ssh -t -q 192.168.143.231   "cd /usr/local/hadoop; ls -al"

ssh -t -q 192.168.143.182   "cd /usr/local/hadoop; ls -al"

ssh -t -q 192.168.143.235   "cd /usr/local/hadoop; ls -al"

ssh -t -q 192.168.143.41    "cd /usr/local/hadoop; ls -al"

ssh -t -q 192.168.143.127   "cd /usr/local/hadoop; ls -al"

### 啟動HDFS,hdfs用戶執行

14. 啟動JournalNode 

JN:

ssh -t -q 192.168.143.101  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ journalnode"

ssh -t -q 192.168.143.102  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ journalnode"

ssh -t -q 192.168.143.103  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ journalnode"

ssh -t -q 192.168.143.101  sudo su -l hdfs -c "jps"

ssh -t -q 192.168.143.102  sudo su -l hdfs -c "jps"

ssh -t -q 192.168.143.103  sudo su -l hdfs -c "jps"

15. 啟動第一個NameNode

ssh 192.168.143.46

su - hdfs/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh start namenode -upgrade

16. 確認狀態,在狀態完全OK之后,才可以啟動另一個namenode

https://192.168.143.46:50470/dfshealth.html#tab-overview

17. 啟動第一個ZKFC

su - hdfs/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh start zkfc192.168.143.46

18. 啟動第二個NameNode

ssh 192.168.143.103

su - hdfs/usr/local/hadoop/hadoop-release/bin/hdfs namenode -bootstrapStandby/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh start namenode

19. 啟動第二個ZKFC

ssh 192.168.143.103

su - hdfs/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh start zkfc

20. 啟動DataNode

ssh -t -q 192.168.143.196  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"

ssh -t -q 192.168.143.231  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"

ssh -t -q 192.168.143.182  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"

ssh -t -q 192.168.143.235  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"

ssh -t -q 192.168.143.41   sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"

ssh -t -q 192.168.143.127  sudo su -l hdfs -c "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\ start\ datanode"

確認狀態

ssh -t -q 192.168.143.196  sudo su -l hdfs -c "jps"

ssh -t -q 192.168.143.231  sudo su -l hdfs -c "jps"

ssh -t -q 192.168.143.182  sudo su -l hdfs -c "jps"

ssh -t -q 192.168.143.235  sudo su -l hdfs -c "jps"

ssh -t -q 192.168.143.41   sudo su -l hdfs -c "jps"

ssh -t -q 192.168.143.127  sudo su -l hdfs -c "jps"

21. 一切正常之后,啟動hbase, hbase用戶執行

啟動hbase master,最好先啟動原來的active master。

ssh -t -q 192.168.143.101  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ master"

ssh -t -q 192.168.143.103  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ master"

啟動Hbase regionserver

ssh -t -q 192.168.143.196  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"

ssh -t -q 192.168.143.231  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"

ssh -t -q 192.168.143.182  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"

ssh -t -q 192.168.143.235  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"

ssh -t -q 192.168.143.41   sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"

ssh -t -q 192.168.143.127  sudo su -l hbase -c "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\ start\ regionserver"

22. Hbase region需要手動Balance開啟、關閉

需要登錄HBase Shell運行如下命令

開啟

balance_switch true

關閉

balance_switch false

 

23. 本次不執行,系統運行一周,確保系統運行穩定,再執行Final。

注意:這期間,磁盤空間可能會快速增長。在執行完final之后,會釋放一部分空間。

Finallize upgrade: hdfs dfsadmin -finalizeUpgrade  

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM