Hadoop生態圈-通過CDH5.15.1部署spark1.6與spark2.3.0的版本兼容運行


             Hadoop生態圈-通過CDH5.15.1部署spark1.6與spark2.3.0的版本兼容運行                 

                                                  作者:尹正傑

版權聲明:原創作品,謝絕轉載!否則將追究法律責任。

 

  在我的CDH5.15.1集群中,默認安裝的spark是1.6版本,開發的同事跟我抱怨,說之前的大數據平台(在ucloud上,屬於雲服務)用的就是spark1.6,好多java的API都用不了,有很多高級的功能沒法在1.6版本上使用,因此被迫需要升級spark版本,他們要求升級到2.3.0或以上版本,經查閱相關資料,才總結了我部署spark2.3.0的部署筆記。當然你可以參考官網:https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html
  如果你使用CDH部署kafka的話,相信升級spark版本這個事情對你來說就是小菜一碟了,因為他們基本上是一個套路。如果你使用時CDH免費版本的話,我並不推薦你使用CDH集成kafka。因為里面有一些和奇葩的坑在等着你。
 
 
一.下載spark2.3的CSD的jar包
  和CDH集成kafka的套路一樣,我們在安裝spark版本的時候也需要下載相應的csd的jar包。下載地址:http://archive.cloudera.com/spark2/csd/
1>.選擇csd版本
2>.安裝下載的軟件包(wget)
[root@node101 ~]# yum -y install wget
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
10gen                                                                                                                                                                                       | 2.5 kB  00:00:00     
base                                                                                                                                                                                        | 3.6 kB  00:00:00     
centosplus                                                                                                                                                                                  | 3.4 kB  00:00:00     
epel                                                                                                                                                                                        | 3.2 kB  00:00:00     
extras                                                                                                                                                                                      | 3.4 kB  00:00:00     
mysql-connectors-community                                                                                                                                                                  | 2.5 kB  00:00:00     
mysql-tools-community                                                                                                                                                                       | 2.5 kB  00:00:00     
mysql56-community                                                                                                                                                                           | 2.5 kB  00:00:00     
updates                                                                                                                                                                                     | 3.4 kB  00:00:00     
(1/3): epel/x86_64/updateinfo                                                                                                                                                               | 933 kB  00:00:00     
(2/3): epel/x86_64/primary                                                                                                                                                                  | 3.6 MB  00:00:01     
(3/3): updates/7/x86_64/primary_db                                                                                                                                                          | 6.0 MB  00:00:01     
epel                                                                                                                                                                                                   12756/12756
Resolving Dependencies
--> Running transaction check
---> Package wget.x86_64 0:1.14-15.el7_4.1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

===================================================================================================================================================================================================================
 Package                                        Arch                                             Version                                                      Repository                                      Size
===================================================================================================================================================================================================================
Installing:
 wget                                           x86_64                                           1.14-15.el7_4.1                                              base                                           547 k

Transaction Summary
===================================================================================================================================================================================================================
Install  1 Package

Total download size: 547 k
Installed size: 2.0 M
Downloading packages:
wget-1.14-15.el7_4.1.x86_64.rpm                                                                                                                                                             | 547 kB  00:00:00     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : wget-1.14-15.el7_4.1.x86_64                                                                                                                                                                     1/1 
  Verifying  : wget-1.14-15.el7_4.1.x86_64                                                                                                                                                                     1/1 

Installed:
  wget.x86_64 0:1.14-15.el7_4.1                                                                                                                                                                                    

Complete!
[root@node101 ~]# 
[root@node101 ~]# yum -y install wget
3>.下載csd的jar包
[root@node101 ~]# mkdir /opt/cloudera/csd && cd /opt/cloudera/csd 
[root@node101 csd]# 
[root@node101 csd]# wget http://archive.cloudera.com/spark2/csd/SPARK2_ON_YARN-2.3.0.cloudera4.jar
--2018-10-31 00:17:57--  http://archive.cloudera.com/spark2/csd/SPARK2_ON_YARN-2.3.0.cloudera4.jar
Connecting to 10.9.137.250:3888... connected.
Proxy request sent, awaiting response... 200 OK
Length: 19037 (19K) [application/java-archive]
Saving to: ‘SPARK2_ON_YARN-2.3.0.cloudera4.jar’

100%[=========================================================================================================================================================================>] 19,037      --.-K/s   in 0.002s  

2018-10-31 00:17:57 (10.4 MB/s) - ‘SPARK2_ON_YARN-2.3.0.cloudera4.jar’ saved [19037/19037]

[root@node101 csd]# 
[root@node101 csd]# ll
total 20
-rw-r--r--. 1 root root 19037 Oct  5 05:45 SPARK2_ON_YARN-2.3.0.cloudera4.jar
[root@node101 csd]# 
4>.更改權限,讓其屬於cloudera-scm用戶
[root@node101 csd]# ll
total 20
-rw-r--r--. 1 root root 19037 Oct  5 05:45 SPARK2_ON_YARN-2.3.0.cloudera4.jar
[root@node101 csd]# 
[root@node101 csd]# 
[root@node101 csd]# id cloudera-scm
uid=997(cloudera-scm) gid=995(cloudera-scm) groups=995(cloudera-scm)
[root@node101 csd]# 
[root@node101 csd]# 
[root@node101 csd]# chown cloudera-scm:cloudera-scm SPARK2_ON_YARN-2.3.0.cloudera4.jar 
[root@node101 csd]# 
[root@node101 csd]# ll
total 20
-rw-r--r--. 1 cloudera-scm cloudera-scm 19037 Oct  5 05:45 SPARK2_ON_YARN-2.3.0.cloudera4.jar
[root@node101 csd]# 

  

二.下載spark2.3的parcel安裝包
  和CDH集成kafka的套路一樣,我們在安裝spark版本的時候也需要下載相應的parcel的jar包。下載地址:http://archive.cloudera.com/spark2/parcels/2.3.0.cloudera4/
1>.選擇spark的版本,它需要和上面的csd的版本對應上,當然也得和你的操作系統的版本對應上。
 
2>.進入下載目錄,並將manifest.json文件進行備份操作
[root@node101 ~]# cd /opt/cloudera/parcel-repo/
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# ll
total 2070564
-rwxr-xr-x. 1 root root 2120090032 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
-rwxr-xr-x. 1 root root         41 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha
-rw-r-----. 1 root root      81046 Oct 26 08:38 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.torrent
-rwxr-xr-x. 1 root root      73767 Oct 26 07:19 manifest.json
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# mv manifest.json manifest.json.`date +%F`
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# ll
total 2070564
-rwxr-xr-x. 1 root root 2120090032 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
-rwxr-xr-x. 1 root root         41 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha
-rw-r-----. 1 root root      81046 Oct 26 08:38 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.torrent
-rwxr-xr-x. 1 root root      73767 Oct 26 07:19 manifest.json.2018-10-31
[root@node101 parcel-repo]# 
3>.下載spark2.3版本對應的安裝包
[root@node101 ~]# hostname
node101.yinzhengjie.org.cn
[root@node101 ~]# 
[root@node101 ~]# cd /opt/cloudera/parcel-repo/
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# ll
total 2070564
-rwxr-xr-x. 1 root root 2120090032 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
-rwxr-xr-x. 1 root root         41 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha
-rw-r-----. 1 root root      81046 Oct 26 08:38 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.torrent
-rwxr-xr-x. 1 root root      73767 Oct 26 07:19 manifest.json.2018-10-31
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# wget http://archive.cloudera.com/spark2/parcels/2.3.0.cloudera4/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel
--2018-10-31 00:36:28--  http://archive.cloudera.com/spark2/parcels/2.3.0.cloudera4/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel
Connecting to 10.9.137.250:3888... connected.
Proxy request sent, awaiting response... 200 OK
Length: 191904064 (183M) [binary/octet-stream]
Saving to: ‘SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel’

100%[=========================================================================================================================================================================>] 191,904,064  255KB/s   in 22m 2s 

2018-10-31 00:58:31 (142 KB/s) - ‘SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel’ saved [191904064/191904064]

[root@node101 parcel-repo]# ll
total 2257972
-rwxr-xr-x. 1 root root 2120090032 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
-rwxr-xr-x. 1 root root         41 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha
-rw-r-----. 1 root root      81046 Oct 26 08:38 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.torrent
-rwxr-xr-x. 1 root root      73767 Oct 26 07:19 manifest.json.2018-10-31
-rw-r--r--. 1 root root  191904064 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel
[root@node101 parcel-repo]# 
下載spark2.3.0版本的parcel安裝包([root@node101 parcel-repo]# wget http://archive.cloudera.com/spark2/parcels/2.3.0.cloudera4/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel)
[root@node101 parcel-repo]# ll
total 2257972
-rwxr-xr-x. 1 root root 2120090032 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
-rwxr-xr-x. 1 root root         41 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha
-rw-r-----. 1 root root      81046 Oct 26 08:38 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.torrent
-rwxr-xr-x. 1 root root      73767 Oct 26 07:19 manifest.json.2018-10-31
-rw-r--r--. 1 root root  191904064 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# wget http://archive.cloudera.com/spark2/parcels/2.3.0.cloudera4/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha1
--2018-10-31 01:42:02--  http://archive.cloudera.com/spark2/parcels/2.3.0.cloudera4/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha1
Connecting to 10.9.137.250:3888... connected.
Proxy request sent, awaiting response... 200 OK
Length: 41 [binary/octet-stream]
Saving to: ‘SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha1’

100%[=========================================================================================================================================================================>] 41          --.-K/s   in 0s      

2018-10-31 01:42:02 (3.01 MB/s) - ‘SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha1’ saved [41/41]

[root@node101 parcel-repo]# ll
total 2257976
-rwxr-xr-x. 1 root root 2120090032 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
-rwxr-xr-x. 1 root root         41 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha
-rw-r-----. 1 root root      81046 Oct 26 08:38 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.torrent
-rwxr-xr-x. 1 root root      73767 Oct 26 07:19 manifest.json.2018-10-31
-rw-r--r--. 1 root root  191904064 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel
-rw-r--r--. 1 root root         41 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha1
[root@node101 parcel-repo]# 
下載spark2.3.0的parcel校驗包([root@node101 parcel-repo]# wget http://archive.cloudera.com/spark2/parcels/2.3.0.cloudera4/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha1)
[root@node101 parcel-repo]# hostname
node101.yinzhengjie.org.cn
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# pwd
/opt/cloudera/parcel-repo
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# ll
total 2257976
-rwxr-xr-x. 1 root root 2120090032 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
-rwxr-xr-x. 1 root root         41 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha
-rw-r-----. 1 root root      81046 Oct 26 08:38 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.torrent
-rwxr-xr-x. 1 root root      73767 Oct 26 07:19 manifest.json.2018-10-31
-rw-r--r--. 1 root root  191904064 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel
-rw-r--r--. 1 root root         41 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha1
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# mv SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha1 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# ll
total 2257976
-rwxr-xr-x. 1 root root 2120090032 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
-rwxr-xr-x. 1 root root         41 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha
-rw-r-----. 1 root root      81046 Oct 26 08:38 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.torrent
-rwxr-xr-x. 1 root root      73767 Oct 26 07:19 manifest.json.2018-10-31
-rw-r--r--. 1 root root  191904064 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel
-rw-r--r--. 1 root root         41 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha
[root@node101 parcel-repo]# 
對下載的校驗包進行重命名操作([root@node101 parcel-repo]# mv SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha1 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha)
[root@node101 parcel-repo]# hostname
node101.yinzhengjie.org.cn
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# pwd
/opt/cloudera/parcel-repo
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# ll
total 2257976
-rwxr-xr-x. 1 root root 2120090032 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
-rwxr-xr-x. 1 root root         41 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha
-rw-r-----. 1 root root      81046 Oct 26 08:38 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.torrent
-rwxr-xr-x. 1 root root      73767 Oct 26 07:19 manifest.json.2018-10-31
-rw-r--r--. 1 root root  191904064 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel
-rw-r--r--. 1 root root         41 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha
[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# wget http://archive.cloudera.com/spark2/parcels/2.1.0.cloudera1/manifest.json
--2018-10-31 01:45:24--  http://archive.cloudera.com/spark2/parcels/2.1.0.cloudera1/manifest.json
Connecting to 10.9.137.250:3888... connected.
Proxy request sent, awaiting response... 200 OK
Length: 4677 (4.6K) [application/json]
Saving to: ‘manifest.json’

100%[=========================================================================================================================================================================>] 4,677       --.-K/s   in 0s      

2018-10-31 01:45:25 (229 MB/s) - ‘manifest.json’ saved [4677/4677]

[root@node101 parcel-repo]# 
[root@node101 parcel-repo]# ll
total 2257984
-rwxr-xr-x. 1 root root 2120090032 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
-rwxr-xr-x. 1 root root         41 Oct 26 07:19 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha
-rw-r-----. 1 root root      81046 Oct 26 08:38 CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.torrent
-rw-r--r--. 1 root root       4677 Feb  5  2018 manifest.json
-rwxr-xr-x. 1 root root      73767 Oct 26 07:19 manifest.json.2018-10-31
-rw-r--r--. 1 root root  191904064 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel
-rw-r--r--. 1 root root         41 Oct  5 05:45 SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179-el7.parcel.sha
[root@node101 parcel-repo]# 
下載manifest.json的安裝包([root@node101 parcel-repo]# wget http://archive.cloudera.com/spark2/parcels/2.1.0.cloudera1/manifest.json)

4>.重啟Cloudera manager的服務端(我是離線方式部署的CDH)

[root@node101 ~]# hostname
node101.yinzhengjie.org.cn
[root@node101 ~]# 
[root@node101 ~]# cd /opt/cloudera-manager/cm-5.15.1/etc/init.d/
[root@node101 init.d]# 
[root@node101 init.d]# ll
total 32
-rwxr-xr-x. 1 1106 4001 8871 Jul 31 06:28 cloudera-scm-agent
-rwxr-xr-x. 1 1106 4001 8417 Jul 31 06:28 cloudera-scm-server
-rwxr-xr-x. 1 1106 4001 4444 Jul 31 06:28 cloudera-scm-server-db
[root@node101 init.d]# 
[root@node101 init.d]# ./cloudera-scm-server restart
Stopping cloudera-scm-server:                              [  OK  ]
Starting cloudera-scm-server:                              [  OK  ]
[root@node101 init.d]# 
 
三.通過CDH部署spark2.3.0
1>.點擊Parcel
2>.點擊分配
3>.點擊激活
4>.點擊確定
5>.激活完成
6>.點擊添加服務
7>.選擇Spark2,並點擊繼續
 
8>.為spark2.3.0版本分配角色
9>.點擊繼續
10>.等待服務部署完成
11>.spark2.3.0服務添加成功
12>.在cloudera manager主界面查看spark2.3.0服務

 13>.在部署有gateway主機上運行spark2.3.0版本環境

 
四.部署spark可能會遇到的報錯處理
1>.缺少YARN組建,spark有兩種模式,一種時獨立模式,一種時yarn模式,
  部署spark時報錯提示如下:
      必須提供“YARN (MR2 Included)”類型的有效依賴服務才能創建“Spark 2”類型的新服務。

  解決方案:

    很簡單,我們在部署spark和spark2時,如果選擇的時on yarn模式的話,人家已經指名道姓的讓你安裝YARN服務,因此我們只需要把yarn服務安裝好再來安裝spark2.3.0版本即可。

2>. 執行內存小雨分配最大內存。

  啟動spark2時報錯如下:

    Required executor memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'. 

   解決方案:

    按照上圖的報錯信息提示,執行的內存超過來你分配的最大內存,因此按照提示去yarn服務的配置文件修改最大容器內存(yarn.scheduler.maximum-allocation-mb)和容器內存(yarn.nodemanager.resource.memory-mb)的值即可。我的建議是不要把操作系統的所有內存都分配出去,需要給操作系統預留點內存,推薦遵循二八法則(20%的內存分配給操作系統,80%的內存分配給服務)。

 
 
 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM