flink集群安裝部署
standalone集群模式
- 必須依賴
- 必須的軟件
- JAVA_HOME配置
- flink安裝
- 配置flink
- 啟動flink
- 添加Jobmanager/taskmanager 實例到集群
- 個人真實環境實踐安裝步驟
必須依賴
必須的軟件
flink運行在所有類unix環境中,例如:linux、mac、或者cygwin,並且集群由一個master節點和一個或者多個worker節點。在你開始安裝系統之前,確保你有在每個節點上安裝以下軟件。
- java 1.8.x或者更高
- ssh
如果你的集群沒有這些軟件,你需要安裝或者升級他們。注意:一般linux服務器上都有ssh,但是java是需要自己安裝的。
在集群的所有節點上需要配置SSH免密碼登錄。
JAVA_HOME配置
flink需要在集群的所有節點(master節點和worker節點)配置JAVA_HOME,指向安裝在機器上的java。
你可以在這個文件中進行配置:conf/flink-conf.yaml 通過env.java.home這個key。
flink安裝
去下載頁面隨時下載安裝包。確保選擇flink安裝包匹配到你的hadoop版本。如果你不打算使用hadoop的話,可以選擇任意版本。
下載最新版本之后,把安裝包上傳到你的master節點,然后解壓:
-
tar xzf flink-*.tgz
-
cd flink-*
配置flink
解壓之后,需要修改conf/flink-conf.yaml
設置jobmanager.rpc.address的值為master節點的ip或者主機名。你也可以定義每個節點上允許jvm申請的最大內存,使用jobmanager.heap.mb和taskmanager.heap.mb
這兩個參數的值的單位都是MB,如果有一些節點想要分配更多的內存,可以通過覆蓋這個參數的默認值 FLINK_TM_HEAP
最后,你需要提供一個節點列表作為worker節點。因為,類似於HDFS配置,修改文件conf/slaves 然后在里面輸入每一個worker節點的ip/hostname 。每一個worker節點將運行一個taskmanager程序。
下面的例子說明了三個節點的配置:(ip地址從10.0.0.1到10.0.0.3 對應的主機名 master worker1 worker2)並顯示配置文件的內容(需要訪問所有機器的相同路徑)
-
vi /path/to/flink/conf/flink-conf.yaml
-
-
jobmanager.rpc.address: 10.0.0.1
-
-
-
vi /path/to/flink/conf/slaves
-
-
10.0.0.2
-
10.0.0.3
flink目錄必須在每一個worker節點的相同路勁。你可以使用一個共享的NFS目錄,或者拷貝整個flink目錄到每一個worker節點。
有關配置的詳細信息,請參見詳細的配置頁面進行查看。
下面這幾個參數的配置值非常重要。
- Jobmanager可用內存(jobmanager.heap.mb)
- taskmanager可用內存(taskmanager.heap.mb)
- 每個機器可用cpu數量(taskmanager.numberOfTaskSlots)
- 集群中的總cpu數量(parallelism.default)
- 節點臨時目錄(taskmanager.tmp.dirs)
啟動flink
下面的腳本將會在本機啟動一個jobmanager節點,然后通過SSH連接到slaves文件中的所有worker節點,在worker節點上面啟動taskmanager。現在flink啟動並且運行。在本地運行的jobmanager現在將會通過配置的RPC端口接收任務。
確認你在master節點並且進入flink目錄:
bin/start-cluster.sh
停止flink,需要使用stop-cluster.sh腳本
添加jobmanager或者taskmanager實例到集群
你可以通過bin/jobmanager.sh腳本和bin/taskmanager.sh腳本向一個運行中的集群添加jobmanager和taskmanager。
添加jobmanager
bin/jobmanager.sh ((start|start-foreground) cluster)|stop|stop-all
添加taskmanager
bin/taskmanager.sh start|start-foreground|stop|stop-all
個人真實環境實踐安裝步驟
以上的內容來源於官網文檔翻譯
下面的內容來自於本人在真實環境的一個安裝步驟:
集群環境規划:三台機器,一主兩從
-
hadoop100 jobManager
-
hadoop101 taskManager
-
hadoop102 taskManager
-
-
注意:
-
1:這幾台節點之間需要互相配置好SSH免密碼登錄。(至少要配置hadoop100可以免密碼登錄hadoop101和hadoop102)
-
2:這幾台節點需要安裝jdk1.8及以上,並且在/etc/profile中配置環境變量JAVA_HOME
-
例如:
-
export JAVA_HOME=/usr/local/jdk
-
export PATH=.:$JAVA_HOME/bin:$PATH
1:上傳flink安裝包到hadoop100節點的/usr/local目錄下,然后解壓
-
cd /usr/local
-
tar -zxvf flink- 1.4.1-bin-hadoop27-scala_2.11.tgz
2:修改hadoop100節點上的flink的配置文件
-
cd /usr/local/flink- 1.4.1/conf
-
vi flink-conf.yaml
-
# 修改此參數的值,改為主節點的主機名
-
jobmanager.rpc.address: hadoop100
-
-
-
vi slaves
-
hadoop101
-
hadoop102
3:把修改好配置文件的flink目錄拷貝到其他兩個節點
-
scp -rq /usr/local/flink- 1.4.1 hadoop101:/usr/local
-
scp -rq /usr/local/flink- 1.4.1 hadoop102:/usr/local
4:在hadoop100節點啟動集群
-
cd /usr/local/flink- 1.4.1
-
bin/start-cluster.sh
執行上面命令以后正常將會看到以下日志輸出:
-
Using the result of 'hadoop classpath' to augment the Hadoop classpath: /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
-
Starting cluster.
-
Using the result of 'hadoop classpath' to augment the Hadoop classpath: /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
-
Using the result of 'hadoop classpath' to augment the Hadoop classpath: /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
-
Starting jobmanager daemon on host hadoop100.
-
Using the result of 'hadoop classpath' to augment the Hadoop classpath: /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
-
Using the result of 'hadoop classpath' to augment the Hadoop classpath: /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
-
Starting taskmanager daemon on host hadoop101.
-
Starting taskmanager daemon on host hadoop102.
5:驗證集群啟動情況
查看進程:
-
在hadoop100節點上執行jps,可以看到:
-
3785 JobManager
-
-
在hadoop101節點上執行jps,可以看到:
-
2534 TaskManager
-
-
在hadoop101節點上執行jps,可以看到:
-
2402 TaskManager
-
-
能看到對應的jobmanager和taskmanager進程即可。
如果啟動失敗了,請查看對應的日志:
-
cd /usr/local/flink- 1.4.1/log
-
-
針對jobmanager節點:
-
more flink-root-jobmanager- 0-hadoop100.log
-
-
針對taskmanager節點:
-
more flink-root-taskmanager- 0-hadoop101.log
-
more flink-root-taskmanager- 0-hadoop102.log
-
-
查看此日志文件中是否有異常日志信息
6:訪問集群web界面
http://hadoop100:8081
7:停止集群
-
在hadoop100節點上執行下面命令
-
cd /usr/local/flink- 1.4.1
-
bin/stop-cluster.sh
執行停止命令之后將會看到下面日志輸出:
-
Using the result of 'hadoop classpath' to augment the Hadoop classpath: /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
-
Using the result of 'hadoop classpath' to augment the Hadoop classpath: /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
-
Using the result of 'hadoop classpath' to augment the Hadoop classpath: /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
-
Stopping taskmanager daemon (pid: 3321) on host hadoop101.
-
Stopping taskmanager daemon (pid: 3088) on host hadoop102.
-
Using the result of 'hadoop classpath' to augment the Hadoop classpath: /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
-
Using the result of 'hadoop classpath' to augment the Hadoop classpath: /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
-
Stopping jobmanager daemon (pid: 5341) on host hadoop100.
再去對應的節點上執行jps進程發現對應的jobmanager和taskmanager進程都沒有了。