轉自:https://www.cnblogs.com/dtmobile-ksw/p/11988132.html
安裝Flink standalone集群
1.下載flink https://flink.apache.org/downloads.html
2.官網參考 https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/cluster_setup.html
另外,如果已有hadoop平台,並且想使用flink讀寫hadoop上的數據,那要下載相對應的兼容hadoop的jar包
2.安裝JDK8、配置ssh免密
3. 選擇節點作為master(job manager)和slave(task manage)
比如這里有三個節點,其中10.0.0.1為master,其他兩個為slaves,右邊為相關配置應設置的內容
4.修改配置文件 flink-1.8.2/conf/flink-conf.yaml
相關的配置,端口號,內存根據實際配置調整,此外還可以在此文件中export JAVA_HOME=/path/to/you
jobmanager.rpc.address: master01.hadoop.xxx.cn # The RPC port where the JobManager is reachable. jobmanager.rpc.port: 6123 # The heap size for the JobManager JVM jobmanager.heap.size: 1024m # The heap size for the TaskManager JVM taskmanager.heap.size: 1024m # The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.每台機器可用的cpu數量,如果只是計算集群,16核的服務器可以配置14個,留2個給系統 taskmanager.numberOfTaskSlots: 1 # The parallelism used for programs that did not specify and other parallelism.默認情況下task的並行度 parallelism.default: 1
比較重要的配置參數(完整配置參數詳解可參加官網鏈接):
the amount of available memory per JobManager (jobmanager.heap.mb), the amount of available memory per TaskManager (taskmanager.heap.mb), the number of available CPUs per machine (taskmanager.numberOfTaskSlots), the total number of CPUs in the cluster (parallelism.default) and the temporary directories (io.tmp.dirs)
5.修改配置文件 flink-1.8.2/conf/masters和slaves
masters文件:指定master所在節點以及端口號
master01.hadoop.xxx.cn:8081
slavers文件:指定slavers所在節點
worker01.hadoop.xxx.cn worker02.hadoop.xxx.cn
6. 分發flink包到各個節點
scp .....
7. 啟動standalone集群
bin/start-cluster.sh
查看狀態:
在master節點jps
50963 StandaloneSessionClusterEntrypoint
在work節點jps
3509 TaskManagerRunner
查看ui界面
http://master01.hadoop.xxx.cn:8081
運行flink自帶的wordcount例子
nc -l 9999
bin/flink run examples/streaming/SocketWindowWordCount.jar --hostname 172.xx.xx.xxx --port 9999
界面:
flink standalone集群中job的容錯
1.jobmanager掛掉的話,正在執行的任務會失敗,所以jobmanager應該做HA。
2.taskmanager掛掉的話,如果有多余的taskmanager節點,flink會自動把任務調度到其他節點上執行。