理論參考:http://www.cnblogs.com/hseagle/p/3673147.html
基於3台主機搭建:以下僅是操作步驟,原理網上自查 :
1. 增加ip和hostname的對應關系,跨主機WORKER無法連接MASTER問題
]$ cat /etc/hosts
192.168.1.6 node6
192.168.1.7 node7
192.168.1.8 node8
2. 新增spark用戶,並建立無密互信
3. 下載依賴安裝包,解壓
$ ll
total 426288
-rw-rw-r-- 1 spark spark 181435897 Sep 22 09:40 jdk-8u102-linux-x64.tar.gz
-rw-rw-r-- 1 spark spark 29086055 Sep 22 09:36 scala-2.11.11.tgz
-rw-rw-r-- 1 spark spark 203728858 Sep 22 09:41 spark-2.2.0-bin-hadoop2.7.tgz
-rw-rw-r-- 1 spark spark 22261552 Sep 22 09:40 zookeeper-3.4.8.tar.gz
export SPARK_HOME=~/soft/spark-2.2.0-bin-hadoop2.7
4. spark 配置添加
cd $SPARK_HOME/conf
cp spark-env.sh.template spark-env.sh
cp slaves.template slaves
$ cat slaves
#localhost
192.168.1.6
192.168.1.7
192.168.1.8
$ cat spark-env.sh
#spark
export JAVA_HOME=~/soft/jdk1.8.0_102
export SCALA_HOME=~/soft/scala-2.11.11
#export SPARK_MASTER_IP=127.0.0.1
export SPARK_WORKER_CORES=12
export SPARK_WORKER_MEMORY=32g
export SPARK_HOME=~/soft/spark-2.2.0-bin-hadoop2.7
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=192.168.1.7:2181 -Dspark.deploy.zookeeper.dir=/spark"
5. 確認以上4步,在每台主機上執行一遍
6. 啟動zk 或集群(略)
7. 啟動spark
cd $SPARK_HOME/sbin;
./start-all.sh (主節點)
./start-master.sh (STANDBY master節點 )
8. 查看前台監控
http://192.168.1.6:8080
9. 測試spark
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://192.168.1.6:7077,192.168.1.7:7077,192.168.1.8:7077 ./examples/jars/spark-examples_2.11-2.2.0.jar