一、環境介紹
服務器:一台阿里雲服務器master,一台騰訊雲服務器slave
操作系統:centOS7
Hadoop:hadoop-2.7.7.tar.gz
Java:jdk-8u172-linux-x64.tar.gz
二、修改hosts、hostname文件
2.1 修改hosts文件(/etc/hosts)
把原有的代碼都注釋掉
假如master是阿里雲。則在阿里配置
其中 ip=阿里的內網ip;ip1=騰訊的外網ip;
ip master
ip1 slave1
在騰訊配置,其中 ip=阿里的外網ip;ip1=騰訊的內網ip。
ip master
ip1 slave1
2.2 修改hostname文件
[root@master ~]vim /etc/hostname
把原有的內容刪掉
master的機器添加內容:master;slave機器添加內容slave
輸入命令"reboot"重啟下機器
驗證一下:
[root@master ~]# hostname
master
[root@slave ~]# hostname
slave
三、ssh互信
3.1 每台機器都執行以下代碼(執行過程只需要按回車)
[root@master .ssh]ssh-keygen -t rsa -P ''
執行完打開.ssh目錄,會有以下三個文件
[root@master ~]# cd ~/.ssh
[root@master .ssh]# ls
authorized_keys id_rsa id_rsa.pub known_hosts
3.2 把每台機器的公鑰(在id_rsa.pub內)都復制到每台機器的authorized_keys,意思就是每台機器要擁有其他任何機器的公鑰。
在master敲:
[root@master src]ssh slave
可以檢查是否互信成功
四、安裝java(只需在master操作,配置后再發送到slave機器)
jdk下載:
[root@master src]wget http://download.oracle.com/otn-pub/java/jdk/8u171-b11/512cd62ec5174c3487ac17c61aaa89e8/jdk-8u171=2-linux-x64.tar.gz
4.1 解壓jdk
[root@master src]# tar -zxvf jdk-8u172-linux-x64.tar.gz
在這里我是解壓到 /usr/local/src
4.2 修改 /etc/profile 文件,配置環境變量
在文件末端添加:
# SET JAVA_PATH export JAVA_HOME=/usr/local/src/jdk1.8.0_172 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:$PATH
注:根據你jdk安裝的位置修改,我這里jdk位置是/usr/local/src/
添加后source一下,使環境變量生效:
[root@master src]# source /etc/profile
檢驗:
[root@master ~]# java -version java version "1.8.0_172" Java(TM) SE Runtime Environment (build 1.8.0_172-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
4.3 把jdk發送到slave機器上
[root@master src]# scp -r jdk1.8.0_172 root@slave:/usr/local/src/
4.4 把 /etc/profile文件發送到slave機器上
[root@master src]# scp /etc/profile root@slave:/etc/
在slave機器上source一下,並檢驗:
[root@slave ~]# source /etc/profile [root@slave ~]# java -version java version "1.8.0_172" Java(TM) SE Runtime Environment (build 1.8.0_172-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
五、安裝並配置Hadoop
Hadoop下載:
[root@master src]# wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.7.7/hadoop-2.7.7.tar.gz
5.1 解壓hadoop
[root@master src]# tar -zxvf hadoop-2.7.7.tar.gz
在這里我是解壓到 /usr/local/src
5.2 配置各個文件
- hadoop-env.sh
在最后添加:
export JAVA_HOME=/usr/local/src/jdk1.8.0_172 - yarn-env.sh
在"JAVA_HEAP_MAX=-Xmx1000m"下面一行添加:
export JAVA_HOME=/usr/local/src/jdk1.8.0_172 - slaves文件
刪掉內容"localhost",添加:
slave - hdfs-site.xml
修改:
<configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>master:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/src/hadoop-2.7.7/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/src/hadoop-2.7.7/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration>
- mapred-site.xml.template
修改:
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8035</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> <!--內存分配--> <property> <name>yarn.nodemanager.vmem-pmem-ratio</name> <value>5</value> </property> <!--log--> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name> <value>3600</value> </property> <property> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/tmp/logs</value> </property> </configuration>
有興趣了解各個配置作用的自行百度~
5.3 創建目錄
回到hadoop-2.7.7目錄下,創建3個目錄:
[root@master hadoop-2.7.7]# mkdir -p dfs/name [root@master hadoop-2.7.7]# mkdir -p dfs/data [root@master hadoop-2.7.7]# mkdir tmp
5.4 修改 /etc/profile 文件,配置環境變量
在文件末端添加:
# SET HADOOP_PATH export HADOOP_HOME=/usr/local/src/hadoop-2.7.7 export PATH=$PATH:$HADOOP_HOME/bin
注:根據你hadoop安裝的位置修改,我這里hadoop位置是/usr/local/src/
添加后source一下,使環境變量生效:
[root@master src]# source /etc/profile
檢驗:
[root@master hadoop-2.7.7]# hadoop version Hadoop 2.7.7 Subversion Unknown -r c1aad84bd27cd79c3d1a7dd58202a8c3ee1ed3ac Compiled by stevel on 2018-07-18T22:47Z Compiled with protoc 2.5.0 From source with checksum 792e15d20b12c74bd6f19a1fb886490 This command was run using /usr/local/src/hadoop-2.7.7/share/hadoop/common/hadoop-common-2.7.7.jar
5.5 把hadoo發送到slave機器上
[root@master src]# scp -r hadoop-2.7.7 root@slave:/usr/local/src/
5.6 /etc/profile文件發送到slave機器上
[root@master src]# scp /etc/profile root@slave:/etc/
在slave機器上source一下,並檢驗:
[root@slave ~]# hadoop version Hadoop 2.7.7 Subversion Unknown -r c1aad84bd27cd79c3d1a7dd58202a8c3ee1ed3ac Compiled by stevel on 2018-07-18T22:47Z Compiled with protoc 2.5.0 From source with checksum 792e15d20b12c74bd6f19a1fb886490 This command was run using /usr/local/src/hadoop-2.7.7/share/hadoop/common/hadoop-common-2.7.7.jar
5.7 在master上格式化集群
[root@master logs]# hdfs namenode -format
開啟集群:
[root@master sbin]# ./start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh Starting namenodes on [master] master: starting namenode, logging to /usr/local/src/hadoop-2.7.7/logs/hadoop-root-namenode-master.out slave: starting datanode, logging to /usr/local/src/hadoop-2.7.7/logs/hadoop-root-datanode-slave.out Starting secondary namenodes [master] master: starting secondarynamenode, logging to /usr/local/src/hadoop-2.7.7/logs/hadoop-root-secondarynamenode-master.out starting yarn daemons starting resourcemanager, logging to /usr/local/src/hadoop-2.7.7/logs/yarn-root-resourcemanager-master.outslave: starting nodemanager, logging to /usr/local/src/hadoop-2.7.7/logs/yarn-root-nodemanager-slave.out
檢驗:
[root@master sbin]# jps 5301 ResourceManager 5558 Jps 4967 NameNode 5150 SecondaryNameNode
[root@slave hadoop-2.7.7]# jps 8725 Jps 8470 DataNode 8607 NodeManager
到這里就已經搭建成功了
六、阿里雲踩過的坑
- 阿里雲服務器默認防火牆開啟,實驗前需關閉。
- 因阿里雲服務器原只支持22、80和443端口,所以需要到控制台中添加防火牆規則,使其支持9000端口,若是出現上傳文件失敗,還要開啟50010端口。