一、基礎環境
1.1 安裝說明




二、Host配置

三、Hadoop的安裝與配置
3.1 創建文件目錄
3.2 下載

3.3 配置環境變量


3.4 Hadoop的配置

1 <?xml version="1.0" encoding="UTF-8"?> 2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 3 <!-- 4 Licensed under the Apache License, Version 2.0 (the "License"); 5 you may not use this file except in compliance with the License. 6 You may obtain a copy of the License at 7 8 http://www.apache.org/licenses/LICENSE-2.0 9 Unless required by applicable law or agreed to in writing, software 10 distributed under the License is distributed on an "AS IS" BASIS, 11 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 See the License for the specific language governing permissions and 13 limitations under the License. See accompanying LICENSE file. 14 --> 15 16 <!-- Put site-specific property overrides in this file. --> 17 <configuration> 18 <property> 19 <name>hadoop.tmp.dir</name> 20 <value>file:/data/hdfs/tmp</value> 21 <description>A base for other temporary directories.</description> 22 </property> 23 <property> 24 <name>io.file.buffer.size</name> 25 <value>131072</value> 26 </property> 27 <property> 28 <name>fs.default.name</name> 29 <value>hdfs://master:9000</value> 30 </property> 31 <property> 32 <name>hadoop.proxyuser.root.hosts</name> 33 <value>*</value> 34 </property> 35 <property> 36 <name>hadoop.proxyuser.root.groups</name> 37 <value>*</value> 38 </property> 39 </configuration>
1 <?xml version="1.0" encoding="UTF-8"?> 2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 3 <!-- 4 Licensed under the Apache License, Version 2.0 (the "License"); 5 you may not use this file except in compliance with the License. 6 You may obtain a copy of the License at 7 8 http://www.apache.org/licenses/LICENSE-2.0 9 10 Unless required by applicable law or agreed to in writing, software 11 distributed under the License is distributed on an "AS IS" BASIS, 12 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 See the License for the specific language governing permissions and 14 limitations under the License. See accompanying LICENSE file. 15 --> 16 17 <!-- Put site-specific property overrides in this file. --> 18 19 <configuration> 20 <property> 21 <name>dfs.replication</name> 22 <value>2</value> 23 </property> 24 <property> 25 <name>dfs.namenode.name.dir</name> 26 <value>file:/data/hdfs/name</value> 27 <final>true</final> 28 </property> 29 <property> 30 <name>dfs.datanode.data.dir</name> 31 <value>file:/data/hdfs/data</value> 32 <final>true</final> 33 </property> 34 <property> 35 <name>dfs.namenode.secondary.http-address</name> 36 <value>master:9001</value> 37 </property> 38 <property> 39 <name>dfs.webhdfs.enabled</name> 40 <value>true</value> 41 </property> 42 <property> 43 <name>dfs.permissions</name> 44 <value>false</value> 45 </property> 46 </configuration>
注意:dfs.namenode.name.dir和dfs.datanode.data.dir的value填寫對應前面創建的目錄
復制template,生成xml,命令如下:
cp mapred-site.xml.template mapred-site.xml
1 <?xml version="1.0"?> 2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 3 <!-- 4 Licensed under the Apache License, Version 2.0 (the "License"); 5 you may not use this file except in compliance with the License. 6 You may obtain a copy of the License at 7 8 http://www.apache.org/licenses/LICENSE-2.0 9 10 Unless required by applicable law or agreed to in writing, software 11 distributed under the License is distributed on an "AS IS" BASIS, 12 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 See the License for the specific language governing permissions and 14 limitations under the License. See accompanying LICENSE file. 15 --> 16 17 <!-- Put site-specific property overrides in this file. --> 18 19 <configuration> 20 21 <property> 22 <name>mapreduce.framework.name</name> 23 <value>yarn</value> 24 </property> 25 26 </configuration>
1 <?xml version="1.0"?> 2 <!-- 3 Licensed under the Apache License, Version 2.0 (the "License"); 4 you may not use this file except in compliance with the License. 5 You may obtain a copy of the License at 6 7 http://www.apache.org/licenses/LICENSE-2.0 8 9 Unless required by applicable law or agreed to in writing, software 10 distributed under the License is distributed on an "AS IS" BASIS, 11 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 See the License for the specific language governing permissions and 13 limitations under the License. See accompanying LICENSE file. 14 --> 15 <configuration> 16 17 <!-- Site specific YARN configuration properties --> 18 <property> 19 <name>yarn.resourcemanager.address</name> 20 <value>master:18040</value> 21 </property> 22 <property> 23 <name>yarn.resourcemanager.scheduler.address</name> 24 <value>master:18030</value> 25 </property> 26 <property> 27 <name>yarn.resourcemanager.webapp.address</name> 28 <value>master:18088</value> 29 </property> 30 <property> 31 <name>yarn.resourcemanager.resource-tracker.address</name> 32 <value>master:18025</value> 33 </property> 34 <property> 35 <name>yarn.resourcemanager.admin.address</name> 36 <value>master:18141</value> 37 </property> 38 <property> 39 <name>yarn.nodemanager.aux-services</name> 40 <value>mapreduce.shuffle</value> 41 </property> 42 <property> 43 <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 44 <value>org.apache.hadoop.mapred.ShuffleHandler</value> 45 </property> 46 </configuration>

最后,將整個hadoop-2.7.1文件夾及其子文件夾使用scp復制到slave1和slave2的相同目錄中:
scp -r /data/hadoop-2.7.1 root@slave1:/data
在Master上執行jps命令,得到如下結果:
4.3 啟動DataNode

master
slave1
slave2
4.4 運行YARN
說明ResourceManager運行正常。

4.5 查看集群是否啟動成功:
jps
Master顯示:
SecondaryNameNode
ResourceManager
NameNode
Slave顯示:
NodeManager
DataNode
五、測試hadoop
5.1 測試HDFS
最后測試下親手搭建的Hadoop集群是否執行正常,測試的命令如下圖所示:
5.2 查看集群狀態
/data/hadoop-2.7.1/bin/hdfs dfsadmin -report
5.3 測試YARN

5.4 測試mapreduce

5.5 測試查看HDFS:
http://115.29.51.97:50070/dfshealth.html#tab-overview
六、配置運行Hadoop中遇見的問題
6.1 JAVA_HOME未設置

則需要/data/hadoop-2.7.1/etc/hadoop/hadoop-env.sh,添加JAVA_HOME路徑
6.2 ncompatible clusterIDs
由於配置Hadoop集群不是一蹴而就的,所以往往伴隨着配置——>運行——>。。。——>配置——>運行的過程,所以DataNode啟動不了時,往往會在查看日志后,發現以下問題:
6.3 NativeCodeLoader的警告
在測試Hadoop時,細心的人可能看到截圖中的警告信息:
學習本就是一個不斷模仿、練習、再到最后面自己原創的過程。
雖然可能從來不能寫出超越網上通類型同主題博文,但為什么還是要寫?
於自己而言,博文主要是自己總結。假設自己有觀眾,畢竟講是最好的學(見下圖)。於讀者而言,筆者能在這個過程get到知識點,那就是雙贏了。
當然由於筆者能力有限,或許文中存在描述不正確,歡迎指正、補充!
感謝您的閱讀。如果本文對您有用,那么請點贊鼓勵。