第一篇主要是整體的步驟,其實中間遇到很多問題,第二篇將遇到的問題全部列舉下來:
1.1包不能加載警告
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
hadoop2.5.1官網上提供的已經是64位操作系統版本,但是仍然報這個錯誤
1.1.1測試本地庫
[root@cluster3 ~]# export HADOOP_ROOT_LOGGER=DEBUG,console [root@cluster3 script]# hadoop fs -text /usr/local/script/hdfile1.txt 14/11/01 10:58:15 DEBUG util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: /usr/local/hadoop/hadoop-2.5.1/lib/native/libhadoop.so.1.0.0: /lib64/libc.so.6: version `GLIBC_2.12' not found (required by /usr/local/hadoop/hadoop-2.5.1/lib/native/libhadoop.so.1.0.0) 14/11/01 10:58:15 DEBUG util.NativeCodeLoader: java.library.path=/usr/local/hadoop/hadoop-2.5.1/lib/native 14/11/01 10:58:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/11/01 10:58:15 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Falling back to shell based [root@cluster1 lib64]# ll /lib64/libc.so.6 lrwxrwxrwx 1 root root 11 Oct 31 17:27 /lib64/libc.so.6 -> libc-2.5.so
可以看到上邊要求的是glibc_2.12,所以需要升級glibc(對hadoop重新編譯即可,不需要升級glibc)
編譯hadoop源碼
2、配置本地yum源
修改yum的配置文件,使用本地ISO做yum源
創建目錄 mkdir /mnt/cdrom mount /dev/cdrom /mnt/cdrom 復制到本地 cp -avf /mnt/cdrom /yum 創建文件: vi /etc/yum.repos.d/CentOS-Local.repo [Local] name=Local Yum baseurl=file:///yum/ gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 enabled=1
# cd /etc/yum.repos.d/ # mv CentOS-Base.repo CentOS-Base.repo.bak 禁用默認的yum 網絡源 # cp CentOS-Media.repo CentOS-Media.repo.bak 是yum 本地源的配置文件 修改配置文件 # vi CentOS-Media.repo baseurl=file:///media/CentOS_6.3_Final/ enabled=1 #啟用yum [root@cluster3 yum.repos.d]# yum -y install gcc
3、clone虛擬機后,修改主機名
修改主機名 修改/etc/sysconfig/network中的hostname為【修改后的主機名】 修改/etc/hosts文件中的 【原來主機名】為【修改后的主機名】 reboot,重啟系統。 查看hostname ,是否修改成功
4測試程序
[root@cluster3 input]# hadoop dfs -mkdir /hadoop [root@cluster3 input]# hadoop dfs -mkdir /hadoop/input [root@cluster3 hadoop-2.5.1]# hadoop dfs -put /usr/local/hadoop/hadoop-2.5.1/test/text1.txt /hadoop/input [root@cluster3 hadoop-2.5.1]# hadoop dfs -put /usr/local/hadoop/hadoop-2.5.1/test/text2.txt /hadoop/input DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. [root@cluster3 hadoop-2.5.1]# hadoop jar /usr/local/hadoop/hadoop-2.5.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount /hadoop/input/* /hadoop/output 14/11/06 15:44:51 INFO client.RMProxy: Connecting to ResourceManager at cluster3/192.168.220.63:8032 14/11/06 15:44:52 INFO input.FileInputFormat: Total input paths to process : 2 14/11/06 15:44:52 INFO mapreduce.JobSubmitter: number of splits:2 14/11/06 15:44:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1415259711375_0001 14/11/06 15:44:53 INFO impl.YarnClientImpl: Submitted application application_1415259711375_0001 14/11/06 15:44:53 INFO mapreduce.Job: The url to track the job: http://cluster3:8088/proxy/application_1415259711375_0001/ 14/11/06 15:44:53 INFO mapreduce.Job: Running job: job_1415259711375_0001 14/11/06 15:45:04 INFO mapreduce.Job: Job job_1415259711375_0001 running in uber mode : false 14/11/06 15:45:04 INFO mapreduce.Job: map 0% reduce 0% 14/11/06 15:45:57 INFO mapreduce.Job: map 100% reduce 0% 14/11/06 15:46:17 INFO mapreduce.Job: map 100% reduce 100% 14/11/06 15:46:18 INFO mapreduce.Job: Job job_1415259711375_0001 completed successfully 14/11/06 15:46:18 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=55 FILE: Number of bytes written=291499 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=241 HDFS: Number of bytes written=25 HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=106968 Total time spent by all reduces in occupied slots (ms)=9679 Total time spent by all map tasks (ms)=106968 Total time spent by all reduce tasks (ms)=9679 Total vcore-seconds taken by all map tasks=106968 Total vcore-seconds taken by all reduce tasks=9679 Total megabyte-seconds taken by all map tasks=109535232 Total megabyte-seconds taken by all reduce tasks=9911296 Map-Reduce Framework Map input records=2 Map output records=4 Map output bytes=41 Map output materialized bytes=61 Input split bytes=216 Combine input records=4 Combine output records=4 Reduce input groups=3 Reduce shuffle bytes=61 Reduce input records=4 Reduce output records=3 Spilled Records=8 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=1085 CPU time spent (ms)=3400 Physical memory (bytes) snapshot=502984704 Virtual memory (bytes) snapshot=2204106752 Total committed heap usage (bytes)=257171456 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=25 File Output Format Counters Bytes Written=25 [root@cluster3 hadoop-2.5.1]# hadoop dfs -ls /hadoop/ DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Found 2 items drwxr-xr-x - root supergroup 0 2014-11-06 15:44 /hadoop/input drwxr-xr-x - root supergroup 0 2014-11-06 15:46 /hadoop/output [root@cluster3 hadoop-2.5.1]# hadoop dfs -cat /hadoop/output/part-r-00000 DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. hadoop 1 hello 2 world 1
5.連接失敗
[root@cluster3 hadoop-2.5.1]# hadoop jar /usr/local/hadoop/hadoop-2.5.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount /hadoop/input/* /hadoop/output 14/11/06 11:28:15 INFO client.RMProxy: Connecting to ResourceManager at cluster3/192.168.220.63:8032 java.net.ConnectException: Call From cluster3/192.168.220.63 to cluster3:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 解決辦法: namenode未啟動
6.沒有datanode
14/11/06 09:39:10 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hadoop/input/text1.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791) 解決辦法: 由於執行了多次hdfs namenode -format 需要手動清除下name和data數據
7.數據丟失危險
2014-11-06 10:20:14,903 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories! 2014-11-06 10:20:14,903 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace edits storage directory (dfs.namenode.edits.dir) configured. Beware of data loss due to lack of redundant storage directories! 通過在dfs.namenode.name.dir和dfs.datanode.data.dir設置多個掛載在不同物理硬盤或者NFS掛載的目錄即可
8.http://192.168.220.63:50070訪問不了,NodeManager啟動一下,過一會就沒了。
關閉防火牆服務
[root@cluster3 hadoop]# service iptables stop
關閉開機自動啟動
[root@cluster3 hadoop]# chkconfig iptables off