編譯:
0. 環境准備
maven(下載安裝,配置環境變量,修改sitting.xml加阿里雲鏡像)
gcc-c++
zlib-devel
autoconf
automake
libtool
通過yum安裝即可,yum -y install gcc-c++ lzo-devel zlib-devel autoconf automake libtool
1. 下載、安裝並編譯LZO
wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.10.tar.gz
tar -zxvf lzo-2.10.tar.gz
cd lzo-2.10
./configure -prefix=/usr/local/hadoop/lzo/
make
make install
2. 編譯hadoop-lzo源碼
2.1 下載hadoop-lzo的源碼,下載地址:https://github.com/twitter/hadoop-lzo/archive/master.zip
2.2 解壓之后,修改pom.xml
<hadoop.current.version>2.7.2</hadoop.current.version>
2.3 聲明兩個臨時環境變量
export C_INCLUDE_PATH=/usr/local/hadoop/lzo/include
export LIBRARY_PATH=/usr/local/hadoop/lzo/lib
2.4 編譯
進入hadoop-lzo-master,執行maven編譯命令
mvn package -Dmaven.test.skip=true
2.5 進入target,將hadoop-lzo-0.4.21-SNAPSHOT.jar放到hadoop的classpath下,如${HADOOP_HOME}/share/hadoop/common
2.6 修改core-site.xml增加配置支持LZO壓縮
<configuration>
<property>
<name>io.compression.codecs</name>
<value>
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.BZip2Codec,
org.apache.hadoop.io.compress.SnappyCodec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec
</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
</configuration>
<mirror>
<id>nexus-aliyun</id>
<mirrorOf>*</mirrorOf>
<name>Nexus aliyun</name>
<url>http://maven.aliyun.com/nexus/content/groups/public</url>
</mirror>
配置lzo:
1)先下載lzo的jar項目
https://github.com/twitter/hadoop-lzo/archive/master.zip
2)下載后的文件名是hadoop-lzo-master,它是一個zip格式的壓縮包,先進行解壓,然后用maven編譯。生成hadoop-lzo-0.4.20.jar。
3)將編譯好后的hadoop-lzo-0.4.20.jar 放入hadoop-2.7.2/share/hadoop/common/
[atguigu@hadoop102 common]$ pwd
/opt/module/hadoop-2.7.2/share/hadoop/common
[atguigu@hadoop102 common]$ ls
hadoop-lzo-0.4.20.jar
4)同步hadoop-lzo-0.4.20.jar到hadoop103、hadoop104
[atguigu@hadoop102 common]$ xsync hadoop-lzo-0.4.20.jar
5)core-site.xml增加配置支持LZO壓縮
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>io.compression.codecs</name> <value> org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.SnappyCodec, com.hadoop.compression.lzo.LzoCodec, com.hadoop.compression.lzo.LzopCodec </value> </property> <property> <name>io.compression.codec.lzo.class</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property> </configuration>
5)同步core-site.xml到hadoop103、hadoop104
[atguigu@hadoop102 hadoop]$ xsync core-site.xml
6)啟動及查看集群
[atguigu@hadoop102 hadoop-2.7.2]$ sbin/start-dfs.sh
[atguigu@hadoop103 hadoop-2.7.2]$ sbin/start-yarn.sh
(1)web和進程查看
Ø Web查看:http://hadoop102:50070
Ø 進程查看:jps查看各個節點狀態。
(2)當啟動發生錯誤的時候:
Ø 查看日志:/home/atguigu/module/hadoop-2.7.2/logs
Ø 如果進入安全模式,可以通過hdfs dfsadmin -safemode leave
Ø 停止所有進程,刪除data和log文件夾,然后hdfs namenode -format 來格式化
hadoop jar /opt/module/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec /input /output //測試