終於將這個神秘的尋象人 oozie 安裝配置成功了,這個困擾我好幾天, 當看到如下的畫面, 我覺得值!
廢話少說,看我如何編譯和安裝過程:
(已經將hadoop2.5.2HA 的環境搭建起來了,hive,habase, flume,stom 都有了
Linux環境:centos6.5 64bit
jdk :1.7
mysql 已經安裝
Apache Maven 3.1.1
下載oozie安裝包:oozie-4.1.0.tar.gz http://mirror.bit.edu.cn/apache/oozie/
下載ext-2.2.zip http://oozie.apache.org/docs/4.0.1/DG_QuickStart.html該路徑有extjs的鏈接
)
1、編譯
去http://mirrors.cnnic.cn/apache/oozie/4.2.0/
,而后進行解壓:
tar -zxvf oozie-4.2.0.tar.gz
cd oozie-4.2.0/bin
./mkdistro.sh -DskipTests -Phadoop-2 -Dhadoop.auth.version=2.5.2 -Ddistcp.version=2.5.2 -Dsqoop.version=1.4.4 -Dhive.version=0.13.1
-Dtomcat.version=7.0.52
進行了漫長的等,網絡問題一直困擾我,我就重復執行上面的命令,發現,最終到了這里,

[INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] Apache Oozie Main ................................. SUCCESS [6.824s] [INFO] Apache Oozie Hadoop Utils hadoop-2-4.2.0 .......... SUCCESS [9.525s] [INFO] Apache Oozie Hadoop Distcp hadoop-2-4.2.0 ......... SUCCESS [0.444s] [INFO] Apache Oozie Hadoop Auth hadoop-2-4.2.0 Test ...... SUCCESS [1.027s] [INFO] Apache Oozie Hadoop Libs .......................... SUCCESS [0.101s] [INFO] Apache Oozie Client ............................... SUCCESS [5:08.683s] [INFO] Apache Oozie Share Lib Oozie ...................... SUCCESS [9.351s] [INFO] Apache Oozie Share Lib HCatalog ................... SUCCESS [11.656s] [INFO] Apache Oozie Share Lib Distcp ..................... SUCCESS [3.151s] [INFO] Apache Oozie Core ................................. SUCCESS [3:53.804s] [INFO] Apache Oozie Share Lib Streaming .................. SUCCESS [13.230s] [INFO] Apache Oozie Share Lib Pig ........................ SUCCESS [15.454s] [INFO] Apache Oozie Share Lib Hive ....................... SUCCESS [13.747s] [INFO] Apache Oozie Share Lib Hive 2 ..................... SUCCESS [14.417s] [INFO] Apache Oozie Share Lib Sqoop ...................... SUCCESS [5.546s] [INFO] Apache Oozie Examples ............................. SUCCESS [10.178s] [INFO] Apache Oozie Share Lib Spark ...................... SUCCESS [15.450s] [INFO] Apache Oozie Share Lib ............................ SUCCESS [52.422s] [INFO] Apache Oozie Docs ................................. FAILURE [9.477s] [INFO] Apache Oozie WebApp ............................... SKIPPED [INFO] Apache Oozie Tools ................................ SKIPPED [INFO] Apache Oozie MiniOozie ............................ SKIPPED [INFO] Apache Oozie Distro ............................... SKIPPED [INFO] Apache Oozie ZooKeeper Security Tests ............. SKIPPED [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 12:21.113s [INFO] Finished at: Wed Oct 26 05:39:28 CST 2016 [INFO] Final Memory: 174M/482M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal org.apache.maven.plugins:maven-site-plugin:2.0-be ta-6:site (default) on project oozie-docs: The site descriptor cannot be resolve d from the repository: Could not transfer artifact org.apache:apache:xml:site_en :16 from/to Codehaus repository (http://repository.codehaus.org/): repository.co dehaus.org: 未知的名稱或服務 [ERROR] org.apache:apache:xml:16 [ERROR] [ERROR] from the specified remote repositories: [ERROR] central (http://repo1.maven.org/maven2, releases=true, snapshots=false), [ERROR] ce d (https://repository.cloudera.com/cloudera/ext-release-local/, relea ses=true, snapshots=false), [ERROR] Codehaus repository (http://repository.codehaus.org/, releases=true, sna pshots=false), [ERROR] cloudera com (https://repository.cloudera.com/content/repositories/relea ses/, releases=true, snapshots=false), [ERROR] central maven (http://central.maven.org/maven2/, releases=true, snapshot s=false), [ERROR] apache.snapshots.repo (https://repository.apache.org/content/groups/snap shots, releases=true, snapshots=true), [ERROR] datanucleus (http://www.datanucleus.org/downloads/maven2, releases=true, snapshots=false), [ERROR] apache.snapshots (http://repository.apache.org/snapshots, releases=false , snapshots=true): Unknown host repository.codehaus.org: 未知的名稱或服務 [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit ch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please rea d the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE xception [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn <goals> -rf :oozie-docs ERROR, Oozie distro creation failed
無奈,然后又去編譯,3.3.2 的,結果:
到了這里,也是實在無法編譯下去了,
到此,我即去網找原因:
都告告訴我是: maven的倉儲地址的問題,於是我換了倉儲的配置:
oozie 根目錄下,
pom.xml中,修改:<repositories></repositories> 中的倉儲,修改如下:
<repositories> <repository> <id>cloudera com</id> <url>https://repository.cloudera.com/content/repositories/releases/</url> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>central</id> <url>http://repo1.maven.org/maven2</url> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>central maven</id> <url>http://central.maven.org/maven2/</url> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>Codehaus repository</id> <url>http://repository.codehaus.org/</url> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>apache.snapshots.repo</id> <url>https://repository.apache.org/content/groups/snapshots</url> <name>Apache Snapshots Repository</name> <snapshots> <enabled>true</enabled> </snapshots> </repository> <repository> <id>datanucleus</id> <url>http://www.datanucleus.org/downloads/maven2</url> <name>Datanucleus</name> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories>
完了之后,繼續編譯
[INFO] Apache Oozie Docs ................................. FAILURE [9.477s]
ping 地址好像可以,
實在不清楚為啥, 嘗試了幾次不行, 估計是哪里需要改的,我沒辦法,之候在去解決
2.改變方案,用其他編譯好的,
於是用的是cloudera公司的
http://archive.cloudera.com/cdh5/cdh/5/
這個:http://archive.cloudera.com/cdh5/cdh/5/oozie-4.1.0-cdh5.8.2.tar.gz
,下載后解壓:tar -zxvf oozie-4.1.0-cdh5.8.2.tar.gz
這個是支持的hadoop2.6的
那我將將他換成了我的hadoop版本hadoop2.5.2
具體做法是:(多謝他了)
這個參考這個地址:http://www.mamicode.com/info-detail-490284.html
1. 解壓
cp oozie-4.1.0-distro.tar.gz /home/hadoop
cd /home/hadoop
tar xvzf oozie-4.1.0-distro.tar.gz
/home/hadoop/oozie-4.1.0即為oozie的根目錄
2. 設置環境變量
vi /etc/profile
export OOZIE_HOME=/home/hadoop/oozie-4.1.0 export PATH=$PATH:$OOZIE_HOME/bin
自后,在source /etc/profile 使他生效
3. 引入jar包
在OOZIE_HOME下創建libext文件夾
mkdir libext
將hadoop的所有jar包復制到該目錄下
cp $HADOOP_HOME/share/hadoop/*/hadoop-*.jar ./libext/
cp $HADOOP_HOME/share/hadoop/*/lib/*.jar ./libext/
cp mysql-connector-java-5.1.29-bin.jar ./libext/
刪除libext中的jasper*.jar, servlet-api.jar, jsp-api.jar,與oozie-4.0.1/oozie-server/lib/下jar包沖突,war包會報:
org.eclipse.jdt.internal.compiler.CompilationResult.getProblems()[Lorg/eclipse/jdt/core/compiler/IProblem
4. 生成war包
bin/oozie-setup.sh prepare-war
會生成@OOZIE_HOME/oozie-server/webapps/oozie.war
解壓ext-2.2.zip后生成ext-2.2文件夾,將該文件夾打入oozie.war。他的做法是在后面啟動服務之后oozie.war包會解壓為oozie,讓后將ext-2.2直接拖進去。
(我的做法是,將上面的oozie.war,下載到桌面,用解壓工具打開,而后將 ext-2.2.zip 拖到 oozie.war中,后來發現,其實是不用的,我打開后就有)
注:1.在網上看到用以下命令可以生成oozie.war,並已經將ext-2.2.zip打入了war包之中
./addtowar.sh -inputwar $OOZIE_HOME/oozie.war -outputwar $OOZIE_HOME/oozie-server/webapps/oozie.war -hadoop 2.3.0 $HADOOP_HOME -extjs /home/oozie/ext-2.2.zip
2.你的有那個zip和unzip 的命令,否則會包錯誤,去root用戶下,用yum -y install unzip 和yum -y install zip 安裝即可
5. 修改配置
vi $OOZIE_HOME/conf/oozie-site.xml
<property> <name>oozie.service.JPAService.jdbc.driver</name> <value>com.mysql.jdbc.Driver</value> <description> JDBC driver class. </description> </property> <property> <name>oozie.service.JPAService.jdbc.url</name> <value>jdbc:mysql://mysql-server:3306/oozie</value> <description> JDBC URL. </description> </property> <property> <name>oozie.service.JPAService.jdbc.username</name> <value>root</value> <description> DB user name. </description> </property> <property> <name>oozie.service.JPAService.jdbc.password</name> <value>mapengbo</value> <description> DB user password. </description> </property>
6. 創建數據庫
創建名為oozie的數據庫並賦權
CREATE DATABASE oozie;
grant all ON oozie.* TO ‘shirdrn‘@‘oozie-server‘IDENTIFIED BY ‘0o21e‘;
FLUSH PRIVILEGES;
生成所需的數據庫表,並執行
bin/ooziedb.sh create -sqlfile oozie.sql –run
查看數據庫oozie生成了oozie的相關表。
7. 啟動服務
bin/oozied.sh start
訪問控制台http://hadoop1:11000/oozie hadoop1為我的主機名
四.配置hadoop的jobhistory和用戶
修改$HADOOP_HOME/etc/hadoop/mapred-site.xml
和$OOZIE_HOME/conf/hadoop-conf/core-site.xml添加如下配置。
<property> <name>mapreduce.jobhistory.address</name> <value>node3:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>node3:19888</value> </property> <property> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>${hadoop.tmp.dir}/mr/history-tmp</value> </property> <property> <name>mapreduce.jobhistory.done-dir</name> <value>${hadoop.tmp.dir}/mr/history-done</value> </property>
需要在hadoop的core-site.xml里面添加如下內容:
<property> <name>hadoop.proxyuser.root.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.root.groups</name> <value>*</value> </property>
root為hadoop的用戶,hadoop.proxyuser.root.groups屬性配置用戶所屬組名稱,配置完成重啟hadoop
你也可以寫成這樣:hadoop.proxyuser.[USER].hosts和hadoop.proxyuser.[USER].groups
啟動hadoop歷史jobHistory服務
$HADOOP_HOME/sbin/mr-jobhistory-daemon.shstart historyserver //這個我是重啟的hadoop集群
重啟oozie
bin/oozied.sh start
五.Client測試
tar –zxvf oozie-client-4.1.0.tar.gz //這個我用到是之前編譯的oozie4.2.0里編譯好的,發現我下載的那個cloudera的里面沒有這個
//地址:鏈接:http://pan.baidu.com/s/1eSBOdEi 密碼:q1nw
tar –zxvf oozie-examples.tar.gz
tar –zxvf oozie-sharelib-4.1.0.tar.gz
hdsf dfs -put examples hdfs:/myserver/user/hadoop/
hdsf dfs -put share /user/hadoop/ --//這個后來發現不行,需要在oozie-site.xml中配置到本地的目錄路徑,
//配置oozie.service.WorkflowAppService.system.libpath
A.修改$OOZIE_HOME/conf/oozie-site.xml文件,添加如下:
<property> <name>oozie.service.WorkflowAppService.system.libpath</name> <value>file:///home/${user.name}/oozie-4.1.0-cdh5.8.2/share/lib</value> </property>
B.修改$OOZIE_HOME/conf/hadoop-conf/core-site.xml文件,添加如下:
<property> <name>yarn.resourcemanager.address</name> <value>node1:8032</value>(應與hadoop的配置相同, 這個我是在http://你的mapreduce主機名:8088/conf 下找到,並將其改的 同下) </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>node1:8030</value> </property>
C.修改oozie.service.HadoopAccessorService.hadoop.configurations 屬性,將其值調整為 *=HADOOP_HOME/etc/hadoop
---//這個我沒有如何配置, 你可以看一下這個人配置的http://heylinux.com/archives/2836.html
D.修改$OOZIE_HOME/examples/apps/map-reduce/job.properties(yarn中已經沒有jobTracker,以下jobTracker填入yarn.resourcemanager.address的值,oozie.wf.application.path即HDFS中oozie示例程序的路徑)
nameNode=hdfs://node1:9000 jobTracker=node1:8032 queueName=default examplesRoot=examples oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/map-reduce outputDir=map-reduce
在$OOZIE_HOME/oozie-client-4.0.1/bin中調用oozie腳本,執行工作流
./oozie job -oozie http://node3:11000/oozie -config $OOZIE_HOME/examples/apps/map-reduce/job.properties -run
訪問控制台http://hadoop1:11000/oozie
完工!