Oozie支持Java action ,Java action 會自動執行workflow任務中制定的java類中的 public static void main(String[] args)方法,會在hadoop集群上以單mapper task的形式執行一個map-reduce job.
workflow任務會等待當前java程序執行完繼續執行下一個action,這意味着我們可以寫多個action以此來調用多個類. 當java類正確執行退出后,將會進入ok控制流;當發生異常時,將會進入error控制流。
Java action 由以下幾個元素組成:
• job-tracker (required)
• name-node (required)
• prepare ---執行刪除文件或者創建目錄的操作
• configuration ---將里面配置的參數傳遞給任務
• main-class (required) ---指定執行的java類的全類名(包名.類名)
• java-opts ---提交給驅動程序的參數。
• arg ---提交給java應用的參數
• file ---添加額外所需jar包
• archive
• capture-output ----可以捕獲輸出
action語法規則如下:
<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:0.1">
...
<action name="[NODE-NAME]">
<java>
<job-tracker>[JOB-TRACKER]</job-tracker>
<name-node>[NAME-NODE]</name-node>
<prepare>
<delete path="[PATH]"/>
...
<mkdir path="[PATH]"/>
...
</prepare>
<job-xml>[JOB-XML]</job-xml>
<configuration>
<property>
<name>[PROPERTY-NAME]</name>
<value>[PROPERTY-VALUE]</value>
</property>
...
</configuration>
<main-class>[MAIN-CLASS]</main-class>
<java-opts>[JAVA-STARTUP-OPTS]</java-opts>
<arg>ARGUMENT</arg>
...
<file>[FILE-PATH]</file>
...
<archive>[FILE-PATH]</archive>
...
<capture-output />
</java>
<ok to="[NODE-NAME]"/>
<error to="[NODE-NAME]"/>
</action>
...
</workflow-app>
若想調用java類有三個是必需的:1.workflow.xml(名字不可改) 2.job.properties(名字可改) 3.jar包
官網給出的例子:
<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> ... <action name="myfirstjavajob"> <java> <job-tracker>foo:8021</job-tracker> <name-node>bar:8020</name-node> <prepare> <delete path="${jobOutput}"/> </prepare> <configuration> <property> <name>mapred.queue.name</name> <value>default</value> </property> </configuration> <main-class>org.apache.oozie.MyFirstMainClass</main-class> <java-opts>-Dblah</java-opts> <arg>argument1</arg> <arg>argument2</arg> </java> <ok to="myotherjob"/> <error to="errorcleanup"/> </action> ... </workflow-app>
我們工作時的例子:
1.workflow.xml---放到hdfs目錄中
<workflow-app name="java-example1" xmlns="uri:oozie:workflow:0.5">
<start to="java-Action"/>
<action name="java-Action">
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<main-class>test1.OzzieTest1</main-class>
<capture-output/>
</java>
<ok to="java-Action2"/>
<error to="fail"/>
</action>
<action name="java-Action2">
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<main-class>test1.OzzieTest1</main-class>
<capture-output/>
</java>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
以下幾點需要注意:
<workflow-app name="java-example1" xmlns="uri:oozie:workflow:0.5">中的workflow如果設置成0.2那么就不會顯示wofkflow的Graph視圖,如下圖所示:

2.job.properties---放在本地即可
nameNode=hdfs://hgdp-001:8020 -----hdfs地址 jobTracker=hgdp-001:8032 -----jobTracker地址 queueName=default ------oozie隊列 hdfspath=user/root examplesRoot=ocn-itv-oozie -----全局目錄 oozie.use.system.libpath=True -----是否加載用戶lib庫(oozie的system share lib) oozie.libpath=${nameNode}/${hdfspath}/${examplesRoot}/lib/ -----用戶lib庫地址(存放所需的jar包) oozie.wf.application.path=${nameNode}/${hdfspath}/${examplesRoot}/wf/wf4/ ----oozie工作流程workflow.xml所在hdfs中的地址
3.oozie運行:
啟動任務:oozie job -config job.properties -run -oozie http://xxxx(地址):11000/oozie
