oozie JAVA Client 編程提交作業


1,eclipse環境搭建

在eclipse中新建一個JAVA工程,導入必要的依賴包,目前用到的有:


其次編寫JAVA 程序提交Oozie作業,這里可參考:oozie官方參考文檔

在運行提交程序前,首先需要把相應的程序打成jar包,定義好workflow.xml,再把它們上傳到HDFS中。然后在程序中指定作業的屬性,這里我是直接用的oozie-examples.tar.gz中的示例。

部分代碼參考如下:

 1 OozieClient wc = new OozieClient("http://192.168.121.35:11000/oozie");
 2         
 3         //create workflow job configuration 
 4         Properties conf = wc.createConfiguration();
 5         conf.setProperty(OozieClient.APP_PATH, "hdfs://datanode1:8020/user/cdhfive/examples/apps/map-reduce");
 6         
 7         //set a workflow parameters
 8         conf.setProperty("nameNode", "hdfs://datanode1:8020");
 9         conf.setProperty("jobTracker", "datanode1:8032");
10         conf.setProperty("inputDir", "/user/cdhfive/examples/input-data");
11 //        conf.setProperty("outputDir", "hdfs://192.168.121.35:8020/user/cdhfive/examples/output-data");
12         conf.setProperty("outputDir", "/user/cdhfive/examples/output-data");
13         conf.setProperty("queueName", "default");
14         conf.setProperty("examplesRoot", "examples");
15         conf.setProperty("user.name", "cdhfive");

在代碼中workflow的參數時需要注意以下幾點:

①在workflow.xml中定義的變量需要在程序中進行設置。如workflow.xml中的 ${jobTracker},則在JAVA程序中需要用語句:

conf.setProperty("jobTracker", "datanode1:8032");設置好。並且value 值要符合相應的格式。

 

2,作業提交過程中碰到的一些問題及解決:

ⓐError starting action [mr-node]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Permission denied: user=hapjin, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x

由於我在本地windows系統上的用戶hapjin運行的eclipse應用程序進行的提交,而集群則是遠程的虛擬機。因此作業執行時報權限錯誤。

這里可以在作業提交過程中指定作業的用戶名:conf.setProperty("user.name", "cdhfive")

 

ⓑ變量不能解析的錯誤:這是因為在workflow.xml中定義了一些變量,如${examplesRoot},而在JAVA代碼中沒有給這些變量賦值(conf.setProperty(key,value))。

javax.servlet.jsp.el.ELException: variable [examplesRoot] cannot be resolved

解決:workflow.xml中定義的變量需要在Java代碼中使用 conf.setProerty方法指定值。

整個完整的程序代碼參考如下:

package test;

import java.util.Properties;

import org.apache.oozie.client.OozieClient;
import org.apache.oozie.client.OozieClientException;
import org.apache.oozie.client.WorkflowJob.Status;

public class CommitJob {
    public static void main(String[] args) {
        //get a OozieClient for local Oozie
        OozieClient wc = new OozieClient("http://192.168.121.35:11000/oozie");
        
        //create workflow job configuration 
        Properties conf = wc.createConfiguration();
        conf.setProperty(OozieClient.APP_PATH, "hdfs://datanode1:8020/user/cdhfive/examples/apps/map-reduce");
        
        //set a workflow parameters
        conf.setProperty("nameNode", "hdfs://datanode1:8020");
             
        conf.setProperty("inputDir", "/user/cdhfive/examples/input-data");
//        conf.setProperty("outputDir", "hdfs://192.168.121.35:8020/user/cdhfive/examples/output-data");
        conf.setProperty("outputDir", "/user/cdhfive/examples/output-data");
        conf.setProperty("queueName", "default");
        conf.setProperty("examplesRoot", "examples");
        conf.setProperty("user.name", "cdhfive");
        
        //submit and start the workflow job
        try{
            String jobId = wc.run(conf);
            System.out.println("Workflow job submitted");
            
            //wait until the workflow job finishes
            while(wc.getJobInfo(jobId).getStatus() == Status.RUNNING){
                System.out.println("Workflow job running...");
                try{
                    Thread.sleep(10*1000);
                }catch(InterruptedException e){e.printStackTrace();}
            }
            System.out.println("Workflow job completed!");
            System.out.println(wc.getJobId(jobId));
        }catch(OozieClientException e){e.printStackTrace();}
        
    }
}

運行結果截圖:

 

3,Oozie處理錯誤的方式

If the failure is of transient nature, Oozie will perform retries after a pre-defined time interval. The number of retries and timer interval for a type of action must be pre-configured at Oozie level. Workflow jobs can override such configuration.

Examples of a transient failures are network problems or a remote system temporary unavailable.

If the failure is of non-transient nature, Oozie will suspend the workflow job until an manual or programmatic intervention resumes the workflow job and the action start or end is retried.

如果作業是臨時失敗的,如因為網絡原因或遠程系統臨時不可用,此時OOzie將會以預定的時間間隔重啟作業。若作業不是臨時失敗的,Oozie將會掛起作業,此時需要手工或程序的干預才能恢復作業的運行。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM