Oozie原理 以及 Action執行模型簡單分析


一,Oozie 內部結構簡單分析(Oozie Internals)

 Oozie是Hadoop的工作流管理系統,正如論文《Oozie: towards a scalable workflow management system for Hadoop》所說:工作流提供了一種聲明式的框架來有效地管理各種各樣的作業,有四個大的需求:可擴展性、多租戶、Hadoop 安全性、可操作性。

Oozie的架構圖如下:

 Oozie提供了RESTful API接口來接受用戶的提交請求(提交工作流作業)。其實,在命令行使用oozie -job xxx命令提交作業,本質上也是發HTTP請求向OozieServer提交作業。

After the workflow submission to Oozie, workflow engine layer drives the execution and associated transitions. 
The workflow engine accomplishes these through a set of pre-defined internal sub-tasks called Commands.

當提交了workflow后,由工作流引擎負責workflow的執行以及狀態的轉換。比如,從一個Action執行到下一個Action,或者workflow狀態由Suspend變成KILLED

 

Most  of  the  commands  are  stored  in  an  internal  priority  queue from where a pool of worker threads picks up
and executes those commands. There are two types of commands:
some are executed when the user submits the request and others are executed asynchronously.

這里有兩種類型的Commands,一種是同步執行的,另一種是異步執行的。

用戶在HDFS上部署好作業(MR作業),然后向Oozie提交Workflow,Oozie以異步方式將作業(MR作業)提交給Hadoop。這也是為什么當調用Oozie 的RESTful接口提交作業之后能立即返回一個jobId的原因,用戶程序不必等待作業執行完成(因為有些大作業可能會執行很久(幾個小時甚至幾天))。Oozie在后台以異步方式,再將workflow對應的Action提交給hadoop執行。

Oozie  splits  larger  workflow  management tasks  (not  Hadoop  jobs)  into  smaller  manageable  subtasks
and asynchronously processes them using a pre-defined state transition model.

 

此外,Oozie提供了一個access layer訪問底層的集群資源。這是Hadoop Security的一個方面吧。

Oozie  provides  a  generic  Hadoop  access  layer restricted through Kerberos authentication to 
access Hadoop’s Job Tracker and Name Node components.

 

Oozie的水平可擴展性和垂直可擴展性

水平可擴展性體現在以下幾個方面:

①具體的作業執行上下文不是在Oozie Server process中。這個在Oozie的Action執行模型中會提到。也就是說:Oozie Server只負責執行workflow,而workflow中的Action,比如MapReduce Action或者Java Action的執行是以集群的方式執行的。Oozie Server只負責查詢這些Action的執行狀態和結果,從而降低了Oozie Server的負載。

Oozie needs to execute different types of jobs as part of workflow processing. If the jobs are executed in the context of the server process, 
there will be twoissues: 1) fewer jobs could run simultaneously due to limited resources in a server process causing significant penalty in scalability and 2) the user application could directly impact the Oozie server performance.

通過將實際作業(MR Action or JAVA Action)的運行交給Hadoop來管理並執行,Oozie Server只負責查詢作業的狀態...如果用戶提交的workflow增多了,只需要簡單地增加Oozie Server 即可。

②作業的狀態持久化到關系數據庫中(以后考慮使用Zookeeper),由於作業(比如MR Action狀態)狀態存儲在數據庫中,而不是在單機的內存中,故很擴容。此外,上面還提到了,實際作業的具體執行是由Hadoop執行的。

 Oozie stores the job states into a persistent store. This approach  enables 
multiple Oozie servers to run simultaneously from different machines.

 

垂直可擴展性體現在:

①線程池以及隊列中的Commands的正確配置與使用。

②異步作業提交模型--減少線程的阻塞

Oozie  often uses a pre-defined timeout for any external communication. Oozie follows an asynchronous job execution pattern for interaction with 
external systems. For example, when a job is submitted to the Hadoop Job Tracker, Oozie does not wait for the job to finish
since it may take a long time. Instead Oozie quickly returns the worker thread back to the thread pool and
later checks for job completion in a separate interaction using a different thread.

 

③使用內存鎖的事務模型 而不是 persistent model?--有點不懂

 In order to maximize resource usage, the persistent store connections are held for the shortest possible duration. To this end,
we chose a memory lock based transaction model instead of a persistent store based one; the latter is often more expensive to hold for long time.

 

最后看看Oozie是怎么從Hadoop集群中獲取作業的執行結果的?---回調 和 輪詢 並用

回調是為了降低開銷,輪詢是為了保證可靠性。

When Oozie starts a MapReduce job, it provides a unique callback URL as part of the MapReduce  job  configuration;  the Hadoop  Job  Tracker 
invokes the given URL to notify the completion of the job. For cases where the Job Tracker failed to invoke the callback URL for any reason
(i.e. a transient network failure), the system has a mechanism to poll the Job Tracker for determining the completion of the MapReduce job.

 

二,Oozie的Action執行模型(Action Execution Model)

A fundamental design principle in Oozie is that the Oozie server never runs user code other than the execution of the workflow itself. 
This ensures better service stability byisolating user code away from Oozie’s code. The Oozie server is also stateless and the launcher job
makes it possible for it to stay that way. By leveraging Hadoop for running the launcher,
handling job failures and recoverability becomes easier for the stateless Oozie server.

①Oozie never run user code other than the execution of the workflow itself. 比如,這里的usercode就是用戶編寫的MapReduce程序。

The Oozie server is also stateless and the launcher job.....

oozie server的無狀態其實就是它把作業的執行信息持久化到數據庫了。

Action的執行模型圖如下:

 

Oozie runs the actual actions through a launcher job, which itself is a Hadoop 
Map‐Reduce job that runs on the Hadoop cluster. The launcher is a map-only 
job that runs only one mapper.

Oozie通過 launcher job 運行某個具體的Action。launcher job是一個 map-only的MR作業,而且並不知道它將在集群的哪台機器上執行這個MR作業。

在上圖中,Oozie Client提交了一個workflow給Oozie Server。這個workflow里面要執行具體的Hive作業(Hive Action)

首先Oozie Server會啟動一個MR作業,也就是launcher job,由launcher job來發起具體的Hive作業。(Hive作業本質上是MR作業)

而我們知道:launcher job是個MR作業,它需要占用slot,也就是說:每提交一個workflow作業,都會創建一個launcher job並占用一個slot,如果底層Hadoop集群slot個數很少,而Oozie提交的作業又很多,launcher job把 slot用完了,使得實際執行Action已經沒有slot可用了,這就會導致死鎖。當然,可以通過配置Oozie的相關參數來避免Oozie發起太多的launcher job

另外,對於MR Action(Hive Action),launcher job 並不需要等到它發起的Action執行完畢后才退出。事實上:MR Action的launcher並不會等待MR作業執行完畢后才退出。

The  <map-reduce>launcher is the exception and it exits right after launching the actual job instead of waiting for it to complete.

 

另外,正是由於這個“launcher job 機制”,當需要將作業交給Oozie來管理運行時,需要將作業相關的配置文件先在HDFS上部署好,然后向Oozie Server發RESTful請求提交作業。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM