Yarn任務提交流程(源碼分析)


關鍵詞:yarn rm mapreduce 提交

Based on Hadoop 2.7.1

JobSubmitter

  • addMRFrameworkToDistributedCache(Configuration conf) : mapreduce.application.framework.path, 用於指定其他framework的hdfs 路徑配置,默認yarn的可以不管
  • Token相關的方法:讀取認證信息(支持二進制、json),並將其添加至相應的fileSystem中,以便以同樣權限訪問文件系統
  • copyAndConfigureFiles(Job job, Path jobSubmitDir): 上傳配置、jar、files、libjars、archives等
  • submitJobInternal: 真正的提交任務接口

核心代碼提交鏈

  1. JobSubmitter -> 
  2. ClientProtocol(YARNRunner) -> 
  3. ResourceMgrDelegate -> 
  4. YarnClient(YarnClientImpl).submitApplication( ApplicationSubmissionContext appContext) -> 
  5. 【RM】ApplicationClientProtocol(ClientRMService).submitApplication( SubmitApplicationRequest request) -> // fill ASC with dft values
  6. RMAppManager.submitApplication( ApplicationSubmissionContext submissionContext, long submitTime, String user) -> 

  • ApplicationSubmissionContext 提交上下文,包含application各種元信息
  • SubmitApplicationRequest 提交Request對象
// Dispatcher is not yet started at this time, so these START events // enqueued should be guaranteed to be first processed when dispatcher // gets started. this.rmContext.getDispatcher().getEventHandler() .handle(new RMAppEvent(applicationId, RMAppEventType.START)); 

START -> APP_NEW_SAVED 

 stateMachineFactory.addTransition(RMAppState.NEW, RMAppState.NEW_SAVING, RMAppEventType.START, new RMAppNewlySavingTransition()) //... private static final class RMAppNewlySavingTransition extends RMAppTransition { @Override public void transition(RMAppImpl app, RMAppEvent event) { // If recovery is enabled then store the application information in a // non-blocking call so make sure that RM has stored the information // needed to restart the AM after RM restart without further client // communication LOG.info("Storing application with id " + app.applicationId); app.rmContext.getStateStore().storeNewApplication(app); } } public synchronized void storeNewApplication(RMApp app) { ApplicationSubmissionContext context = app .getApplicationSubmissionContext(); assert context instanceof ApplicationSubmissionContextPBImpl; ApplicationStateData appState = ApplicationStateData.newInstance( app.getSubmitTime(), app.getStartTime(), context, app.getUser()); dispatcher.getEventHandler().handle(new RMStateStoreAppEvent(appState)); } private static final class AddApplicationToSchedulerTransition extends RMAppTransition { @Override public void transition(RMAppImpl app, RMAppEvent event) { app.handler.handle(new AppAddedSchedulerEvent(app.applicationId, app.submissionContext.getQueue(), app.user, app.submissionContext.getReservationID())); } }

AppAddedSchedulerEvent 會由配置的Scheduler來handle。

P.S. 看 event 部分代碼的方法,

  1. 找出狀態,比如 APP_NEW_SAVED,
  2. 找出handle這個狀態的事件類
  3. 找出處理這個事件的具體邏輯 (這里可能邏輯最復雜)
  4. 找下一個事件
  5. 重復。。

ApplicationMaster

START -> APPNEWSAVED -> APP_ACCEPTED ....

后面是一些attempt的啟動等各種事件的反復。直接跳到 AM 部分。

ResourceManager內有 createApplicationMasterLauncher() 和 createApplicationMasterService() 

private void launch() throws IOException, YarnException { connect(); ContainerId masterContainerID = masterContainer.getId(); ApplicationSubmissionContext applicationContext = application.getSubmissionContext(); LOG.info("Setting up container " + masterContainer + " for AM " + application.getAppAttemptId()); ContainerLaunchContext launchContext = createAMContainerLaunchContext(applicationContext, masterContainerID); StartContainerRequest scRequest = StartContainerRequest.newInstance(launchContext, masterContainer.getContainerToken()); List<StartContainerRequest> list = new ArrayList<StartContainerRequest>(); list.add(scRequest); StartContainersRequest allRequests = StartContainersRequest.newInstance(list); StartContainersResponse response = containerMgrProxy.startContainers(allRequests); if (response.getFailedRequests() != null && response.getFailedRequests().containsKey(masterContainerID)) { Throwable t = response.getFailedRequests().get(masterContainerID).deSerialize(); parseAndThrowException(t); } else { LOG.info("Done launching container " + masterContainer + " for AM " + application.getAppAttemptId()); } } private ContainerLaunchContext createAMContainerLaunchContext( ApplicationSubmissionContext applicationMasterContext, ContainerId containerID) throws IOException { // Construct the actual Container ContainerLaunchContext container = applicationMasterContext.getAMContainerSpec(); LOG.info("Command to launch container " + containerID + " : " + StringUtils.arrayToString(container.getCommands().toArray( new String[0]))); // Finalize the container setupTokens(container, containerID); return container; } 

注意以上其中兩行:

  • ContainerLaunchContext launchContext = createAMContainerLaunchContext(applicationContext, masterContainerID) 創建 AM 請求
  • StartContainersResponse response = containerMgrProxy.startContainers(allRequests); 啟動AM的容器並在容器內啟動AM。
  @Override public ContainerLaunchContext getAMContainerSpec() { ApplicationSubmissionContextProtoOrBuilder p = viaProto ? proto : builder; if (this.amContainer != null) { return amContainer; } // Else via proto if (!p.hasAmContainerSpec()) { return null; } amContainer = convertFromProtoFormat(p.getAmContainerSpec()); return amContainer; } public class ApplicationSubmissionContextPBImpl extends ApplicationSubmissionContext { ApplicationSubmissionContextProto proto = ApplicationSubmissionContextProto.getDefaultInstance(); ApplicationSubmissionContextProto.Builder builder = null; boolean viaProto = false; private ApplicationId applicationId = null; private Priority priority = null; private ContainerLaunchContext amContainer = null; private Resource resource = null; private Set<String> applicationTags = null; private ResourceRequest amResourceRequest = null; private LogAggregationContext logAggregationContext = null; private ReservationId reservationId = null; /// ... }

接下來便是啟動后的AppMaster 創建job,並通過AMRMClient向ResourceManager申請資源等。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM