在yarn中的application詳情頁面
http://resourcemanager/cluster/app/$applicationId
或者通過application命令
yarn application -status $applicationId
只能看到應用啟動以來占用的資源*時間統計,比如:
Aggregate Resource Allocation : 3962853 MB-seconds, 1466 vcore-seconds
到處都找不到這個應用當前實時的資源占用情況,比如當前占用了多少內存多少核,跟進yarn代碼發現其實是有這個統計的:
org.apache.hadoop.yarn.api.records.ApplicationResourceUsageReport
public static ApplicationResourceUsageReport newInstance( int numUsedContainers, int numReservedContainers, Resource usedResources, Resource reservedResources, Resource neededResources, long memorySeconds, long vcoreSeconds) { ApplicationResourceUsageReport report = Records.newRecord(ApplicationResourceUsageReport.class); report.setNumUsedContainers(numUsedContainers); report.setNumReservedContainers(numReservedContainers); report.setUsedResources(usedResources); report.setReservedResources(reservedResources); report.setNeededResources(neededResources); report.setMemorySeconds(memorySeconds); report.setVcoreSeconds(vcoreSeconds); return report; }
其中usedResources就是當前的實時占用資源情況,包括內存和cpu,這個統計是在YarnScheduler的接口中返回:
org.apache.hadoop.yarn.server.resourcemanager.scheduler.YarnScheduler
/** * Get a resource usage report from a given app attempt ID. * @param appAttemptId the id of the application attempt * @return resource usage report for this given attempt */ @LimitedPrivate("yarn") @Evolving ApplicationResourceUsageReport getAppResourceUsageReport( ApplicationAttemptId appAttemptId);
getAppResourceUsageReport方法被RMAppAttemptImpl.getApplicationResourceUsageReport調用:
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl
@Override public ApplicationResourceUsageReport getApplicationResourceUsageReport() { this.readLock.lock(); try { ApplicationResourceUsageReport report = scheduler.getAppResourceUsageReport(this.getAppAttemptId()); if (report == null) { report = RMServerUtils.DUMMY_APPLICATION_RESOURCE_USAGE_REPORT; } AggregateAppResourceUsage resUsage = this.attemptMetrics.getAggregateAppResourceUsage(); report.setMemorySeconds(resUsage.getMemorySeconds()); report.setVcoreSeconds(resUsage.getVcoreSeconds()); return report; } finally { this.readLock.unlock(); } }
RMAppAttemptImpl.getApplicationResourceUsageReport被兩個地方調用:
第一個調用
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl
public ApplicationReport createAndGetApplicationReport(String clientUserName, boolean allowAccess) { ... appUsageReport = currentAttempt.getApplicationResourceUsageReport(); ...
RMAppImpl.createAndGetApplicationReport會被ClientRMService.getApplications和ClientRMService.getApplicationReport調用,這兩個方法分別對應命令
yarn application -list
yarn application -status $applicationId
這兩個地方展示信息的時候都沒展示usedResources,可能作者覺得這個實時資源占用統計沒那么重要。
詳見:
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService
第二個調用
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.AppInfo
public AppInfo(RMApp app, Boolean hasAccess, String schemePrefix) { ... ApplicationResourceUsageReport resourceReport = attempt .getApplicationResourceUsageReport(); if (resourceReport != null) { Resource usedResources = resourceReport.getUsedResources(); allocatedMB = usedResources.getMemory(); allocatedVCores = usedResources.getVirtualCores(); runningContainers = resourceReport.getNumUsedContainers(); } ...
這個構造函數會在RMWebServices.getApp和RMWebServices.getApps時被調用,這是個service接口,對應url分別為:
http://resourcemanager/ws/v1/cluster/apps/$applicationId
http://resourcemanager/ws/v1/cluster/apps?state=RUNNING
這兩個接口的返回值中有實時資源占用情況如下:
<allocatedMB>56320</allocatedMB>
<allocatedVCores>21</allocatedVCores>
分別對應實時內存占用和實時CPU占用;
詳見:
org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices
如果你發現spark應用內存的占用比你分配的要多,可以參考這里:https://www.cnblogs.com/barneywill/p/10102353.html