如果運行Spark集群時狀態一直為Accepted且不停止不報錯,比如像下面這樣的情況:
15/06/14 11:33:33 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED) 15/06/14 11:33:34 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED) 15/06/14 11:33:35 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED) 15/06/14 11:33:36 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED) 15/06/14 11:33:37 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED) 15/06/14 11:33:38 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED) 15/06/14 11:33:39 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED) 15/06/14 11:33:40 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED) 15/06/14 11:33:41 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
一般是由於有多個用戶同時向集群提交任務或一個用戶向集群同時提交了多個任務導致Yarn資源的分配錯誤。解決這個問題,只需要更改Hadoop的配置文件:/etc/hadoop/conf/capacity-scheduler.xml,把選項:yarn.scheduler.capacity.maximum-am-resource-percent從0.1改成0.5。顧名思義,這個選項是增加Yarn可調度的資源量,當然也可以視具體情況增加更多。也可見,默認情況下,Yarn沒有將很多資源分配給任務的能力。