1、現象:有一個節點的NodeManager啟動不了。
后台報錯日志如下:
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:192) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:425) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:472) Caused by: java.io.IOException: Cannot run program "/opt/hadoop-yarn/bin/container-executor": error=13, Permission denied at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047) at org.apache.hadoop.util.Shell.runCommand(Shell.java:485) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:169) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:190) ... 3 more Caused by: java.io.IOException: error=13, Permission denied at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:186) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028) ... 8 more
2、container-executor的權限如圖所示:
3、問題排查過程
yarn后台運行的用戶是mr,但是其不在users組中,導致沒有權限報錯。
解決方法:將mr加到users用戶組中即可,為了減小影響使用root用戶執行usermod -G users mr即可,此命令修改了mr用戶的附屬群組。
其他:
有可能是其他權限問題導致,解決方案參見:
https://blog.csdn.net/lsr40/article/details/79554901