spark on yarn exitCode: -104


 

 

執行spark任務時,每次啟動后,少則一個小時,多則兩三天左右,任務就會死掉,yarn日志報錯見下圖:

AM Container for appattempt_1554609747730_49028_000001 exited with exitCode: -104
For more detailed output, check application tracking page:http:/xxx:8088/cluster/app/application_1554609747730_49028Then, click on links to logs of each attempt.
Diagnostics: Container [pid=14954,containerID=container_e06_1554609747730_49028_01_000001] is running beyond physical memory limits. Current usage: 2.5 GB of 2.5 GB physical memory used; 4.3 GB of 12.2 TB virtual memory used. Killing container.
Dump of the process-tree for container_e06_1554609747730_49028_01_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 15519 14954 14954 14954 (java) 23448925 131083 4470353920 656393 /usr/lib/jvm/java-1.8.0/bin/java -server -Xmx2048m -Djava.io.tmpdir=/mnt/disk1/yarn/usercache/hadoop/appcache/application_1554609747730_49028/container_e06_1554609747730_49028_01_000001/tmp -Dlog4j.ignoreTCL=true -Dspark.yarn.app.container.log.dir=/mnt/disk4/log/hadoop-yarn/containers/application_1554609747730_49028/container_e06_1554609747730_49028_01_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class com.miaoke.job.online.realhouse.RealClassBeforeStuNum --jar file:/data/job/mkspark.jar --properties-file /mnt/disk1/yarn/usercache/hadoop/appcache/application_1554609747730_49028/container_e06_1554609747730_49028_01_000001/__spark_conf__/__spark_conf__.properties
|- 14954 14952 14954 14954 (bash) 5 7 115855360 358 /bin/bash -c LD_LIBRARY_PATH=/usr/lib/hadoop-current/lib/native::/usr/lib/hadoop-current/lib/native::/opt/apps/ecm/service/hadoop/2.7.2-1.2.15/package/hadoop-2.7.2-1.2.15/lib/native:/usr/lib/hadoop-current/lib/native::/opt/apps/ecm/service/hadoop/2.7.2-1.2.15/package/hadoop-2.7.2-1.2.15/lib/native:/opt/apps/ecm/service/hadoop/2.7.2-1.2.15/package/hadoop-2.7.2-1.2.15/lib/native /usr/lib/jvm/java-1.8.0/bin/java -server -Xmx2048m -Djava.io.tmpdir=/mnt/disk1/yarn/usercache/hadoop/appcache/application_1554609747730_49028/container_e06_1554609747730_49028_01_000001/tmp '-Dlog4j.ignoreTCL=true' -Dspark.yarn.app.container.log.dir=/mnt/disk4/log/hadoop-yarn/containers/application_1554609747730_49028/container_e06_1554609747730_49028_01_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class 'com.miaoke.job.online.realhouse.RealClassBeforeStuNum' --jar file:/data/job/mkspark.jar --properties-file /mnt/disk1/yarn/usercache/hadoop/appcache/application_1554609747730_49028/container_e06_1554609747730_49028_01_000001/__spark_conf__/__spark_conf__.properties 1> /mnt/disk4/log/hadoop-yarn/containers/application_1554609747730_49028/container_e06_1554609747730_49028_01_000001/stdout 2> /mnt/disk4/log/hadoop-yarn/containers/application_1554609747730_49028/container_e06_1554609747730_49028_01_000001/stderr
 
Container killed on request. Exit code is 143

 

解決:

  這是物理內存使用超過了限定值,YARN的NodeManager監控到內存使用超過閾值,強制終止該container進程。

  在Spark客戶端“spark-defaults.conf”配置文件中增加如下參數,或者在提交命令時添加--conf指定如下參數,來增大memoryOverhead。

   spark.yarn.driver.memoryOverhead:設置堆外內存大小(cluster模式使用)。

   spark.yarn.am.memoryOverhead:設置堆外內存大小(client模式使用)。

  --conf spark.yarn.driver.memoryOverhead=768m

   --conf spark.yarn.am.memoryOverhead=768m

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM