Spark運行問題備忘一(網絡搜集)


問題一

ERROR storage.DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /hadoop/application_1415632483774_448143/spark-local-20141127115224-9ca8/04/shuffle_1_1562_27

java.io.FileNotFoundException: /hadoop/application_1415632483774_448143/spark-local-20141127115224-9ca8/04/shuffle_1_1562_27 (No such file or directory)

解決方法:表面上看是因為shuffle沒有地方寫了,如果后面的stack是local space 的問題,那么清一下磁盤就好了。上面這種問題,是因為一個excutor給分配的內存不夠,此時,減少excutor-core的數量,加大excutor-memory的值應該就沒有問題。

 

問題二

ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@pc-jfqdfx31:48586] -> [akka.tcp://sparkDriver@pc-jfqdfx30:41656] disassociated! Shutting down.
15/07/23 10:50:56 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM

解決方法:這個錯誤比較隱晦,從信息上看來不知道是什么問題,但是歸根結底還是內存的問題,有兩個方法可以解決這個錯誤,方法一:如上面所說,加大excutor-memory的值,減少executor-cores的數量,問題可以解決。方法二:加大executor.overhead的值,但是這樣其實並沒有解決掉根本的問題。所以如果集群的資源是支持的話,就用方法一的辦法吧。

  另外,這個錯誤也出現在partitionBy(new HashPartition(partiton-num))時,如果partiton-num太大或者太小的時候會報這種錯誤,說白了也是內存的原因,不過這個時候增加內存和overhead沒有什么用,得去調整這個partiton-num的值。

 

問題三

Container運行超出物理內存限制

查看hive的虛擬內存,默認的是2.1G
hive> set yarn.nodemanager.vmem-pmem-ratio;
yarn.nodemanager.vmem-pmem-ratio=2.1

解決方法:

Step1:更改yarn的配置屬性,同時在ResourceManager Default Group和Gateway Default Group中添加配置內容如下:
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>10</value>
</property>


 

 

 


Step2:更改yarn中以下系統配置
將mapreduce.map.memory.mb 由0G改為4G
將mapreduce.reduce.memory.mb由0G改為4G

Step3:重啟過期配置,在Hive中查看配置:

hive> set yarn.nodemanager.vmem-pmem-ratio;
yarn.nodemanager.vmem-pmem-ratio=10

hive> set mapreduce.map.memory.mb;
mapreduce.map.memory.mb=4

hive> set mapreduce.reduce.memory.mb;
mapreduce.reduce.memory.mb=4

 

總結:

1. 當spark console打印的堆棧很可能只是表面現象,導致問題出現的堆棧信息很可能在yarn的日志里面

2. yarn日志里面的堆棧錯誤,要優先排查解決

3.注意mapreduce.map.memory.mb、 mapreduce.reduce.memory.mb、yarn.scheduler.minimum-allocation-mb、mapreduce.reduce.java.opts 、mapreduce.map.java.opts指標值的設定

 

 

There are memory settings that can be set at the Yarn container level and also at the mapper and reducer level. Memory is requested in increments of the Yarn container size. Mapper and reducer tasks run inside a container.

mapreduce.map.memory.mb and mapreduce.reduce.memory.mb

above parameters describe upper memory limit for the map-reduce task and if memory subscribed by this task exceeds this limit, the corresponding container will be killed.

These parameters determine the maximum amount of memory that can be assigned to mapper and reduce tasks respectively. Let us look at an example: Mapper is bound by an upper limit for memory which is defined in the configuration parameter mapreduce.map.memory.mb.

However, if the value for yarn.scheduler.minimum-allocation-mb is greater than this value of mapreduce.map.memory.mb, then the yarn.scheduler.minimum-allocation-mb is respected and the containers of that size are given out.

This parameter needs to be set carefully and if not set properly, this could lead to bad performance or OutOfMemory errors.

mapreduce.reduce.java.opts and mapreduce.map.java.opts

This property value needs to be less than the upper bound for map/reduce task as defined in mapreduce.map.memory.mb/mapreduce.reduce.memory.mb, as it should fit within the memory allocation for the map/reduce task.


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM