mapreduce 之 shuffle錯誤


錯誤信息

reduce容器報的錯誤信息如下:

2020-07-01 14:06:19,276 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#78
	at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
	at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:395)
	at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:310)
	at org.apache.hadoop.mapreduce.task.reduce.Fetcher.openShuffleUrl(Fetcher.java:291)
	at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:330)
	at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:198)

其他日志信息

2020-07-01 14:06:19,404 INFO [fetcher#60] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#60 - MergeManager returned status WAIT ...
2020-07-01 14:06:19,404 INFO [fetcher#8] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: xxx118:13562 freed by fetcher#8 in 0ms
2020-07-01 14:06:19,404 INFO [fetcher#60] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: xxx162:13562 freed by fetcher#60 in 0ms
2020-07-01 14:06:19,404 INFO [fetcher#6] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#6 - MergeManager returned status WAIT ...
2020-07-01 14:06:19,404 INFO [fetcher#6] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: xxx144:13562 freed by fetcher#6 in 2ms

從信息可以看出來,錯誤的原因是由於reduce從map拷貝數據的過程當中失敗的,並且還是在merge階段.

解決辦法:

修改mapred的參數

mapreduce.reduce.shuffle.memory.limit.percent=0.1 # 默認是0.25 可以在代碼里面設置,也可以在mapred-site.xml文件配置,這個參數的意思是說reduce每次fetceh數據的時候,達到reduce jvm內存的百分之多少的時候,就把數據寫入到磁盤


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM