參考文章:http://www.jianshu.com/p/791137760c14
運行SparkStreming程序一段時間后,發現產生了異常:
ERROR JobScheduler: Error running job streaming job 1496767480000 ms.0 org.apache.spark.SparkException:
Job aborted due to stage failure:
Task 13 in stage 37560.0 failed 4 times,
most recent failure: Lost task 13.3 in stage 37560.0
(TID 3891416, 192.169.2.33, executor 1):
kafka.common.OffsetOutOfRangeException
如果消息體太大了,超過 fetch.message.max.bytes=1m
的默認配置,那么Spark Streaming會直接拋出OffsetOutOfRangeException異常,然后停止服務。
解決方案:Kafka consumer中設置fetch.message.max.bytes為大一點的內存
比如設置為50M:1024*1024*50
fetch.message.max.bytes=52428800