- driver報下面錯,同時報在我自己寫的代碼 collect 部分. top user 不報錯,top file報錯,我猜是因為file 比user多得多
20/08/24 08:37:15 ERROR MicroBatchExecution: Query [id = de341482-5e75-4c34-b924-146a7eb6c9b0, runId = 13007eb2-10eb-4ef0-a799-dc048a7fc0bf] terminated with error org.apache.spark.SparkException: Job aborted due to stage failure: ShuffleMapStage 4 (start at top_n.scala:646) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 2 at org.apache.spark.MapOutputTracker$.$anonfun$convertMapStatuses$2(MapOutputTracker.scala:1010) a
executor 報錯
20/08/24 08:30:25 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:30:43 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:30:58 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:31:17 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:31:35 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:31:53 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:32:12 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:32:25 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
ref:
https://blog.csdn.net/lingbo229/article/details/84943560
Solution:
memory 從16G -> 24G, 然后改成G1 GC collector, 同時加了GC 打印
"spark.executor.extraJavaOptions": "-XX:+UseG1GC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps",
