遇到的問題描述:在hadoop上面執行程序,程序運行之后能夠正常執行。一切似乎都是正常的,然而過了一段時間之后程序便開始阻塞直到程序超時退出(如下)。
14/08/19 21:17:51 INFO mapred.JobClient: map 99% reduce 71% 14/08/19 21:17:54 INFO mapred.JobClient: map 99% reduce 75% 14/08/19 21:17:57 INFO mapred.JobClient: map 99% reduce 79% 14/08/19 21:18:00 INFO mapred.JobClient: map 99% reduce 83% 14/08/19 21:18:03 INFO mapred.JobClient: map 99% reduce 87% 14/08/19 21:18:06 INFO mapred.JobClient: map 99% reduce 91%
出現這個問題是因為程序出現了一些異常,導致task執行失敗,然而hadoop並不退出也不重啟task。
異常一:程序玻本身的錯誤
attempt_201408192045_0002_m_000196_2: [2014-08-19 21:16:44 WARN ] [main] (org.apache.hadoop.mapred.Child:291) - Error running child attempt_201408192045_0002_m_000196_2: java.io.IOException: Index: 0, Size: 0 attempt_201408192045_0002_m_000196_2: at com.ict.hadoop.WXExtraction$Map.map(WXExtraction.java:61) attempt_201408192045_0002_m_000196_2: at com.ict.hadoop.WXExtraction$Map.map(WXExtraction.java:1) attempt_201408192045_0002_m_000196_2: at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) attempt_201408192045_0002_m_000196_2: at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) attempt_201408192045_0002_m_000196_2: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) attempt_201408192045_0002_m_000196_2: at org.apache.hadoop.mapred.Child$4.run(Child.java:270) attempt_201408192045_0002_m_000196_2: at java.security.AccessController.doPrivileged(Native Method) attempt_201408192045_0002_m_000196_2: at javax.security.auth.Subject.doAs(Subject.java:416) attempt_201408192045_0002_m_000196_2: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) attempt_201408192045_0002_m_000196_2: at org.apache.hadoop.mapred.Child.main(Child.java:264) attempt_201408192045_0002_m_000196_2: Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 attempt_201408192045_0002_m_000196_2: at java.util.ArrayList.rangeCheck(ArrayList.java:571) attempt_201408192045_0002_m_000196_2: at java.util.ArrayList.get(ArrayList.java:349) attempt_201408192045_0002_m_000196_2: at com.ict.wxparser.parser.WXParser.getMsgContent(WXParser.java:188) attempt_201408192045_0002_m_000196_2: at com.ict.wxparser.parser.WXParser.parseLine(WXParser.java:137) attempt_201408192045_0002_m_000196_2: at com.ict.hadoop.WXExtraction$Map.map(WXExtraction.java:57) attempt_201408192045_0002_m_000196_2: ... 9 more attempt_201408192045_0002_m_000196_2: [2014-08-19 21:16:44 INFO ] [main] (org.apache.hadoop.mapred.Task:956) - Runnning cleanup for the task 14/08/19 21:17:18 INFO mapred.JobClient: Task Id : attempt_201408192045_0002_m_000196_3, Status : FAILED java.io.IOException: Index: 0, Size: 0 at com.ict.hadoop.WXExtraction$Map.map(WXExtraction.java:61) at com.ict.hadoop.WXExtraction$Map.map(WXExtraction.java:1) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:571) at java.util.ArrayList.get(ArrayList.java:349) at com.ict.wxparser.parser.WXParser.getMsgContent(WXParser.java:188) at com.ict.wxparser.parser.WXParser.parseLine(WXParser.java:137) at com.ict.hadoop.WXExtraction$Map.map(WXExtrac
解決這個問題的關鍵在於修改代碼使得程序任務能夠正常執行。
異常二:org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: unable to create new native thread
這個問題說明程序的內存已經溢出,這時候會拋出溢出異常,並導致程序執行失敗。
解決方法:
1. 增大hadoop-env.sh 中HADOOP_HEAPSIZE的值
2 .增大 mapred-site.xml 中mapred.child.java.opts的值(默認為200M)
<property> <name>mapred.child.java.opts</name> <value>-Xmx2048m</value> </property>
3. 減小 mapred-site.xml中mapred.tasktracker.map.tasks.maximumde和mapred.tasktracker.reduce.tasks.maximum的值
<property> <name>mapred.tasktracker.map.tasks.maximum</name> <value>15</value> </property>
