Hive中使用Python實現Transform時遇到Broken pipe錯誤排查


Hive中有一表,列分隔符為冒號(:),有一列utime是Timestamp格式,需要轉成Weekday存到新表。

利用Python寫一個Pipeline的Transform,weekday.py的代碼也很簡單:
import sys
import datetime
for line in sys.stdin:
 line=line.strip()
 uid,mid,rating,utime=line.split(':')
 weekday=datetime.datetime.fromtimestamp(float(utime)).isoweekday()
 print '\t'.join([uid,mid,rating,str(weekday)])
 
HQL的查詢也很簡單:
select 
transform(uid,mid,rating,utime) 
using 'python weekday.py' as (uid,mid,rating,weekday) 
from rating
 
Stage-1結束后就報錯!
 
排查過程:
1. Hive給出的日志,沒有什么意義。Hive日志:
 INFO exec.Task: 2015-07-07 16:34:57,938 Stage-1 map = 0%,  reduce = 0%
INFO exec.Task: 2015-07-07 16:35:30,262 Stage-1 map = 100%,  reduce = 0%
ERROR exec.Task: Ended Job = job_1431587697935_0210 with errors
ERROR operation.Operation: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 20001 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. An error occurred while reading or writing to your custom script. It may have crashed with an error. 
 at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315)
 at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:156)
 at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
 at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 
 2. 不死心呀!開啟Hive日志的Debug,再看看日志。
   因為我一直使用的是Beeline連接Hive,得到的日志跟1.一樣,沒有收獲。后來我想想,要不用Hive CLI看一下,會不會有收獲。終於得到點有意義的日志了:
Task with the most failures(4): 
-----
Task ID:
  task_1431587697935_0210_m_000000
-----
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"uid":11,"mid":2791,"rating":4,"utime":"978903186"}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"uid":11,"mid":2791,"rating":4,"utime":"978903186"}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20001]: An error occurred while reading or writing to your custom script. It may have crashed with an error.
at org.apache.hadoop.hive.ql.exec.ScriptOperator.process(ScriptOperator.java:456)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)

Caused by: java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:345)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.hadoop.hive.ql.exec.TextRecordWriter.write(TextRecordWriter.java:53)
at org.apache.hadoop.hive.ql.exec.ScriptOperator.process(ScriptOperator.java:425)

3. 根據上步提示的那行可疑的數據,我懷疑有Bad Data,處理時出錯。我單獨將報錯的行放到另一個表中去處理,又完全沒有問題。好吧,繼續。
 
4. 斷續之前我需要搞清楚: java.io.IOException: Broken pipe是什么?
    寫入端出現的時候,另一端卻休息或退出了,因此造成沒有及時取走管道中的數據,從而系統異常退出。在這里就是:當Streaming正在獲取input數據,好給weekday.py處理的過程中,weekday.py異常終止了。等Streaming准備好數據回來后,卻找不到weekday.py來接收數據,於是Broken pipe了。
 
5. 搞明白 Broken pipe並給前面的錯誤信息,斷定問題應該出在weekday.py上。接下來,既然是MapReduce出錯,那就需要去看Yarn的Stderr.
     通過ResouceManager查看對應Application的Logs中的stderr,發現:
     Traceback (most recent call last):
     File "weekday_mapper.py", line 5, in <module>
    uid,mid,rating,utime=line.split(':')
    ValueError: need more than 1 value to unpack
 
6. 從Python的錯誤來看,推測有數據行的分隔符(:)有異常,導致split之后不能返回4個值(uid,mid,rating,utime)。用各種方法檢查數據格式,一切正常。只好,處理腳本加上異常處理。
加上異常處理之后不報錯了, 但是Select輸出0行數據
import sys
import datetime
for line in sys.stdin:
 try:
  line=line.strip()
  uid,mid,rating,utime=line.split(':')
  weekday=datetime.datetime.fromtimestamp(float(utime)).isoweekday()
  print '\t'.join([uid,mid,rating,str(weekday)])
 except Exception, ex:
  pass
 
7. 問題鎖定到:腳本處理數據有問題。嘗試直接從HDFS上直接抓取表的數據文件,再用腳本處理,是正常的。
hdfs dfs -cat /user/hive/warehouse/test.db/t/000000_0|python /tmp/weekday_mapper.py
最后,懷疑transform的輸出格式是不是與定義表的格式不一樣,查閱官方說明:

By default, columns will be transformed to STRING and delimited by TAB before feeding to the user script。

於是,將腳本中的 uid,mid,rating,utime=line.split(':')改成 uid,mid,rating,utime=line.split('\t')。再試一次,成功!

 

總結

  1. 基礎知識很重要,要在自己內心成體系,才能夠用信手拈來。路漫漫兮!

  2. 有時憑借經驗的“猜”,會很有幫助,有時卻會“聰明反被聰明誤"。所以要重視日志,並以之為操作重現的依據。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM