sqoop從DB2遷移數據到HDFS


Sqoop import job failed to read data from DB2 database which has UTF8 encoding. Essentially, even the data cannot be read at DB2 with select queries as there are some characters which are not in UTF8.

Sqoop job will throw an error similar to below:

Error: java.io.IOException: SQLException in nextKeyValue
        at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:265)
..
..
Caused by: com.ibm.db2.jcc.am.SqlException: [jcc][t4][1065][12306][4.19.26] Caught java.io.CharConversionException.  See attached Throwable for details. ERRORCODE=-4220, SQLSTATE=null
        at com.ibm.db2.jcc.am.kd.a(Unknown Source)
        at com.ibm.db2.jcc.am.kd.a(Unknown Source)
..
..
Caused by: java.nio.charset.MalformedInputException: Input length = 527
        at com.ibm.db2.jcc.am.s.a(Unknown Source)
        ... 22 more
Caused by: sun.io.MalformedInputException
        at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java:105)
        ... 23 more
2018-09-10 06:01:34,879 INFO mapreduce.Job:  map 0% reduce 0%
2018-09-10 06:01:45,942 INFO mapreduce.Job:  map 100% reduce 0%
2018-09-10 06:02:02,039 INFO mapreduce.Job: Task Id : attempt_1535965915754_0038_m_000000_2, Status : FAILED
Error: java.io.IOException: SQLException in nextKeyValue
        at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
        at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
        at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1988)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: com.ibm.db2.jcc.am.SqlException: [jcc][t4][1065][12306][4.16.53] Caught java.io.CharConversionException.  See attached Throwable for details. ERRORCODE=-4220, SQLSTATE=null
        at com.ibm.db2.jcc.am.fd.a(fd.java:723)
        at com.ibm.db2.jcc.am.fd.a(fd.java:60)
        at com.ibm.db2.jcc.am.fd.a(fd.java:112)
        at com.ibm.db2.jcc.am.jc.a(jc.java:2870)
        at com.ibm.db2.jcc.am.jc.p(jc.java:527)
        at com.ibm.db2.jcc.am.jc.N(jc.java:1563)
        at com.ibm.db2.jcc.am.ResultSet.getStringX(ResultSet.java:1153)
        at com.ibm.db2.jcc.am.ResultSet.getString(ResultSet.java:1128)
        at org.apache.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:71)
        at com.cloudera.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:61)
        at PC_KPI_PC_INCIDENT_CFIUS_CONSTRAINED.readFields(PC_KPI_PC_INCIDENT_CFIUS_CONSTRAINED.java:197)
        at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:244)
        ... 12 more
Caused by: java.nio.charset.MalformedInputException: Input length = 574820
        at com.ibm.db2.jcc.am.r.a(r.java:19)
        at com.ibm.db2.jcc.am.jc.a(jc.java:2862)
        ... 20 more
Caused by: sun.io.MalformedInputException
        at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java:167)
        at com.ibm.db2.jcc.am.r.a(r.java:16)
        ... 21 more

 

 

 

解決辦法:

需要在yarn的mapred-site.xml文件中添加如下配置:

<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1024m -Ddb2.jcc.charsetDecoderEncoder=3</value>
</property>

http://www-01.ibm.com/support/docview.wss?uid=swg21684365


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM