問題場景
今天有個需求,將 hive 跑完的數據入到 Oracle 里,需求側直接從 Oracle 表里取數據,不接收文件;
通過下面腳本落到本地文件:
[hive@hadoop101 tool]$ more hive_2_file.sh #!/bin/sh path=$1 query=$2 file=$3 field=$4 beeline -u 'jdbc:hive2://110.110.1.110:9000' -n username -p password -e "insert overwrite directory '/warehouse/servpath/downlo ad/${path}' ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ( 'field.delim'='${field}', 'serialization.format'= '', 'serialization.null.format'='' ) ${query} " hadoop fs -getmerge /warehouse/servpath/downlo/${path} ${file}
文件入庫控制文件:
load data characterset utf8 infile '/warehouse/servpath/downlo/TEMP_INFO_20200909.txt' append into table TEMP_INFO_20200909 fields terminated by "\t" TRAILING NULLCOLS (MONTH_ID, PROV_ID, USER_ID, COUNTRY_STAY_DAYS)
執行入庫程序:
#!/bin/sh table=$1 sqlldr scott/tieger@orcl control=/TEMP_INFO_20200909/${table}.ctl log=/TEMP_INFO_20200909/${table}.log bad=TEMP_INFO_20200909/${table}.bad rows=1100000000 direct=true skip_index_maintenance=TRUE
結果居然報錯了
#!/bin/shRecord 5031: Rejected - Error on table TEMP_INFO_20200909, column COUNTRY_STAY_DAYS. Field in data file exceeds maximum length
可是 文件里的字段沒有超過表里的最大長度,在網上找了一下,說是需要修改控制文件,在報錯字段后 + char(300) 即可;
load data characterset utf8 infile '/warehouse/servpath/downlo/TEMP_INFO_20200909.txt' append into table TEMP_INFO_20200909 fields terminated by "\t" TRAILING NULLCOLS (MONTH_ID, PROV_ID, USER_ID, COUNTRY_STAY_DAYS char(300))
重啟啟動入庫程序,問題得以解決