1.安裝sqoop
請參考http://www.cnblogs.com/Richardzhu/p/3322635.html
增加了SQOOP_HOME相關環境變量:source ~/.bashrc /etq/profile
sqoop help檢測sqoop是否安裝好了,沒有error表示安裝好了
2.互導數據
mysql到hbase
sqoop import --connect jdbc:mysql://54.0.88.53:3306/chen --username root --password password --table hivetest --hbase-create-table --hbase-table test --column-family tbl_name --hbase-row-key tbl_type --hbase-row-key可以指定datatable中哪一列作為hbase新表的rowkey,--column-family是除rowkey之外的所有列的列族名
mysql到hive
復制表結構
sqoop create-hive-table --connect jdbc:mysql://54.0.88.53:3306/chen --table hivetest --username root --password password --hive-table hivetest
導入數據(存在時不沖突,不存在時創建)
注:多次執行會增量的load數據到hive
sqoop import --connect jdbc:mysql://54.0.88.53:3306/chen --username root --password password --table hivetest --hive-import sqoop import --connect 'jdbc:sqlserver://192.168.1.80;username=test;password=test;database=ba' --table=monthly_list_cdr_ac --hive-import -m 14 --hive-table monthly_list_cdr_ac --split-by day_date --hive-partition-key dt --hive-partition-value 20130531
hive到mysql(和HDFS導出的方式相同)
注:在無primary key情況下多次執行會增量的load數據到mysql
sqoop export --connect jdbc:mysql://54.0.88.53:3306/chen --username root --password password --table detail3 --export-dir /apps/hive/warehouse/detail3 --input-fields-terminated-by '\|'
連接mysql並列出數據庫中的表
sqoop list-tables --connect jdbc:mysql://localhost:3306/chen --username root --password password sqoop import --connect jdbc:mysql://mysqlserver_IP/databaseName --table testtable -m 1 sqoop import --connect jdbc:mysql://10.233.45.104:3306/test --username root --password root --table testa --hive-import -m 1
其中, mysqlserver_IP是mysql服務器地址,databaseName是數據庫名,testtable是表名,-m 1 指定只用一個map任務,默認是4個map,這是導成文件格式。
問題1:
INFO mapred.JobClient: Task Id : attempt_201108051007_0010_m_000000_0, Status : FAILED
java.util.NoSuchElementException
這種錯誤的原因是sqoop解析文件的字段與Mysql數據庫的表的字段沒有對應上。因此需要告訴sqoop文件的分隔符,使它能夠正確的解析文件字段。hive默認的字段分隔符為'\001'。
其他數據導入導出
將結果集導入mysql
從本地導入:
load data local inpath '/home/labs/kang/award.txt' overwrite into table award;
sqoop導入:對應編碼,記得刪除當前文件夾產生的java文件
sqoop export --connect "jdbc:mysql://54.0.88.53:3306/mydb?useUnicode=true&characterEncoding=UTF-8" --username root --password password --table china --export-dir /apps/hive/warehouse/china --input-fields-terminated-by '|'
將hive中的表導入hbase中,首先要拼接Rowkey和value:
insert overwrite table detail3 select concat(cust_no, sa_tx_dt, tx_log_no), concat( cust_no,"\|", sa_tx_dt,"\|", tx_log_no,"\|",sa_tx_tm,"\|", temp,"\|", cust_acct_no,"\|", sa_tx_crd_no,"\|", cr_tx_amt,"\|", acct_bal,"\|", f_fare,"\|", dr_cr_cod,"\|", tran_cd,"\|", tx_type,"\|", xt_op_trl,"\|", xt_op_trl2,"\|", bus_inst_no,"\|", canal,"\|", sa_op_acct_no_32,"\|", sa_op_cust_name,"\|", sa_op_bank_no,"\|", cr_cust_docag_stno,"\|", sa_otx_flg,"\|", sa_rmrk,"\|", other,"\|", tlr_no,"\|") from detail2; drop table hbase_detail3; CREATE EXTERNAL TABLE hbase_detail3(key string, values string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = "values:val") TBLPROPERTIES("hbase.table.name" = "detail3"); //建立外部表 insert overwrite table hbase_detail3 select * from detail3;
本地文件到hbase
hive -e "select * from hivetest" >> hive.csv hive.tsv hadoop fs -put hive.tsv /user/hdfs/chen hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,info:tbl_type hbase_hive /user/hdfs/chen/hive.csv hbase org.apache.hadoop.hbase.mapreduce.Driver import hbase_hive ./hive.csv