在學習sqoop job之前,最好先學習一下sqoop命令的導入導出
sqoop 使用 import 將 mysql 中數據導入到 hive
sqoop 使用 import 將 mysql 中數據導入到 hdfs
sqoop 使用 export 將 hive 中數據導出到 mysql
sqoop job
sqoop job 可將一些參數配置以及命令語句保存起來,方便調用。
接下來實現一個從mysql導入到hive的任務
- mysql建表,表名為 sqoop_job
CREATE TABLE `sqoop_job` ( `id` int(11) DEFAULT NULL, `name` varchar(255) DEFAULT NULL, `jobname` varchar(255) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1
- 給sqoop_job插入測試數據
insert into sqoop_job values(1,"name1","jobname1"); insert into sqoop_job values(2,"name2","jobname2"); insert into sqoop_job values(3,"name3","jobname3");
- 將mysql表結構同步到hive
sqoop create-hive-table --connect jdbc:mysql://localhost:3306/sqooptest --username root --password 123qwe --table sqoop_job
--hive-table sqoop_job --fields-terminated-by , - 創建一個導入任務的sqoop job
sqoop job --create sqoopimport1 -- import --connect jdbc:mysql://localhost:3306/sqooptest --username root -password 123qwe --table sqoop_job
--hive-import --hive-table sqoop_job --fields-terminated-by ',' -m 1創建成功后可使用命令查看當前job列表
sqoop job -list
sqoop還支持查看已創建任務的參數配置
使用命令 sqoop job --show jobname
EFdeMacBook-Pro:sbin FengZhen$ sqoop job --show sqoopimport1 Job: sqoopimport1 Tool: import Options: ---------------------------- verbose = false db.connect.string = jdbc:mysql://localhost:3306/sqooptest codegen.output.delimiters.escape = 0 codegen.output.delimiters.enclose.required = false codegen.input.delimiters.field = 0 hbase.create.table = false db.require.password = true hdfs.append.dir = false db.table = sqoop_job codegen.input.delimiters.escape = 0 import.fetch.size = null accumulo.create.table = false codegen.input.delimiters.enclose.required = false db.username = root reset.onemapper = false codegen.output.delimiters.record = 10 import.max.inline.lob.size = 16777216 hbase.bulk.load.enabled = false hcatalog.create.table = false db.clear.staging.table = false codegen.input.delimiters.record = 0 enable.compression = false hive.overwrite.table = false hive.import = true codegen.input.delimiters.enclose = 0 hive.table.name = sqoop_job accumulo.batch.size = 10240000 hive.drop.delims = false codegen.output.delimiters.enclose = 0 hdfs.delete-target.dir = false codegen.output.dir = . codegen.auto.compile.dir = true relaxed.isolation = false mapreduce.num.mappers = 1 accumulo.max.latency = 5000 import.direct.split.size = 0 codegen.output.delimiters.field = 44 export.new.update = UpdateOnly incremental.mode = None hdfs.file.format = TextFile codegen.compile.dir = /tmp/sqoop-FengZhen/compile/546e29b092f451585b5c8547b3e9985e direct.import = false hive.fail.table.exists = false db.batch = false
- 執行job
sqoop job --exec sqoopimport1
執行成功后可查看hive中表的數據
hive> select * from sqoop_job; OK 1 name1 jobname1 2 name2 jobname2 3 name3 jobname3 Time taken: 1.618 seconds, Fetched: 3 row(s)
Done.

