簡介:
DataX 是一個異構數據源離線同步工具,致力於實現包括關系型數據庫(MySQL、Oracle等)、HDFS、Hive、ODPS、HBase、FTP等各種異構數據源之間穩定高效的數據同步功能。github地址: https://github.com/alibaba/DataX
1 注意部分
目前dataX不支持mysql8.X,需要修改源碼,修改的地方
- OriginalConfPretreatmentUtil類中引用的DataBaseType的追加參數,mysql8的zeroDateTimeBehavior=convertToNull修改為zeroDateTimeBehavior=CONVERT_TO_NULL
修改前:suffix = "yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true";
修改后: suffix = "yearIsDateType=false&zeroDateTimeBehavior=CONVERT_TO_NULL&tinyInt1isBit=false&rewriteBatchedStatements=true";

- mysql驅動,在mysql reader和writer的pom文件修改為
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.11</version>
</dependency>
- clean install 跳過測試
- 將reader和writer生成的target下面的datax的plugin拷貝到core工程項目和bin同級的plugin(源碼生成是沒有的,新建)
2 使用部分
- 目錄級別

- json模板
{
"job": {
"setting": {
"speed": {
"byte":10485760
},
"errorLimit": {
"record": 0,
"percentage": 0.02
}
},
"content": [
{
"reader": {
"name": "streamreader",
"parameter": {
"column" : [
{
"value": "DataX",
"type": "string"
},
{
"value": 19890604,
"type": "long"
},
{
"value": "1989-06-04 00:00:00",
"type": "date"
},
{
"value": true,
"type": "bool"
},
{
"value": "test",
"type": "bytes"
}
],
"sliceRecordCount": 100000
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"print": false,
"encoding": "UTF-8"
}
}
}
]
}
}
- mysql示例json(github官網可以查看)
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "123456",
"column": ["id","name"],
"where": "id>0",
"connection": [
{
"table": [
"user"
],
"jdbcUrl": [
"jdbc:mysql://47.101.137.97:3306/test1?serverTimezone=UTC"
]
}
]
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"username": "root",
"password": "123456",
"column": ["id","name"],
"connection": [
{
"table": [
"user"
],
"jdbcUrl":"jdbc:mysql://47.101.137.97:3306/test2?serverTimezone=UTC"
}
]
}
}
}
],
"setting": {
"speed": {
"channel": 1,
"byte": 104857600
},
"errorLimit": {
"record": 10,
"percentage": 0.05
}
}
}
}
- 執行
進入到datax的bin目錄(eg./Users/xuzhihui/test/backend/DataX-master/core/target/datax/bin),然后執行
python datax.py ../job/test.json
- 結果

