在已經啟動后的連接器配置中table.include.list 添加了一張已有數據的表,如何為該表做snapshot
> 開發環境 debezium版本是1.3.final
如題,這里要介紹一個參數 “snapshot.new.tables” ,這個參數有點神奇,是被官方雪藏起來的,官方issue給的解釋是 https://issues.redhat.com/browse/DBZ-1977
示例如下:
{
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.user": "debezium_mysql",
"tasks.max": "1",
"database.history.kafka.bootstrap.servers": "cdh04:9092,cdh05:9092,cdh06:9092",
"database.history.kafka.topic": "ninth_studio_connector_history",
"database.history.kafka.recovery.poll.interval.ms": "5000",
"database.server.name": "ninth_studio_connector",
"database.port": "3306",
"tombstones.on.delete": "false",
"snapshot.new.tables":"parallel",
"database.hostname": "common.mysql.test.local",
"database.password": "********",
"database.serverTimezone":"UTC" ,
"table.include.list": "ninth_studio.wehub_action_logs,ninth_studio.ab_py_assistant_binds",
"database.include.list": "ninth_studio"
}
ps:1.3版本在配置了snapshot.mode = when_needed , snapshot.include.collection.list 后,偶爾會出現監聽不到數據的情況,要更新配置之后(隨便更新什么配置,主要是為了讓connect重啟,單純的使用rest api 重啟不會起作用)才能繼續讀取數據
時區問題
設置參考連接: https://my.oschina.net/dacoolbaby/blog/3096451
這位大佬分享了很多debezium的坑點,有很多可以借鑒的地方
decimal數據類型轉換
參考鏈接: https://blog.csdn.net/u012551524/article/details/83546765
默認precise會將其轉為“F3A=”,設置為double則可以正常顯示了
binlog文件失效導致拋異常
異常如下:
Connector requires binlog file 'mysql-bin.000003', but MySQL only has mysql-bin.000041, mysql-bin.000042 (io.debezium.connector.mysql.MySqlConnectorTask:330)
[2018-10-03 12:48:43,752] INFO Stopping MySQL connector task (io.debezium.connector.mysql.MySqlConnectorTask:245)
[2018-10-03 12:48:43,752] INFO WorkerSourceTask{id=debezium-mysql-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSourceTask:397)
[2018-10-03 12:48:43,752] INFO WorkerSourceTask{id=debezium-mysql-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask:414)
[2018-10-03 12:48:43,753] ERROR WorkerSourceTask{id=debezium-mysql-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:177)
org.apache.kafka.connect.errors.ConnectException: The connector is trying to read binlog starting at binlog file 'mysql-bin.000003', pos=715349422, skipping 982 events plus 40 rows, but this is no longer available on the server. Reconfigure the connector to use a snapshot when needed.
[2018-10-03 12:48:43,752] INFO Stopping MySQL connector task (io.debezium.connector.mysql.MySqlConnectorTask:245)
[2018-10-03 12:48:43,752] INFO WorkerSourceTask{id=debezium-mysql-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSourceTask:397)
[2018-10-03 12:48:43,752] INFO WorkerSourceTask{id=debezium-mysql-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask:414)
[2018-10-03 12:48:43,753] ERROR WorkerSourceTask{id=debezium-mysql-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:177)
org.apache.kafka.connect.errors.ConnectException: The connector is trying to read binlog starting at binlog file 'mysql-bin.000003', pos=715349422, skipping 982 events plus 40 rows, but this is no longer available on the server. Reconfigure the connector to use a snapshot when needed.
這種時候最簡單的方式是將 "snapshot.mode"設置為when_needed,如果想從根本上避免,可以考慮將mysql的binlog失效時間
expire_logs_days
調大一點
ps: 有時並不會直接拋出異常,而是會在日志里不斷打印 `Couldn't commit processed log positions with the source database due to a concurrent connector shutdown or restart`
這時需要嘗試重啟connect 服務,然后再去查看任務運行狀態