1 Maxwell
maxwell 是由美國zendesk開源,用java編寫的Mysql實時抓取軟件。 其抓取的原理也是基於binlog。
1.1 工具對比
1 Maxwell 沒有 Canal那種server+client模式,只有一個server把數據發送到消息隊列或redis。
2 Maxwell 有一個亮點功能,就是Canal只能抓取最新數據,對已存在的歷史數據沒有辦法處理。而Maxwell有一個bootstrap功能,可以直接引導出完整的歷史數據用於初始化,非常好用。
3 Maxwell不能直接支持HA,但是它支持斷點還原,即錯誤解決后重啟繼續上次點兒讀取數據。
4 Maxwell只支持json格式,而Canal如果用Server+client模式的話,可以自定義格式。
5 Maxwell比Canal更加輕量級。
1.2 安裝Maxwell
解壓縮maxwell-1.25.0.tar.gz 到某個目錄下。
1.3 使用前准備工作
在數據庫中建立一個maxwell庫用於存儲Maxwell的元數據。
CREATE DATABASE maxwell ;
並且分配一個賬號可以操作該數據庫
GRANT ALL PRIVILEGES ON *.* TO 'maxwell'@'%' IDENTIFIED BY '123123';
分配這個賬號可以監控其他數據庫的權限
GRANT SELECT ,REPLICATION SLAVE , REPLICATION CLIENT ON *.* TO maxwell@'%'
1.4 使用Maxwell監控抓取MySql數據
在任意位置建立maxwell.properties 文件
producer=kafka kafka.bootstrap.servers=hadoop1:9092,hadoop2:9092,hadoop3:9092 kafka_topic=ODS_DB_GMALL2020_M host=hadoop2 user=maxwell password=123123 client_id=maxwell_1
啟動程序
/opt/module/maxwell/bin/maxwell --config /opt/module/maxwell/config.properties >/dev/null 2>&1 &
1.5 修改或插入mysql數據,並消費kafka進行觀察
/ext/kafka_2.11-1.0.0/bin/kafka-topics.sh --create --topic ODS_DB_GMALL2020_M --zookeeper hadoop1:2181,hadoop2:2181,hadoop3:2181 --partitions 12 --replication-factor 1
執行測試語句
INSERT INTO z_user_info VALUES(30,'zhang3','13810001010'),(31,'li4','1389999999');
對比
canal |
maxwell |
{"data":[{"id":"30","user_name":"zhang3","tel":"13810001010"},{"id":"31","user_name":"li4","tel":"1389999999"}],"database":"gmall-2020-04","es":1589385314000,"id":2,"isDdl":false,"mysqlType":{"id":"bigint(20)","user_name":"varchar(20)","tel":"varchar(20)"},"old":null,"pkNames":["id"],"sql":"","sqlType":{"id":-5,"user_name":12,"tel":12},"table":"z_user_info","ts":1589385314116,"type":"INSERT"} |
{"database":"gmall-2020-04","table":"z_user_info","type":"insert","ts":1589385314,"xid":82982,"xoffset":0,"data":{"id":30,"user_name":"zhang3","tel":"13810001010"}}
{"database":"gmall-2020-04","table":"z_user_info","type":"insert","ts":1589385314,"xid":82982,"commit":true,"data":{"id":31,"user_name":"li4","tel":"1389999999"}} |
執行update操作
UPDATE z_user_info SET user_name='wang55' WHERE id IN(30,31)
canal |
maxwell |
{"data":[{"id":"30","user_name":"wang55","tel":"13810001010"},{"id":"31","user_name":"wang55","tel":"1389999999"}],"database":"gmall-2020-04","es":1589385508000,"id":3,"isDdl":false,"mysqlType":{"id":"bigint(20)","user_name":"varchar(20)","tel":"varchar(20)"},"old":[{"user_name":"zhang3"},{"user_name":"li4"}],"pkNames":["id"],"sql":"","sqlType":{"id":-5,"user_name":12,"tel":12},"table":"z_user_info","ts":1589385508676,"type":"UPDATE"} |
{"database":"gmall-2020-04","table":"z_user_info","type":"update","ts":1589385508,"xid":83206,"xoffset":0,"data":{"id":30,"user_name":"wang55","tel":"13810001010"},"old":{"user_name":"zhang3"}}
{"database":"gmall-2020-04","table":"z_user_info","type":"update","ts":1589385508,"xid":83206,"commit":true,"data":{"id":31,"user_name":"wang55","tel":"1389999999"},"old":{"user_name":"li4"}} |
delete操作
DELETE FROM z_user_info WHERE id IN(30,31)
canal |
maxwell |
{"data":[{"id":"30","user_name":"wang55","tel":"13810001010"},{"id":"31","user_name":"wang55","tel":"1389999999"}],"database":"gmall-2020-04","es":1589385644000,"id":4,"isDdl":false,"mysqlType":{"id":"bigint(20)","user_name":"varchar(20)","tel":"varchar(20)"},"old":null,"pkNames":["id"],"sql":"","sqlType":{"id":-5,"user_name":12,"tel":12},"table":"z_user_info","ts":1589385644829,"type":"DELETE"} |
{"database":"gmall-2020-04","table":"z_user_info","type":"delete","ts":1589385644,"xid":83367,"xoffset":0,"data":{"id":30,"user_name":"wang55","tel":"13810001010"}}
{"database":"gmall-2020-04","table":"z_user_info","type":"delete","ts":1589385644,"xid":83367,"commit":true,"data":{"id":31,"user_name":"wang55","tel":"1389999999"}} |
總結數據特點:
一 日志結構
canal 每一條SQL會產生一條日志,如果該條Sql影響了多行數據,則已經會通過集合的方式歸集在這條日志中。(即使是一條數據也會是數組結構)
maxwell 以影響的數據為單位產生日志,即每影響一條數據就會產生一條日志。如果想知道這些日志是否是通過某一條sql產生的可以通過xid進行判斷,相同的xid的日志來自同一sql。
二 數字類型
當原始數據是數字類型時,maxwell會尊重原始數據的類型不增加雙引,變為字符串。
canal一律轉換為字符串。
三 帶原始數據字段定義
canal數據中會帶入表結構。maxwell更簡潔。