實時電商數倉(十)之數據采集(九)數據庫數據采集(四)Maxwell入門與安裝


1  Maxwell

maxwell 是由美國zendesk開源,用java編寫的Mysql實時抓取軟件。 其抓取的原理也是基於binlog

1.1  工具對比

1 Maxwell 沒有 Canal那種server+client模式,只有一個server把數據發送到消息隊列或redis。

2 Maxwell 有一個亮點功能,就是Canal只能抓取最新數據,對已存在的歷史數據沒有辦法處理。而Maxwell有一個bootstrap功能,可以直接引導出完整的歷史數據用於初始化,非常好用。

3 Maxwell不能直接支持HA,但是它支持斷點還原,即錯誤解決后重啟繼續上次點兒讀取數據。

4 Maxwell只支持json格式,而Canal如果用Server+client模式的話,可以自定義格式。

5 MaxwellCanal更加輕量級。

1.2  安裝Maxwell

     解壓縮maxwell-1.25.0.tar.gz 到某個目錄下。

1.3    使用前准備工作

在數據庫中建立一個maxwell庫用於存儲Maxwell的元數據。

CREATE DATABASE maxwell ;

並且分配一個賬號可以操作該數據庫

GRANT ALL PRIVILEGES ON *.* TO 'maxwell'@'%' IDENTIFIED BY '123123';

分配這個賬號可以監控其他數據庫的權限

GRANT  SELECT ,REPLICATION SLAVE , REPLICATION CLIENT  ON *.* TO maxwell@'%'

1.4   使用Maxwell監控抓取MySql數據

在任意位置建立maxwell.properties 文件

producer=kafka
kafka.bootstrap.servers=hadoop1:9092,hadoop2:9092,hadoop3:9092
kafka_topic=ODS_DB_GMALL2020_M

host=hadoop2
user=maxwell
password=123123

client_id=maxwell_1

啟動程序

/opt/module/maxwell/bin/maxwell --config  /opt/module/maxwell/config.properties >/dev/null 2>&1 &

1.5   修改或插入mysql數據,並消費kafka進行觀察

/ext/kafka_2.11-1.0.0/bin/kafka-topics.sh --create --topic ODS_DB_GMALL2020_M --zookeeper hadoop1:2181,hadoop2:2181,hadoop3:2181     --partitions 12 --replication-factor 1

執行測試語句

INSERT INTO z_user_info VALUES(30,'zhang3','13810001010'),(31,'li4','1389999999');

對比

canal

maxwell

{"data":[{"id":"30","user_name":"zhang3","tel":"13810001010"},{"id":"31","user_name":"li4","tel":"1389999999"}],"database":"gmall-2020-04","es":1589385314000,"id":2,"isDdl":false,"mysqlType":{"id":"bigint(20)","user_name":"varchar(20)","tel":"varchar(20)"},"old":null,"pkNames":["id"],"sql":"","sqlType":{"id":-5,"user_name":12,"tel":12},"table":"z_user_info","ts":1589385314116,"type":"INSERT"}

{"database":"gmall-2020-04","table":"z_user_info","type":"insert","ts":1589385314,"xid":82982,"xoffset":0,"data":{"id":30,"user_name":"zhang3","tel":"13810001010"}}

 

{"database":"gmall-2020-04","table":"z_user_info","type":"insert","ts":1589385314,"xid":82982,"commit":true,"data":{"id":31,"user_name":"li4","tel":"1389999999"}}

執行update操作

UPDATE z_user_info SET user_name='wang55' WHERE id IN(30,31)

canal

maxwell

{"data":[{"id":"30","user_name":"wang55","tel":"13810001010"},{"id":"31","user_name":"wang55","tel":"1389999999"}],"database":"gmall-2020-04","es":1589385508000,"id":3,"isDdl":false,"mysqlType":{"id":"bigint(20)","user_name":"varchar(20)","tel":"varchar(20)"},"old":[{"user_name":"zhang3"},{"user_name":"li4"}],"pkNames":["id"],"sql":"","sqlType":{"id":-5,"user_name":12,"tel":12},"table":"z_user_info","ts":1589385508676,"type":"UPDATE"}

{"database":"gmall-2020-04","table":"z_user_info","type":"update","ts":1589385508,"xid":83206,"xoffset":0,"data":{"id":30,"user_name":"wang55","tel":"13810001010"},"old":{"user_name":"zhang3"}}

 

{"database":"gmall-2020-04","table":"z_user_info","type":"update","ts":1589385508,"xid":83206,"commit":true,"data":{"id":31,"user_name":"wang55","tel":"1389999999"},"old":{"user_name":"li4"}}

 

delete操作

DELETE  FROM z_user_info   WHERE id IN(30,31)

canal

maxwell

{"data":[{"id":"30","user_name":"wang55","tel":"13810001010"},{"id":"31","user_name":"wang55","tel":"1389999999"}],"database":"gmall-2020-04","es":1589385644000,"id":4,"isDdl":false,"mysqlType":{"id":"bigint(20)","user_name":"varchar(20)","tel":"varchar(20)"},"old":null,"pkNames":["id"],"sql":"","sqlType":{"id":-5,"user_name":12,"tel":12},"table":"z_user_info","ts":1589385644829,"type":"DELETE"}

{"database":"gmall-2020-04","table":"z_user_info","type":"delete","ts":1589385644,"xid":83367,"xoffset":0,"data":{"id":30,"user_name":"wang55","tel":"13810001010"}}

 

{"database":"gmall-2020-04","table":"z_user_info","type":"delete","ts":1589385644,"xid":83367,"commit":true,"data":{"id":31,"user_name":"wang55","tel":"1389999999"}}

總結數據特點:

日志結構

canal 每一條SQL會產生一條日志,如果該條Sql影響了多行數據,則已經會通過集合的方式歸集在這條日志中。(即使是一條數據也會是數組結構)

maxwell 以影響的數據為單位產生日志,即每影響一條數據就會產生一條日志。如果想知道這些日志是否是通過某一條sql產生的可以通過xid進行判斷,相同的xid的日志來自同一sql。

 

數字類型

   當原始數據是數字類型時,maxwell會尊重原始數據的類型不增加雙引,變為字符串。

   canal一律轉換為字符串。

 

帶原始數據字段定義

canal數據中會帶入表結構。maxwell更簡潔。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM