pt-table-checksum 3.0.4檢測不出主從差異數據


群里好幾位同學問 pt-table-checksum 3.0.4, 主從兩個表數據是不一致,為啥檢測不出來?前段時間自己也測試過,只是沒整理成隨筆^_-

一、基本環境

VMware10.0+CentOS6.9+MySQL5.7.19

ROLE HOSTNAME BASEDIR DATADIR IP PORT
Master ZST1 /usr/local/mysql /data/mysql/mysql3306/data 192.168.85.132 3306
Slave ZST2 /usr/local/mysql /data/mysql/mysql3306/data 192.168.85.133 3306

基於Row+Gtid搭建的一主一從復制結構:Master->Slave

二、構造差異數據

借助樣例數據庫sakila做測試

# 主庫flush logs
mydba@192.168.85.132,3306 [sakila]> flush logs;

# 主庫開啟general_log
[root@ZST1 ~]# rm -rf /data/mysql/mysql3306/data/mysql-general.log
mydba@192.168.85.132,3306 [sakila]> set global general_log_file='/data/mysql/mysql3306/data/mysql-general.log';
mydba@192.168.85.132,3306 [sakila]> set global general_log =1;
mydba@192.168.85.132,3306 [sakila]> show variables like 'general_log%';

# 從庫修改部分數據,造成不一致
mydba@192.168.85.133,3306 [sakila]> delete from sakila.actor where actor_id<=3; # 外鍵約束刪除失敗
mydba@192.168.85.133,3306 [sakila]> update sakila.actor set last_name=first_name where actor_id<=3;
# 主庫sakila.actor數據
mydba@192.168.85.132,3306 [sakila]> select * from sakila.actor limit 3;
+----------+------------+-----------+---------------------+
| actor_id | first_name | last_name | last_update         |
+----------+------------+-----------+---------------------+
|        1 | PENELOPE   | GUINESS   | 2006-02-15 04:34:33 |
|        2 | NICK       | WAHLBERG  | 2006-02-15 04:34:33 |
|        3 | ED         | CHASE     | 2006-02-15 04:34:33 |
+----------+------------+-----------+---------------------+
# 從庫sakila.actor數據
mydba@192.168.85.133,3306 [sakila]> select * from sakila.actor limit 3;
+----------+------------+-----------+---------------------+
| actor_id | first_name | last_name | last_update         |
+----------+------------+-----------+---------------------+
|        1 | PENELOPE   | PENELOPE  | 2017-11-08 09:54:10 |
|        2 | NICK       | NICK      | 2017-11-08 09:54:10 |
|        3 | ED         | ED        | 2017-11-08 09:54:10 |
+----------+------------+-----------+---------------------+
View Code

從庫修改部分數據,造成主從不一致

三、pt-table-checksum

3.1、檢測數據是否一致

pt-table-checksum可以在任何機器上執行,只要它能連接到Master就行。我是在從庫執行,最后的參數指定到主庫就行

# 運行pt-table-checksum
[root@ZST2 ~]# pt-table-checksum --nocheck-binlog-format --nocheck-replication-filters --recursion-method=hosts --replicate=sakila.checksums --databases=sakila --tables=actor,city --host=192.168.85.132 --port=3306 --user=mydba --password=mysql5719
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
11-08T09:57:01      0      0      200       1       0   0.075 sakila.actor
11-08T09:57:01      0      0      600       1       0   0.034 sakila.city
[root@ZST2 ~]# 
View Code

DIFFS=0表示沒有差異數據。實際上主從數據不一致,這里卻沒有檢測出來~
主庫得到的general-log、binlog拷貝到其他文件夾,方便后續分析

# 拷貝general-log、binlog文件
[root@ZST1 ~]# cp /data/mysql/mysql3306/data/mysql-general.log /data/backup/mysql-general.log.ptchecksum3306
[root@ZST1 ~]# cp /data/mysql/mysql3306/logs/mysql-bin.000083 /data/backup/mysql-bin.000083.ptchecksum3306
View Code

3.2、查看general-log

[root@ZST1 ~]# cat /data/backup/mysql-general.log.ptchecksum3306
/usr/local/mysql/bin/mysqld, Version: 5.7.19-log (MySQL Community Server (GPL)). started with:
Tcp port: 3306  Unix socket: /tmp/mysql3306.sock
Time                 Id Command    Argument
2017-11-08T01:57:01.750917Z        20 Connect   mydba@192.168.85.133 on  using TCP/IP
2017-11-08T01:57:01.751564Z        20 Query     set autocommit=1
2017-11-08T01:57:01.752220Z        20 Query     SHOW VARIABLES LIKE 'innodb\_lock_wait_timeout'
2017-11-08T01:57:01.757028Z        20 Query     SET SESSION innodb_lock_wait_timeout=1
2017-11-08T01:57:01.757521Z        20 Query     SHOW VARIABLES LIKE 'wait\_timeout'
2017-11-08T01:57:01.760950Z        20 Query     SET SESSION wait_timeout=10000
2017-11-08T01:57:01.761400Z        20 Query     SELECT @@SQL_MODE
2017-11-08T01:57:01.761772Z        20 Query     SET @@SQL_QUOTE_SHOW_CREATE = 1/*!40101, @@SQL_MODE='NO_AUTO_VALUE_ON_ZERO,ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'*/
2017-11-08T01:57:01.762140Z        20 Query     SELECT @@server_id /*!50038 , @@hostname*/
2017-11-08T01:57:01.762475Z        20 Query     SELECT @@SQL_MODE
2017-11-08T01:57:01.762772Z        20 Query     SET SQL_MODE=',NO_AUTO_VALUE_ON_ZERO,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'
2017-11-08T01:57:01.763098Z        20 Query     SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ
2017-11-08T01:57:01.763504Z        20 Query     SHOW VARIABLES LIKE 'wsrep_on'
2017-11-08T01:57:01.766949Z        20 Query     SELECT @@SERVER_ID
2017-11-08T01:57:01.767470Z        20 Query     SHOW SLAVE HOSTS
2017-11-08T01:57:01.787329Z        20 Query     SHOW VARIABLES LIKE 'wsrep_on'
2017-11-08T01:57:01.790712Z        20 Query     SELECT @@SERVER_ID
2017-11-08T01:57:01.794388Z        20 Query     SHOW VARIABLES LIKE 'wsrep_on'
2017-11-08T01:57:01.797637Z        20 Query     SELECT @@SERVER_ID
2017-11-08T01:57:01.801356Z        20 Query     SHOW DATABASES LIKE 'sakila'
2017-11-08T01:57:01.802164Z        20 Query     CREATE DATABASE IF NOT EXISTS `sakila` /* pt-table-checksum */
2017-11-08T01:57:01.802951Z        20 Query     USE `sakila`
2017-11-08T01:57:01.803300Z        20 Query     SHOW TABLES FROM `sakila` LIKE 'checksums'
2017-11-08T01:57:01.806111Z        20 Query     CREATE TABLE IF NOT EXISTS `sakila`.`checksums` (
     db             CHAR(64)     NOT NULL,
     tbl            CHAR(64)     NOT NULL,
     chunk          INT          NOT NULL,
     chunk_time     FLOAT            NULL,
     chunk_index    VARCHAR(200)     NULL,
     lower_boundary TEXT             NULL,
     upper_boundary TEXT             NULL,
     this_crc       CHAR(40)     NOT NULL,
     this_cnt       INT          NOT NULL,
     master_crc     CHAR(40)         NULL,
     master_cnt     INT              NULL,
     ts             TIMESTAMP    NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
     PRIMARY KEY (db, tbl, chunk),
     INDEX ts_db_tbl (ts, db, tbl)
  ) ENGINE=InnoDB DEFAULT CHARSET=utf8
2017-11-08T01:57:01.825908Z        20 Query     SHOW GLOBAL STATUS LIKE 'Threads_running'
2017-11-08T01:57:01.828793Z        20 Query     SELECT CONCAT(@@hostname, @@port)
2017-11-08T01:57:01.844001Z        20 Query     SELECT CRC32('test-string')
2017-11-08T01:57:01.844518Z        20 Query     SELECT CRC32('a')
2017-11-08T01:57:01.845025Z        20 Query     SELECT CRC32('a')
2017-11-08T01:57:01.845517Z        20 Query     SHOW VARIABLES LIKE 'wsrep_on'
2017-11-08T01:57:01.849157Z        20 Query     SHOW DATABASES
2017-11-08T01:57:01.850038Z        20 Query     SHOW /*!50002 FULL*/ TABLES FROM `sakila`
2017-11-08T01:57:01.851486Z        20 Query     /*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := '', @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
2017-11-08T01:57:01.851943Z        20 Query     USE `sakila`
2017-11-08T01:57:01.852408Z        20 Query     SHOW CREATE TABLE `sakila`.`actor`
2017-11-08T01:57:01.853034Z        20 Query     /*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
2017-11-08T01:57:01.854092Z        20 Query     EXPLAIN SELECT * FROM `sakila`.`actor` WHERE 1=1
2017-11-08T01:57:01.857374Z        20 Query     USE `sakila`
2017-11-08T01:57:01.857990Z        20 Query     DELETE FROM `sakila`.`checksums` WHERE db = 'sakila' AND tbl = 'actor'
2017-11-08T01:57:01.877626Z        20 Query     USE `sakila`
2017-11-08T01:57:01.878413Z        20 Query     EXPLAIN SELECT COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `actor_id`, convert(`first_name` using utf8mb4), convert(`last_name` using utf8mb4), UNIX_TIMESTAMP(`last_update`))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `sakila`.`actor` /*explain checksum table*/
2017-11-08T01:57:01.879347Z        20 Query     REPLACE INTO `sakila`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT 'sakila', 'actor', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `actor_id`, convert(`first_name` using utf8mb4), convert(`last_name` using utf8mb4), UNIX_TIMESTAMP(`last_update`))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `sakila`.`actor` /*checksum table*/
2017-11-08T01:57:01.881166Z        20 Query     SHOW WARNINGS
2017-11-08T01:57:01.881764Z        20 Query     SELECT this_crc, this_cnt FROM `sakila`.`checksums` WHERE db = 'sakila' AND tbl = 'actor' AND chunk = '1'
2017-11-08T01:57:01.897051Z        20 Query     UPDATE `sakila`.`checksums` SET chunk_time = '0.001821', master_crc = '6816983c', master_cnt = '200' WHERE db = 'sakila' AND tbl = 'actor' AND chunk = '1'
2017-11-08T01:57:01.900914Z        20 Query     SHOW GLOBAL STATUS LIKE 'Threads_running'
2017-11-08T01:57:01.930534Z        20 Query     /*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := '', @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
2017-11-08T01:57:01.931387Z        20 Query     USE `sakila`
2017-11-08T01:57:01.932194Z        20 Query     SHOW CREATE TABLE `sakila`.`city`
2017-11-08T01:57:01.933399Z        20 Query     /*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
2017-11-08T01:57:01.935136Z        20 Query     EXPLAIN SELECT * FROM `sakila`.`city` WHERE 1=1
2017-11-08T01:57:01.940169Z        20 Query     USE `sakila`
2017-11-08T01:57:01.941026Z        20 Query     DELETE FROM `sakila`.`checksums` WHERE db = 'sakila' AND tbl = 'city'
2017-11-08T01:57:01.942010Z        20 Query     USE `sakila`
2017-11-08T01:57:01.943012Z        20 Query     EXPLAIN SELECT COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `city_id`, convert(`city` using utf8mb4), `country_id`, UNIX_TIMESTAMP(`last_update`))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `sakila`.`city` /*explain checksum table*/
2017-11-08T01:57:01.945033Z        20 Query     REPLACE INTO `sakila`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT 'sakila', 'city', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `city_id`, convert(`city` using utf8mb4), `country_id`, UNIX_TIMESTAMP(`last_update`))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `sakila`.`city` /*checksum table*/
2017-11-08T01:57:01.960088Z        20 Query     SHOW WARNINGS
2017-11-08T01:57:01.960938Z        20 Query     SELECT this_crc, this_cnt FROM `sakila`.`checksums` WHERE db = 'sakila' AND tbl = 'city' AND chunk = '1'
2017-11-08T01:57:01.961674Z        20 Query     UPDATE `sakila`.`checksums` SET chunk_time = '0.015889', master_crc = '4d700c4', master_cnt = '600' WHERE db = 'sakila' AND tbl = 'city' AND chunk = '1'
2017-11-08T01:57:01.964712Z        20 Query     SHOW GLOBAL STATUS LIKE 'Threads_running'
2017-11-08T01:57:01.972503Z        20 Quit
[root@ZST1 ~]# 
View Code

general-log邏輯
• 設置SESSION選項
• 創建checksums數據表
• 針對每一張需要check的表執行下面操作
  • DELETE:從checksums表中刪除sakila的記錄
  • EXPLAIN:分析計算sakila的this_cnt,this_crc的執行計划
  • REPLACE INTO:計算sakila的this_cnt,this_crc
  • UPDATE:使用this_cnt,this_crc更新master_crc,master_cnt
在主庫上這些以SQL語句的形式執行,且執行時沒有設置SESSION的日志格式為STATEMENT,主庫的binlog_format='ROW',所以binlog里記錄的是語句的最終執行結果(具體的數值,而非SQL語句)

3.3、查看binlog

[root@ZST1 ~]# mysqlbinlog -v --base64-output=decode-rows /data/backup/mysql-bin.000083.ptchecksum3306
View Code

binlog邏輯是:首先創建checksums數據表,然后delete->insert->update checksums  具體數值
主庫上最后一個update語句

SELECT this_crc, this_cnt FROM `sakila`.`checksums` WHERE db = 'sakila' AND tbl = 'actor' AND chunk = '1'
UPDATE `sakila`.`checksums` SET chunk_time = '0.001821', master_crc = '6816983c', master_cnt = '200' WHERE db = 'sakila' AND tbl = 'actor' AND chunk = '1'
View Code

在binlog體現為(原封不動應用到從庫)

[root@ZST1 ~]# mysqlbinlog -v --base64-output=decode-rows /data/backup/mysql-bin.000083.ptchecksum3306
...
COMMIT/*!*/;
# at 1604
#171108  9:57:01 server id 1323306  end_log_pos 1669 CRC32 0x16bf0702   GTID    last_committed=3        sequence_number=4       rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= '8ab82362-9c37-11e7-a858-000c29c1025c:575'/*!*/;
# at 1669
#171108  9:57:01 server id 1323306  end_log_pos 1743 CRC32 0xee6b7639   Query   thread_id=20    exec_time=0     error_code=0
SET TIMESTAMP=1510106221/*!*/;
BEGIN
/*!*/;
# at 1743
#171108  9:57:01 server id 1323306  end_log_pos 1823 CRC32 0x589cc01f   Table_map: `sakila`.`checksums` mapped to number 248
# at 1823
#171108  9:57:01 server id 1323306  end_log_pos 1950 CRC32 0xc0604f63   Update_rows: table id 248 flags: STMT_END_F
### UPDATE `sakila`.`checksums`
### WHERE
###   @1='sakila'
###   @2='actor'
###   @3=1
###   @4=NULL
###   @5=NULL
###   @6=NULL
###   @7=NULL
###   @8='6816983c'
###   @9=200
###   @10=NULL
###   @11=NULL
###   @12=1510106221
### SET
###   @1='sakila'
###   @2='actor'
###   @3=1
###   @4=0.001821            
###   @5=NULL
###   @6=NULL
###   @7=NULL
###   @8='6816983c'
###   @9=200
###   @10='6816983c'
###   @11=200
###   @12=1510106221
# at 1950
#171108  9:57:01 server id 1323306  end_log_pos 1981 CRC32 0x1f197fe6   Xid = 198
COMMIT/*!*/;
# at 1981
View Code

也就是說從庫不會去計算所謂的CRC32,它直接完整copy主庫的checksums的所有內容

3.4、如何解決

個人認為只有在statement格式下才能進行,因為兩邊要計算CRC32,計算完后再把主上的master_crc、master_cnt更新到從庫,最后在從庫對比master和this相關列。pt-table-checksum 3.0.4在執行時缺少SET @@binlog_format='STATEMENT',建議不要使用。
有一種很挫的方法,僅僅是為了看差異結果(生產環境勿用),執行pt-table-checksum前,在主上 set global binlog_format='STATEMENT';

# 主庫修改binlog_format為statement
mydba@192.168.85.132,3306 [sakila]> set global binlog_format='STATEMENT';

# 從庫運行pt-table-checksum
[root@ZST2 ~]# pt-table-checksum --nocheck-binlog-format --nocheck-replication-filters --recursion-method=hosts --replicate=sakila.checksums --databases=sakila --tables=actor,city --host=192.168.85.132 --port=3306 --user=mydba --password=mysql5719
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
11-08T12:40:27      0      1      200       1       0   0.015 sakila.actor
11-08T12:40:27      0      0      600       1       0   0.024 sakila.city
[root@ZST2 ~]# 
View Code

DIFFS=1,說明sakila.actor表存在差異

# 差異信息
mydba@192.168.85.133,3306 [sakila]> SELECT db,tbl,SUM(this_cnt) AS total_rows,COUNT(*) AS chunks
FROM sakila.checksums 
WHERE (master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc))
GROUP BY db,tbl;
+--------+-------+------------+--------+
| db     | tbl   | total_rows | chunks |
+--------+-------+------------+--------+
| sakila | actor |        200 |      1 |
+--------+-------+------------+--------+
1 row in set (0.00 sec)
View Code

主要就是查看master_cnt、this_cnt和master_crc、this_crc

四、pt-table-sync

4.1、修復數據不一致

前面已經檢測出主從數據不一致,下面使用pt-table-sync修復數據

# 打印命令
[root@ZST2 ~]# pt-table-sync --replicate=sakila.checksums --sync-to-master h=192.168.85.133,u=mydba,p=mysql5719,P=3306 --databases=sakila --charset=utf8 --print
REPLACE INTO `sakila`.`actor`(`actor_id`, `first_name`, `last_name`, `last_update`) VALUES ('1', 'PENELOPE', 'GUINESS', '2006-02-15 04:34:33') /*percona-toolkit src_db:sakila src_tbl:actor src_dsn:A=utf8,P=3306,h=192.168.85.132,p=...,u=mydba dst_db:sakila dst_tbl:actor dst_dsn:A=utf8,P=3306,h=192.168.85.133,p=...,u=mydba lock:1 transaction:1 changing_src:sakila.checksums replicate:sakila.checksums bidirectional:0 pid:3365 user:uest host:ZST2*/;
REPLACE INTO `sakila`.`actor`(`actor_id`, `first_name`, `last_name`, `last_update`) VALUES ('2', 'NICK', 'WAHLBERG', '2006-02-15 04:34:33') /*percona-toolkit src_db:sakila src_tbl:actor src_dsn:A=utf8,P=3306,h=192.168.85.132,p=...,u=mydba dst_db:sakila dst_tbl:actor dst_dsn:A=utf8,P=3306,h=192.168.85.133,p=...,u=mydba lock:1 transaction:1 changing_src:sakila.checksums replicate:sakila.checksums bidirectional:0 pid:3365 user:uest host:ZST2*/;
REPLACE INTO `sakila`.`actor`(`actor_id`, `first_name`, `last_name`, `last_update`) VALUES ('3', 'ED', 'CHASE', '2006-02-15 04:34:33') /*percona-toolkit src_db:sakila src_tbl:actor src_dsn:A=utf8,P=3306,h=192.168.85.132,p=...,u=mydba dst_db:sakila dst_tbl:actor dst_dsn:A=utf8,P=3306,h=192.168.85.133,p=...,u=mydba lock:1 transaction:1 changing_src:sakila.checksums replicate:sakila.checksums bidirectional:0 pid:3365 user:uest host:ZST2*/;
[root@ZST2 ~]# 

# 執行命令
[root@ZST2 ~]# pt-table-sync --replicate=sakila.checksums --sync-to-master h=192.168.85.133,u=mydba,p=mysql5719,P=3306 --databases=sakila --charset=utf8 --execute
REPLACE statements on sakila.actor can adversely affect child table `sakila`.`film_actor` because it has an ON UPDATE CASCADE foreign key constraint. See --[no]check-child-tables in the documentation for more information. --check-child-tables error  while doing sakila.actor on 192.168.85.133
[root@ZST2 ~]# 
View Code

--execute就是執行打印出來的命令,REPLACE INTO實際對應delete、insert操作,由於外鍵約束delete失敗(構造差異數據時就嘗試過delete),修復不成功。
pt-table-checksum及pt-table-sync詳細說明請參考:pt-table-checksum解讀使用pt-table-checksum及pt-table-sync校驗復制一致性


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM