MySQL在5.6版本推出了GTID復制,相比傳統的復制,GTID復制對於運維更加友好,這個事物是誰產生,產生多少事物,非常直接的標識出來。
今天將討論一下 關於從庫show slave status 中的Retrieved_Gtid_Set 和 Executed_Gtid_Set.
Retrieved_Gtid_Set : 從庫已經接收到主庫的事務編號
Executed_Gtid_Set : 從庫自身已經執行的事務編號
下面將解釋這兩列的含義:
首先看看master和slave的server-uuid
Master:
[root@localhost][db1]> show variables like '%uuid%'; +---------------+--------------------------------------+ | Variable_name | Value | +---------------+--------------------------------------+ | server_uuid | 2a09ee6e-645d-11e7-a96c-000c2953a1cb | +---------------+--------------------------------------+ 1 row in set (0.00 sec)
Slave
[root@localhost][(none)]> show variables like '%uuid%'; +---------------+--------------------------------------+ | Variable_name | Value | +---------------+--------------------------------------+ | server_uuid | 8ce853fc-6f8a-11e7-8940-000c29e3f5ab | +---------------+--------------------------------------+ 1 row in set (0.01 sec)
其中主庫的server-id是10,從庫的server-id是20.
搭建好主從以后,如果沒有數據寫入,那么show slave status是下面這樣的:
Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_UUID: 2a09ee6e-645d-11e7-a96c-000c2953a1cb Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version:
如果在主庫創建表,並且寫入2條數據,是下面這樣的:
[root@localhost][db1]> create table t2 ( id int); Query OK, 0 rows affected (0.07 sec) [root@localhost][db1]> insert into t2 select 1; Query OK, 1 row affected (0.07 sec) Records: 1 Duplicates: 0 Warnings: 0 [root@localhost][db1]> insert into t2 select 2; Query OK, 1 row affected (0.02 sec) Records: 1 Duplicates: 0 Warnings: 0
這里auto_commit=1,可以看到創建表加插入2條數據,一共執行了3個事務.
從庫:show slave status\G
Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_UUID: 2a09ee6e-645d-11e7-a96c-000c2953a1cb Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3 Executed_Gtid_Set: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3 Auto_Position: 1
主庫:show master status
+------------------+----------+--------------+------------------+------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+------------------------------------------+ | mysql-bin.000001 | 912 | | | 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3 | +------------------+----------+--------------+------------------+------------------------------------------+ 1 row in set (0.00 sec)
其中主庫的 Executed_Gtid_Set為:2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3
看到從庫的Retrieved_Gtid_Set為: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3
Executed_Gtid_Set為: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3
也就是說主庫產生了3個事務,從庫接收到了主庫的3個事務,且都已全部執行。
其中 2a09ee6e-645d-11e7-a96c-000c2953a1cb 是主庫的server-uuid. 可以從從庫解析binlog看出:
# at 154 #170823 0:38:38 server id 10 end_log_pos 219 CRC32 0x6268641f GTID last_committed=0 sequence_number=1 SET @@SESSION.GTID_NEXT= '2a09ee6e-645d-11e7-a96c-000c2953a1cb:1'/*!*/; # at 219 #170823 0:38:38 server id 10 end_log_pos 316 CRC32 0x6c837618 Query thread_id=103 exec_time=0 error_code=0 use `db1`/*!*/; SET TIMESTAMP=1503419918/*!*/; SET @@session.pseudo_thread_id=103/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1436549152/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; create table t2 ( id int) /*!*/;
可以看見server-id為10,gtid-next為2a09ee6e-645d-11e7-a96c-000c2953a1cb:1,執行了建表,剩下的
2a09ee6e-645d-11e7-a96c-000c2953a1cb:2 與 2a09ee6e-645d-11e7-a96c-000c2953a1cb:3 執行的查詢我沒有寫出來。
這里也體現了文章開始提到的:這個事物由誰產生,產生多少事物,非常直接的標識了出來。
那么對於文章開頭那個詭異的gtid是怎么出來的呢?先說說已經執行的事務:
Executed_Gtid_Set: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-33,
8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1
這里的2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-33好理解,就是已經執行主庫的1-33的事務,那么8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1呢?其實也簡單,有兩種情況:
NO.1 從庫有數據寫入(即從庫插入數據)
[root@localhost][db1]> insert into t2 select 1; Query OK, 1 row affected (0.03 sec) Records: 1 Duplicates: 0 Warnings: 0
show slave status\G;
Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_UUID: 2a09ee6e-645d-11e7-a96c-000c2953a1cb Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3 Executed_Gtid_Set: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3, 8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1 Auto_Position: 1 Replicate_Rewrite_DB:
可以看到已經執行的事務有來自主庫的2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3,也有從庫自己寫入的數據:8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1。我們可以解析binlog看看
mysqlbinlog -vv mysql-bin.000001 --include-gtids='8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1'
# at 896 #170823 0:59:19 server id 20 end_log_pos 961 CRC32 0x0492528a GTID last_committed=3 sequence_number=4 SET @@SESSION.GTID_NEXT= '8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1'/*!*/; # at 961 #170823 0:59:19 server id 20 end_log_pos 1032 CRC32 0xbf545cca Query thread_id=25 exec_time=0 error_code=0 SET TIMESTAMP=1503421159/*!*/; SET @@session.pseudo_thread_id=25/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1436549152/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; BEGIN /*!*/; # at 1032 #170823 0:59:19 server id 20 end_log_pos 1079 CRC32 0x2f2de3ec Rows_query # insert into t2 select 1 # at 1079 #170823 0:59:19 server id 20 end_log_pos 1123 CRC32 0x18fe1c5c Table_map: `db1`.`t2` mapped to number 109 # at 1123 #170823 0:59:19 server id 20 end_log_pos 1163 CRC32 0x163a708e Write_rows: table id 109 flags: STMT_END_F BINLOG ' 52KcWR0UAAAALwAAADcEAACAABdpbnNlcnQgaW50byB0MiBzZWxlY3QgMezjLS8= 52KcWRMUAAAALAAAAGMEAAAAAG0AAAAAAAEAA2RiMQACdDIAAQMAAVwc/hg= 52KcWR4UAAAAKAAAAIsEAAAAAG0AAAAAAAEAAgAB//4BAAAAjnA6Fg== '/*!*/; ### INSERT INTO `db1`.`t2` ### SET ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ # at 1163 #170823 0:59:19 server id 20 end_log_pos 1194 CRC32 0xe3347ac1 Xid = 68 COMMIT/*!*/; SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
從binlog中可以清楚看到是從庫進行了寫入。
NO.2 主從切換(這里使用的是MHA切換主從)
Master_Server_Id: 20 Master_UUID: 8ce853fc-6f8a-11e7-8940-000c29e3f5ab Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1 Executed_Gtid_Set: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3, 8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1 Auto_Position: 1
可以看到主從切換以后主庫的server-id是20。這里的意思是接收到主庫8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1,並已經執行了這個事物,這個事物其實就是之前從庫寫入的那條數據。對於2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3就是之前主庫執行的3個事務,如果此時在主庫再插入一條數據,那么變化如下:
Retrieved_Gtid_Set: 8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1-2
Executed_Gtid_Set: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3,
8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1-2
下面說說GTID不連續問題,類似 2a09ee6e-645d-11e7-a96c-000c2953a1cb:37-45 ,這個是由於binlog被清理后導致的,我們可以測試一下,然后查看gtid_purge變量。
binlog不可能永遠駐留在服務上,需要定期進行清理(通過expire_logs_days可以控制定期清理間隔),否則遲早它會把磁盤用盡。gtid_purged用於記錄已經被清除了的binlog事務集合,它是gtid_executed的子集。只有gtid_executed為空時才能手動設置該變量,此時會同時更新gtid_executed為和gtid_purged相同的值。gtid_executed為空意味着要么之前沒有啟動過基於GTID的復制,要么執行過RESET MASTER。執行RESET MASTER時同樣也會把gtid_purged置空,即始終保持gtid_purged是gtid_executed的子集。
[root@localhost][db1]> show master logs; +------------------+-----------+ | Log_name | File_size | +------------------+-----------+ | mysql-bin.000001 | 3530 | +------------------+-----------+ 1 row in set (0.00 sec) [root@localhost][db1]> flush logs; Query OK, 0 rows affected (0.05 sec) [root@localhost][db1]> show master logs; +------------------+-----------+ | Log_name | File_size | +------------------+-----------+ | mysql-bin.000001 | 3577 | | mysql-bin.000002 | 234 | +------------------+-----------+ 2 rows in set (0.00 sec) [root@localhost][db1]> PURGE BINARY LOGS TO 'mysql-bin.000002'; Query OK, 0 rows affected (0.01 sec) [root@localhost][db1]> show master logs; +------------------+-----------+ | Log_name | File_size | +------------------+-----------+ | mysql-bin.000002 | 234 | +------------------+-----------+ 1 row in set (0.00 sec)
然后只要從庫有重新啟動,才會讀取。MySQL服務器啟動時,通過讀binlog文件,初始化gtid_executed和gtid_purged,使它們的值能和上次MySQL運行時一致。
gtid_executed 本機被執行並寫入日志的gtid
gtid_purged 該變量中記錄的是本機上已經執行過,但是已經被purge binary logs to命令清理的gtid_set
沒啟動前:
Retrieved_Gtid_Set: 8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1-9 Executed_Gtid_Set: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3, 8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1-9
重啟並插入數據:
Master_Server_Id: 20 Master_UUID: 8ce853fc-6f8a-11e7-8940-000c29e3f5ab Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 8ce853fc-6f8a-11e7-8940-000c29e3f5ab:10 Executed_Gtid_Set: 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3, 8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1-10 Auto_Position: 1
[root@localhost][(none)]> show variables like 'gtid_purged'; +---------------+------------------------------------------------------------------------------------+ | Variable_name | Value | +---------------+------------------------------------------------------------------------------------+ | gtid_purged | 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3, 8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1-9 | +---------------+------------------------------------------------------------------------------------+ 1 row in set (0.01 sec)
可以看到 2a09ee6e-645d-11e7-a96c-000c2953a1cb:1-3 與 8ce853fc-6f8a-11e7-8940-000c29e3f5ab:1-9 都在mysql-bin.000001里,已經被清除了。
下面的兩個實驗,將會告訴我們如果主日志被清除,但從還沒獲得這些日志,該怎么處理:
實驗一:如果slave所需要事務對應的GTID在master上已經被purge了
根據show global variables like '%gtid%'
的命令結果我們可以看到,和GTID相關的變量中有一個gtid_purged
。從字面意思以及 官方文檔可以知道該變量中記錄的是本機上已經執行過,但是已經被purge binary logs to
命令清理的gtid_set
。
本節中我們就要試驗下,如果master上把某些slave還沒有fetch到的gtid event purge后會有什么樣的結果。
以下指令在master上執行
-
master [localhost] {msandbox} (test) > show global variables like '%gtid%';
-
+---------------------------------+----------------------------------------+
-
| Variable_name | Value |
-
+---------------------------------+----------------------------------------+
-
| binlog_gtid_simple_recovery | OFF |
-
| enforce_gtid_consistency | ON |
-
| gtid_executed | 24024e52-bd95-11e4-9c6d-926853670d0b:1 |
-
| gtid_mode | ON |
-
| gtid_owned | |
-
| gtid_purged | |
-
| simplified_binlog_gtid_recovery | OFF |
-
+---------------------------------+----------------------------------------+
-
7 rows in set (0.01 sec)
-
-
master [localhost] {msandbox} (test) > flush logs;create table gtid_test2 (ID int) engine=innodb;
-
Query OK, 0 rows affected (0.04 sec)
-
-
Query OK, 0 rows affected (0.02 sec)
-
-
master [localhost] {msandbox} (test) > flush logs;create table gtid_test3 (ID int) engine=innodb;
-
Query OK, 0 rows affected (0.04 sec)
-
-
Query OK, 0 rows affected (0.04 sec)
-
-
master [localhost] {msandbox} (test) > show master status;
-
+------------------+----------+--------------+------------------+------------------------------------------+
-
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
-
+------------------+----------+--------------+------------------+------------------------------------------+
-
| mysql-bin.000005 | 359 | | | 24024e52-bd95-11e4-9c6d-926853670d0b:1-3 |
-
+------------------+----------+--------------+------------------+------------------------------------------+
-
1 row in set (0.00 sec)
-
-
master [localhost] {msandbox} (test) > purge binary logs to 'mysql-bin.000004';
-
Query OK, 0 rows affected (0.03 sec)
-
-
master [localhost] {msandbox} (test) > show global variables like '%gtid%';
-
+---------------------------------+------------------------------------------+
-
| Variable_name | Value |
-
+---------------------------------+------------------------------------------+
-
| binlog_gtid_simple_recovery | OFF |
-
| enforce_gtid_consistency | ON |
-
| gtid_executed | 24024e52-bd95-11e4-9c6d-926853670d0b:1-3 |
-
| gtid_mode | ON |
-
| gtid_owned | |
-
| gtid_purged | 24024e52-bd95-11e4-9c6d-926853670d0b:1 |
-
| simplified_binlog_gtid_recovery | OFF |
-
+---------------------------------+------------------------------------------+
-
7 rows in set (0.00 sec)
在slave2上重新做一次主從,以下命令在slave2上執行
-
slave2 [localhost] {msandbox} ((none)) > change master to master_host='127.0.0.1',master_port =21288,master_user='rsandbox',master_password='rsandbox',master_auto_position=1;
-
Query OK, 0 rows affected, 2 warnings (0.04 sec)
-
-
slave2 [localhost] {msandbox} ((none)) > start slave;
-
Query OK, 0 rows affected (0.01 sec)
-
-
slave2 [localhost] {msandbox} ((none)) > show slave status\G
-
*************************** 1. row ***************************
-
......
-
Slave_IO_Running: No
-
Slave_SQL_Running: Yes
-
......
-
Last_Errno: 0
-
Last_Error:
-
Skip_Counter: 0
-
Exec_Master_Log_Pos: 0
-
Relay_Log_Space: 151
-
......
-
Last_IO_Errno: 1236
-
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
-
Last_SQL_Errno: 0
-
Last_SQL_Error:
-
......
-
Auto_Position: 1
-
1 row in set (0.00 sec)
實驗二:忽略purged的部分,強行同步
那么實際生產應用當中,偶爾會遇到這樣的情況:某個slave從備份恢復后(或者load data infile)后,DBA可以人為保證該slave數據和master一致;或者即使不一致,這些差異也不會導致今后的主從異常(例如:所有master上只有insert沒有update)。這樣的前提下,我們又想使slave通過replication從master進行數據復制。此時我們就需要跳過master已經被purge的部分,那么實際該如何操作呢?
我們還是以實驗一的情況為例:
先確認master上已經purge的部分。從下面的命令結果可以知道master上已經缺失24024e52-bd95-11e4-9c6d-926853670d0b:1
這一條事務的相關日志
-
master [localhost] {msandbox} (test) > show global variables like '%gtid%';
-
+---------------------------------+------------------------------------------+
-
| Variable_name | Value |
-
+---------------------------------+------------------------------------------+
-
| binlog_gtid_simple_recovery | OFF |
-
| enforce_gtid_consistency | ON |
-
| gtid_executed | 24024e52-bd95-11e4-9c6d-926853670d0b:1-3 |
-
| gtid_mode | ON |
-
| gtid_owned | |
-
| gtid_purged | 24024e52-bd95-11e4-9c6d-926853670d0b:1 |
-
| simplified_binlog_gtid_recovery | OFF |
-
+---------------------------------+------------------------------------------+
-
7 rows in set (0.00 sec)
在slave上通過set global gtid_purged='xxxx'
的方式,跳過已經purge的部分
-
slave2 [localhost] {msandbox} ((none)) > stop slave;
-
Query OK, 0 rows affected (0.04 sec)
-
-
slave2 [localhost] {msandbox} ((none)) > set global gtid_purged = '24024e52-bd95-11e4-9c6d-926853670d0b:1';
-
Query OK, 0 rows affected (0.05 sec)
-
-
slave2 [localhost] {msandbox} ((none)) > start slave;
-
Query OK, 0 rows affected (0.01 sec)
-
-
slave2 [localhost] {msandbox} ((none)) > show slave status\G
-
*************************** 1. row ***************************
-
Slave_IO_State: Waiting for master to send event
-
......
-
Master_Log_File: mysql-bin .000005
-
Read_Master_Log_Pos: 359
-
Relay_Log_File: mysql_sandbox21290-relay-bin .000004
-
Relay_Log_Pos: 569
-
Relay_Master_Log_File: mysql-bin .000005
-
Slave_IO_Running: Yes
-
Slave_SQL_Running: Yes
-
......
-
Exec_Master_Log_Pos: 359
-
Relay_Log_Space: 873
-
......
-
Master_Server_Id: 1
-
Master_UUID: 24024e52-bd95-11e4-9c6d-926853670d0b
-
Master_Info_File: /data/mysql/rsandbox_mysql -5_6_23/node2/data/master.info
-
SQL_Delay: 0
-
SQL_Remaining_Delay: NULL
-
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
-
......
-
Retrieved_Gtid_Set: 24024e52-bd95-11e4-9c6d-926853670d0b:2-3
-
Executed_Gtid_Set: 24024e52-bd95-11e4-9c6d-926853670d0b:1-3
-
Auto_Position: 1
-
1 row in set (0.00 sec)
可以看到此時slave已經可以正常同步,並補齊了24024e52-bd95-11e4-9c6d-926853670d0b:2-3
范圍的binlog日志。
https://blog.csdn.net/woailyoo0000/article/details/88981380