MySQL5.7(5.6)GTID環境下恢復從庫思(qi)路(yin)方(ji)法(qiao)

本文轉載自查看原文 2016-08-09 17:08 2926 Mysql數據庫

          要討論如何恢復從庫，我們得先來了解如下一些概念： 
        

 
         GTID_EXECUTED:它是一組包含已經記錄在二進制日志文件中的事務集合 
        

 
         GTID_PURGED:它是一組包含已經從二進制日志刪除掉的事務集合。 
        

          在繼續討論時，我們先來看下如何新建一個基於GTID的slave。 
        

           通過了解上面的兩個參數，我們現在只需要： 
         
           1.從主庫上做一個備份時記錄備份時gtid_executed的值。 
         
           2.在新的slave上恢復此備份時設置從庫的gtid_purged的值為備份時master上gtid_executed的值。 
         
           通過mysqldump可以完成我們需要的功能。 
         
           目前主庫上的狀態（3301）： 
         
          [zejin] 3301>show global variables like 'gtid_executed'; +---------------+-------------------------------------------+ | Variable_name | Value | +---------------+-------------------------------------------+ | gtid_executed | a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-15 | +---------------+-------------------------------------------+ 1 row in set (0.00 sec) [zejin] 3301>show global variables like 'gtid_purged'; +---------------+-------------------------------------------+ | Variable_name | Value | +---------------+-------------------------------------------+ | gtid_purged | a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-13 | +---------------+-------------------------------------------+ 1 row in set (0.00 sec) 
         
           step1:用mysqldump做一個全備 
         
           mysqldump --all-databases --single-transaction --triggers --routines --events --host=127.0.0.1 --port=3301 --user=root --password=123 > dump3301.sql 
         
           打開dump3301.sql我們可以看到如下語句： 
         
           SET @@ 
          GLOBAL.GTID_PURGED='a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-15'; 
         
           此值即為master3301上 
          gtid_executed的值。 
         
           step2:全新啟動一個新的庫3303,注意在配置文件中配置 
          enforce_gtid_consistency及gtid_mode=on 
         
          mysqld_safe --defaults-file=/home/mysql/my3303.cnf & 此時新庫3303上的狀態應該是這樣的： [(none)] 3303>show global variables like 'gtid_executed'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | gtid_executed | | +---------------+-------+ 1 row in set (0.01 sec) [(none)] 3303>show global variables like 'gtid_purged'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | gtid_purged | | +---------------+-------+ 1 row in set (0.00 sec) 
         
           step3:導入備份文件並查看狀態值： 
         
          mysql -uroot -h127.0.0.1 -p123 -P3303 < dump3301.sql [(none)] 3303>show global variables like 'gtid_executed'; +---------------+-------------------------------------------+ | Variable_name | Value | +---------------+-------------------------------------------+ | gtid_executed | a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-15 | +---------------+-------------------------------------------+ 1 row in set (0.02 sec) [(none)] 3303>show global variables like 'gtid_purged'; +---------------+-------------------------------------------+ | Variable_name | Value | +---------------+-------------------------------------------+ | gtid_purged | a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-15 | +---------------+-------------------------------------------+ 1 row in set (0.00 sec) 
         
           step4:做主從change語句 
         
          [zejin] 3303>change master to master_host='192.168.1.240',master_port=3301,master_user='repl',master_password='123',master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.01 sec) [zejin] 3303>start slave; Query OK, 0 rows affected (0.00 sec) [zejin] 3303>show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.240 Master_User: repl Master_Port: 3301 Connect_Retry: 60 Master_Log_File: binlog57.000014 Read_Master_Log_Pos: 194 Relay_Log_File: zejin240-relay-bin.000002 Relay_Log_Pos: 365 Relay_Master_Log_File: binlog57.000014 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 194 Relay_Log_Space: 575 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 3301 Master_UUID: a97983fc-5a29-11e6-9d28-000c29d4dc3f Master_Info_File: /home/mysql/I3303/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-15 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) 
         
           至此完成了加入一台新的slave的GTID主從環境。

          假如我們目前擁有一主帶兩從的環境： 
        

           master（3301） 
         
           slave（3302） 
         
           slave（3303）

          我們來考慮這么一種異常情況，由於種種原因，有可能主庫上已經purge掉了一些binlog，但從庫都還沒有接收到（如slave停了一段時間，而master已經把一些binlog給purge掉了。） 
        

           主庫目前的狀態是： 
         
 
          [zejin] 3301>show global variables like 'gtid_executed';
+---------------+-------------------------------------------+
| Variable_name | Value |
+---------------+-------------------------------------------+
| gtid_executed | a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-21 |
+---------------+-------------------------------------------+
1 row in set (0.00 sec)
 
[zejin] 3301>show global variables like 'gtid_purged';
+---------------+-------------------------------------------+
| Variable_name | Value |
+---------------+-------------------------------------------+
| gtid_purged | a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-20 |
+---------------+-------------------------------------------+
1 row in set (0.00 sec)
 
[zejin] 3301>select * from t_users;
+----+------+
| id | name |
+----+------+
| 1 | chen |
| 2 | ok |
| 3 | li |
+----+------+
3 rows in set (0.00 sec) 
         
 
 在從庫3303上，我們可以看到如下錯誤提示： 
        

 
           Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.' 
         

 
          [zejin] 3303>show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 192.168.1.240
                  Master_User: repl
                  Master_Port: 3301
                Connect_Retry: 60
              Master_Log_File: binlog57.000014
          Read_Master_Log_Pos: 457
               Relay_Log_File: zejin240-relay-bin.000003
                Relay_Log_Pos: 4
        Relay_Master_Log_File: binlog57.000014
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 457
              Relay_Log_Space: 194
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 3301
                  Master_UUID: a97983fc-5a29-11e6-9d28-000c29d4dc3f
             Master_Info_File: /home/mysql/I3303/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 160809 17:25:39
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: a97983fc-5a29-11e6-9d28-000c29d4dc3f:16
            Executed_Gtid_Set: a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-16
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)


[zejin] 3303>select * from t_users;
+----+------+
| id | name |
+----+------+
|  1 | li   |
|  2 | zhou |
+----+------+
2 rows in set (0.00 sec) 
         
 

           主從已經中斷，數據也已不一致。 
         

             
         

            接下來我們來看如何恢復： 
          

            由於GTID具有全局唯一性，那么其它正常的gtid已經被復制到了slave3302上，所以我們可以把3303指向3302，同步完畢后再指回master3301（此前提基於3302的binlog還沒被purge掉，即存在3303沒有從master3301接收到的GTID事務） 
          

            操作方法如下： 
          
 
            [zejin] 3303>change master to master_host='192.168.1.240',master_port=3302,master_user='repl',master_password='123',master_auto_position=1;

[zejin] 3303>start slave;
Query OK, 0 rows affected (0.03 sec)

[zejin] 3303>show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.240
                  Master_User: repl
                  Master_Port: 3302
                Connect_Retry: 60
              Master_Log_File: binlog57.000007
          Read_Master_Log_Pos: 1723
               Relay_Log_File: zejin240-relay-bin.000002
                Relay_Log_Pos: 1687
        Relay_Master_Log_File: binlog57.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1723
              Relay_Log_Space: 1937
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 3302
                  Master_UUID: 5cee6f9f-5ab8-11e6-a081-000c29d4dc3f
             Master_Info_File: /home/mysql/I3303/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: a97983fc-5a29-11e6-9d28-000c29d4dc3f:17-21
            Executed_Gtid_Set: a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-21
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

[zejin] 3303>select * from t_users;
+----+------+
| id | name |
+----+------+
|  1 | chen |
|  2 | ok   |
|  3 | li   |
+----+------+
3 rows in set (0.00 sec)


數據也已經完全與主的一致了，復制正常后再change到3301master上。
[zejin] 3303>change master to master_host='192.168.1.240',master_port=3301,master_user='repl',master_password='123',master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.01 sec)

[zejin] 3303>start slave;
Query OK, 0 rows affected (0.00 sec) 
           
 

            上面這種情況是基於還有另一個從庫已經接收到了master的所有binlog的情況下，那如果結果只是M-S，也發生了如上的問題，那又該如何恢復，我們有如下兩種方法： 
          

              
          

            目前Master上狀態為： 
          
 
            [zejin] 3301>show global variables like '%gtid%';
+----------------------------------+-------------------------------------------+
| Variable_name                    | Value                                     |
+----------------------------------+-------------------------------------------+

| gtid_executed                    | a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-27 |
……
| gtid_purged                      | a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-25 |
……
+----------------------------------+-------------------------------------------+
8 rows in set (0.00 sec)
 
           

              
          

            Slave上狀態為： 
          
 
            [zejin] 3303>show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 192.168.1.240
                  Master_User: repl
                  Master_Port: 3301
                Connect_Retry: 60
              Master_Log_File: binlog57.000016
          Read_Master_Log_Pos: 729
               Relay_Log_File: zejin240-relay-bin.000003
                Relay_Log_Pos: 4
        Relay_Master_Log_File: binlog57.000016
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 729
              Relay_Log_Space: 194
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 3301
                  Master_UUID: a97983fc-5a29-11e6-9d28-000c29d4dc3f
             Master_Info_File: /home/mysql/I3303/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 160809 17:54:42
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: a97983fc-5a29-11e6-9d28-000c29d4dc3f:22
            Executed_Gtid_Set: a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-22
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec) 
           
和之前同樣類型的錯誤，我們恢復的思路為：

             把slave上的gtid_purged設置為master還沒有被purge掉的值，最后借助第三方一致性同步工具來做數據的一致性同步。 
           

               
           

             我們需要先在slave上做一下reset master來清除gtid的一些信息，直接設置會報如下錯誤： 
           
 
            [zejin] 3303>set global GTID_PURGED="a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-26";
ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty. 
           
 
正確操作步驟如下（在slave上執行）：
 
            [zejin] 3303>reset master;
Query OK, 0 rows affected (0.02 sec)

[zejin] 3303>set global GTID_PURGED="a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-26";
Query OK, 0 rows affected (0.00 sec)

[zejin] 3303>start slave;
Query OK, 0 rows affected (0.00 sec)

[zejin] 3303>show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.240
                  Master_User: repl
                  Master_Port: 3301
                Connect_Retry: 60
              Master_Log_File: binlog57.000018
          Read_Master_Log_Pos: 728
               Relay_Log_File: zejin240-relay-bin.000004
                Relay_Log_Pos: 718
        Relay_Master_Log_File: binlog57.000018
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 728
              Relay_Log_Space: 968
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 3301
                  Master_UUID: a97983fc-5a29-11e6-9d28-000c29d4dc3f
             Master_Info_File: /home/mysql/I3303/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: a97983fc-5a29-11e6-9d28-000c29d4dc3f:22:27
            Executed_Gtid_Set: a97983fc-5a29-11e6-9d28-000c29d4dc3f:1-27
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

 
           

             當然執行完這個之后數據是不一致的，那么此時就可以通過pt-table-checksum和pt-table-sync來做數據的一致性恢復了。 
           

              
          

              
          

            我們還有另一種方法，那就是重建slave，方法如本文最開始的那樣新建一個slave，但是在由於目前slave上已經有gtid的一些信息，所以在恢復時得先在slave上reset master，具體操作如下： 
          

            在slave上操作： 
           
 
            reset master
source dump3301.sql;
change master to master_host='192.168.1.240',master_port=3301,master_user='repl',master_password='123',master_auto_position=1;
start slave;
show slave status\G 
           
 
 至此完成slave同步異常的恢復。 
          

              
          

              
          

              
          

              
          

              
          

              
          

              
          

              
          

              
          

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 MySQL5.6下使用xtrabackup部分備份恢復到MySQL5.7 mysql5.7下載與安裝，php5.6與mysql5.7整合 MySQL5.7 GTID學習筆記 MySQL5.7配置GTID主從---GTID介紹 MySQL5.6 與 MySQL5.7 的區別 MySQL5.7 GTID在線開啟與關閉【轉】 Mysql5.7的gtid主從半同步復制和組復制 Windows環境下的MYSQL5.7配置文件定位 MySQL SSL配置（mysql5.7和mysql5.6) MySQL 5.6 GTID Replication