簡介
MySQL Group Replication(簡稱MGR)字面意思是mysql組復制的意思,但其實他是一個高可用的集群架構,暫時只支持mysql5.7和mysql8.0版本.
是MySQL官方於2016年12月推出的一個全新的高可用與高擴展的解決方案,提供了高可用、高擴展、高可靠的MySQL集群服務.
也是mysql官方基於組復制概念並充分參考MariaDB Galera Cluster和Percona XtraDB Cluster結合而來的新的高可用集群架構.
MySQL Group Replication是建立在基於Paxos的XCom之上的,正因為有了XCom基礎設施,保證數據庫狀態機在節點間的事務一致性,才能在理論和實踐中保證數據庫系統在不同節點間的事務一致性。
由一般主從復制概念擴展,多個節點共同組成一個數據庫集群,事務的提交必須經過半數以上節點同意方可提交,在集群中每個節點上都維護一個數據庫狀態機,保證節點間事務的一致性。
組復制的特點:
● 高一致性
基於原生復制及 paxos 協議的組復制技術,並以插件的方式提供,提供一致數據安全保證;
● 高容錯性
只要不是大多數節點壞掉就可以繼續工作,有自動檢測機制,當不同節點產生資源爭用沖突時,不會出現錯誤,按照先到者優先原則進行處理,並且內置了自動化腦裂防護機制;
● 高擴展性
節點的新增和移除都是自動的,新節點加入后,會自動從其他節點上同步狀態,直到新節點和其他節點保持一致,如果某節點被移除了,其他節點自動更新組信息,自動維護新的組信息;
● 高靈活性
有單主模式和多主模式,單主模式下,會自動選主,所有更新操作都在主上進行;
多主模式下,所有 server 都可以同時處理更新操作。
優點:
高一致性,基於原生復制及paxos協議的組復制技術.
高容錯性,有自動檢測機制,當出現宕機后,會自動剔除問題節點,其他節點可以正常使用(類似zk集群),當不同節點產生資源爭用沖突時,會按照先到先得處理,並且內置了自動化腦裂防護機制.
高擴展性,可隨時在線新增和移除節點,會自動同步所有節點上狀態,直到新節點和其他節點保持一致,自動維護新的組信息.
高靈活性,直接插件形式安裝(5.7.17后自帶.so插件),有單主模式和多主模式,單主模式下,只有主庫可以讀寫,其他從庫會加上super_read_only狀態,只能讀取不可寫入,出現故障會自動選主.
缺點:
還是太新,不太穩定,暫時性能還略差於PXC,對網絡穩定性要求很高,至少是同機房做.
應用場景:
1、彈性的數據庫復制環境
組復制可以靈活的增加和減少集群中的數據庫實例
2、高可用的數據庫環境
組復制允許數據庫實例宕機,只要集群中大多數服務器可用,則整個數據庫服務可用
3、替代傳統主從復制結構的數據庫環境
組復制的有哪些先決條件:
1、只支持innodb存儲引擎
2、每張表都需要有主鍵
3、只支持ipv4網絡環境
4、要求高網絡帶寬(通常是千兆內網)和低網絡延遲
組復制有哪些限制條件?
1、Replication Event Checksums
由於代碼設計的原因,目前組復制還不能支持binlog的checksum,如果要使用組復制,需要配置binlog-checksum=none
2、Gap Locks
組復制校驗的進程不支持間隙鎖,mysql間隙鎖的設計是為了解決幻讀問題
3、Table Locks and Named Locks
組復制校驗的進程不支持表級鎖和named locks
4、SERIALIZABLE Isolation Level
組復制不支持串行事務級別
5、Concurrent DDL versus DML Operations
組復制的多主模式不支持並行的DDL和DML操作
6、Foreign Keys with Cascading Constraints
組復制的多主模式不支持帶有級聯約束類型的外鍵
7、Very Large Transactions
組復制不支持巨大的事務
組復制協議:
MGR使用paxos算法來解決一致性的問題,自己的理解:
比如一場法庭審判:
底下的人都是提議者,可以提出自己的意見和建議,但是,最后做出決定的是法官,這個法官可以是一個或者多個。
提出提議的眾人爭先和法官聯系。每一個提議者都會提出一個意見,這個意見會生成一個唯一的ID。如果有一個提議者和大多數法官(半數以上)取得的聯系,那么這個提議就會被接受
法官就會記錄這個提議生成的唯一ID,和上一個ID進行對比,如果現在的這個ID大於舊ID,那么這個意見就會被接收,然后法官們一起決定這個意見,以少數服從多數的方式決定。
如果這個ID小於或者等於舊ID,那么法官就會拒絕這個提議,拒絕后,提議者會將這個ID號增加,然后繼續向法官提出提議,直到被接受。
在同一時間,只有一個提議能被接受。
MGR沖突檢測
MGR多主模式下,一個事務在執行時,並不會做前置的檢查,但是在提交階段,會和其他節點通信對該事務是否能夠提交達成一個決議。在多個節點同對相同記錄的修改,在提交時會進行沖突檢測,首先提交的事務將獲得優先權。例如對同一條記錄的修改,t1事務先於t2事務,那么t1事務在沖突檢測后獲得執行權,順利提交,而t2事務進行回滾。顯然這種多點寫入條件下,對於同一條記錄的並發修改,由於大量的回滾,導致性能很低,因此MySQL官方建議,這種對於同一條記錄的修改,應該放在同一個節點執行,這樣可以利用節點本地鎖來進行同步等待,減少事務回滾,提高性能。
MGR新節點加入過程
MGR中,新節點申請加入組,會在組中生成一個View_change事件,組內所有online節點將該事件寫入到binlog,同時,申請加入組的新節點也會記錄這個View_change事件,之后,該節點會進入下面兩個階段。
- 第一階段,新節點會從組內online的節點中選擇一個作為貢獻者(donor),通過標准的異步復制通道,拉取貢獻者的binlog,並應用這些binlog。與此同時,新節點也會獲取當前組內正在交換的事務信息,並將其緩存到隊列中,當binlog應用完成,也就是應用到View_change事件處,異步復制通道關閉,進入第二階段。
- 第二階段,新節點處理緩存在隊列中的組內事務信息,當隊列中的事務信息處理完成,即緩存隊列長度為0時,新節點在組內狀態變為online。
在第一階段,遇到任何錯誤,新節點會自動從組內選擇另外一個online節點作為貢獻者,如果仍然遇到異常,會繼續選擇其他online節點,直到所有online節點枚舉完畢,如果這時仍然報錯,會sleep一段時間之后,再次重試,sleep時間和重試次數通過相應參數來控制。
第一階段,應用binlog的開始點由新節點的gtid_executed決定,結束點由View_change事件決定。MGR新節點加入組的第一階段,由於使用傳統的異步binlog數據同步,如果新加入的節點使用較早的備份,可能出現binlog接不上的情況,新節點一直處於RECOVERING狀態,在經過一定時間間隔和一定次數的重試后,恢復失敗,新節點從組中退出。另外一種情況,binlog能夠接上,但是binlog太多,導致應用binlog時間太長,同時第二階段緩存隊列也可能變得很大,整個恢復過程也將花費太長的時間。因些建議新節點加入組時,使用最近、最新的一次完整備份數據作為基礎。
安裝
1.服務環境設定規划
ip地址 |
mysql版本 | 數據庫端口號 | Server-ID | MGR端口號 | 操作系統 |
192.168.202.174 | mysql 5.7.25 | 3306 | 174 | 24901 | CentOS 7.5 |
192.168.202.175 | mysql 5.7.25 | 3306 | 175 | 24901 | CentOS 7.5 |
192.168.202.176 | mysql 5.7.25 | 3306 | 176 | 24901 | CentOS 7.5 |
多主模式下最好有三台以上的節點,單主模式則視實際情況而定,不過同個Group最多節點數為9.服務器配置盡量保持一致,因為和PXC一樣,也會有"木桶短板效應".
需要特別注意,mysql數據庫的服務端口號和MGR的服務端口不是一回事,需要區分開來.
而server-id要區分開來是必須的,單純做主從復制也要滿足這一點了.
2.配置服務器
關閉selinux
vim /etc/selinux/config
SELINUX=disabled
vim /etc/hosts
192.168.202.174 node1
192.168.202.175 node2
192.168.202.176 node3
開放防火牆端口:
firewall-cmd --zone=public --add-port=3306/tcp --permanent
firewall-cmd --zone=public --add-port=24901/tcp --permanent
firewall-cmd --reload
firewall-cmd --zone=public --list-ports
3.安裝部署
怎么安裝mysql 5.7.25 參考 https://www.cnblogs.com/EikiXu/p/10595093.html
直接就說怎么安裝MGR了,上面也說了,MGR在mysql5.7.17版本之后就都是自帶插件了,只是沒有安裝上而已,和半同步插件一個套路,所以默認是沒有選項.
所有集群內的服務器都必須安裝MGR插件才能正常使用該功能.
我們可以看到,一開始是沒有裝的
(root@localhost) 11:42:26 [(none)]> show plugins; +----------------------------+----------+--------------------+---------+---------+ | Name | Status | Type | Library | License | +----------------------------+----------+--------------------+---------+---------+ | binlog | ACTIVE | STORAGE ENGINE | NULL | GPL | | mysql_native_password | ACTIVE | AUTHENTICATION | NULL | GPL | | sha256_password | ACTIVE | AUTHENTICATION | NULL | GPL | | InnoDB | ACTIVE | STORAGE ENGINE | NULL | GPL | | INNODB_TRX | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_LOCKS | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_LOCK_WAITS | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_CMP | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_CMP_RESET | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_CMPMEM | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_CMPMEM_RESET | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_CMP_PER_INDEX | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_CMP_PER_INDEX_RESET | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_BUFFER_PAGE | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_BUFFER_PAGE_LRU | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_BUFFER_POOL_STATS | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_TEMP_TABLE_INFO | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_METRICS | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_FT_DEFAULT_STOPWORD | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_FT_DELETED | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_FT_BEING_DELETED | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_FT_CONFIG | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_FT_INDEX_CACHE | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_FT_INDEX_TABLE | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_TABLES | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_TABLESTATS | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_INDEXES | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_COLUMNS | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_FIELDS | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_FOREIGN | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_FOREIGN_COLS | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_TABLESPACES | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_DATAFILES | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | INNODB_SYS_VIRTUAL | ACTIVE | INFORMATION SCHEMA | NULL | GPL | | MyISAM | ACTIVE | STORAGE ENGINE | NULL | GPL | | MRG_MYISAM | ACTIVE | STORAGE ENGINE | NULL | GPL | | MEMORY | ACTIVE | STORAGE ENGINE | NULL | GPL | | CSV | ACTIVE | STORAGE ENGINE | NULL | GPL | | PERFORMANCE_SCHEMA | ACTIVE | STORAGE ENGINE | NULL | GPL | | BLACKHOLE | ACTIVE | STORAGE ENGINE | NULL | GPL | | partition | ACTIVE | STORAGE ENGINE | NULL | GPL | | ARCHIVE | ACTIVE | STORAGE ENGINE | NULL | GPL | | FEDERATED | DISABLED | STORAGE ENGINE | NULL | GPL | | ngram | ACTIVE | FTPARSER | NULL | GPL | +----------------------------+----------+--------------------+---------+---------+
MGR相關參數也是沒有加載的,只有一個其他相關的參數
(root@localhost) 11:43:27 [(none)]> show variables like 'group%'; +----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | group_concat_max_len | 1024 | +----------------------+-------+ 1 row in set (0.01 sec)
然后,先看看當前插件的目錄
(root@localhost) 11:43:49 [(none)]> show variables like 'plugin_dir'; +---------------+------------------------------+ | Variable_name | Value | +---------------+------------------------------+ | plugin_dir | /usr/local/mysql/lib/plugin/ | +---------------+------------------------------+ 1 row in set (0.00 sec)
再搜索一下我們需要的MGR插件,是否存在
[root@node1 mysqldata]# ll /usr/local/mysql/lib/plugin/ |grep group_replication -rwxr-xr-x. 1 mysql mysql 16393641 Dec 21 19:11 group_replication.so
最后,從新進入mysql服務,進行安裝
(root@localhost) 11:45:25 [(none)]> install PLUGIN group_replication SONAME 'group_replication.so'; Query OK, 0 rows affected (0.10 sec)
這個時候,就有了
(root@localhost) 11:45:26 [(none)]> show plugins; +----------------------------+----------+--------------------+----------------------+---------+ | Name | Status | Type | Library | License | +----------------------------+----------+--------------------+----------------------+---------+ | binlog | ACTIVE | STORAGE ENGINE | NULL | GPL | | mysql_native_password | ACTIVE | AUTHENTICATION | NULL | GPL | | sha256_password | ACTIVE | AUTHENTICATION | NULL | GPL | ........ | ARCHIVE | ACTIVE | STORAGE ENGINE | NULL | GPL | | FEDERATED | DISABLED | STORAGE ENGINE | NULL | GPL | | ngram | ACTIVE | FTPARSER | NULL | GPL | | group_replication | ACTIVE | GROUP REPLICATION | group_replication.so | GPL | +----------------------------+----------+--------------------+----------------------+---------+ 45 rows in set (0.00 sec)
再去看MGR相關的參數,就有很多了
(root@localhost) 11:45:47 [(none)]> show variables like 'group%'; +----------------------------------------------------+----------------------+ | Variable_name | Value | +----------------------------------------------------+----------------------+ | group_concat_max_len | 1024 | | group_replication_allow_local_disjoint_gtids_join | OFF | | group_replication_allow_local_lower_version_join | OFF | | group_replication_auto_increment_increment | 7 | | group_replication_bootstrap_group | OFF | | group_replication_components_stop_timeout | 31536000 | | group_replication_compression_threshold | 1000000 | | group_replication_enforce_update_everywhere_checks | OFF | | group_replication_exit_state_action | READ_ONLY | | group_replication_flow_control_applier_threshold | 25000 | | group_replication_flow_control_certifier_threshold | 25000 | | group_replication_flow_control_mode | QUOTA | | group_replication_force_members | | | group_replication_group_name | | | group_replication_group_seeds | | | group_replication_gtid_assignment_block_size | 1000000 | | group_replication_ip_whitelist | AUTOMATIC | | group_replication_local_address | | | group_replication_member_weight | 50 | | group_replication_poll_spin_loops | 0 | | group_replication_recovery_complete_at | TRANSACTIONS_APPLIED | | group_replication_recovery_reconnect_interval | 60 | | group_replication_recovery_retry_count | 10 | | group_replication_recovery_ssl_ca | | | group_replication_recovery_ssl_capath | | | group_replication_recovery_ssl_cert | | | group_replication_recovery_ssl_cipher | | | group_replication_recovery_ssl_crl | | | group_replication_recovery_ssl_crlpath | | | group_replication_recovery_ssl_key | | | group_replication_recovery_ssl_verify_server_cert | OFF | | group_replication_recovery_use_ssl | OFF | | group_replication_single_primary_mode | ON | | group_replication_ssl_mode | DISABLED | | group_replication_start_on_boot | ON | | group_replication_transaction_size_limit | 0 | | group_replication_unreachable_majority_timeout | 0 | +----------------------------------------------------+----------------------+ 37 rows in set (0.01 sec)
上面有些配置是我預先配置好的,后面會詳細解析.
4.配置MGR環境
熟悉mysql的人都知道,mysql支持set global的全局在線配置方式,所以並不局限於配置文件,這里直接解析參數和給出命令.
假設我們先寫到配置文件my.cnf:
首先,MGR是一定要用GTID的,所以,GTID就必須要開,這個還真是要重啟才能生效,各位要注意這點
#開啟GTID,必須開啟 gtid_mode=on #強制GTID的一致性 enforce-gtid-consistency=on
然后,列舉一些公共參數的修改
#binlog格式,MGR要求必須是ROW,不過就算不是MGR,也最好用row binlog_format=row #server-id必須是唯一的 server-id = 174 #MGR使用樂觀鎖,所以官網建議隔離級別是RC,減少鎖粒度 transaction_isolation = READ-COMMITTED #因為集群會在故障恢復時互相檢查binlog的數據, #所以需要記錄下集群內其他服務器發過來已經執行過的binlog,按GTID來區分是否執行過. log-slave-updates=1 #binlog校驗規則,5.6之后的高版本是CRC32,低版本都是NONE,但是MGR要求使用NONE binlog_checksum=NONE #基於安全的考慮,MGR集群要求復制模式要改成slave記錄記錄到表中,不然就報錯 master_info_repository=TABLE #同上配套 relay_log_info_repository=TABLE slave-parallel-workers=N #並行的SQL線程數量 slave-parallel-type=LOGICAL_CLOCK #基於組提交的並行復制方式 #slave上apply relay-log時事務順序提交 slave-preserve-commit-order=1 plugin_dir=/usr/local/mysql/lib/plugin #根據實際情況修改 plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so;group_replication.so" report_host=192.168.202.174 report_port=3306
最后就是MGR自身的獨有配置參數了.
#記錄事務的算法,官網建議設置該參數使用 XXHASH64 算法 transaction_write_set_extraction = XXHASH64 #相當於此GROUP的名字,是UUID值,不能和集群內其他GTID值的UUID混用,可用uuidgen來生成一個新的, #主要是用來區分整個內網里邊的各個不同的GROUP,而且也是這個group內的GTID值的UUID loose-group_replication_group_name = 'cc5e2627-2285-451f-86e6-0be21581539f' #IP地址白名單,默認只添加127.0.0.1,不會允許來自外部主機的連接,按需安全設置 loose-group_replication_ip_whitelist = '127.0.0.1/8,192.168.202.0/24' #是否隨服務器啟動而自動啟動組復制,不建議直接啟動,怕故障恢復時有擾亂數據准確性的特殊情況 loose-group_replication_start_on_boot = OFF #本地MGR的IP地址和端口,host:port,是MGR的端口,不是數據庫的端口 loose-group_replication_local_address = '192.168.202.174:24901' #需要接受本MGR實例控制的服務器IP地址和端口,是MGR的端口,不是數據庫的端口 loose-group_replication_group_seeds = '192.168.202.174:24901,192.168.202.174:24901,192.168.202.174:24901' #開啟引導模式,添加組成員,用於第一次搭建MGR或重建MGR的時候使用,只需要在集群內的其中一台開啟, loose-group_replication_bootstrap_group = OFF #是否啟動單主模式,如果啟動,則本實例是主庫,提供讀寫,其他實例僅提供讀,如果為off就是多主模式了 loose-group_replication_single_primary_mode = off #多主模式下,強制檢查每一個實例是否允許該操作,如果不是多主,可以關閉 loose-group_replication_enforce_update_everywhere_checks = on
重點來解析幾個參數:
group_replication_group_name: 這個必須是獨立的UUID值,不能和集群里面其他的數據庫的GTID的UUID值一樣,在linux系統下可以用uuidgen來生成一個新的UUID
group_replication_ip_whitelist: 關於IP白名單來說,本來是安全設置,如果全內網涵蓋是不太適合的,我這樣設置只是為了方便,這個參數可以set global動態修改,還是比較方便的
group_replication_start_on_boot: 不建議隨系統啟動的原因有兩個,第一個就是怕故障恢復時的極端情況下影響數據准確性,第二個就是怕一些添加或移除節點的操作被這個參數影響到
group_replication_local_address: 特別注意的是這個端口並不是數據庫服務端口,是MGR的服務端口,而且要保證這個端口沒有被使用,是MGR互相通信使用的端口.
group_replication_group_seeds: 接受本group控制的IP地址和端口號,這個端口也是MGR的服務端口,可以用set global動態修改,用以添加和移動節點.
group_replication_bootstrap_group: 需要特別注意,引導的服務器只需要一台,所以集群內其他服務器都不需要開啟這個參數,默認off就好了,有需要再set global來開啟就足夠了.
group_replication_single_primary_mode: 取決於想用的是多主模式還是單主模式,如果是單主模式,就類似於半同步復制,但是比半同步要求更高,因為需要集群內過半數的服務器寫入成功后,主庫才會返回寫入成功,數據一致性也更高,通常金融服務也更推薦這種使用方法.如果是多主模式,看上去性能更高,但是事務沖突的幾率也更高,雖然MGR內部有先到先得原則,但是這些還是不能忽略,對於高並發環境,更加可能是致命的,所以一般多主模式也是建議分開來使用,一個地址鏈接一個庫,從邏輯操作上區分開來,避免沖突的可能.
group_replication_enforce_update_everywhere_checks: 如果是單主模式,因為不存在多主同時操作的可能,這個強制檢查是可以關閉,因為已經不存在這樣的操作,多主是必須要開的,不開的話數據就可能出現錯亂了.
如果用set global方式動態開啟的話就如下了:
set global transaction_write_set_extraction='XXHASH64'; set global group_replication_start_on_boot=OFF; set global group_replication_bootstrap_group = OFF ; set global group_replication_group_name= 'cc5e2627-2285-451f-86e6-0be21581539f'; set global group_replication_local_address='10.0.2.5:33081'; set global group_replication_group_seeds='10.0.2.5:33081,10.0.2.6:33081,10.0.2.7:33081'; set global group_replication_ip_whitelist = '127.0.0.1/8,192.168.1.0/24,10.0.0.1/8,10.18.89.49/22'; set global group_replication_single_primary_mode=off; set global group_replication_enforce_update_everywhere_checks=on;
需要特別注意的是,同一集群group內的數據庫服務器的配置,都必須保持一致,不然是會報錯的,或者是造成一些奇葩事情.當然了,server-id和本機的IP地址端口要注意區分.
配置好了,就可以准備啟動了,但是啟動有順序要求,需要特別注意.
5.啟動MGR集群
就如上面說的,啟動MGR是要注意順序的,因為需要有其中一台數據庫做引導,其他數據庫才可以順利加入進來.
如果是單主模式,那么主庫就一定要先啟動並做引導,不然就不是主了.
當出現異常時,應該要去查看mysql報錯文件mysql.err,一般都有相應的error日志提示.
好了,轉回正題,現在假設用10.0.2.6這台服務器做引導,先登進本地mysql服務端:
SET SQL_LOG_BIN=0; #啟動引導,注意,只有這套開啟引導,其他兩台都請忽略這一步 mysql> SET GLOBAL group_replication_bootstrap_group=ON; #創建一個用戶來做同步的用戶,並授權,所有集群內的服務器都需要做 mysql> create user 'sroot'@'%' identified by '123123'; mysql> grant REPLICATION SLAVE on *.* to 'sroot'@'%' with grant option; #清空所有舊的GTID信息,避免沖突 mysql> reset master;set sql_log_bin=1; #創建同步規則認證信息,就是剛才授權的那個用戶,和一般的主從規則寫法不太一樣 mysql> CHANGE MASTER TO MASTER_USER='sroot', MASTER_PASSWORD='123123' FOR CHANNEL 'group_replication_recovery'; #啟動MGR mysql> start group_replication; #查看是否啟動成功,看到online就是成功了 mysql> SELECT * FROM performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+ | group_replication_applier | a29a1b91-4908-11e8-848b-08002778eea7 | ubuntu | 3308 | ONLINE | PRIMARY | 8.0.11 | +---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+ 1 row in set (0.02 sec) #這個時候,就可以先關閉引導了 mysql> SET GLOBAL group_replication_bootstrap_group=OFF;
然后,就到另外兩台服務器192.168.202.175和192.168.202.176了,也是要登進本地mysql服務端:
#不需要啟動引導了,下面大致是類似的 #用戶授權還是要做的 SET SQL_LOG_BIN=0; mysql> create user 'sroot'@'%' identified by '123123'; mysql> grant REPLICATION SLAVE on *.* to 'sroot'@'%' with grant option; #清空所有舊的GTID信息,避免沖突 mysql> reset master;SET SQL_LOG_BIN=1; #創建同步規則認證信息,就是剛才授權的那個用戶,和一般的主從規則寫法不太一樣 mysql> CHANGE MASTER TO MASTER_USER='sroot', MASTER_PASSWORD='123123' FOR CHANNEL 'group_replication_recovery'; #啟動MGR mysql> start group_replication; #查看是否啟動成功,看到online就是成功了
(root@localhost) 13:40:12 [(none)]> SELECT * FROM performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ | group_replication_applier | dbab147e-5a88-11e9-b9c3-000c297c0a9d | node1 | 3306 | ONLINE | | group_replication_applier | f8d7e601-5a88-11e9-bc6b-000c29f334ce | node2 | 3306 | ONLINE | +---------------------------+--------------------------------------+-------------+-------------+--------------+
如此類推,在192.168.202.176上就應該是下面這樣了
(root@localhost) 13:42:04 [(none)]> SELECT * FROM performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ | group_replication_applier | 0c22257b-5a89-11e9-bebd-000c29741024 | node3 | 3306 | ONLINE | | group_replication_applier | dbab147e-5a88-11e9-b9c3-000c297c0a9d | node1 | 3306 | ONLINE | | group_replication_applier | f8d7e601-5a88-11e9-bc6b-000c29f334ce | node2 | 3306 | ONLINE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 3 rows in set (0.00 sec)
看到MEMBER_STATE全部都是online就是成功連接上了,不過如果出現故障,是會被剔除出集群的並且在本機上會顯示error,這個時候就需要去看本機的mysql報錯文件mysql.err了.
需要注意的是,現在是多主模式,MEMBER_ROLE里顯示的都是PRIMARY,如果是單主模式,就會只顯示一個PRIMARY,其他是SECONDARY了.
6.使用
在多主模式下,下面這些連接方式都是能直接讀寫的
mysql -usroot -p123123 -h192.168.202.174 -P3306 mysql -usroot -p123123 -h192.168.202.175 -P3306 mysql -usroot -p123123 -h192.168.202.176 -P3306
怎么操作我就不說了,和以前的mysql一樣create,insert,delete一樣,你就看到其他服務器也會有數據了.
如果是單主的話,那么就只有PRIMARY狀態的主庫可以寫數據,SECONDARY狀態的只能讀不能寫,例如下面這樣
mysql> select * from ttt; +----+--------+ | id | name | +----+--------+ | 1 | ggg | | 2 | ffff | | 3 | hhhhh | | 4 | tyyyyy | | 5 | aaaaaa | +----+--------+ 5 rows in set (0.00 sec) mysql> delete from ttt where id = 5; ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
這些操作相關就不詳細展開了,搭好了就可以慢慢試.
7.管理維護
為了驗證我上面說過的東西,先看看當前的GTID和從庫狀態
#查一下GTID,就是之前設的那個group的uuid mysql> show master status; +------------------+----------+--------------+------------------+---------------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+---------------------------------------------------+ | mysql-bin.000003 | 4801 | | | cc5e2627-2285-451f-86e6-0be21581539f:1-23:1000003 | +------------------+----------+--------------+------------------+---------------------------------------------------+ 1 row in set (0.00 sec) #再看從庫狀態,沒有數據,因為根本不是主從結構 mysql> show slave status; Empty set (0.00 sec)
上面看到了一條命令,是查當前節點信息的,下面慢慢列舉一些常用的命令
#查看group內所有成員的節點信息 mysql> SELECT * FROM performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+ | group_replication_applier | a29a1b91-4908-11e8-848b-08002778eea7 | ubuntu | 3308 | ONLINE | PRIMARY | 8.0.11 | | group_replication_applier | af892b6e-49ca-11e8-9c9e-080027b04376 | ubuntu | 3308 | ONLINE | SECONDARY | 8.0.11 | | group_replication_applier | d058176a-51cf-11e8-8c95-080027e7b723 | ubuntu | 3308 | ONLINE | SECONDARY | 8.0.11 | +---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+ 3 rows in set (0.00 sec) #查看GROUP中的同步情況,當前復制狀態 mysql> select * from performance_schema.replication_group_member_stats\G *************************** 1. row *************************** CHANNEL_NAME: group_replication_applier VIEW_ID: 15258529121778212:5 MEMBER_ID: a29a1b91-4908-11e8-848b-08002778eea7 COUNT_TRANSACTIONS_IN_QUEUE: 0 COUNT_TRANSACTIONS_CHECKED: 9 COUNT_CONFLICTS_DETECTED: 0 COUNT_TRANSACTIONS_ROWS_VALIDATING: 0 TRANSACTIONS_COMMITTED_ALL_MEMBERS: cc5e2627-2285-451f-86e6-0be21581539f:1-23:1000003 LAST_CONFLICT_FREE_TRANSACTION: cc5e2627-2285-451f-86e6-0be21581539f:23 COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0 COUNT_TRANSACTIONS_REMOTE_APPLIED: 3 COUNT_TRANSACTIONS_LOCAL_PROPOSED: 9 COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0 *************************** 2. row *************************** CHANNEL_NAME: group_replication_applier VIEW_ID: 15258529121778212:5 MEMBER_ID: af892b6e-49ca-11e8-9c9e-080027b04376 COUNT_TRANSACTIONS_IN_QUEUE: 0 COUNT_TRANSACTIONS_CHECKED: 9 COUNT_CONFLICTS_DETECTED: 0 COUNT_TRANSACTIONS_ROWS_VALIDATING: 0 TRANSACTIONS_COMMITTED_ALL_MEMBERS: cc5e2627-2285-451f-86e6-0be21581539f:1-23:1000003 LAST_CONFLICT_FREE_TRANSACTION: cc5e2627-2285-451f-86e6-0be21581539f:23 COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0 COUNT_TRANSACTIONS_REMOTE_APPLIED: 10 COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0 COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0 *************************** 3. row *************************** CHANNEL_NAME: group_replication_applier VIEW_ID: 15258529121778212:5 MEMBER_ID: d058176a-51cf-11e8-8c95-080027e7b723 COUNT_TRANSACTIONS_IN_QUEUE: 0 COUNT_TRANSACTIONS_CHECKED: 9 COUNT_CONFLICTS_DETECTED: 0 COUNT_TRANSACTIONS_ROWS_VALIDATING: 0 TRANSACTIONS_COMMITTED_ALL_MEMBERS: cc5e2627-2285-451f-86e6-0be21581539f:1-23:1000003 LAST_CONFLICT_FREE_TRANSACTION: cc5e2627-2285-451f-86e6-0be21581539f:23 COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0 COUNT_TRANSACTIONS_REMOTE_APPLIED: 9 COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0 COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0 3 rows in set (0.00 sec) #當前server中各個通道的使用情況, mysql> select * from performance_schema.replication_connection_status\G *************************** 1. row *************************** CHANNEL_NAME: group_replication_applier GROUP_NAME: cc5e2627-2285-451f-86e6-0be21581539f SOURCE_UUID: cc5e2627-2285-451f-86e6-0be21581539f THREAD_ID: NULL SERVICE_STATE: ON COUNT_RECEIVED_HEARTBEATS: 0 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00.000000 RECEIVED_TRANSACTION_SET: cc5e2627-2285-451f-86e6-0be21581539f:1-23:1000003 LAST_ERROR_NUMBER: 0 LAST_ERROR_MESSAGE: LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000 LAST_QUEUED_TRANSACTION: cc5e2627-2285-451f-86e6-0be21581539f:23 LAST_QUEUED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2018-05-09 16:38:08.035692 LAST_QUEUED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000 LAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMP: 2018-05-09 16:38:08.031639 LAST_QUEUED_TRANSACTION_END_QUEUE_TIMESTAMP: 2018-05-09 16:38:08.031753 QUEUEING_TRANSACTION: QUEUEING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000 QUEUEING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000 QUEUEING_TRANSACTION_START_QUEUE_TIMESTAMP: 0000-00-00 00:00:00.000000 *************************** 2. row *************************** CHANNEL_NAME: group_replication_recovery GROUP_NAME: SOURCE_UUID: THREAD_ID: NULL SERVICE_STATE: OFF COUNT_RECEIVED_HEARTBEATS: 0 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00.000000 RECEIVED_TRANSACTION_SET: LAST_ERROR_NUMBER: 0 LAST_ERROR_MESSAGE: LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000 LAST_QUEUED_TRANSACTION: LAST_QUEUED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000 LAST_QUEUED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000 LAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMP: 0000-00-00 00:00:00.000000 LAST_QUEUED_TRANSACTION_END_QUEUE_TIMESTAMP: 0000-00-00 00:00:00.000000 QUEUEING_TRANSACTION: QUEUEING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000 QUEUEING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000 QUEUEING_TRANSACTION_START_QUEUE_TIMESTAMP: 0000-00-00 00:00:00.000000 2 rows in set (0.00 sec) #當前server中各個通道是否啟用,on是啟用 mysql> select * from performance_schema.replication_applier_status; +----------------------------+---------------+-----------------+----------------------------+ | CHANNEL_NAME | SERVICE_STATE | REMAINING_DELAY | COUNT_TRANSACTIONS_RETRIES | +----------------------------+---------------+-----------------+----------------------------+ | group_replication_applier | ON | NULL | 0 | | group_replication_recovery | OFF | NULL | 0 | +----------------------------+---------------+-----------------+----------------------------+ 2 rows in set (0.00 sec) #單主模式下,查看那個是主庫,只顯示uuid值 mysql> select * from performance_schema.global_status where VARIABLE_NAME='group_replication_primary_member'; +----------------------------------+--------------------------------------+ | VARIABLE_NAME | VARIABLE_VALUE | +----------------------------------+--------------------------------------+ | group_replication_primary_member | a29a1b91-4908-11e8-848b-08002778eea7 | +----------------------------------+--------------------------------------+ 1 row in set (0.00 sec)
例如下面這個例子
mysql> show global variables like 'server_uuid'; +---------------+--------------------------------------+ | Variable_name | Value | +---------------+--------------------------------------+ | server_uuid | af892b6e-49ca-11e8-9c9e-080027b04376 | +---------------+--------------------------------------+ 1 row in set (0.00 sec) mysql> show global variables like 'super%'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | super_read_only | ON | +-----------------+-------+ 1 row in set (0.00 sec)
好明顯,這台不是主庫,super_read_only都開啟了.
8.切換到多主模式
MGR切換模式需要重新啟動組復制,因些需要在所有節點上先關閉組復制,設置 group_replication_single_primary_mode=OFF 等參數,再啟動組復制。
# 停止組復制(所有節點執行): mysql> stop group_replication; mysql> set global group_replication_single_primary_mode=OFF; mysql> set global group_replication_enforce_update_everywhere_checks=ON; # 隨便選擇某個節點執行 mysql> SET GLOBAL group_replication_bootstrap_group=ON; mysql> START GROUP_REPLICATION; mysql> SET GLOBAL group_replication_bootstrap_group=OFF; # 其他節點執行 mysql> START GROUP_REPLICATION; # 查看組信息,所有節點的 MEMBER_ROLE 都為 PRIMARY mysql> SELECT * FROM performance_schema.replication_group_members; +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 5254cf46-8415-11e8-af09-fa163eab3dcf | 192.168.56.102 | 3306 | ONLINE | PRIMARY | 8.0.11 | | group_replication_applier | 601a4025-8415-11e8-b2b6-fa163e767b9a | 192.168.56.103 | 3306 | ONLINE | PRIMARY | 8.0.11 | | group_replication_applier | 8cb3f19b-8414-11e8-9d34-fa163eda7360 | 192.168.56.101 | 3306 | ONLINE | PRIMARY | 8.0.11 | +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+ 3 rows in set (0.00 sec)
可以看到所有節點狀態都是online,角色都是PRIMARY,MGR多主模式搭建成功。
切回單主模式
# 所有節點執行 mysql> stop group_replication; mysql> set global group_replication_enforce_update_everywhere_checks=OFF; mysql> set global group_replication_single_primary_mode=ON; # 主節點(192.168.56.101)執行 SET GLOBAL group_replication_bootstrap_group=ON; START GROUP_REPLICATION; SET GLOBAL group_replication_bootstrap_group=OFF; # 從節點(192.168.56.102、192.168.56.103)執行 START GROUP_REPLICATION; # 查看MGR組信息 mysql> SELECT * FROM performance_schema.replication_group_members; +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 5254cf46-8415-11e8-af09-fa163eab3dcf | 192.168.56.102 | 3306 | ONLINE | SECONDARY | 8.0.11 | | group_replication_applier | 601a4025-8415-11e8-b2b6-fa163e767b9a | 192.168.56.103 | 3306 | ONLINE | SECONDARY | 8.0.11 | | group_replication_applier | 8cb3f19b-8414-11e8-9d34-fa163eda7360 | 192.168.56.101 | 3306 | ONLINE | PRIMARY | 8.0.11 | +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+ 3 rows in set (0.00 sec)
9.異常處理:
1、注意:前面的用戶密碼修改和創建用戶操作必須設置binlog不記錄,執行后再打開,否則會引起START GROUP_REPLICATION執行報錯:
報錯信息如下:
ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log.1
mysql后台報錯信息:
2017-04-17T12:06:55.113724+08:00 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 423ccc44-2318-11
e7-96e9-fa163e6e90d0:1-4 > Group transactions: 7e29f043-2317-11e7-9594-fa163e98778e:1-5,
test-testtest-test-testtesttest:1-6'
2017-04-17T12:06:55.113825+08:00 0 [ERROR] Plugin group_replication reported: 'The member contains transactions not present in the group. The member will now exit the group.'
2017-04-17T12:06:55.113835+08:00 0 [Note] Plugin group_replication reported: 'To force this member into the group you can use the group_replication_allow_local_disjoint_gtids_join option'
2017-04-17T12:06:55.113947+08:00 3 [Note] Plugin group_replication reported: 'Going to wait for view modification'
2017-04-17T12:06:55.114493+08:00 0 [Note] Plugin group_replication reported: 'getstart group_id 4317e324'
2017-04-17T12:07:00.054225+08:00 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate'
2017-04-17T12:07:00.056324+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-04-17T12:07:00.056349+08:00 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit'
2017-04-17T12:07:00.057272+08:00 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-04-17T12:07:00.057288+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-04-17T12:07:05.069548+08:00 3 [Note] Plugin group_replication reported: 'auto_increment_increment is reset to 1'
2017-04-17T12:07:05.069644+08:00 3 [Note] Plugin group_replication reported: 'auto_increment_offset is reset to 1'
2017-04-17T12:07:05.070107+08:00 9 [Note] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
解決方案:
根據提示打開group_replication_allow_local_disjoint_gtids_join選項,mysql命令行執行:
mysql> set global group_replication_allow_local_disjoint_gtids_join=ON;
再次啟動組復制
mysql> START GROUP_REPLICATION;
2、連不上master,報錯信息如下:
2017-04-17T16:18:14.756191+08:00 25 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2017-04-17T16:18:14.814193+08:00 25 [ERROR] Slave I/O for channel 'group_replication_recovery': error connecting to master'repl_user@host-192-168-99-156:3306' - retry-time: 60 retries: 1, Error_code: 2005
2017-04-17T16:18:14.814219+08:00 25 [Note] Slave I/O thread for channel 'group_replication_recovery' killed while connecting to master
2017-04-17T16:18:14.814227+08:00 25 [Note] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'FIRST', position 4
2017-04-17T16:18:14.814342+08:00 19 [ERROR] Plugin group_replication reported: 'There was an error when connecting to the donor server. Check group replication recovery's connection credentials.'
解決方案:
添加映射
vim /etc/hosts
192.168.99.156 db1 host-192-168-99-156
192.168.99.157 db2 host-192-168-99-157
192.168.99.158 db3 host-192-168-99-158
重啟下組復制
mysql> stop group_replication;
Query OK, 0 rows affected (8.76 sec)
mysql> start group_replication;
Query OK, 0 rows affected (2.51 sec)
3.
ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log.
查看日志:
2018-06-12T09:50:17.911574Z 2 [ERROR] Plugin group_replication reported: ‘binlog_checksum should be NONE for Group Replication’
解決辦法:(三個節點都執行,不要忘記在配置文件中也要更新)
mysql> show variables like '%binlog_checksum%';
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| binlog_checksum | CRC32 |
+-----------------+-------+
1 row in set (0.00 sec)
mysql> set @@global.binlog_checksum='none';
Query OK, 0 rows affected (0.09 sec)
mysql> show variables like '%binlog_checksum%';
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| binlog_checksum | NONE |
+-----------------+-------+
1 row in set (0.00 sec)
mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (12.40 sec)
參考來源:
https://dev.mysql.com/doc/refman/5.7/en/group-replication.html
https://blog.csdn.net/mchdba/article/details/54381854
http://www.voidcn.com/article/p-aiixfpfr-brr.html
https://www.cnblogs.com/luoahong/articles/8043035.html
https://blog.csdn.net/i_am_wangjian/article/details/80508663