一、MariaDB Galera Cluster 部署介紹
MariaDB作為Mysql的一個分支,在開源項目中已經廣泛使用,例如大熱的openstack,所以,為了保證服務的高可用性,同時提高系統的負載能力,集群部署是必不可少的。
MariaDB Galera Cluster 介紹
Galera Cluster是由第三方公司Codership所研發的一套免費開源的集群高可用方案,實現了數據零丟失,官網地址為http://galeracluster.com/。其在MySQLInnoDB存儲引擎基礎上打了wrep(虛擬全同步復制),Percona/MariaDB已捆綁在各自的發行版本中。
MariaDB Galera Cluster是MariaDB同步多主機集群。它僅支持XtraDB/InnoDB存儲引擎(雖然有對MyISAM實驗支持,具體看wsrep_replicate_myisam系統變量)。
MariaDB Galera Cluster主要功能:
l 同步復制
l 真正的multi-master,即所有節點可以同時讀寫數據庫
l 自動的節點成員控制,失效節點自動被清除
l 新節點加入數據自動復制
l 真正的並行復制,行級
l 用戶可以直接連接集群,使用感受上與MySQL完全一致
優勢:
l 因為是多主,所以不存在Slavelag(延遲)
l 不存在丟失事務的情況
l 同時具有讀和寫的擴展能力
l 更小的客戶端延遲
l 節點間數據是同步的,而Master/Slave模式是異步的,不同slave上的binlog可能是不同的
缺點:
l 加入新節點時開銷大,需要復制完整的數據
l 不能有效地解決寫擴展的問題,所有的寫操作都發生在所有的節點
l 有多少個節點,就有多少份重復的數據
l 由於事務提交需要跨節點通信,即涉及分布式事務操作,因此寫入會比主從復制慢很多,節點越多,寫入越慢,死鎖和回滾也會更加頻繁
l 對網絡要求比較高,如果網絡出現波動不穩定,則可能會造成兩個節點失聯,Galera Cluster集群會發生腦裂,服務將不可用
還有一些地方存在局限:
l 僅支持InnoDB/XtraDB存儲引擎,任何寫入其他引擎的表,包括mysql.*表都不會被復制。但是DDL語句可以復制,但是insert into mysql.user(MyISAM存儲引擎)之類的插入數據不會被復制
l Delete操作不支持沒有主鍵的表,因為沒有主鍵的表在不同的節點上的順序不同,如果執行select … limit …將出現不同的結果集
l LOCK/UNLOCK TABLES/FLUSH TABLES WITH READ LOCKS不支持單表所鎖,以及鎖函數GET_LOCK()、RELEASE_LOCK(),但FLUSH TABLES WITH READ LOCK支持全局表鎖
l General Query Log日志不能保存在表中,如果開始查詢日志,則只能保存到文件中
l 不能有大事務寫入,不能操作wsrep_max_ws_rows=131072(行),且寫入集不能超過wsrep_max_ws_size=1073741824(1GB),否則客戶端直接報錯
l 由於集群是樂觀鎖並發控制,因此,在commit階段會有事務沖突發生。如果兩個事務在集群中的不同節點上對同一行寫入並提交,則失敗的節點將回滾,客戶端返回死鎖報錯
l XA分布式事務不支持Codership Galera Cluster,在提交時可能會回滾
l 整個集群的寫入吞吐量取決於最弱的節點限制,集群要使用同一的配置
技術:
Galera集群的復制功能是基於認證的復制,其流程如下:
當客戶端發出一個commit的指令,在事務被提交之前,所有對數據庫的更改都會被write-set收集起來,並且將write-set 記錄的內容發送給其他節點。
write-set 將在每個節點上使用搜索到的主鍵進行確認性認證測試,測試結果決定着節點是否應用write-set更改數據。如果認證測試失敗,節點將丟棄 write-set ;如果認證測試成功,則事務提交,工作原理如下圖:
關於新節點的加入,流程如下:
新加入的節點叫做Joiner,給Joiner提供復制的節點叫Donor。在該過程中首先會檢查本地grastate.dat文件的seqno事務號是否在遠端donor節點galera.cache文件里,如果存在,那么進行Incremental State Transfer(IST)增量同步復制,將剩余的事務發送過去;如果不存在那么進行State Snapshot Transfer(SST)全量同步復制。SST有三種全量拷貝方式:mysqldump、rsync和xtrabackup。SST的方法可以通過wsrep_sst_method這個參數來設置。
備注:
SST是指從donor到joiner的數據全量拷貝,它通常使用在一個新的節點加入時,為了與集群同步,新的節點不得不去一個已經在集群中的節點上拷貝數據,在PXC(Percona Xtradb Cluster)中,有三種SST的方法,mysqldump,rsync,Xtrabackup。
建議使用XtraBackup,另外對XtraBackup補充說明:
在XtraBackup 2.1.x版本里,使用innobackupex備份時,備份流程如下:
1. 備份InnoDB表數據
2. 執行全局表讀鎖FLUSH TABLES WITH READ LOCKS
3. 拷貝.frm和MyISAM表數據
4. 得到當前的binlog文件名和position點
5. 完成redo log事務日志的后台復制
6. 解鎖UNLOCK TABLES
由上面可以看出如果備份好幾張MyISAM存儲的大表時,將會進行鎖表。
二、環境准備
環境說明:安裝MariaDB集群至少需要3台服務器(如果只有兩台的話需要特殊配置,請參照官方文檔)
1、硬件規划
2、系統版本
root@mariadb-node1 ~]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
3、關閉防火牆
[root@mariadb-node1 ~]# sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
[root@mariadb-node1 ~]# setenforce 0
setenforce: SELinux is disabled
4、selinux
[root@mariadb-node1 ~]# systemctl stop firewalld.service
[root@mariadb-node1 ~]# systemctl disable firewalld.service
5、配置域名解析
[root@mariadb-node1 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.120 mariadb-node1
192.168.1.121 mariadb-node2
192.168.1.122 mariadb-node3
6、加大文件描述符
vi /etc/security/limits.conf
* soft nofile 65536 * hard nofile 65536
vi /etc/sysctl.conf
fs.file-max=655350 net.ipv4.ip_local_port_range = 1025 65000 net.ipv4.tcp_tw_recycle = 1
最后執行:
# sysctl -p
安裝Percona XtraBackup熱備份工具
下載地址:
tar -zxvf percona-xtrabackup-2.4.6-Linux-x86_64.tar.gz cd percona-xtrabackup-2.4.6-Linux-x86_64/bin/ cp -a * /usr/bin/
創建XtraBackup備份時用的用戶名和密碼:
MariaDB [(none)]> grant all on *.* to 'galera'@'localhost' identified by '123456';
5、配置mariadb源
備注:從MariaDB 10.1.20 版本開始,Galera Cluster就已經包含在MariaDB包里面了,不需要單獨部署MariaDB-Galera-server 和galera 包。
使用YUM方式部署MariaDB Galera Cluster。
#三台機器同時配置mariadb源
#這里用的是openstack-newton的源,里面包含了mariadb yum install centos-release-openstack-newton -y #查看openstack-newton源是否存在 [root@mariadb-node1 ~]# cd /etc/yum.repos.d/ [root@mariadb-node1 yum.repos.d]# ll total 52 -rw-r--r--. 1 root root 2573 Nov 21 2014 CentOS-Base.repo #阿里雲國內yum源 -rw-r--r--. 1 root root 1664 Dec 9 2015 CentOS-Base.repo.backup -rw-r--r-- 1 root root 1056 Sep 6 2016 CentOS-Ceph-Jewel.repo -rw-r--r--. 1 root root 1309 Dec 9 2015 CentOS-CR.repo -rw-r--r--. 1 root root 649 Dec 9 2015 CentOS-Debuginfo.repo -rw-r--r--. 1 root root 290 Dec 9 2015 CentOS-fasttrack.repo -rw-r--r--. 1 root root 630 Dec 9 2015 CentOS-Media.repo -rw-r--r-- 1 root root 1113 Jun 23 2017 CentOS-OpenStack-newton.repo -rw-r--r-- 1 root root 509 Sep 12 22:11 CentOS-QEMU-EV.repo -rw-r--r--. 1 root root 1331 Dec 9 2015 CentOS-Sources.repo -rw-r--r--. 1 root root 1952 Dec 9 2015 CentOS-Vault.repo -rw-r--r--. 1 root root 951 Oct 3 01:44 epel.repo #阿里雲國內epel源 -rw-r--r--. 1 root root 1050 Oct 3 01:44 epel-testing.repo
#更新緩存
yum clean all yum makecache
三、安裝 MariaDB Galera Cluster (#備注:三台機器同時操作,並把yum源改成國內阿里源)
yum install mariadb mariadb-galera-server mariadb-galera-common galera rsync -y
#配置mariadb
下面我們開始配置MariaDB Galera Cluster,分別修改MariaDB Galera集群的每個節點上的/etc/my.cnf.d/server.cnf文件,具體每個節點的內容如下:
1、192.168.1.120節點的/etc/my.cnf.d/server.cnf文件內容:
[root@mariadb-node1 ~]# cat /etc/my.cnf.d/server.cnf
[server]
[mysqld]
server_id=129
datadir=/app/galera
user=mysql
skip-external-locking
skip-name-resolve
character-set-server=utf8
[galera]
wsrep_causal_reads=ON
wsrep_provider_options="gcache.size=4G"
wsrep_certify_nonPK=ON
query_cache_size=0
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_name=MariaDB-Galera-Cluster
wsrep_cluster_address="gcomm://192.168.1.120,192.168.1.121,192.168.1.122"
wsrep_node_name=mariadb-a04
wsrep_node_address=192.168.1.120
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
wsrep_slave_threads=8
innodb_flush_log_at_trx_commit=0
innodb_buffer_pool_size=2G
wsrep_sst_method=rsync
[embedded]
[mariadb]
[mariadb-10.1]
# 上面配置使用的是rsync方式同步數據,如果要使用xtrabackup方式(建議使用),需要設置:
wsrep_sst_auth=galera:123456
wsrep_sst_method=xtrabackup-v2 #默認是rsync全量拷貝,但是需要在donor節點上執行全局讀鎖(flushtables with read lock),建議采用xtrabackup熱備份方式,只有在備份.frm表結構文件才會鎖表
2、192.168.1.121節點的/etc/my.cnf.d/server.cnf文件內容:
[root@mariadb-node2 ~]# vi /etc/my.cnf.d/server.cnf [server] [mysqld] server_id=129 datadir=/app/galera user=mysql skip-external-locking skip-name-resolve character-set-server=utf8 [galera] wsrep_causal_reads=ON wsrep_provider_options="gcache.size=4G" wsrep_certify_nonPK=ON query_cache_size=0 wsrep_on=ON wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_name=MariaDB-Galera-Cluster wsrep_cluster_address="gcomm://192.168.1.120,192.168.1.121,192.168.1.122" wsrep_node_name=mariadb-a04 wsrep_node_address=192.168.1.121 binlog_format=row default_storage_engine=InnoDB innodb_autoinc_lock_mode=2 wsrep_slave_threads=8 innodb_flush_log_at_trx_commit=0 innodb_buffer_pool_size=2G wsrep_sst_method=rsync [embedded] [mariadb] [mariadb-10.1]
3、192.168.1.122節點的/etc/my.cnf.d/server.cnf文件內容:
[root@mariadb-node3 ~]# vi /etc/my.cnf.d/server.cnf [server] [mysqld] server_id=130 datadir=/app/galera user=mysql skip-external-locking skip-name-resolve character-set-server=utf8 [galera] wsrep_causal_reads=ON wsrep_provider_options="gcache.size=4G" wsrep_certify_nonPK=ON query_cache_size=0 wsrep_on=ON wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_name=MariaDB-Galera-Cluster wsrep_cluster_address="gcomm://192.168.1.120,192.168.1.121,192.168.1.122" wsrep_node_name=mariadb-a05 wsrep_node_address=192.168.1.122 binlog_format=row default_storage_engine=InnoDB innodb_autoinc_lock_mode=2 wsrep_slave_threads=8 innodb_flush_log_at_trx_commit=0 innodb_buffer_pool_size=2G wsrep_sst_method=rsync [embedded] [mariadb] [mariadb-10.1]
4、MariaDB一個節點初始化安裝(192.168.1.120):
mysql_install_db --defaults-file=/etc/my.cnf.d/server.cnf --user=mysql
5、在192.168.1.120節點上通過bootstrap啟動(第一次啟動一定要使用--wsrep-new-cluster,再次啟動就不需要)
mysqld_safe --defaults-file=/etc/my.cnf.d/server.cnf --user=mysql --wsrep-new-cluster &
6、初始化Mariadb,設置root密碼與安全設置 (在192.168.1.120節點上面操作)
[root@mariadb-node1 ~]# mysql_install_db --defaults-file=/etc/my.cnf.d/server.cnf --user=mysql Installing MariaDB/MySQL system tables in '/app/galera' ... 2017-12-24 3:42:27 139950307854528 [Note] /usr/libexec/mysqld (mysqld 10.1.20-MariaDB) starting as process 22527 ... 2017-12-24 3:42:27 139950307854528 [Note] WSREP: Read nil XID from storage engines, skipping position init 2017-12-24 3:42:27 139950307854528 [Note] WSREP: wsrep_load(): loading provider library 'none' 2017-12-24 3:42:28 139950307854528 [Note] InnoDB: Using mutexes to ref count buffer pool pages 2017-12-24 3:42:28 139950307854528 [Note] InnoDB: The InnoDB memory heap is disabled 2017-12-24 3:42:28 139950307854528 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2017-12-24 3:42:28 139950307854528 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier 2017-12-24 3:42:28 139950307854528 [Note] InnoDB: Compressed tables use zlib 1.2.7 2017-12-24 3:42:28 139950307854528 [Note] InnoDB: Using Linux native AIO 2017-12-24 3:42:28 139950307854528 [Note] InnoDB: Using SSE crc32 instructions 2017-12-24 3:42:28 139950307854528 [Note] InnoDB: Initializing buffer pool, size = 2.0G 2017-12-24 3:42:29 139950307854528 [Note] InnoDB: Completed initialization of buffer pool 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: The first specified data file ./ibdata1 did not exist: a new database to be created! 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Setting file ./ibdata1 size to 12 MB 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Database physically writes the file full: wait... 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Setting log file ./ib_logfile101 size to 48 MB 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Setting log file ./ib_logfile1 size to 48 MB 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0 2017-12-24 3:42:30 139950307854528 [Warning] InnoDB: New log files created, LSN=45883 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Doublewrite buffer not found: creating new 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Doublewrite buffer created 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: 128 rollback segment(s) are active. 2017-12-24 3:42:30 139950307854528 [Warning] InnoDB: Creating foreign key constraint system tables. 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Foreign key constraint system tables created 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Creating tablespace and datafile system tables. 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Tablespace and datafile system tables created. 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Creating zip_dict and zip_dict_cols system tables. 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: zip_dict and zip_dict_cols system tables created. 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Waiting for purge to start 2017-12-24 3:42:30 139950307854528 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.34-79.1 started; log sequence number 0 2017-12-24 3:42:30 139947298572032 [Note] InnoDB: Dumping buffer pool(s) not yet started OK Filling help tables... 2017-12-24 3:42:34 139725903902912 [Note] /usr/libexec/mysqld (mysqld 10.1.20-MariaDB) starting as process 22557 ... 2017-12-24 3:42:34 139725903902912 [Note] WSREP: Read nil XID from storage engines, skipping position init 2017-12-24 3:42:34 139725903902912 [Note] WSREP: wsrep_load(): loading provider library 'none' 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: Using mutexes to ref count buffer pool pages 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: The InnoDB memory heap is disabled 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: Compressed tables use zlib 1.2.7 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: Using Linux native AIO 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: Using SSE crc32 instructions 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: Initializing buffer pool, size = 2.0G 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: Completed initialization of buffer pool 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: Highest supported file format is Barracuda. 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: 128 rollback segment(s) are active. 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: Waiting for purge to start 2017-12-24 3:42:34 139725903902912 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.34-79.1 started; log sequence number 1622818 2017-12-24 3:42:34 139722923599616 [Note] InnoDB: Dumping buffer pool(s) not yet started OK Creating OpenGIS required SP-s... 2017-12-24 3:42:38 139838952904896 [Note] /usr/libexec/mysqld (mysqld 10.1.20-MariaDB) starting as process 22587 ... 2017-12-24 3:42:38 139838952904896 [Note] WSREP: Read nil XID from storage engines, skipping position init 2017-12-24 3:42:38 139838952904896 [Note] WSREP: wsrep_load(): loading provider library 'none' 2017-12-24 3:42:38 139838952904896 [Note] InnoDB: Using mutexes to ref count buffer pool pages 2017-12-24 3:42:38 139838952904896 [Note] InnoDB: The InnoDB memory heap is disabled 2017-12-24 3:42:38 139838952904896 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2017-12-24 3:42:38 139838952904896 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier 2017-12-24 3:42:38 139838952904896 [Note] InnoDB: Compressed tables use zlib 1.2.7 2017-12-24 3:42:38 139838952904896 [Note] InnoDB: Using Linux native AIO 2017-12-24 3:42:38 139838952904896 [Note] InnoDB: Using SSE crc32 instructions 2017-12-24 3:42:38 139838952904896 [Note] InnoDB: Initializing buffer pool, size = 2.0G 2017-12-24 3:42:39 139838952904896 [Note] InnoDB: Completed initialization of buffer pool 2017-12-24 3:42:39 139838952904896 [Note] InnoDB: Highest supported file format is Barracuda. 2017-12-24 3:42:39 139838952904896 [Note] InnoDB: 128 rollback segment(s) are active. 2017-12-24 3:42:39 139838952904896 [Note] InnoDB: Waiting for purge to start 2017-12-24 3:42:39 139838952904896 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.34-79.1 started; log sequence number 1622828 2017-12-24 3:42:39 139835976857344 [Note] InnoDB: Dumping buffer pool(s) not yet started OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MariaDB root USER ! To do so, start the server, then issue the following commands: '/usr/bin/mysqladmin' -u root password 'new-password' '/usr/bin/mysqladmin' -u root -h 192.168.1.120 password 'new-password' Alternatively you can run: '/usr/bin/mysql_secure_installation' which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the MariaDB Knowledgebase at http://mariadb.com/kb or the MySQL manual for more instructions. You can start the MariaDB daemon with: cd '/usr' ; /usr/bin/mysqld_safe --datadir='/app/galera' You can test the MariaDB daemon with mysql-test-run.pl cd '/usr/mysql-test' ; perl mysql-test-run.pl Please report any problems at http://mariadb.org/jira The latest information about MariaDB is available at http://mariadb.org/. You can find additional information about the MySQL part at: http://dev.mysql.com Consider joining MariaDB's strong and vibrant community: https://mariadb.org/get-involved/
#192.168.1.120 節點啟動MariaDB
[root@mariadb-node1 ~]# mysqld_safe --defaults-file=/etc/my.cnf.d/server.cnf --user=mysql --wsrep-new-cluster & [1] 22614 [root@mariadb-node1 ~]# 171224 03:43:26 mysqld_safe Logging to '/app/galera/mariadb-node1.err'. 171224 03:43:26 mysqld_safe Starting mysqld daemon with databases from /app/galera 171224 03:43:26 mysqld_safe WSREP: Running position recovery with --log_error='/app/galera/wsrep_recovery.vISRp0' --pid-file='/app/galera/mariadb-node1-recover.pid' 171224 03:43:29 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1 [root@mariadb-node1 ~]#
#檢查mysqld是否啟動
[root@mariadb-node1 ~]# netstat -lntup
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1080/sshd
tcp 0 0 0.0.0.0:4567 0.0.0.0:* LISTEN 22806/mysqld
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1179/master
tcp6 0 0 :::3306 :::* LISTEN 22806/mysqld
tcp6 0 0 :::22 :::* LISTEN 1080/sshd
tcp6 0 0 ::1:25 :::* LISTEN 1179/master
7、其它兩個節點, 初始化MariaDB(備注:192.168.1.121,192.168.1.122)
mysql_install_db --defaults-file=/etc/my.cnf.d/server.cnf --user=mysql
8、啟動MariaDB(備注:192.168.1.121,192.168.1.122)
mysqld_safe --defaults-file=/etc/my.cnf.d/server.cnf --user=mysql &
四、驗證操作(備注:三個節點,相同操作)
1、查看集群節點
[root@mariadb-node1 ~]# mysql -uroot -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 12 Server version: 10.1.20-MariaDB MariaDB Server Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | +--------------------+ 3 rows in set (0.00 sec) #查看幾個集群節點 MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size'; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | wsrep_cluster_size | 3 | +--------------------+-------+ 1 row in set (0.00 sec) #查看集群節點 MariaDB [(none)]> show global status like 'ws%'; +------------------------------+----------------------------------------------------------+ | Variable_name | Value | +------------------------------+----------------------------------------------------------+ | wsrep_apply_oooe | 0.000000 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 0.000000 | | wsrep_causal_reads | 8 | | wsrep_cert_deps_distance | 0.000000 | | wsrep_cert_index_size | 0 | | wsrep_cert_interval | 0.000000 | | wsrep_cluster_conf_id | 2 | | wsrep_cluster_size | 3 | | wsrep_cluster_state_uuid | 8d1b5d98-e819-11e7-96c9-9a239fd041bf | | wsrep_cluster_status | Primary | | wsrep_commit_oooe | 0.000000 | | wsrep_commit_oool | 0.000000 | | wsrep_commit_window | 0.000000 | | wsrep_connected | ON | | wsrep_desync_count | 0 | | wsrep_evs_delayed | | | wsrep_evs_evict_list | | | wsrep_evs_repl_latency | 0/0/0/0/0 | | wsrep_evs_state | OPERATIONAL | | wsrep_flow_control_paused | 0.000000 | | wsrep_flow_control_paused_ns | 0 | | wsrep_flow_control_recv | 0 | | wsrep_flow_control_sent | 0 | | wsrep_gcomm_uuid | 8d1a9b25-e819-11e7-8d07-62fb3ba88975 | | wsrep_incoming_addresses | 192.168.1.122:3306,192.168.1.121:3306,192.168.1.120:3306 | | wsrep_last_committed | 3 | | wsrep_local_bf_aborts | 0 | | wsrep_local_cached_downto | 18446744073709551615 | | wsrep_local_cert_failures | 0 | | wsrep_local_commits | 0 | | wsrep_local_index | 2 | | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_avg | 0.375000 | | wsrep_local_recv_queue_max | 3 | | wsrep_local_recv_queue_min | 0 | | wsrep_local_replays | 0 | | wsrep_local_send_queue | 0 | | wsrep_local_send_queue_avg | 0.000000 | | wsrep_local_send_queue_max | 1 | | wsrep_local_send_queue_min | 0 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_local_state_uuid | 8d1b5d98-e819-11e7-96c9-9a239fd041bf | | wsrep_protocol_version | 7 | | wsrep_provider_name | Galera | | wsrep_provider_vendor | Codership Oy <info@codership.com> | | wsrep_provider_version | 3.16(r5c765eb) | | wsrep_ready | ON | | wsrep_received | 8 | | wsrep_received_bytes | 861 | | wsrep_repl_data_bytes | 0 | | wsrep_repl_keys | 0 | | wsrep_repl_keys_bytes | 0 | | wsrep_repl_other_bytes | 0 | | wsrep_replicated | 0 | | wsrep_replicated_bytes | 0 | | wsrep_thread_count | 9 | +------------------------------+----------------------------------------------------------+ 58 rows in set (0.00 sec)
可以看到集群正常使用。
注釋:
wsrep_cluster_status為Primary,表示節點為主節點,正常讀寫。
wsrep_ready為ON,表示集群正常運行。
wsrep_cluster_size為3,表示集群有三個節點。
2、創建MyISAM表測試 (備注:192.168.1.120)
[root@mariadb-node1 ~]# mysql -uroot -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 13 Server version: 10.1.20-MariaDB MariaDB Server Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
#創建數據庫 MariaDB [(none)]> create database crm character set=utf8; Query OK, 1 row affected (0.00 sec) MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | crm | | information_schema | | mysql | | performance_schema | +--------------------+ 4 rows in set (0.00 sec) MariaDB [(none)]> use crm Database changed
#創建MyISAM表 MariaDB [crm]> create table myisam_tbl (id int,name text) ENGINE MyISAM; Query OK, 0 rows affected (0.00 sec) MariaDB [crm]> insert into myisam_tbl values(1,'jojo'); Query OK, 1 row affected (0.00 sec) MariaDB [crm]> insert into myisam_tbl values(1,'nulige'); Query OK, 1 row affected (0.00 sec) MariaDB [crm]> show tables; +---------------+ | Tables_in_crm | +---------------+ | myisam_tbl | +---------------+
#查看表內容 (備注:在192.168.1.120上面操作)
MariaDB [crm]> select * from myisam_tbl; +------+--------+ | id | name | +------+--------+ | 1 | jojo | | 1 | nulige | +------+--------+ 2 rows in set (0.00 sec)
#其它節點查看數據庫(備注:192.168.1.121,192.168.1.122)
[root@mariadb-node2 ~]# mysql -uroot -p
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 14
Server version: 10.1.20-MariaDB MariaDB Server
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| crm |
| information_schema |
| mysql |
| performance_schema |
+--------------------+
4 rows in set (0.00 sec)
MariaDB [(none)]> use crm;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MariaDB [crm]> show tables;
+---------------+
| Tables_in_crm |
+---------------+
| myisam_tbl |
+---------------+
1 row in set (0.00 sec)
#查看表內容,沒有同步過來
MariaDB [crm]> select * from myisam_tbl;
Empty set (0.00 sec)
備注:可以看到MyISAM存儲的表,Galera不支持同步。它僅支持XtraDB/ InnoDB存儲引擎(雖然有對MyISAM實驗支持,具體看wsrep_replicate_myisam系統變量)。
3、驗證InnoDB存儲的表
[root@mariadb-node1 ~]# mysql -uroot -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 14 Server version: 10.1.20-MariaDB MariaDB Server Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. #創建數據庫 MariaDB [crm]> create database kuaiwei character set=utf8; Query OK, 1 row affected (0.01 sec) MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | crm | | information_schema | | kuaiwei | | mysql | | performance_schema | +--------------------+ 5 rows in set (0.00 sec) MariaDB [(none)]> use kuaiwei Database changed #創建表 MariaDB [kuaiwei]> create table innodb_tbl(id int,name text) ENGINE InnoDB; Query OK, 0 rows affected (0.01 sec) MariaDB [kuaiwei]> insert into innodb_tbl values(1,'jojo'); Query OK, 1 row affected (0.00 sec) MariaDB [kuaiwei]> insert into innodb_tbl values(1,'nulige'); Query OK, 1 row affected (0.01 sec) MariaDB [kuaiwei]> show tables; +-------------------+ | Tables_in_kuaiwei | +-------------------+ | innodb_tbl | +-------------------+ 1 row in set (0.00 sec) MariaDB [kuaiwei]> select * from innodb_tbl; +------+--------+ | id | name | +------+--------+ | 1 | jojo | | 1 | nulige | +------+--------+ 2 rows in set (0.01 sec) MariaDB [kuaiwei]> exit Bye
#登錄其它節點查看數據庫,表中內容(備注:192.168.1.121,192.168.1.122)
[root@mariadb-node2 ~]# mysql -uroot -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 15 Server version: 10.1.20-MariaDB MariaDB Server Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | crm | | information_schema | | kuaiwei | | mysql | | performance_schema | +--------------------+ 5 rows in set (0.00 sec) MariaDB [(none)]> use kuaiwei; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [kuaiwei]> show tables; +-------------------+ | Tables_in_kuaiwei | +-------------------+ | innodb_tbl | +-------------------+ 1 row in set (0.00 sec) MariaDB [kuaiwei]> select * from innodb_tbl; +------+--------+ | id | name | +------+--------+ | 1 | jojo | | 1 | nulige | +------+--------+ 2 rows in set (0.00 sec) MariaDB [kuaiwei]> exit Bye
4、模擬故障:
停掉192.168.1.120 服務器上面的mariadb
[root@mariadb-node1 ~]# mysqladmin -uroot -p "shutdown" Enter password: #輸入數據庫密碼 #檢查數據庫是否啟動 [root@mariadb-node1 ~]# netstat -lntup Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1080/sshd tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1179/master tcp6 0 0 :::22 :::* LISTEN 1080/sshd tcp6 0 0 ::1:25 :::* LISTEN 1179/master
然后在其他節點(192.168.1.121,192.168.1.122)上面執行:
[root@mariadb-node2 ~]# mysql -uroot -p
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 16
Server version: 10.1.20-MariaDB MariaDB Server
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> show global status like 'wsrep%';
+------------------------------+---------------------------------------+
| Variable_name | Value |
+------------------------------+---------------------------------------+
| wsrep_apply_oooe | 0.000000 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 1.000000 |
| wsrep_causal_reads | 24 |
| wsrep_cert_deps_distance | 1.166667 |
| wsrep_cert_index_size | 6 |
| wsrep_cert_interval | 0.000000 |
| wsrep_cluster_conf_id | 3 |
| wsrep_cluster_size | 2 |
| wsrep_cluster_state_uuid | 8d1b5d98-e819-11e7-96c9-9a239fd041bf |
| wsrep_cluster_status | Primary |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 1.000000 |
| wsrep_connected | ON |
| wsrep_desync_count | 0 |
| wsrep_evs_delayed | |
| wsrep_evs_evict_list | |
| wsrep_evs_repl_latency | 0/0/0/0/0 |
| wsrep_evs_state | OPERATIONAL |
| wsrep_flow_control_paused | 0.000000 |
| wsrep_flow_control_paused_ns | 0 |
| wsrep_flow_control_recv | 0 |
| wsrep_flow_control_sent | 0 |
| wsrep_gcomm_uuid | 71aa0335-e81b-11e7-a8bc-dbee4f046e45 |
| wsrep_incoming_addresses | 192.168.1.122:3306,192.168.1.121:3306 |
| wsrep_last_committed | 9 |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_cached_downto | 4 |
| wsrep_local_cert_failures | 0 |
| wsrep_local_commits | 0 |
| wsrep_local_index | 1 |
| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_avg | 0.000000 |
| wsrep_local_recv_queue_max | 1 |
| wsrep_local_recv_queue_min | 0 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_avg | 0.000000 |
| wsrep_local_send_queue_max | 1 |
| wsrep_local_send_queue_min | 0 |
| wsrep_local_state | 4 |
| wsrep_local_state_comment | Synced |
| wsrep_local_state_uuid | 8d1b5d98-e819-11e7-96c9-9a239fd041bf |
| wsrep_protocol_version | 7 |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy <info@codership.com> |
| wsrep_provider_version | 3.16(r5c765eb) |
| wsrep_ready | ON |
| wsrep_received | 10 |
| wsrep_received_bytes | 2894 |
| wsrep_repl_data_bytes | 0 |
| wsrep_repl_keys | 0 |
| wsrep_repl_keys_bytes | 0 |
| wsrep_repl_other_bytes | 0 |
| wsrep_replicated | 0 |
| wsrep_replicated_bytes | 0 |
| wsrep_thread_count | 9 |
+------------------------------+---------------------------------------+
58 rows in set (0.00 sec)
MariaDB [(none)]> exit
Bye
此時集群為自動將192.168.1.120故障節點剔除掉,並且正常提供服務。
最后我們恢復失敗的節點(192.168.1.120):
[root@mariadb-node1 ~]# mysqld_safe --defaults-file=/etc/my.cnf.d/server.cnf --user=mysql &
再查看集群環境:
[root@mariadb-node2 ~]# mysql -uroot -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 17 Server version: 10.1.20-MariaDB MariaDB Server Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
#又恢復成了三個節點 MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size'; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | wsrep_cluster_size | 3 | +--------------------+-------+ 1 row in set (0.00 sec) MariaDB [(none)]> exit Bye
5、模擬腦裂后的處理
下面模擬在網絡抖動發生丟包的情況下,兩個節點失聯導致腦裂。首先,在192.168.1.121和192.168.1.122兩個節點上分別執行:
iptables -A INPUT -p tcp --sport 4567 -j DROP
iptables -A INPUT -p tcp --dport 4567 -j DROP
以上命令用來禁止wsrep全同步復制4567端口通信。
然后我們在192.168.1.120節點查看:
MariaDB [(none)]> show global statuslike 'ws%'; 可以看到下面的幾個值: wsrep_cluster_size 1 wsrep_cluster_status non-Primary wsrep_ready OFF MariaDB [(none)]> use test_db; ERROR 1047 (08S01): WSREP has not yetprepared node for application use MariaDB [(none)]> select@@wsrep_node_name; ERROR 1205 (HY000): Lock wait timeoutexceeded; try restarting transaction
現在已經出現腦裂的情況,並且集群無法執行任何命令。
為了解決這個問題,可以執行:
set global wsrep_provider_options="pc.bootstrap=true";
通過這個命令來強制恢復出現腦裂的節點。
下面我們來驗證一下:
MariaDB [(none)]> select @@wsrep_node_name; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction MariaDB [(none)]> set global wsrep_provider_options="pc.bootstrap=true"; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> select @@wsrep_node_name; +-------------------+ | @@wsrep_node_name | +-------------------+ | mariadb-a03 | +-------------------+ 1 row in set (0.27 sec) MariaDB [(none)]> use test_db; Reading table information for completion oft able and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [test_db]> show tables; +-------------------+ | Tables_in_test_db | +-------------------+ | innodb_tbl | | myisam_tbl | +-------------------+
最后我們將節點192.168.1.121和192.168.1.122恢復一下,只要清理一下iptables表即可(因為我的是測試環境,生產環境需要刪除上面的規則即可):
iptables –F
各個節點驗證一下:
192.168.1.120: MariaDB [test_db]> SHOW STATUS LIKE 'wsrep_cluster_size'; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | wsrep_cluster_size | 3 | +--------------------+-------+ 1 row in set (0.00 sec) 192.168.1.121: MariaDB [(none)]> select @@wsrep_node_name; +-------------------+ | @@wsrep_node_name | +-------------------+ | mariadb-node2 | +-------------------+
6、避免臟讀
Galera Cluster不是真正意義上的全同步復制,存在延遲。我們可以在一個節點上面執行FLUSH TABLES WITH READ LOCK;全局讀鎖。
然后在其他節點執行寫操作,觀察延遲情況。
比如我們在192.168.1.122節點執行全局讀鎖設置:
[root@mariadb-node3 ~]# mysql -uroot -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 13 Server version: 10.1.20-MariaDB MariaDB Server Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | crm | | information_schema | | kuaiwei | | mysql | | performance_schema | +--------------------+ 5 rows in set (0.00 sec) MariaDB [(none)]> use kuaiwei; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed #執行全局讀鎖設置 MariaDB [kuaiwei]> flush tables with read lock; Query OK, 0 rows affected (0.00 sec) MariaDB [kuaiwei]> select * from innodb_tbl; +------+--------+ | id | name | +------+--------+ | 1 | jojo | | 1 | nulige | +------+--------+ 2 rows in set (0.00 sec)
然后在192.168.1.120節點插入操作
[root@mariadb-node1 ~]# mysql -uroot -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 11 Server version: 10.1.20-MariaDB MariaDB Server Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | crm | | information_schema | | kuaiwei | | mysql | | performance_schema | +--------------------+ 5 rows in set (0.00 sec) MariaDB [(none)]> use kuaiwei; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [kuaiwei]> select @@wsrep_node_name; +-------------------+ | @@wsrep_node_name | +-------------------+ | mariadb-a04 | +-------------------+ 1 row in set (0.00 sec) MariaDB [kuaiwei]> insert into innodb_tbl values(2,'li men'); Query OK, 1 row affected (0.00 sec) MariaDB [kuaiwei]> select * from innodb_tb1; ERROR 1146 (42S02): Table 'kuaiwei.innodb_tb1' doesn't exist MariaDB [kuaiwei]> insert into innodb_tbl values(2,'hbase'); Query OK, 1 row affected (0.00 sec) MariaDB [kuaiwei]> select * from innodb_tbl; +------+--------+ | id | name | +------+--------+ | 1 | jojo | | 1 | nulige | | 2 | li men | | 2 | hbase | +------+--------+ 4 rows in set (0.00 sec)
在節點192.168.1.122上測試查詢操作:
MariaDB [kuaiwei]> select * from innodb_tbl; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
這里之所以沒有讀取到臟數據,是因為我在MariaDB配置文件中設置了wsrep_causal_reads=ON;
我們將wsrep_causal_reads修改為0或OFF來看一下效果:
MariaDB [kuaiwei]> set wsrep_causal_reads=0; ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect... Connection id: 16 Current database: kuaiwei Query OK, 0 rows affected, 1 warning (13.43 sec) MariaDB [kuaiwei]> select * from innodb_tbl; +------+--------+ | id | name | +------+--------+ | 1 | jojo | | 1 | nulige | | 2 | li men | | 2 | hbase | +------+--------+ 4 rows in set (0.00 sec) MariaDB [kuaiwei]> exit Bye
通過上面的一系列測試,最后總結一下:
1、在生產環境下應該避免使用大事務,不建議在高並發寫入場景下使用Galera Cluster架構,會導致集群限流,從而引起整個集群hang住,出現生產故障。針對這種情況可以考慮主從,實現讀寫分離等手段。
2、對數據一致性要求較高,並且數據寫入不頻繁,數據庫容量也不大(50GB左右),網絡狀況良好的情況下,可以考慮使用Galera方案。