在MariaDB 的高可用集群部署中,主流的实现方案还是Galera Cluster 与HAProxy和Pacemaker 的集成方案。
其中, Galera Cluster 负责数据库节点之间的数据同步和一致性, Pacemaker 负责MariaDB 资源服务的启停和运行监控等功能,HAProxy 主要负责数据库对外服务的负载均衡。
MariaDB Galera Cluster 部署的第一步就是软件安装,在RHEL 和CentOS版本的Linux 系统中,建议通过Redhat 的RDO 源进行MariaDB Galera Cluster 的部署。与部署单机版的MariaDB 数据库不同,在Galera 集群部署中,不再是安装MariaDB-sever 软件包,而是安装MariaDB-Galera-server 和Galera 软件包。在RDO 源配置完成后,将会看到yum 仓库中新增了上述两个软件包。
Galera是一个MySQL(也支持MariaDB,Percona)的同步多主集群软件,目前只支持InnoDB引擎。

主要功能:
同步复制
真正的multi-master,即所有节点可以同时读写数据库
自动的节点成员控制,失效节点自动被清除
新节点加入数据自动复制
真正的并行复制,行级
用户可以直接连接集群,使用感受上与MySQL完全一致
优势:
因为是多主,所以不存在Slave lag(延迟)
不存在丢失交易的情况
同时具有读和写的扩展能力
更小的客户端延迟
节点间数据是同步的,而Master/Slave模式是异步的,不同slave上的binlog可能是不同的
技术:
Galera集群的复制功能基于Galera library实现,为了让MySQL与Galera library通讯,特别针对MySQL开发了wsrep API。




MariaDB介绍
MariaDB是MySQL的一个分支,由MySQL的创始人Michael Widenius主导开发,采用GPL授权许可。
开发这个分支的原因之一是Oracle公司收购了MySQL后,有将MySQL闭源的潜在风险,因此社区采用分支的方式来避开这个风险。
MariaDB的目的是完全兼容MySQL,包括API和命令行,使之能轻松成为MySQL的代替品。
在存储引擎方面,使用XtraDB来代替MySQL的InnoDB。
方案总览
haproxy作为MariaDB Galera Cluster的前端
2台haproxy用keepalived避免单点故障
3台MariaDB和一个garbd仲裁节点组成集群,仲裁节点上无数据
Galera的SST采用Percona提供的XtraBackup(防止锁表,非阻塞)


环境准备
操作系统:CentOS 7.4版本
集群数量:3个节点
主机信息:
192.168.99.2 node1 selinux=disabled firewalld关闭
192.168.99.4 node2 selinux=disabled firewalld关闭
192.168.99.5 node3 selinux=disabled firewalld关闭
setenforce 0
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
systemctl disable firewalld.service
systemctl stop firewalld.service
--------------------------------------------------------------------------------
直接添加mariadb的专用源
cat >/etc/yum.repos.d/mariadb.repo<<EOF
[mariadb]
name = MariaDB
baseurl=http://yum.mariadb.org/10.1.40/centos7-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=0
enable=1
EOF
yum clean all
yum install MariaDB-server -y
安装的过程中会把MariaDB-client和galera一起安装好
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
MariaDB-server x86_64 10.1.40-1.el7.centos mariadb 24 M
Installing for dependencies:
MariaDB-client x86_64 10.1.40-1.el7.centos mariadb 10 M
galera x86_64 25.3.26-1.rhel7.el7.centos mariadb 8.1 M
Transaction Summary
================================================================================
官方的mariadb galera cluster各个版本下载地址:
https://downloads.mariadb.org/mariadb-galera/+releases/
官方的MariaDB Galera集群入门:
https://mariadb.com/kb/en/library/getting-started-with-mariadb-galera-cluster/
内网可以考虑下载rpm包进行安装
可以看到, yum 安装命令会自动进行依赖包解析并进行依赖包的自动安装,安装完成之后,每个Galera 节点系统中将会增加许多与Galera 相关的目录文件。
# rpm -ql galera
/usr/bin/garb-systemd
/usr/bin/garbd
/usr/lib/systemd/system/garb.service
/usr/lib64/galera
/usr/lib64/galera/libgalera_smm.so
/usr/share/doc/galera
/usr/share/doc/galera/COPYING
/usr/share/doc/galera/LICENSE.asio
/usr/share/doc/galera/LICENSE.chromium
/usr/share/doc/galera/LICENSE.crc32c
/usr/share/doc/galera/README
/usr/share/doc/galera/README-MySQL
/usr/share/man/man8/garbd.8.gz
# rpm -qf /bin/galera_new_cluster
MariaDB-server-10.1.14-1.el7.centos.x86_64
在mariadb-1服务器上运行(启动集群后这些密码的设置会同步到集群中的其他机器中)
启动数据库
systemctl start mariadb
设置密码
mysqladmin -u root password 12345678
添加集群认证用户(galera集群认证用户---不是必须配置项)
MariaDB [(none)]> GRANT ALL PRIVILEGES ON *.* TO sst@'%' IDENTIFIED BY 'sstpass123' WITH GRANT OPTION;
MariaDB [(none)]> FLUSH PRIVILEGES;
关闭所有数据库
systemctl stop mariadb
给三台测试的服务器都写上这样的配置,每台的配置相应的做一些调整
# cat /etc/my.cnf.d/galera.cnf | grep -v '^#\|^$'
[galera]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=192.168.99.2
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_name="galera_cluster"
wsrep_cluster_address="gcomm://192.168.99.2:4567,192.168.99.4:4567,192.168.99.5:4567"
wsrep_node_name=node1
wsrep_node_address=192.168.99.2
wsrep_sst_auth=sst:sstpass123
# galera集群认证用户:密码---不是必须配置项
wsrep_sst_method=rsync
wsrep_causal_reads=ON
wsrep_slave_threads=1
wsrep_certify_nonPK=1
wsrep_max_ws_rows=131072
wsrep_max_ws_size=1073741824
wsrep_debug=0
wsrep_convert_LOCK_to_trx=0
wsrep_retry_autocommit=1
wsrep_auto_increment_control=1
wsrep_drupal_282555_workaround=0
wsrep_causal_reads=0
wsrep_notify_cmd=
[mariadb]
log-error=/var/log/mariadb/mariadb.log
注:使用galera-4包安装的需要修改配置文件wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
将此文件复制到mariadb-2、mariadb-3,注意要把 wsrep_node_name 和 wsrep_node_address 改成相应节点的 hostname 和 ip。示范如下
[galera]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=192.168.99.4
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_name="galera_cluster"
wsrep_cluster_address="gcomm://192.168.99.2:4567,192.168.99.4:4567,192.168.99.5:4567"
wsrep_node_name=node2
wsrep_node_address=192.168.99.4
wsrep_sst_auth=sst:sstpass123
wsrep_sst_method=rsync
wsrep_causal_reads=ON
wsrep_slave_threads=1
wsrep_certify_nonPK=1
wsrep_max_ws_rows=131072
wsrep_max_ws_size=1073741824
wsrep_debug=0
wsrep_convert_LOCK_to_trx=0
wsrep_retry_autocommit=1
wsrep_auto_increment_control=1
wsrep_drupal_282555_workaround=0
wsrep_causal_reads=0
wsrep_notify_cmd=
[mariadb]
log-error=/var/log/mariadb/mariadb.log
[galera]
query_cache_size=0 # 关闭查询缓存
binlog_format=ROW # binlog文件格式:行
default_storage_engine=innodb # Mariadb 存储引擎
innodb_autoinc_lock_mode=2 #主键自增模式修改为交叉模式
wsrep_provider=/usr/lib64/galera/libgalera_smm.so #galera 库文件
wsrep_cluster_address="gcomm://192.168.99.2:4567,192.168.99.4:4567,192.168.99.5:4567" # galera集群url
wsrep_cluster_name='galera_cluster' # galera集群名称
wsrep_node_address='192.168.99.2' # 该节点的地址
wsrep_node_name='node1' # 该节点的主机名
wsrep_sst_method=xtrabackup-v2 # 拷贝模式
wsrep_sst_auth=sst:sstpass123 # galera集群认证用户:密码---不是必须配置项
如果启动其他节点数据库有问题,建议改成rsync模式试试,但是最终我们建议使用xtrabackup-v2

启动 MariaDB Galera Cluster 服务:
# /bin/galera_new_cluster
# systemctl status mariadb
● mariadb.service - MariaDB 10.3 database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2019-08-16 16:39:33 CST; 5min ago
Docs: man:mysqld(8)
https://mariadb.com/kb/en/library/systemd/
Process: 19118 ExecStartPost=/usr/libexec/mysql-check-upgrade (code=exited, status=0/SUCCESS)
Process: 19042 ExecStartPre=/usr/libexec/mysql-prepare-db-dir %n (code=exited, status=0/SUCCESS)
Process: 19017 ExecStartPre=/usr/libexec/mysql-check-socket (code=exited, status=0/SUCCESS)
Main PID: 19081 (mysqld)
Status: "Taking your SQL requests now..."
CGroup: /system.slice/mariadb.service
└─19081 /usr/libexec/mysqld --basedir=/usr --wsrep-new-cluster
Aug 16 16:39:33 45-113-32-200 systemd[1]: Starting MariaDB 10.3 database server...
Aug 16 16:39:33 45-113-32-200 mysql-prepare-db-dir[19042]: Database MariaDB is probably initialized in /var/lib/mysql already, nothing is done.
Aug 16 16:39:33 45-113-32-200 mysql-prepare-db-dir[19042]: If this is not the case, make sure the /var/lib/mysql is empty before running mysql-prepare-db-dir.
Aug 16 16:39:33 45-113-32-200 mysqld[19081]: 2019-08-16 16:39:33 0 [Warning] WSREP: option --wsrep-causal-reads is deprecated
Aug 16 16:39:33 45-113-32-200 mysqld[19081]: 2019-08-16 16:39:33 0 [Warning] WSREP: --wsrep-causal-reads=ON takes precedence over --wsrep-sync-wait=0. WSREP_SYNC_WAIT_BEFORE_READ is on
Aug 16 16:39:33 45-113-32-200 mysqld[19081]: 2019-08-16 16:39:33 0 [Warning] WSREP: --wsrep-sync-wait=1 takes precedence over --wsrep-causal-reads=OFF. WSREP_SYNC_WAIT_BEFORE_READ is on
Aug 16 16:39:33 45-113-32-200 mysqld[19081]: 2019-08-16 16:39:33 0 [Note] /usr/libexec/mysqld (mysqld 10.3.10-MariaDB) starting as process 19081 ...
Aug 16 16:39:33 45-113-32-200 mysqld[19081]: 2019-08-16 16:39:33 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 4182)
Aug 16 16:39:33 45-113-32-200 mysqld[19081]: 2019-08-16 16:39:33 0 [Warning] Changed limits: max_open_files: 1024 max_connections: 151 (was 151) table_cache: 421 (was 2000)
Aug 16 16:39:33 45-113-32-200 systemd[1]: Started MariaDB 10.3 database server.
剩余两节点启动方式为:
systemctl start mariadb
systemctl enable mariadb
查看集群状态:(集群服务使用了4567和3306端口))
[root@node1 ~]# netstat -tnlp | grep -e 4567 -e 3306
tcp 0 0 0.0.0.0:4567 0.0.0.0:* LISTEN 17908/mysqld
tcp 0 0 192.168.99.2:3306 0.0.0.0:* LISTEN 17908/mysqld
扩展:剩余的mariadb启动报错
报错:Database MariaDB is not initialized, but the directory /var/lib/mysql is not empty, so initialization cannot be done.
解决办法: rm -rf /var/lib/mysql
mkdir -pv /var/lib/mysql
chown mysql:mysql -R /var/lib/mysql
启动数据库服务后,登录mysql数据库库,通过wsrep_cluster_size参数来判断启动是否成功。
SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
1 row in set (0.001 sec)
//查看Galera集群运行状态
show status like '%wsrep%';
+------------------------------+-------------------------------------------------------+
| Variable_name | Value |
+------------------------------+-------------------------------------------------------+
| wsrep_apply_oooe | 0.000000 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 0.000000 |
| wsrep_causal_reads | 1 |
| wsrep_cert_deps_distance | 0.000000 |
| wsrep_cert_index_size | 0 |
| wsrep_cert_interval | 0.000000 |
| wsrep_cluster_conf_id | 3 |
| wsrep_cluster_size | 3 |
| wsrep_cluster_state_uuid | 6c8283d0-bff8-11e9-ad9e-8a56edae0f44 |
| wsrep_cluster_status | Primary |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 0.000000 |
| wsrep_connected | ON |
| wsrep_desync_count | 0 |
| wsrep_evs_delayed | |
| wsrep_evs_evict_list | |
| wsrep_evs_repl_latency | 0/0/0/0/0 |
| wsrep_evs_state | OPERATIONAL |
| wsrep_flow_control_paused | 0.000000 |
| wsrep_flow_control_paused_ns | 0 |
| wsrep_flow_control_recv | 0 |
| wsrep_flow_control_sent | 0 |
| wsrep_gcomm_uuid | 5efbadba-c001-11e9-9b8e-ab70e12329ca |
| wsrep_incoming_addresses | 192.168.99.2:3306,192.168.99.5:3306,192.168.99.4:3306 |
| wsrep_last_committed | 0 |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_cached_downto | 18446744073709551615 |
| wsrep_local_cert_failures | 0 |
| wsrep_local_commits | 0 |
| wsrep_local_index | 0 |
| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_avg | 0.000000 |
| wsrep_local_recv_queue_max | 1 |
| wsrep_local_recv_queue_min | 0 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_avg | 0.000000 |
| wsrep_local_send_queue_max | 1 |
| wsrep_local_send_queue_min | 0 |
| wsrep_local_state | 4 |
| wsrep_local_state_comment | Synced |
| wsrep_local_state_uuid | 6c8283d0-bff8-11e9-ad9e-8a56edae0f44 |
| wsrep_protocol_version | 8 |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy <info@codership.com> |
| wsrep_provider_version | 3.23(rXXXX) |
| wsrep_ready | ON |
| wsrep_received | 10 |
| wsrep_received_bytes | 742 |
| wsrep_repl_data_bytes | 0 |
| wsrep_repl_keys | 0 |
| wsrep_repl_keys_bytes | 0 |
| wsrep_repl_other_bytes | 0 |
| wsrep_replicated | 0 |
| wsrep_replicated_bytes | 0 |
| wsrep_thread_count | 2 |
+------------------------------+-------------------------------------------------------+
58 rows in set (0.001 sec)
运维常用命令
查看集群状态
show status like 'wsrep%';
注释:
wsrep_cluster_status为Primary,表示节点为主节点,正常读写。
wsrep_ready为ON,表示集群正常运行。
wsrep_connected: 如果该值为Off,且wsrep_ready的值也为Off,则说明该节点没有连接到集群
wsrep_cluster_size为3,表示集群有三个节点。
wsrep_cluster_state_uuid:在集群所有节点的值应该是相同的,有不同值的节点,说明其没有连接入集群。
wsrep_cluster_conf_id:正常情况下所有节点上该值是一样的.如果值不同,说明该节点被临时”分区”了.当节点之间网络连接恢复的时候应该会恢复一样的值。
wsrep_flow_control_paused:表示复制停止了多长时间.即表明集群因为Slave延迟而慢的程度.值为0~1,越靠近0越好,值为1表示复制完全停止.可优化wsrep_slave_threads的值来改善.
wsrep_flow_control_sent:表示该节点已经停止复制了多少次.
查看复制模式
show variables like "wsrep_sst_method";
wsrep_local_state_comment 状态对照表


文件理解
grastate.dat
默认路径为:/var/lib/mysql/
grastate.dat
可以通过该文件查看到该节点记录的uuid和seqno,当节点正常退出Galera集群时,会将GTID的值更新到该文件中。
如果该节点数据库服务正在运行,则seqno的值是-1的
在断电的情况下,所有节点的seqno值可能都相同,此时需根 gvwstate.dat判断启动节点
gvwstate.dat
默认路径为:/var/lib/mysql/gvwstate.dat
此文件保存了集群状态信息,如果集群正常关闭的话,这个文件会自动删除掉。
查看节点状态


wsrep_local_state:
节点的状态,取值1-6。
取值1:The node starts and establishes a connection to the Primary Component.
取值2:When the node succeeds with a state transfer request, it begins to cache write-sets.
取值3:The node receives a State Snapshot Transfer. It now has all cluster data and begins to apply the cached write-sets.
Here the node enables Flow Control to ensure an eventual decrease in the slave queue.
取值4:The node finishes catching up with the cluster. Its slave queue is now empty and it enables Flow Control to keep it empty.
The node sets the MySQL status variable wsrep_ready to the value 1. The node is now allowed to process transactions.
取值5:The node receives a state transfer request. Flow Control relaxes to DONOR. The node caches all write-sets it cannot apply.
取值6:The node completes the state transfer to joiner node.
问题
1.异常断电
当机房突然停电,所有galera主机都非正常关机,来电后开机,会导致galera集群服务无法正常启动。如何处理?
第1步:开启galera集群的群主主机的mariadb服务。
第2步:开启galera集群的成员主机的mariadb服务。
异常处理:galera集群的群主主机和成员主机的mysql服务无法启动,如何处理?
解决方法一:第1步、删除garlera群主主机的/var/lib/mysql/grastate.dat状态文件
/bin/galera_new_cluster启动服务。启动正常。登录并查看wsrep状态。
第2步:删除galera成员主机中的/var/lib/mysql/grastate.dat状态文件
systemctl restart mariadb重启服务。启动正常。登录并查看wsrep状态。
解决方法二:第1步、修改garlera群主主机的/var/lib/mysql/grastate.dat状态文件中的0为1
/bin/galera_new_cluster启动服务。启动正常。登录并查看wsrep状态。
第2步:修改galera成员主机中的/var/lib/mysql/grastate.dat状态文件中的0为1
systemctl restart mariadb重启服务。启动正常。登录并查看wsrep状态。
注:如果被删除了/var/lib/mysql/grastate.dat
,新建/var/lib/mysql/grastate.dat
要变更权限,执行chown mysql:mysql /var/lib/mysql/grastate.dat
2.脑裂
set global wsrep_provider_options="pc.bootstrap=true";
通过这个命令来强制恢复出现脑裂的节点。
扩展:
问题--->当gralera群主主机在机房突然断电后,悲催的是群主主机起不来,只有几台成员主机没有问题....
解决办法--->在某个成员主机中删除/var/lib/mysql/grastate.dat文件,并执行/bin/galera_new_cluster启动服务,于是这个成员主机就变成了群主主机。
然后在删除其他galera成员主机中的/var/lib/mysql/grastate.dat状态文件,systemctl restart mariadb重启服务,问题发生了,其他的galera成员主机的mariadb服务启动不了....
问题在于发生了脑裂,登录新的群主主机执行set global wsrep_provider_options="pc.bootstrap=true";,再次在其他的galera成员主机重启mariadb服务,启动正常。OK,问题解决。
3.报错/var/run/mariadb/mariadb.pid缺少权限
查看报错日志/var/log/mariadb/mariadb.log
[ERROR] mysqld: Can't create/write to file '/var/run/mariadb/mariadb.pid' (Errcode: 13 "Permission denied")
[ERROR] Can't start server: can't create PID file: Permission denied:
解决办法:mkdir -pv /var/run/mariadb/
chown -R mysql:mysql /var/run/mariadb
4.报错WSREP: failed to open gcomm backend connection: 131: invalid UUID
[ERROR] WSREP: failed to open gcomm backend connection: 131: invalid UUID: 00000000 (FATAL)
at gcomm/src/pc.cpp:PC():271
[ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -131 (State not recoverable)
[ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1379: Failed to open channel 'wsrep_cluster' at 'gcomm://': -131 (State not recoverable)
[ERROR] WSREP: gcs connect failed: State not recoverable
[ERROR] WSREP: wsrep::connect(gcomm://) failed: 7
[ERROR] Aborting
解决办法:进入该数据库节点/var/lib/mysql/目录,将文件gvwstate.dat移除(mv)掉。然后重新启动mairbd即可
5.断电后数据库集群中的服务启动了一段时间,发现集群的数据库数据并没有同步复制,数据一直往一台机器写入,怎么办?(生产环境实例...后果是操作失误导致开启了集群后数据被几天没有新数据的数据库覆盖了,也就是最近几天的数据没有了)
解决办法:在备份数据后,把所有数据库服务全部关闭,把一直有数据写入数据库作为主节点率先启动执行/bin/galera_new_cluster
其他节点后启动执行rm -f /var/lib/mysql/grastate.dat && systemctl start mariadb(有可能会发现由于同步数据时间过长,导致的无法启动,如果有相关mysql进程可以杀死后多次启动,直到数据同步完成能启动为止,当然你也可以调整相关mysql启动超时的参数---查看日志关于状态变好可能是有帮助的)
一般节点的启动状态变好
[Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
[Note] WSREP: Restored state OPEN -> JOINED (0)
[Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
关于启动后查看哪个节点才是主节点
]$ sudo grep -i node /var/log/mariadb/mariadb.log
2019-12-10 11:03:50 139673594214592 [Note] WSREP: Node e56ffd23 state prim
2019-12-10 11:03:50 139673593899776 [Note] WSREP: New cluster view: global state: dcf89fd8-1a5b-11ea-9e67-46b64f83070d:0, view# 4: Primary, number of nodes: 2, my index: 0, protocol version 3
2019-12-10 11:07:02 139752511609024 [Note] WSREP: Node 232e34f2 state prim
2019-12-10 11:07:02 139752511294208 [Note] WSREP: New cluster view: global state: dcf89fd8-1a5b-11ea-9e67-46b64f83070d:0, view# 1: Primary, number of nodes: 1, my index: 0, protocol version 3
2019-12-10 11:09:04 139752200206080 [Note]
WSREP: Node 232e34f2 state prim
2019-12-10 11:09:05 139752511294208 [Note] WSREP: New cluster view: global state: dcf89fd8-1a5b-11ea-9e67-46b64f83070d:0, view# 2: Primary, number of nodes: 2, my index: 0, protocol version 3
MariaDB [(none)]> show status like 'wsrep_gcomm_uuid';
+------------------+--------------------------------------+
| Variable_name | Value |
+------------------+--------------------------------------+
| wsrep_gcomm_uuid |
232e34f2-1afa-11ea-9759-6e2b142ae042 |
+------------------+--------------------------------------+
1 row in set (0.00 sec)
-------------------------------------------------------------------------------------------------------------------------------------------
生产环境的配置示范
/etc/my.cnf的配置
[client]
port=3306
socket=/var/lib/mysql/mysql.sock
[mysqld]
character-set-server=utf8
default-storage-engine=InnoDB
port=3306
socket=/var/lib/mysql/mysql.sock
skip-external-locking
key_buffer_size = 256M
max_allowed_packet = 1M
table_open_cache = 256
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size= 16M
thread_concurrency = 8
max_connections = 800
log-bin=mysql-bin
binlog_format=mixed
server-id = 1
skip-name-resolve
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
[myisamchk]
key_buffer_size = 128M
sort_buffer_size = 128M
read_buffer = 2M
write_buffer = 2M
[mysqlhotcopy]
interactive-timeout
!includedir /etc/my.cnf.d
高可用集群配置文件/etc/my.cnf.d/galera.cnf
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.30.126:4567,192.168.30.127:4567"
wsrep_cluster_name="wsrep_cluster"
wsrep_node_name=host-127
wsrep_node_address=192.168.30.127
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
bind-address=192.168.30.127
wsrep_slave_threads=1
innodb_flush_log_at_trx_commit=0
[embedded]
[mariadb]
log-error=/var/log/mariadb/mariadb.log #添加日志文件方便分析和排查故障
pid-file=/var/run/mariadb/mariadb.pid
[mariadb-10.1]
生产环境由于在配置中添加了日志文件和pid-file导致服务起不来的问题
[root@node1 ~]# systemctl status mariadb.service
● mariadb.service - MariaDB 10.1.40 database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/mariadb.service.d
└─migrated-from-my.cnf-settings.conf
Active: failed (Result: exit-code) since Fri 2019-12-27 16:16:03 CST; 7h left
Docs: man:mysqld(8)
https://mariadb.com/kb/en/library/systemd/
Process: 14595 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=1/FAILURE)
Process: 14386 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
Process: 14384 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Main PID: 14595 (code=exited, status=1/FAILURE)
Dec 27 16:16:03 node1 mysqld[14595]: 2019-12-27 16:16:03 140593965365504 [Note] Plugin 'FEEDBACK' is disabled.
Dec 27 16:16:03 node1 mysqld[14595]: 2019-12-27 16:16:03 140592789059328 [Note] InnoDB: Dumping buffer pool(s) not yet started
Dec 27 16:16:03 node1 mysqld[14595]: 2019-12-27 16:16:03 140593648891648 [Note] WSREP: (1f745993, 'tcp://0.0.0.0:4567') connection established to 74e6b5d2 tcp://10.30.1.202:4567
Dec 27 16:16:03 node1 mysqld[14595]: 2019-12-27 16:16:03 140593648891648 [Note] WSREP: (1f745993, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
Dec 27 16:16:03 node1 mysqld[14595]: 2019-12-27 16:16:03 140593965365504 [Note] Server socket created on IP: '10.30.1.201'.
Dec 27 16:16:03 node1 mysqld[14595]: 2019-12-27 16:16:03 140593964320512 [ERROR] mysqld: Can't create/write to file '/var/run/mariadb/mariadb.pid' (Errcode: 2 "No such file or directory")
Dec 27 16:16:03 node1 systemd[1]: mariadb.service: main process exited, code=exited, status=1/FAILURE
Dec 27 16:16:03 node1 systemd[1]: Failed to start MariaDB 10.1.40 database server.
Dec 27 16:16:03 node1 systemd[1]: Unit mariadb.service entered failed state.
Dec 27 16:16:03 node1 systemd[1]: mariadb.service failed.
解决办法
mkdir -pv /var/run/mariadb/
chown -R mysql:mysql /var/run/mariadb/
mkdir -pv /var/log/mariadb/
chown -R mysql:mysql /var/log/mariadb/
前同事关于高可用数据库操作
启动第一个节点
# service mysql bootstrap
或
#service mysql start --wsrep-new-cluster wsrep_cluster_address="gcomm://"
第一个节点一旦关闭后重启,启动时必须指定集群中其他任意一个节点的IP
# service mysql start --wsrep_cluster_address=gcomm://192.168.1.102
启动后续节点
# service mysql start
如果异常关闭后不能启动,执行以下命令后在尝试
mysqld_safe --wsrep-recover
查看集群状态
mysql>SHOW STATUS LIKE 'wsrep%';
启动异常,报错
WSREP: failed to open gcomm backend connection: 131: invalid UUID
进入该数据库节点/var/lib/mysql/目录,将文件gvwstate.dat移除(mv)掉。然后重新启动mairbd即可
扩展:关于内网的mariadb和galera的安装办法
其实galera并不需要单独安装,当我们配置好mariadb的源时,在yum安装mariadb-server包时会自动帮我们安装一批软件,其中就包括galera


下面是我们生产环境的安装办法


# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.88.66.5
yum.mariadb.org
# cat /etc/yum.repos.d/mariadb.repo
[mariadb]
name=MariaDB
gpgcheck=0
enable=1
Loaded plugins: fastestmirror
MariaDB-10.1.14-centos7-x86_64-server.rpm | 99 MB 00:01
Examining /var/tmp/yum-root-yhH8bv/MariaDB-10.1.14-centos7-x86_64-server.rpm: MariaDB-server-10.1.14-1.el7.centos.x86_64
Marking /var/tmp/yum-root-yhH8bv/MariaDB-10.1.14-centos7-x86_64-server.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package MariaDB-server.x86_64 0:10.1.14-1.el7.centos will be installed
--> Processing Dependency: MariaDB-client for package: MariaDB-server-10.1.14-1.el7.centos.x86_64
Loading mirror speeds from cached hostfile
* base:
mirrors.tuna.tsinghua.edu.cn
* epel:
mirrors.tuna.tsinghua.edu.cn
* extras:
mirrors.tuna.tsinghua.edu.cn
* updates:
mirrors.neusoft.edu.cn
--> Processing Dependency: galera for package: MariaDB-server-10.1.14-1.el7.centos.x86_64
--> Processing Dependency: libjemalloc.so.1()(64bit) for package: MariaDB-server-10.1.14-1.el7.centos.x86_64
--> Running transaction check
---> Package MariaDB-client.x86_64 0:10.1.14-1.el7.centos will be installed
---> Package galera.x86_64 0:25.3.15-1.rhel7.el7.centos will be installed
---> Package jemalloc.x86_64 0:3.6.0-1.el7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
MariaDB-server x86_64 10.1.14-1.el7.centos /MariaDB-10.1.14-centos7-x86_64-server
423 M
Installing for dependencies:
MariaDB-client x86_64 10.1.14-1.el7.centos mariadb 39 M
galera x86_64 25.3.15-1.rhel7.el7.centos mariadb 7.7 M
jemalloc x86_64 3.6.0-1.el7 epel 105 k
Transaction Summary
================================================================================
Install 1 Package (+3 Dependent packages)
Total size: 469 M
Installed size: 624 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : MariaDB-client-10.1.14-1.el7.centos.x86_64 1/4
Installing : jemalloc-3.6.0-1.el7.x86_64 2/4
Installing : galera-25.3.15-1.rhel7.el7.centos.x86_64 3/4
Installing : MariaDB-server-10.1.14-1.el7.centos.x86_64 4/4
Verifying : MariaDB-server-10.1.14-1.el7.centos.x86_64 1/4
Verifying : galera-25.3.15-1.rhel7.el7.centos.x86_64 2/4
Verifying : jemalloc-3.6.0-1.el7.x86_64 3/4
Verifying : MariaDB-client-10.1.14-1.el7.centos.x86_64 4/4
Installed:
MariaDB-server.x86_64 0:10.1.14-1.el7.centos
Dependency Installed:
MariaDB-client.x86_64 0:10.1.14-1.el7.centos
galera.x86_64 0:25.3.15-1.rhel7.el7.centos
jemalloc.x86_64 0:3.6.0-1.el7
Complete!
作者:Dexter_Wang 工作岗位:某互联网公司资深Linux架构师 联系邮箱:993852246@qq.com