环境
vip 192.168.1.101
slave 192.168.1.16 5.7.17 3306
master 192.168.1.135 5.7.17 3306
proxysql 192.168.1.16(为方便proxysql放在了16节点上)
一 MHA的搭建
1.安装MHA软件,首先安装epel源。(2台机器)
rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
2.安装perl相关组件(2台机器)
yum install perl-DBD-MySQL yum install perl-Config-Tiny yum install perl-Log-Dispatch yum install perl-Parallel-ForkManager
3.安装MHA软件 (两台机器建议都安装,切换方便)(2台机器)
rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
4.建立SSH信任关系
5.授权
GRANT ALL PRIVILEGES ON *.* TO 'zhuch'@'%' IDENTIFIED BY "zhuch"
GRANT REPLICATION SLAVE ON *.* TO 'slave'@'%' IDENTIFIED BY "oracle"
6.创建应用目录
mkdir /etc/masterha
拷贝如下文件到 /etc/masterha
[root@mysql3 masterha]# ls -l
total 32
-rw-r--r--. 1 root root 509 Feb 10 02:29 app1.conf -rw-r--r--. 1 root root 55 Feb 10 03:15 drop_vip.sh -rw-r--r--. 1 root root 57 Feb 10 03:15 init_vip.sh -rw-r--r--. 1 root root 354 Feb 10 02:25 masterha_default.conf -rwxr-xr-x. 1 root root 3978 Feb 10 03:16 master_ip_failover -rwxr-xr-x. 1 root root 10390 Feb 10 03:17 master_ip_online_change
app1.conf MHA相关配置文件(在软件包解压后的目录里面有样例配置文件,只不过这里我们直接创建一个重新编辑)
[root@mysql3 masterha]# cat app1.conf
[server default]
#mha manager工作目录
manager_workdir = /var/log/masterha/app1
manager_log = /var/log/masterha/app1/app1.log
remote_workdir = /var/log/masterha/app1
[server1]
hostname=192.168.1.16
master_binlog_dir = /data/mysql/mysql3306/logs
candidate_master = 1
check_repl_delay = 0 #用防止master故障时,切换时slave有延迟,卡在那里切不过来。
[server3]
hostname=192.168.1.135
master_binlog_dir=/data/mysql/mysql3306/logs
candidate_master = 1
check_repl_delay = 0
drop_vip.sh 解除绑定vip
[root@mysql3 masterha]# cat drop_vip.sh
vip="192.168.1.101/24"
/sbin/ip addr del $vip dev eth0
init._vip.sh 绑定vip
[root@mysql3 masterha]# cat init_vip.sh
vip="192.168.1.101/24"
/sbin/ip addr add $vip dev eth0
masterha_default.conf 全局级配置文件
[root@mysql3 masterha]# cat masterha_default.conf
[server default]
#MySQL的用户和密码
user=zhuch
password=zhuch
#系统ssh用户
ssh_user=root
#复制用户
repl_user=slave
repl_password=oracle
#监控
ping_interval=1
#shutdown_script=""
#切换调用的脚本
master_ip_failover_script= /etc/masterha/master_ip_failover
master_ip_online_change_script= /etc/masterha/master_ip_online_change
master_ip_failover 自动failover脚本

[root@mysql3 masterha]# cat master_ip_failover #!/usr/bin/env perl # Copyright (C) 2011 DeNA Co.,Ltd. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict; use warnings FATAL => 'all'; use Getopt::Long; use MHA::DBHelper; #自定义该组机器的vip my $vip = "192.168.1.101"; my $if = "eth0"; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password ); GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password, ); sub add_vip { my $output1 = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $orig_master_host /sbin/ip addr del $vip/24 dev $if`; my $output2 = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $new_master_host /sbin/ip addr add $vip/24 dev $if`; } exit &main(); sub main { if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; eval { # updating global catalog, etc $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print "Set read_only=0 on the new master.\n"; $new_master_handler->disable_read_only(); ## Creating an app user on the new master #print "Creating app user on the new master..\n"; #FIXME_xxx_create_user( $new_master_handler->{dbh} ); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); ## Update master ip on the catalog database, etc &add_vip(); $exit_code = 0; }; if ($@) { warn $@; # If you want to continue failover, exit 10. exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { # do nothing exit 0; } else { &usage(); exit 1; } } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; }
master_ip_online_change 手动failover脚本

[root@mysql3 masterha]# cat master_ip_online_change #!/usr/bin/env perl # Copyright (C) 2011 DeNA Co.,Ltd. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict; use warnings FATAL => 'all'; use Getopt::Long; use MHA::DBHelper; use MHA::NodeUtil; use Time::HiRes qw( sleep gettimeofday tv_interval ); use Data::Dumper; my $_tstart; my $_running_interval = 0.1; #添加vip定义 my $vip = "192.168.1.101"; my $if = "eth0"; my ( $command, $orig_master_is_new_slave, $orig_master_host, $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, $orig_master_ssh_user, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password, $new_master_ssh_user, ); GetOptions( 'command=s' => \$command, 'orig_master_is_new_slave' => \$orig_master_is_new_slave, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'orig_master_user=s' => \$orig_master_user, 'orig_master_password=s' => \$orig_master_password, 'orig_master_ssh_user=s' => \$orig_master_ssh_user, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password, 'new_master_ssh_user=s' => \$new_master_ssh_user, ); exit &main(); sub drop_vip { my $output = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $orig_master_host /sbin/ip addr del $vip/24 dev $if`; #mysql里的连接全部干掉 #FIXME } sub add_vip { my $output = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $new_master_host /sbin/ip addr add $vip/24 dev $if`; } sub current_time_us { my ( $sec, $microsec ) = gettimeofday(); my $curdate = localtime($sec); return $curdate . " " . sprintf( "%06d", $microsec ); } sub sleep_until { my $elapsed = tv_interval($_tstart); if ( $_running_interval > $elapsed ) { sleep( $_running_interval - $elapsed ); } } sub get_threads_util { my $dbh = shift; my $my_connection_id = shift; my $running_time_threshold = shift; my $type = shift; $running_time_threshold = 0 unless ($running_time_threshold); $type = 0 unless ($type); my @threads; my $sth = $dbh->prepare("SHOW PROCESSLIST"); $sth->execute(); while ( my $ref = $sth->fetchrow_hashref() ) { my $id = $ref->{Id}; my $user = $ref->{User}; my $host = $ref->{Host}; my $command = $ref->{Command}; my $state = $ref->{State}; my $query_time = $ref->{Time}; my $info = $ref->{Info}; $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info); next if ( $my_connection_id == $id ); next if ( defined($query_time) && $query_time < $running_time_threshold ); next if ( defined($command) && $command eq "Binlog Dump" ); next if ( defined($user) && $user eq "system user" ); next if ( defined($command) && $command eq "Sleep" && defined($query_time) && $query_time >= 1 ); if ( $type >= 1 ) { next if ( defined($command) && $command eq "Sleep" ); next if ( defined($command) && $command eq "Connect" ); } if ( $type >= 2 ) { next if ( defined($info) && $info =~ m/^select/i ); next if ( defined($info) && $info =~ m/^show/i ); } push @threads, $ref; } return @threads; } sub main { if ( $command eq "stop" ) { ## Gracefully killing connections on the current master # 1. Set read_only= 1 on the new master # 2. DROP USER so that no app user can establish new connections # 3. Set read_only= 1 on the current master # 4. Kill current queries # * Any database access failure will result in script die. my $exit_code = 1; eval { ## Setting read_only=1 on the new master (to avoid accident) my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error(die_on_error)_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); print current_time_us() . " Set read_only on the new master.. "; $new_master_handler->enable_read_only(); if ( $new_master_handler->is_read_only() ) { print "ok.\n"; } else { die "Failed!\n"; } $new_master_handler->disconnect(); # Connecting to the orig master, die if any database error happens my $orig_master_handler = new MHA::DBHelper(); $orig_master_handler->connect( $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, 1 ); ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand $orig_master_handler->disable_log_bin_local(); # print current_time_us() . " Drpping app user on the orig master..\n"; print current_time_us() . " drop vip $vip..\n"; #drop_app_user($orig_master_handler); &drop_vip(); ## Waiting for N * 100 milliseconds so that current connections can exit my $time_until_read_only = 15; $_tstart = [gettimeofday]; my @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); while ( $time_until_read_only > 0 && $#threads >= 0 ) { if ( $time_until_read_only % 5 == 0 ) { printf "%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n", current_time_us(), $#threads + 1, $time_until_read_only * 100; if ( $#threads < 5 ) { print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n" foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_read_only--; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); } ## Setting read_only=1 on the current master so that nobody(except SUPER) can write print current_time_us() . " Set read_only=1 on the orig master.. "; $orig_master_handler->enable_read_only(); if ( $orig_master_handler->is_read_only() ) { print "ok.\n"; } else { die "Failed!\n"; } ## Waiting for M * 100 milliseconds so that current update queries can complete my $time_until_kill_threads = 5; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); while ( $time_until_kill_threads > 0 && $#threads >= 0 ) { if ( $time_until_kill_threads % 5 == 0 ) { printf "%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n", current_time_us(), $#threads + 1, $time_until_kill_threads * 100; if ( $#threads < 5 ) { print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n" foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_kill_threads--; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); } ## Terminating all threads print current_time_us() . " Killing all application threads..\n"; $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 ); print current_time_us() . " done.\n"; $orig_master_handler->enable_log_bin_local(); $orig_master_handler->disconnect(); ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { ## Activating master ip on the new master # 1. Create app user with write privileges # 2. Moving backup script if needed # 3. Register new master's ip to the catalog database # We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery. # If exit code is 0 or 10, MHA does not abort my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print current_time_us() . " Set read_only=0 on the new master.\n"; $new_master_handler->disable_read_only(); ## Creating an app user on the new master #print current_time_us() . " Creating app user on the new master..\n"; print current_time_us() . "Add vip $vip on $if..\n"; # create_app_user($new_master_handler); &add_vip(); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); ## Update master ip on the catalog database, etc $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { # do nothing exit 0; } else { &usage(); exit 1; } } sub usage { print "Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; die; }
7.在主库绑定vip(执行脚本)
sh init._vip.sh
8.检测SSH 是否ok
[root@mysql2 opt]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf Sat Feb 10 22:00:34 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf.. Sat Feb 10 22:00:34 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:00:34 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:00:34 2018 - [info] Starting SSH connection tests.. Sat Feb 10 22:00:36 2018 - [debug] Sat Feb 10 22:00:35 2018 - [debug] Connecting via SSH from root@192.168.1.135(192.168.1.135:22) to root@192.168.1.16(192.168.1.16:22).. Sat Feb 10 22:00:36 2018 - [debug] ok. Sat Feb 10 22:00:41 2018 - [debug] Sat Feb 10 22:00:34 2018 - [debug] Connecting via SSH from root@192.168.1.16(192.168.1.16:22) to root@192.168.1.135(192.168.1.135:22).. Sat Feb 10 22:00:41 2018 - [debug] ok. Sat Feb 10 22:00:41 2018 - [info] All SSH connection tests passed successfully.
9.检测主从复制情况是否ok
[root@mysql2 opt]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf Sat Feb 10 22:00:34 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf.. Sat Feb 10 22:00:34 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:00:34 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:00:34 2018 - [info] Starting SSH connection tests.. Sat Feb 10 22:00:36 2018 - [debug] Sat Feb 10 22:00:35 2018 - [debug] Connecting via SSH from root@192.168.1.135(192.168.1.135:22) to root@192.168.1.16(192.168.1.16:22).. Sat Feb 10 22:00:36 2018 - [debug] ok. Sat Feb 10 22:00:41 2018 - [debug] Sat Feb 10 22:00:34 2018 - [debug] Connecting via SSH from root@192.168.1.16(192.168.1.16:22) to root@192.168.1.135(192.168.1.135:22).. Sat Feb 10 22:00:41 2018 - [debug] ok. Sat Feb 10 22:00:41 2018 - [info] All SSH connection tests passed successfully. [root@mysql2 opt]# [root@mysql2 opt]# [root@mysql2 opt]# masterha_check_repl --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf Sat Feb 10 22:26:50 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf.. Sat Feb 10 22:26:50 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:26:50 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:26:50 2018 - [info] MHA::MasterMonitor version 0.56. Sat Feb 10 22:26:50 2018 - [info] GTID failover mode = 1 Sat Feb 10 22:26:50 2018 - [info] Dead Servers: Sat Feb 10 22:26:50 2018 - [info] Alive Servers: Sat Feb 10 22:26:50 2018 - [info] 192.168.1.16(192.168.1.16:3306) Sat Feb 10 22:26:50 2018 - [info] 192.168.1.135(192.168.1.135:3306) Sat Feb 10 22:26:50 2018 - [info] Alive Slaves: Sat Feb 10 22:26:50 2018 - [info] 192.168.1.16(192.168.1.16:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Sat Feb 10 22:26:50 2018 - [info] GTID ON Sat Feb 10 22:26:50 2018 - [info] Replicating from 192.168.1.135(192.168.1.135:3306) Sat Feb 10 22:26:50 2018 - [info] Primary candidate for the new Master (candidate_master is set) Sat Feb 10 22:26:50 2018 - [info] Current Alive Master: 192.168.1.135(192.168.1.135:3306) Sat Feb 10 22:26:50 2018 - [info] Checking slave configurations.. Sat Feb 10 22:26:50 2018 - [info] read_only=1 is not set on slave 192.168.1.16(192.168.1.16:3306). Sat Feb 10 22:26:50 2018 - [info] Checking replication filtering settings.. Sat Feb 10 22:26:50 2018 - [info] binlog_do_db= , binlog_ignore_db= Sat Feb 10 22:26:50 2018 - [info] Replication filtering check ok. Sat Feb 10 22:26:50 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking. Sat Feb 10 22:26:50 2018 - [info] Checking SSH publickey authentication settings on the current master.. Sat Feb 10 22:26:51 2018 - [info] HealthCheck: SSH to 192.168.1.135 is reachable. Sat Feb 10 22:26:51 2018 - [info] 192.168.1.135(192.168.1.135:3306) (current master) +--192.168.1.16(192.168.1.16:3306) Sat Feb 10 22:26:51 2018 - [info] Checking replication health on 192.168.1.16.. Sat Feb 10 22:26:51 2018 - [info] ok. Sat Feb 10 22:26:51 2018 - [info] Checking master_ip_failover_script status: Sat Feb 10 22:26:51 2018 - [info] /etc/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.135 --orig_master_ip=192.168.1.135 --orig_master_port=3306 Sat Feb 10 22:26:51 2018 - [info] OK. Sat Feb 10 22:26:51 2018 - [warning] shutdown_script is not defined. Sat Feb 10 22:26:51 2018 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
10.设置从库上的 relay_log_purge=0 以及 read_only=1 (只读)
'set global relay_log_purge=0'
'set global read_only=1'
应用差异的中继日志到其他从库的时候也许会用到 ,但是我们这里一主一从其实不必配置,如果设置了 relay_log_purge=0 的话,又怕从库的relay log产生过多,这时候我们可以使用purge_relay_logs 命令定时删除,这个是MHA自带的
可以写成一个脚本定时删除 如下:
#!/bin/bash user=zhuch passwd=zhuch port=3306 log_dir='/etc/masterha/log' work_dir='/etc/masterha/relay_log_node' purge='/usr/bin/purge_relay_logs' if [ ! -d $log_dir ] then mkdir $log_dir -p fi if [ ! -d $work_dir ] then mkdir $work_dir -p fi $purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1
基本上MHA 就已经搭建完了 ,主库挂掉后会切换到从库 并且vip 也会漂移到从库
二 安装配置proxysql
1.安装
下载地址 https://www.percona.com/downloads/proxysql/
rpm -ivh proxysql-1.4.3-1-centos67.x86_64.rpm
2.配置 登入proxysql 把MySQL主从信息添加进去,将主库master放入写节点中,也加就是hostgroup_id 为100中,slave节点做读放到1000中
mysql -uadmin -padmin -P6032 -h127.0.0.1
但是注意:这里我直接将写节点的 设置为 VIP 192.168.1.101
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(100,'192.168.1.101',3306,1,1000,10,'vip');
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(1000,'192.168.1.16',3306,1,1000,10,'slave'
admin@ 23:16: [(none)]> select * from mysql_servers; +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+ | hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment | +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | 0 | 1000 | 10 | 0 | 0 | test proxysql | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | 0 | 1000 | 10 | 0 | 0 | test proxysql | +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+
3. 配置后端使用的MySQL用户,需要先在后端MySQL(135,16) 里真实存在,一个是监控账号,一个是程序账号:
GRANT ALL PRIVILEGES ON *.* TO 'proxysql'@'192.168.1.16' identified by 'proxysql'
GRANT ALL PRIVILEGES ON *.* TO 'sbuser'@'%' identified by 'sbuser'
在后端MySQL里添加完之后再配置proxysql: 这里需要注意,default_hostgroup需要和上面的对应
insert into mysql_users(username,password,active,default_hostgroup,transaction_persistent) values('sbuser','sbuser',1,100,1);
admin@ 23:37: [(none)]> select * from mysql_users; +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+ | username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections | +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+ | sbuser | sbuser | 1 | 0 | 100 | | 0 | 1 | 0 | 1 | 1 | 10000 | +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
4.设置健康的监测账号
admin@ 23:37: [(none)]>set mysql-monitor_username='proxysql';
admin@ 23:37: [(none)]>set mysql-monitor_password='proxysql';
-- 应用到线上
load mysql servers to runtime; load mysql users to runtime; load mysql variables to runtime; -- 持久化 save mysql servers to disk; save mysql users to disk; save mysql variables to disk;
要是是用明文密码设置mysql_users,在这里可以用save命令来转换成了hash值的密码:
save mysql users to mem;
admin@ 23:39: [(none)]> select * from mysql_users; +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+ | username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections | +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+ | sbuser | *CA96E56547F43610DDE9EB7B12B4EF4C51CDDFFC | 1 | 0 | 100 | | 0 | 1 | 0 | 1 | 1 | 10000 | +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
5.配置路由
-- 发送到M admin@127.0.0.1 : (none) 04:58:11>INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT.*FOR UPDATE$',100,1); Query OK, 1 row affected (0.00 sec) -- 发送到S admin@127.0.0.1 : (none) 05:08:17>INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT',1000,1); Query OK, 1 row affected (0.00 sec)
admin@127.0.0.1 : (none) 05:09:37>load mysql query rules to runtime; Query OK, 0 rows affected (0.00 sec) admin@127.0.0.1 : (none) 05:09:57>save mysql query rules to disk; Query OK, 0 rows affected (0.00 sec)
6.连接数据库6033 测试读写分离
[root@mysql2 sysbench]# mysql -usbuser -psbuser -P6033 -h192.168.1.16
sbuser@ 23:59: [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | z1_email | | z1_exchange | | z1_relation | +--------------------+ 7 rows in set (0.03 sec) sbuser@ 00:02: [(none)]> sbuser@ 00:02: [(none)]> sbuser@ 00:02: [(none)]> use z1_email; Database changed, 2 warnings sbuser@ 00:02: [z1_email]> sbuser@ 00:02: [z1_email]> insert into a1 values(134); Query OK, 1 row affected (0.01 sec) sbuser@ 00:03: [z1_email]> insert into a1 values(146); Query OK, 1 row affected (0.01 sec) sbuser@ 00:03: [z1_email]> insert into a1 values(157); Query OK, 1 row affected (0.02 sec) sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> selet * from a1; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'selet * from a1' at line 1 sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> select * from a1; +------+ | id | +------+ | 1 | | 2 | | 12 | | 13 | | 14 | | 111 | | 222 | | 333 | | 250 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 15 | | 15 | | 16 | | 123 | | 124 | | 17 | | 1000 | | 1001 | | 1002 | | 1003 | | 1003 | | 1004 | | 1004 | | 134 | | 146 | | 157 | +------+ 36 rows in set (0.00 sec)
进入管理账户6032端口查看,可以看到的确有读写分离已经完成了
admin@ 00:10: [(none)]> select * from stats_mysql_query_digest; +-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+ | hostgroup | schemaname | username | digest | digest_text | count_star | first_seen | last_seen | sum_time | min_time | max_time | +-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+ | 1000 | z1_email | sbuser | 0xB17CC7AAA7E39A4A | select * from a1 | 1 | 1518278606 | 1518278606 | 2123 | 2123 | 2123 | | 100 | z1_email | sbuser | 0x496C8B86BBC0D398 | insert into a1 values(?) | 3 | 1518278580 | 1518278588 | 30478 | 6373 | 16671 | | 1000 | information_schema | sbuser | 0x620B328FE9D6D71A | SELECT DATABASE() | 1 | 1518278568 | 1518278568 | 508 | 508 | 508 | | 100 | information_schema | sbuser | 0x02033E45904D3DF0 | show databases | 1 | 1518278563 | 1518278563 | 30233 | 30233 | 30233 | +-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+ 4 rows in set (0.00 sec)
三 测试
1. 模拟主库宕机的情况
分析:主库挂掉后proxysql的写入情况
主库故障,使用MHA 手动failover 将 vip 切换到从库 192.168.1.16上 ,此时 192.168.1.16 上的 vip是192.168.1.101
admin@ 00:17: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+--------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+--------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | +--------------+---------------+------+--------+--------+ 2 rows in set (0.00 sec)
从上面可以看出来 mysql_servers 中的 hostname 的写是192.168.1.101 读是192.168.1.16,这样一来是不是 主库挂了后手动切换后就可以直接写了呢? 测试一下
在主节点上模拟主库挂掉的情况
[root@mysql3 masterha]# ps -ef |grep mysql mysql 2020 65360 0 Feb10 pts/1 00:00:58 mysqld --defaults-file=/etc/my.cnf root 5356 65360 0 00:43 pts/1 00:00:00 grep mysql [root@mysql3 masterha]# [root@mysql3 masterha]# [root@mysql3 masterha]# kill -9 2020
然后去6033 程序端口查看是否可以写 发现报错了,超时
sbuser@ 00:57: [z1_email]> insert into a1 values(158); ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 100 after 10001ms
然后去6033 程序端口查看是否可以读 发现也报错了,超时 (这里很奇怪按理说可以读才对)
sbuser@ 18:59: [z1_email]> select * from a1; ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 1000 after 10000ms
现在进行手动切换
masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.1.135 --master_state=dead --new_master_host=192.168.1.16 --ignore_last_failover
现在已经切换完毕了 并且vip已经切换到了 192.168.1.16上
[root@mysql2 masterha]# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:0c:29:92:bf:e3 brd ff:ff:ff:ff:ff:ff inet 192.168.1.16/24 brd 192.168.1.255 scope global eth0 inet 192.168.1.101/24 scope global secondary eth0 inet6 fe80::20c:29ff:fe92:bfe3/64 scope link valid_lft forever preferred_lft forever
这时候再去程序端口 6033 进行插入和读取的操作,发现可以进行读写了
sbuser@ 19:08: [z1_email]> select * from a1; +------+ | id | +------+ | 1 | | 2 | | 12 | | 13 | | 14 | | 111 | | 222 | | 333 | | 250 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 15 | | 15 | | 16 | | 123 | | 124 | | 17 | | 1000 | | 1001 | | 1002 | | 1003 | | 1003 | | 1004 | | 1004 | | 134 | | 146 | | 157 | +------+ 36 rows in set (0.00 sec) sbuser@ 19:08: [z1_email]> sbuser@ 19:08: [z1_email]> insert into a1 values(1590); Query OK, 1 row affected (0.00 sec)
此时主库恢复后 change 到新的主库
root@ 19:21: [(none)]> change master to master_host='192.168.1.16', -> master_user='slave', -> master_password='oracle', -> master_auto_position=1;
查看主从同步状态是OK的
root@ 19:51: [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.16 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000018 Read_Master_Log_Pos: 2231 Relay_Log_File: mysql3-relay-bin.000002 Relay_Log_Pos: 675 Relay_Master_Log_File: mysql-bin.000018 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2231 Relay_Log_Space: 883 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 330616 Master_UUID: 25aa2017-083b-11e8-b78a-000c2992bfe3 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 25aa2017-083b-11e8-b78a-000c2992bfe3:59554 Executed_Gtid_Set: 25aa2017-083b-11e8-b78a-000c2992bfe3:1-59554, 7af79590-0840-11e8-ac17-000c29459399:1-10 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
此时我们再去管理端口查看一下,发现其实管理端口只有192.168.1.16 和 vip 192.168.1.101 并且vip 已经漂移到了 192.168.1.16这台机器上
[root@mysql2 opt]# mysql -uadmin -padmin -P6032 -h127.0.0.1 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 19 Server version: 5.5.30 (ProxySQL Admin Module) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. admin@ 19:53: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+--------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+--------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | +--------------+---------------+------+--------+--------+ 2 rows in set (0.00 sec)
然后我们加入192.168.1.135 并且我这里分配的权重是9
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(1000,'192.168.1.135',3306,9,1000,10,'test proxysql');
admin@ 19:59: [(none)]> load mysql servers to runtime;
Query OK, 0 rows affected (0.01 sec)
admin@ 19:59: [(none)]> save mysql servers to disk;
Query OK, 0 rows affected (0.05 sec)
查看runtime_mysql_servers ,有 十分之九的概率的 读操作会分配到 192.168.1.135 十分之一的读会在 192.168.1.16 并且全部的写操作都在 192.168.1.16(因为VIP 192.168.1.101在16上)
admin@ 19:59: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+--------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+--------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | +--------------+---------------+------+--------+--------+
现在主库为192.168.1.16 如果此时主库挂了怎么办? 是否还会影响在proxysql中的读写操作呢?
我们再次模拟 主库挂掉的情况 此时主库是 192.168.1.16
[root@mysql2 opt]# ps -ef |grep mysql root 2976 21612 0 19:50 pts/4 00:00:00 mysql -uroot -px xxxx root 2983 6583 0 19:53 pts/3 00:00:00 mysql -uadmin -px xxx -P6032 -h127.0.0.1 root 3369 15620 0 22:09 pts/1 00:00:00 grep mysql mysql 28714 15620 0 Feb10 pts/1 00:01:51 mysqld --defaults-file=/etc/my.cnf root 31851 21524 0 Feb10 pts/0 00:00:00 mysql -usbuser -px xxxx -P6033 -h192.168.1.16 [root@mysql2 opt]# [root@mysql2 opt]# [root@mysql2 opt]# kill -9 28714
此时再去 proxysql的程序端口6033中做读操作 超时不可读
sbuser@ 20:35: [z1_email]> select * from a1; ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 1000 after 10000ms
此时再去 proxysql的程序端口6033中做写操作 超时不可写
sbuser@ 22:17: [z1_email]> insert into a1 values(1591); ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 100 after 10001ms
这时候我们做基于MHA 的手动failover操作
masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.1.16 --master_state=dead --new_master_host=192.168.1.135 --ignore_last_failover
此时vip 已经漂移到192.168.1.135 上了 ,并且我们进proxysql管理端口 6032 看看
admin@ 20:35: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+---------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+---------+--------+ | 100 | 192.168.1.101 | 3306 | SHUNNED | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 | +--------------+---------------+------+---------+--------+
我们再进入 proxysql的 6033 端口看看是否可以做读操作 因为此时 192.168.1.135 的状态还是online的
sbuser@ 22:22: [z1_email]> select * from a1; +------+ | id | +------+ | 1 | | 2 | | 12 | | 13 | | 14 | | 111 | | 222 | | 333 | | 250 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 15 | | 15 | | 16 | | 123 | | 124 | | 17 | | 1000 | | 1001 | | 1002 | | 1003 | | 1003 | | 1004 | | 1004 | | 134 | | 146 | | 157 | | 1590 | +------+ 37 rows in set (0.00 sec)
可见是可以读的,那么我们vip 已经漂移到了192.168.1.135上了啊 是否可以写呢?
sbuser@ 22:23: [z1_email]> insert into a1 values(1591); Query OK, 1 row affected (0.14 sec)
发现可以写的,我们再回到管理端口6302 去看看居然发现 vip 192.168.1.101 的状态又变回了ONLINE (emmmm.....)
admin@ 22:21: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+---------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+---------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 | +--------------+---------------+------+---------+--------+ 3 rows in set (0.01 sec)
所以这里我觉得 应该是proxysql 没有立刻获取 vip 已经漂移的状态,显示的是 SHUNNED ,但是并不影响使用 只是显示有问题
最后我们再把 192.168.1.16 恢复起来 change 到 新的master 192.168.1.135上
[root@mysql2 masterha]# mysqld --defaults-file=/etc/my.cnf & [2] 3490 [root@mysql2 masterha]# mysql -uroot -poracle mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 Server version: 5.7.17-log MySQL Community Server (GPL) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. root@ 22:28: [(none)]> change master to master_host='192.168.1.135', -> master_user='slave', -> master_password='oracle', -> master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.07 sec) root@ 22:35: [(none)]> start slave; Query OK, 0 rows affected (0.02 sec) root@ 22:36: [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.135 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000016 Read_Master_Log_Pos: 743 Relay_Log_File: mysql2-relay-bin.000002 Relay_Log_Pos: 675 Relay_Master_Log_File: mysql-bin.000016 Slave_IO_Running: Yes Slave_SQL_Running: Yes
........
再查看一下 proxysql的管理端口 6032,发现192.168.1.16显示状态还是
admin@ 22:24: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+---------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+---------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 | +--------------+---------------+------+---------+--------+ 3 rows in set (0.01 sec)
我们去proxysql的程序端口6033 进行查询一次
sbuser@ 22:23: [z1_email]> select * from a1; +------+ | id | +------+ | 1 | | 2 | | 12 | | 13 | | 14 | | 111 | | 222 | | 333 | | 250 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 15 | | 15 | | 16 | | 123 | | 124 | | 17 | | 1000 | | 1001 | | 1002 | | 1003 | | 1003 | | 1004 | | 1004 | | 134 | | 146 | | 157 | | 1590 | | 1591 | +------+ 38 rows in set (0.00 sec)
再查看一下 proxysql的管理端口 6032看看 可见都显示ONLINE 了
admin@ 22:37: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+--------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+--------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | +--------------+---------------+------+--------+--------+ 3 rows in set (0.01 sec)
最后做一个总结:
MHA + proxysql 可以做到高可用和读写分离,在主库挂掉后切换到从库,通过主库的vip漂移的特性将proxysql中的写节点配置成vip,
并且总是主库在做写操作的,因为vip在哪台机器哪台机器就是主库。
而且如果我们做了如下结构的proxysql策略,则无论是 哪台机器挂掉 ,只要进行切换就不会影响读和写
admin@ 22:37: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+--------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+--------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | +--------------+---------------+------+--------+--------+ 3 rows in set (0.01 sec)