環境
vip 192.168.1.101
slave 192.168.1.16 5.7.17 3306
master 192.168.1.135 5.7.17 3306
proxysql 192.168.1.16(為方便proxysql放在了16節點上)
一 MHA的搭建
1.安裝MHA軟件,首先安裝epel源。(2台機器)
rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
2.安裝perl相關組件(2台機器)
yum install perl-DBD-MySQL yum install perl-Config-Tiny yum install perl-Log-Dispatch yum install perl-Parallel-ForkManager
3.安裝MHA軟件 (兩台機器建議都安裝,切換方便)(2台機器)
rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
4.建立SSH信任關系
5.授權
GRANT ALL PRIVILEGES ON *.* TO 'zhuch'@'%' IDENTIFIED BY "zhuch"
GRANT REPLICATION SLAVE ON *.* TO 'slave'@'%' IDENTIFIED BY "oracle"
6.創建應用目錄
mkdir /etc/masterha
拷貝如下文件到 /etc/masterha
[root@mysql3 masterha]# ls -l
total 32
-rw-r--r--. 1 root root 509 Feb 10 02:29 app1.conf -rw-r--r--. 1 root root 55 Feb 10 03:15 drop_vip.sh -rw-r--r--. 1 root root 57 Feb 10 03:15 init_vip.sh -rw-r--r--. 1 root root 354 Feb 10 02:25 masterha_default.conf -rwxr-xr-x. 1 root root 3978 Feb 10 03:16 master_ip_failover -rwxr-xr-x. 1 root root 10390 Feb 10 03:17 master_ip_online_change
app1.conf MHA相關配置文件(在軟件包解壓后的目錄里面有樣例配置文件,只不過這里我們直接創建一個重新編輯)
[root@mysql3 masterha]# cat app1.conf
[server default]
#mha manager工作目錄
manager_workdir = /var/log/masterha/app1
manager_log = /var/log/masterha/app1/app1.log
remote_workdir = /var/log/masterha/app1
[server1]
hostname=192.168.1.16
master_binlog_dir = /data/mysql/mysql3306/logs
candidate_master = 1
check_repl_delay = 0 #用防止master故障時,切換時slave有延遲,卡在那里切不過來。
[server3]
hostname=192.168.1.135
master_binlog_dir=/data/mysql/mysql3306/logs
candidate_master = 1
check_repl_delay = 0
drop_vip.sh 解除綁定vip
[root@mysql3 masterha]# cat drop_vip.sh
vip="192.168.1.101/24"
/sbin/ip addr del $vip dev eth0
init._vip.sh 綁定vip
[root@mysql3 masterha]# cat init_vip.sh
vip="192.168.1.101/24"
/sbin/ip addr add $vip dev eth0
masterha_default.conf 全局級配置文件
[root@mysql3 masterha]# cat masterha_default.conf
[server default]
#MySQL的用戶和密碼
user=zhuch
password=zhuch
#系統ssh用戶
ssh_user=root
#復制用戶
repl_user=slave
repl_password=oracle
#監控
ping_interval=1
#shutdown_script=""
#切換調用的腳本
master_ip_failover_script= /etc/masterha/master_ip_failover
master_ip_online_change_script= /etc/masterha/master_ip_online_change
master_ip_failover 自動failover腳本

[root@mysql3 masterha]# cat master_ip_failover #!/usr/bin/env perl # Copyright (C) 2011 DeNA Co.,Ltd. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict; use warnings FATAL => 'all'; use Getopt::Long; use MHA::DBHelper; #自定義該組機器的vip my $vip = "192.168.1.101"; my $if = "eth0"; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password ); GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password, ); sub add_vip { my $output1 = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $orig_master_host /sbin/ip addr del $vip/24 dev $if`; my $output2 = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $new_master_host /sbin/ip addr add $vip/24 dev $if`; } exit &main(); sub main { if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; eval { # updating global catalog, etc $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print "Set read_only=0 on the new master.\n"; $new_master_handler->disable_read_only(); ## Creating an app user on the new master #print "Creating app user on the new master..\n"; #FIXME_xxx_create_user( $new_master_handler->{dbh} ); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); ## Update master ip on the catalog database, etc &add_vip(); $exit_code = 0; }; if ($@) { warn $@; # If you want to continue failover, exit 10. exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { # do nothing exit 0; } else { &usage(); exit 1; } } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; }
master_ip_online_change 手動failover腳本

[root@mysql3 masterha]# cat master_ip_online_change #!/usr/bin/env perl # Copyright (C) 2011 DeNA Co.,Ltd. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict; use warnings FATAL => 'all'; use Getopt::Long; use MHA::DBHelper; use MHA::NodeUtil; use Time::HiRes qw( sleep gettimeofday tv_interval ); use Data::Dumper; my $_tstart; my $_running_interval = 0.1; #添加vip定義 my $vip = "192.168.1.101"; my $if = "eth0"; my ( $command, $orig_master_is_new_slave, $orig_master_host, $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, $orig_master_ssh_user, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password, $new_master_ssh_user, ); GetOptions( 'command=s' => \$command, 'orig_master_is_new_slave' => \$orig_master_is_new_slave, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'orig_master_user=s' => \$orig_master_user, 'orig_master_password=s' => \$orig_master_password, 'orig_master_ssh_user=s' => \$orig_master_ssh_user, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password, 'new_master_ssh_user=s' => \$new_master_ssh_user, ); exit &main(); sub drop_vip { my $output = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $orig_master_host /sbin/ip addr del $vip/24 dev $if`; #mysql里的連接全部干掉 #FIXME } sub add_vip { my $output = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $new_master_host /sbin/ip addr add $vip/24 dev $if`; } sub current_time_us { my ( $sec, $microsec ) = gettimeofday(); my $curdate = localtime($sec); return $curdate . " " . sprintf( "%06d", $microsec ); } sub sleep_until { my $elapsed = tv_interval($_tstart); if ( $_running_interval > $elapsed ) { sleep( $_running_interval - $elapsed ); } } sub get_threads_util { my $dbh = shift; my $my_connection_id = shift; my $running_time_threshold = shift; my $type = shift; $running_time_threshold = 0 unless ($running_time_threshold); $type = 0 unless ($type); my @threads; my $sth = $dbh->prepare("SHOW PROCESSLIST"); $sth->execute(); while ( my $ref = $sth->fetchrow_hashref() ) { my $id = $ref->{Id}; my $user = $ref->{User}; my $host = $ref->{Host}; my $command = $ref->{Command}; my $state = $ref->{State}; my $query_time = $ref->{Time}; my $info = $ref->{Info}; $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info); next if ( $my_connection_id == $id ); next if ( defined($query_time) && $query_time < $running_time_threshold ); next if ( defined($command) && $command eq "Binlog Dump" ); next if ( defined($user) && $user eq "system user" ); next if ( defined($command) && $command eq "Sleep" && defined($query_time) && $query_time >= 1 ); if ( $type >= 1 ) { next if ( defined($command) && $command eq "Sleep" ); next if ( defined($command) && $command eq "Connect" ); } if ( $type >= 2 ) { next if ( defined($info) && $info =~ m/^select/i ); next if ( defined($info) && $info =~ m/^show/i ); } push @threads, $ref; } return @threads; } sub main { if ( $command eq "stop" ) { ## Gracefully killing connections on the current master # 1. Set read_only= 1 on the new master # 2. DROP USER so that no app user can establish new connections # 3. Set read_only= 1 on the current master # 4. Kill current queries # * Any database access failure will result in script die. my $exit_code = 1; eval { ## Setting read_only=1 on the new master (to avoid accident) my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error(die_on_error)_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); print current_time_us() . " Set read_only on the new master.. "; $new_master_handler->enable_read_only(); if ( $new_master_handler->is_read_only() ) { print "ok.\n"; } else { die "Failed!\n"; } $new_master_handler->disconnect(); # Connecting to the orig master, die if any database error happens my $orig_master_handler = new MHA::DBHelper(); $orig_master_handler->connect( $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, 1 ); ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand $orig_master_handler->disable_log_bin_local(); # print current_time_us() . " Drpping app user on the orig master..\n"; print current_time_us() . " drop vip $vip..\n"; #drop_app_user($orig_master_handler); &drop_vip(); ## Waiting for N * 100 milliseconds so that current connections can exit my $time_until_read_only = 15; $_tstart = [gettimeofday]; my @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); while ( $time_until_read_only > 0 && $#threads >= 0 ) { if ( $time_until_read_only % 5 == 0 ) { printf "%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n", current_time_us(), $#threads + 1, $time_until_read_only * 100; if ( $#threads < 5 ) { print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n" foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_read_only--; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); } ## Setting read_only=1 on the current master so that nobody(except SUPER) can write print current_time_us() . " Set read_only=1 on the orig master.. "; $orig_master_handler->enable_read_only(); if ( $orig_master_handler->is_read_only() ) { print "ok.\n"; } else { die "Failed!\n"; } ## Waiting for M * 100 milliseconds so that current update queries can complete my $time_until_kill_threads = 5; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); while ( $time_until_kill_threads > 0 && $#threads >= 0 ) { if ( $time_until_kill_threads % 5 == 0 ) { printf "%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n", current_time_us(), $#threads + 1, $time_until_kill_threads * 100; if ( $#threads < 5 ) { print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n" foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_kill_threads--; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); } ## Terminating all threads print current_time_us() . " Killing all application threads..\n"; $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 ); print current_time_us() . " done.\n"; $orig_master_handler->enable_log_bin_local(); $orig_master_handler->disconnect(); ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { ## Activating master ip on the new master # 1. Create app user with write privileges # 2. Moving backup script if needed # 3. Register new master's ip to the catalog database # We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery. # If exit code is 0 or 10, MHA does not abort my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print current_time_us() . " Set read_only=0 on the new master.\n"; $new_master_handler->disable_read_only(); ## Creating an app user on the new master #print current_time_us() . " Creating app user on the new master..\n"; print current_time_us() . "Add vip $vip on $if..\n"; # create_app_user($new_master_handler); &add_vip(); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); ## Update master ip on the catalog database, etc $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { # do nothing exit 0; } else { &usage(); exit 1; } } sub usage { print "Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; die; }
7.在主庫綁定vip(執行腳本)
sh init._vip.sh
8.檢測SSH 是否ok
[root@mysql2 opt]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf Sat Feb 10 22:00:34 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf.. Sat Feb 10 22:00:34 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:00:34 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:00:34 2018 - [info] Starting SSH connection tests.. Sat Feb 10 22:00:36 2018 - [debug] Sat Feb 10 22:00:35 2018 - [debug] Connecting via SSH from root@192.168.1.135(192.168.1.135:22) to root@192.168.1.16(192.168.1.16:22).. Sat Feb 10 22:00:36 2018 - [debug] ok. Sat Feb 10 22:00:41 2018 - [debug] Sat Feb 10 22:00:34 2018 - [debug] Connecting via SSH from root@192.168.1.16(192.168.1.16:22) to root@192.168.1.135(192.168.1.135:22).. Sat Feb 10 22:00:41 2018 - [debug] ok. Sat Feb 10 22:00:41 2018 - [info] All SSH connection tests passed successfully.
9.檢測主從復制情況是否ok
[root@mysql2 opt]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf Sat Feb 10 22:00:34 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf.. Sat Feb 10 22:00:34 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:00:34 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:00:34 2018 - [info] Starting SSH connection tests.. Sat Feb 10 22:00:36 2018 - [debug] Sat Feb 10 22:00:35 2018 - [debug] Connecting via SSH from root@192.168.1.135(192.168.1.135:22) to root@192.168.1.16(192.168.1.16:22).. Sat Feb 10 22:00:36 2018 - [debug] ok. Sat Feb 10 22:00:41 2018 - [debug] Sat Feb 10 22:00:34 2018 - [debug] Connecting via SSH from root@192.168.1.16(192.168.1.16:22) to root@192.168.1.135(192.168.1.135:22).. Sat Feb 10 22:00:41 2018 - [debug] ok. Sat Feb 10 22:00:41 2018 - [info] All SSH connection tests passed successfully. [root@mysql2 opt]# [root@mysql2 opt]# [root@mysql2 opt]# masterha_check_repl --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf Sat Feb 10 22:26:50 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf.. Sat Feb 10 22:26:50 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:26:50 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. Sat Feb 10 22:26:50 2018 - [info] MHA::MasterMonitor version 0.56. Sat Feb 10 22:26:50 2018 - [info] GTID failover mode = 1 Sat Feb 10 22:26:50 2018 - [info] Dead Servers: Sat Feb 10 22:26:50 2018 - [info] Alive Servers: Sat Feb 10 22:26:50 2018 - [info] 192.168.1.16(192.168.1.16:3306) Sat Feb 10 22:26:50 2018 - [info] 192.168.1.135(192.168.1.135:3306) Sat Feb 10 22:26:50 2018 - [info] Alive Slaves: Sat Feb 10 22:26:50 2018 - [info] 192.168.1.16(192.168.1.16:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Sat Feb 10 22:26:50 2018 - [info] GTID ON Sat Feb 10 22:26:50 2018 - [info] Replicating from 192.168.1.135(192.168.1.135:3306) Sat Feb 10 22:26:50 2018 - [info] Primary candidate for the new Master (candidate_master is set) Sat Feb 10 22:26:50 2018 - [info] Current Alive Master: 192.168.1.135(192.168.1.135:3306) Sat Feb 10 22:26:50 2018 - [info] Checking slave configurations.. Sat Feb 10 22:26:50 2018 - [info] read_only=1 is not set on slave 192.168.1.16(192.168.1.16:3306). Sat Feb 10 22:26:50 2018 - [info] Checking replication filtering settings.. Sat Feb 10 22:26:50 2018 - [info] binlog_do_db= , binlog_ignore_db= Sat Feb 10 22:26:50 2018 - [info] Replication filtering check ok. Sat Feb 10 22:26:50 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking. Sat Feb 10 22:26:50 2018 - [info] Checking SSH publickey authentication settings on the current master.. Sat Feb 10 22:26:51 2018 - [info] HealthCheck: SSH to 192.168.1.135 is reachable. Sat Feb 10 22:26:51 2018 - [info] 192.168.1.135(192.168.1.135:3306) (current master) +--192.168.1.16(192.168.1.16:3306) Sat Feb 10 22:26:51 2018 - [info] Checking replication health on 192.168.1.16.. Sat Feb 10 22:26:51 2018 - [info] ok. Sat Feb 10 22:26:51 2018 - [info] Checking master_ip_failover_script status: Sat Feb 10 22:26:51 2018 - [info] /etc/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.135 --orig_master_ip=192.168.1.135 --orig_master_port=3306 Sat Feb 10 22:26:51 2018 - [info] OK. Sat Feb 10 22:26:51 2018 - [warning] shutdown_script is not defined. Sat Feb 10 22:26:51 2018 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
10.設置從庫上的 relay_log_purge=0 以及 read_only=1 (只讀)
'set global relay_log_purge=0'
'set global read_only=1'
應用差異的中繼日志到其他從庫的時候也許會用到 ,但是我們這里一主一從其實不必配置,如果設置了 relay_log_purge=0 的話,又怕從庫的relay log產生過多,這時候我們可以使用purge_relay_logs 命令定時刪除,這個是MHA自帶的
可以寫成一個腳本定時刪除 如下:
#!/bin/bash user=zhuch passwd=zhuch port=3306 log_dir='/etc/masterha/log' work_dir='/etc/masterha/relay_log_node' purge='/usr/bin/purge_relay_logs' if [ ! -d $log_dir ] then mkdir $log_dir -p fi if [ ! -d $work_dir ] then mkdir $work_dir -p fi $purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1
基本上MHA 就已經搭建完了 ,主庫掛掉后會切換到從庫 並且vip 也會漂移到從庫
二 安裝配置proxysql
1.安裝
下載地址 https://www.percona.com/downloads/proxysql/
rpm -ivh proxysql-1.4.3-1-centos67.x86_64.rpm
2.配置 登入proxysql 把MySQL主從信息添加進去,將主庫master放入寫節點中,也加就是hostgroup_id 為100中,slave節點做讀放到1000中
mysql -uadmin -padmin -P6032 -h127.0.0.1
但是注意:這里我直接將寫節點的 設置為 VIP 192.168.1.101
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(100,'192.168.1.101',3306,1,1000,10,'vip');
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(1000,'192.168.1.16',3306,1,1000,10,'slave'
admin@ 23:16: [(none)]> select * from mysql_servers; +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+ | hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment | +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | 0 | 1000 | 10 | 0 | 0 | test proxysql | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | 0 | 1000 | 10 | 0 | 0 | test proxysql | +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+
3. 配置后端使用的MySQL用戶,需要先在后端MySQL(135,16) 里真實存在,一個是監控賬號,一個是程序賬號:
GRANT ALL PRIVILEGES ON *.* TO 'proxysql'@'192.168.1.16' identified by 'proxysql'
GRANT ALL PRIVILEGES ON *.* TO 'sbuser'@'%' identified by 'sbuser'
在后端MySQL里添加完之后再配置proxysql: 這里需要注意,default_hostgroup需要和上面的對應
insert into mysql_users(username,password,active,default_hostgroup,transaction_persistent) values('sbuser','sbuser',1,100,1);
admin@ 23:37: [(none)]> select * from mysql_users; +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+ | username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections | +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+ | sbuser | sbuser | 1 | 0 | 100 | | 0 | 1 | 0 | 1 | 1 | 10000 | +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
4.設置健康的監測賬號
admin@ 23:37: [(none)]>set mysql-monitor_username='proxysql';
admin@ 23:37: [(none)]>set mysql-monitor_password='proxysql';
-- 應用到線上
load mysql servers to runtime; load mysql users to runtime; load mysql variables to runtime; -- 持久化 save mysql servers to disk; save mysql users to disk; save mysql variables to disk;
要是是用明文密碼設置mysql_users,在這里可以用save命令來轉換成了hash值的密碼:
save mysql users to mem;
admin@ 23:39: [(none)]> select * from mysql_users; +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+ | username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections | +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+ | sbuser | *CA96E56547F43610DDE9EB7B12B4EF4C51CDDFFC | 1 | 0 | 100 | | 0 | 1 | 0 | 1 | 1 | 10000 | +----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
5.配置路由
-- 發送到M admin@127.0.0.1 : (none) 04:58:11>INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT.*FOR UPDATE$',100,1); Query OK, 1 row affected (0.00 sec) -- 發送到S admin@127.0.0.1 : (none) 05:08:17>INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT',1000,1); Query OK, 1 row affected (0.00 sec)
admin@127.0.0.1 : (none) 05:09:37>load mysql query rules to runtime; Query OK, 0 rows affected (0.00 sec) admin@127.0.0.1 : (none) 05:09:57>save mysql query rules to disk; Query OK, 0 rows affected (0.00 sec)
6.連接數據庫6033 測試讀寫分離
[root@mysql2 sysbench]# mysql -usbuser -psbuser -P6033 -h192.168.1.16
sbuser@ 23:59: [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | z1_email | | z1_exchange | | z1_relation | +--------------------+ 7 rows in set (0.03 sec) sbuser@ 00:02: [(none)]> sbuser@ 00:02: [(none)]> sbuser@ 00:02: [(none)]> use z1_email; Database changed, 2 warnings sbuser@ 00:02: [z1_email]> sbuser@ 00:02: [z1_email]> insert into a1 values(134); Query OK, 1 row affected (0.01 sec) sbuser@ 00:03: [z1_email]> insert into a1 values(146); Query OK, 1 row affected (0.01 sec) sbuser@ 00:03: [z1_email]> insert into a1 values(157); Query OK, 1 row affected (0.02 sec) sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> selet * from a1; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'selet * from a1' at line 1 sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> sbuser@ 00:03: [z1_email]> select * from a1; +------+ | id | +------+ | 1 | | 2 | | 12 | | 13 | | 14 | | 111 | | 222 | | 333 | | 250 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 15 | | 15 | | 16 | | 123 | | 124 | | 17 | | 1000 | | 1001 | | 1002 | | 1003 | | 1003 | | 1004 | | 1004 | | 134 | | 146 | | 157 | +------+ 36 rows in set (0.00 sec)
進入管理賬戶6032端口查看,可以看到的確有讀寫分離已經完成了
admin@ 00:10: [(none)]> select * from stats_mysql_query_digest; +-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+ | hostgroup | schemaname | username | digest | digest_text | count_star | first_seen | last_seen | sum_time | min_time | max_time | +-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+ | 1000 | z1_email | sbuser | 0xB17CC7AAA7E39A4A | select * from a1 | 1 | 1518278606 | 1518278606 | 2123 | 2123 | 2123 | | 100 | z1_email | sbuser | 0x496C8B86BBC0D398 | insert into a1 values(?) | 3 | 1518278580 | 1518278588 | 30478 | 6373 | 16671 | | 1000 | information_schema | sbuser | 0x620B328FE9D6D71A | SELECT DATABASE() | 1 | 1518278568 | 1518278568 | 508 | 508 | 508 | | 100 | information_schema | sbuser | 0x02033E45904D3DF0 | show databases | 1 | 1518278563 | 1518278563 | 30233 | 30233 | 30233 | +-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+ 4 rows in set (0.00 sec)
三 測試
1. 模擬主庫宕機的情況
分析:主庫掛掉后proxysql的寫入情況
主庫故障,使用MHA 手動failover 將 vip 切換到從庫 192.168.1.16上 ,此時 192.168.1.16 上的 vip是192.168.1.101
admin@ 00:17: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+--------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+--------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | +--------------+---------------+------+--------+--------+ 2 rows in set (0.00 sec)
從上面可以看出來 mysql_servers 中的 hostname 的寫是192.168.1.101 讀是192.168.1.16,這樣一來是不是 主庫掛了后手動切換后就可以直接寫了呢? 測試一下
在主節點上模擬主庫掛掉的情況
[root@mysql3 masterha]# ps -ef |grep mysql mysql 2020 65360 0 Feb10 pts/1 00:00:58 mysqld --defaults-file=/etc/my.cnf root 5356 65360 0 00:43 pts/1 00:00:00 grep mysql [root@mysql3 masterha]# [root@mysql3 masterha]# [root@mysql3 masterha]# kill -9 2020
然后去6033 程序端口查看是否可以寫 發現報錯了,超時
sbuser@ 00:57: [z1_email]> insert into a1 values(158); ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 100 after 10001ms
然后去6033 程序端口查看是否可以讀 發現也報錯了,超時 (這里很奇怪按理說可以讀才對)
sbuser@ 18:59: [z1_email]> select * from a1; ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 1000 after 10000ms
現在進行手動切換
masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.1.135 --master_state=dead --new_master_host=192.168.1.16 --ignore_last_failover
現在已經切換完畢了 並且vip已經切換到了 192.168.1.16上
[root@mysql2 masterha]# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:0c:29:92:bf:e3 brd ff:ff:ff:ff:ff:ff inet 192.168.1.16/24 brd 192.168.1.255 scope global eth0 inet 192.168.1.101/24 scope global secondary eth0 inet6 fe80::20c:29ff:fe92:bfe3/64 scope link valid_lft forever preferred_lft forever
這時候再去程序端口 6033 進行插入和讀取的操作,發現可以進行讀寫了
sbuser@ 19:08: [z1_email]> select * from a1; +------+ | id | +------+ | 1 | | 2 | | 12 | | 13 | | 14 | | 111 | | 222 | | 333 | | 250 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 15 | | 15 | | 16 | | 123 | | 124 | | 17 | | 1000 | | 1001 | | 1002 | | 1003 | | 1003 | | 1004 | | 1004 | | 134 | | 146 | | 157 | +------+ 36 rows in set (0.00 sec) sbuser@ 19:08: [z1_email]> sbuser@ 19:08: [z1_email]> insert into a1 values(1590); Query OK, 1 row affected (0.00 sec)
此時主庫恢復后 change 到新的主庫
root@ 19:21: [(none)]> change master to master_host='192.168.1.16', -> master_user='slave', -> master_password='oracle', -> master_auto_position=1;
查看主從同步狀態是OK的
root@ 19:51: [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.16 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000018 Read_Master_Log_Pos: 2231 Relay_Log_File: mysql3-relay-bin.000002 Relay_Log_Pos: 675 Relay_Master_Log_File: mysql-bin.000018 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2231 Relay_Log_Space: 883 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 330616 Master_UUID: 25aa2017-083b-11e8-b78a-000c2992bfe3 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 25aa2017-083b-11e8-b78a-000c2992bfe3:59554 Executed_Gtid_Set: 25aa2017-083b-11e8-b78a-000c2992bfe3:1-59554, 7af79590-0840-11e8-ac17-000c29459399:1-10 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
此時我們再去管理端口查看一下,發現其實管理端口只有192.168.1.16 和 vip 192.168.1.101 並且vip 已經漂移到了 192.168.1.16這台機器上
[root@mysql2 opt]# mysql -uadmin -padmin -P6032 -h127.0.0.1 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 19 Server version: 5.5.30 (ProxySQL Admin Module) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. admin@ 19:53: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+--------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+--------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | +--------------+---------------+------+--------+--------+ 2 rows in set (0.00 sec)
然后我們加入192.168.1.135 並且我這里分配的權重是9
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(1000,'192.168.1.135',3306,9,1000,10,'test proxysql');
admin@ 19:59: [(none)]> load mysql servers to runtime;
Query OK, 0 rows affected (0.01 sec)
admin@ 19:59: [(none)]> save mysql servers to disk;
Query OK, 0 rows affected (0.05 sec)
查看runtime_mysql_servers ,有 十分之九的概率的 讀操作會分配到 192.168.1.135 十分之一的讀會在 192.168.1.16 並且全部的寫操作都在 192.168.1.16(因為VIP 192.168.1.101在16上)
admin@ 19:59: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+--------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+--------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | +--------------+---------------+------+--------+--------+
現在主庫為192.168.1.16 如果此時主庫掛了怎么辦? 是否還會影響在proxysql中的讀寫操作呢?
我們再次模擬 主庫掛掉的情況 此時主庫是 192.168.1.16
[root@mysql2 opt]# ps -ef |grep mysql root 2976 21612 0 19:50 pts/4 00:00:00 mysql -uroot -px xxxx root 2983 6583 0 19:53 pts/3 00:00:00 mysql -uadmin -px xxx -P6032 -h127.0.0.1 root 3369 15620 0 22:09 pts/1 00:00:00 grep mysql mysql 28714 15620 0 Feb10 pts/1 00:01:51 mysqld --defaults-file=/etc/my.cnf root 31851 21524 0 Feb10 pts/0 00:00:00 mysql -usbuser -px xxxx -P6033 -h192.168.1.16 [root@mysql2 opt]# [root@mysql2 opt]# [root@mysql2 opt]# kill -9 28714
此時再去 proxysql的程序端口6033中做讀操作 超時不可讀
sbuser@ 20:35: [z1_email]> select * from a1; ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 1000 after 10000ms
此時再去 proxysql的程序端口6033中做寫操作 超時不可寫
sbuser@ 22:17: [z1_email]> insert into a1 values(1591); ERROR 9001 (HY000): Max connect timeout reached while reaching hostgroup 100 after 10001ms
這時候我們做基於MHA 的手動failover操作
masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.1.16 --master_state=dead --new_master_host=192.168.1.135 --ignore_last_failover
此時vip 已經漂移到192.168.1.135 上了 ,並且我們進proxysql管理端口 6032 看看
admin@ 20:35: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+---------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+---------+--------+ | 100 | 192.168.1.101 | 3306 | SHUNNED | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 | +--------------+---------------+------+---------+--------+
我們再進入 proxysql的 6033 端口看看是否可以做讀操作 因為此時 192.168.1.135 的狀態還是online的
sbuser@ 22:22: [z1_email]> select * from a1; +------+ | id | +------+ | 1 | | 2 | | 12 | | 13 | | 14 | | 111 | | 222 | | 333 | | 250 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 15 | | 15 | | 16 | | 123 | | 124 | | 17 | | 1000 | | 1001 | | 1002 | | 1003 | | 1003 | | 1004 | | 1004 | | 134 | | 146 | | 157 | | 1590 | +------+ 37 rows in set (0.00 sec)
可見是可以讀的,那么我們vip 已經漂移到了192.168.1.135上了啊 是否可以寫呢?
sbuser@ 22:23: [z1_email]> insert into a1 values(1591); Query OK, 1 row affected (0.14 sec)
發現可以寫的,我們再回到管理端口6302 去看看居然發現 vip 192.168.1.101 的狀態又變回了ONLINE (emmmm.....)
admin@ 22:21: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+---------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+---------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 | +--------------+---------------+------+---------+--------+ 3 rows in set (0.01 sec)
所以這里我覺得 應該是proxysql 沒有立刻獲取 vip 已經漂移的狀態,顯示的是 SHUNNED ,但是並不影響使用 只是顯示有問題
最后我們再把 192.168.1.16 恢復起來 change 到 新的master 192.168.1.135上
[root@mysql2 masterha]# mysqld --defaults-file=/etc/my.cnf & [2] 3490 [root@mysql2 masterha]# mysql -uroot -poracle mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 Server version: 5.7.17-log MySQL Community Server (GPL) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. root@ 22:28: [(none)]> change master to master_host='192.168.1.135', -> master_user='slave', -> master_password='oracle', -> master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.07 sec) root@ 22:35: [(none)]> start slave; Query OK, 0 rows affected (0.02 sec) root@ 22:36: [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.135 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000016 Read_Master_Log_Pos: 743 Relay_Log_File: mysql2-relay-bin.000002 Relay_Log_Pos: 675 Relay_Master_Log_File: mysql-bin.000016 Slave_IO_Running: Yes Slave_SQL_Running: Yes
........
再查看一下 proxysql的管理端口 6032,發現192.168.1.16顯示狀態還是
admin@ 22:24: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+---------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+---------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 | +--------------+---------------+------+---------+--------+ 3 rows in set (0.01 sec)
我們去proxysql的程序端口6033 進行查詢一次
sbuser@ 22:23: [z1_email]> select * from a1; +------+ | id | +------+ | 1 | | 2 | | 12 | | 13 | | 14 | | 111 | | 222 | | 333 | | 250 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 15 | | 15 | | 16 | | 123 | | 124 | | 17 | | 1000 | | 1001 | | 1002 | | 1003 | | 1003 | | 1004 | | 1004 | | 134 | | 146 | | 157 | | 1590 | | 1591 | +------+ 38 rows in set (0.00 sec)
再查看一下 proxysql的管理端口 6032看看 可見都顯示ONLINE 了
admin@ 22:37: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+--------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+--------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | +--------------+---------------+------+--------+--------+ 3 rows in set (0.01 sec)
最后做一個總結:
MHA + proxysql 可以做到高可用和讀寫分離,在主庫掛掉后切換到從庫,通過主庫的vip漂移的特性將proxysql中的寫節點配置成vip,
並且總是主庫在做寫操作的,因為vip在哪台機器哪台機器就是主庫。
而且如果我們做了如下結構的proxysql策略,則無論是 哪台機器掛掉 ,只要進行切換就不會影響讀和寫
admin@ 22:37: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers; +--------------+---------------+------+--------+--------+ | hostgroup_id | hostname | port | status | weight | +--------------+---------------+------+--------+--------+ | 100 | 192.168.1.101 | 3306 | ONLINE | 1 | | 1000 | 192.168.1.135 | 3306 | ONLINE | 9 | | 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | +--------------+---------------+------+--------+--------+ 3 rows in set (0.01 sec)