本文來自我的github pages博客http://galengao.github.io/ 即www.gaohuirong.cn
摘要:
- 本篇是自己搭建的一篇mysql MHA文章
- 前面的安裝步驟基本不變,后面的比如keepalived的配置文件有幾種方法
- 其實想完成keepalived+lvs+atlas(mycat)+mha+mysql主從復制 這樣的架構,只是MYCAT單獨文章了
每個節點都關閉防火牆,SELINUX。
1、安裝epel yum源
wget http://mirrors.hustunique.com/epel/6/x86_64/epel-release-6-8.noarch.rpm wget http://mirrors.hustunique.com/epel/RPM-GPG-KEY-EPEL-6 rpm --import RPM-GPG-KEY-EPEL-6 rpm -ivh epel-release-6-8.noarch.rpm
2、所有節點安裝MHA node所需的perl模塊(DBD:mysql)
yum -y install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes perl-devel
3、建立主從復制
- 在node1 master上(node2上也要執行)
grant replication slave on *.* to 'backup'@'192.168.10.%'identified by'backup'; grant all privileges on *.* to 'root'@'192.168.10.%'identified by'123456'; show master status;
- 在node2 node3 slave上
change master to master_host='192.168.10.120', master_user='backup', master_password='backup',master_port=3306, master_log_file='mysql-bin.000001',master_log_pos=120,master_connect_retry=1; start slave; show slave status \G;
其中Slave_IO_Running 與 Slave_SQL_Running 的值都必須為YES,才表明狀態正常。
4、ssh-keygen實現三台機器之間相互免密鑰登錄 在每個節點上都執行下面語句
ssh-keygen -t rsa ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.10.120 ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.10.121 ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.10.122
5、三節點安裝mha4mysql-node-0.56,node3上安裝mha4mysql-manager-0.56
- 在node1 node2 node3安裝mha4mysql-node
wget https://googledrive.com/host/0B1lu97m8-haWeHdGWXp0YVVUSlk/mha4mysql-node-0.56.tar.gz tar xf mha4mysql-node-0.56.tar.gz cd mha4mysql-node-0.56 perl Makefile.PL make && make install
注:若運行 perl Makefile.PL時報Can’t locate CPAN.pm in @INC這個錯時,表示沒有安裝CPAN模塊:
wget http://www.cpan.org/authors/id/A/AN/ANDK/CPAN-2.10.tar.gz
tar -zxvf CPAN-2.10.tar.gz
cd CPAN-2.10
perl Makefile.PL
make
make install
- 在node3上安裝mha4mysql-manager
# 首先安裝依賴包 yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Config-IniFiles perl-Time-HiRes wget https://googledrive.com/host/0B1lu97m8-haWeHdGWXp0YVVUSlk/mha4mysql-manager-0.56.tar.gz tar xf mha4mysql-manager-0.56.tar.gz cd mha4mysql-manager-0.56 perl Makefile.PL make && make install
- 在node3上管理MHA配置文件
mkdir -p /etc/mha/{app1,scripts} cp -r mha4mysql-manager-0.56/samples/conf/* /etc/mha/ cp -r mha4mysql-manager-0.56/samples/scripts/* /etc/mha/scripts/ mv /etc/mha/app1.cnf /etc/mha/app1/ mv /etc/mha/masterha_default.cnf /etc/masterha_default.cnf
6、在manager上設置全局配置:
vim /etc/masterha_default.cnf [server default] user=root password=123456 ssh_user=root repl_user=backup repl_password=backup ping_interval=1 #shutdown_script="" secondary_check_script=masterha_secondary_check-s node1 -s node2 -s node3 --user=root --master_host=node1 --master_ip=192.168.10.120 --master_port=3306 #master_ip_failover_script="/etc/mha/scripts/master_ip_failover" #master_ip_online_change_script="/etc/mha/scripts/master_ip_online_change" # shutdown_script= /script/masterha/power_manager #report_script=""
# 創建日志目錄
mkdir -p /var/log/mha/app1
vim /etc/mha/app1/app1.cnf # master_binlog_dir 要是mysql的binlog目錄,否則驗證會報錯 [server default] manager_workdir=/var/log/mha/app1 manager_log=/var/log/mha/app1/manager.log [server1] hostname=node1 master_binlog_dir="/usr/local/mysql/logs" candidate_master=1 [server2] hostname=node2 master_binlog_dir="/usr/local/mysql/logs" candidate_master=1 [server3] hostname=node3 master_binlog_dir="/usr/local/mysql/logs" no_master=1
注釋:
candidate_master=1 表示該主機優先可被選為new master,當多個[serverX]等設置此參數時,優先級由[serverX]配置的順序決定;
secondary_check_script mha強烈建議有兩個或多個網絡線路檢查MySQL主服務器的可用性。默認情況下,只有單一的路線 MHA Manager檢查:從Manager to Master,但這是不可取的。MHA實際上可以有兩個或兩個以上的檢查路線通過調用外部腳本定義二次檢查腳本參數;
master_ip_failover_script 在MySQL從服務器提升為新的主服務器時,調用此腳本,因此可以將vip信息寫到此配置文件;
master_ip_online_change_script 使用masterha_master_switch命令手動切換MySQL主服務器時后會調用此腳本,參數和master_ip_failover_script 類似,腳本可以互用 shutdown_script 此腳本(默認samples內的腳本)利用服務器的遠程控制IDRAC等,使用ipmitool強制去關機,以避免fence設備重啟主服務器,造成腦列現象;
report_script 當新主服務器切換完成以后通過此腳本發送郵件報告,可參考使用http://caspian.dotconf.net/menu/Software/SendEmail/sendEmail-v1.56.tar.gz
以上涉及到的腳本可以從mha4mysql-manager-0.56/samples/scripts/*拷貝進行修改使用
其他manager詳細配置參數https://code.google.com/p/mysql-master-ha/wiki/Parameters
7、masterha_check_ssh驗證ssh信任登錄是否成功,masterha_check_repl驗證mysql復制是否成功
驗證ssh信任:
masterha_check_ssh --conf=/etc/mha/app1/app1.cnf
驗證主從復制:
masterha_check_repl --conf=/etc/mha/app1/app1.cnf
8、啟動MHA manager,並監控日志文件
在node1上killall mysqld的同時在node3上啟動manager服務
masterha_manager --conf=/etc/mha/app1/app1.cnf
之后觀察node3上/var/log/mha/app1/manager.log日志會發現node1 dead狀態,主自動切換到node2上,而node3上的主從配置指向了node2, 並且發生一次切換后會生成/var/log/mha/app1/app1.failover.complete文件;
手動恢復node1操作:
rm -rf /var/log/mha/app1/app1.failover.complete
在node1上啟動mysql:
service mysql start
登錄進去show master status;
重新配置node2 node3 主從指向node1(change master to)
stop slave; change master to master_host='192.168.10.120', master_user='backup', master_password='backup',master_port=3306, master_log_file='mysql-bin.000002',master_log_pos=120,master_connect_retry=1; start slave; show slave status\G;
MHA Manager后台執行:
(1)查看Manager的狀態
masterha_check_status --conf=/etc/mha/app1/app1.cnf
(2)后台啟動
nohup masterha_manager --conf=/etc/mha/app1/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
啟動參數介紹:
- –remove_dead_master_conf 該參數代表當發生主從切換后,老的主庫的ip將會從配置文件中移除。
- –manger_log 日志存放位置
- –ignore_last_failover 在缺省情況下,如果MHA檢測到連續發生宕機,且兩次宕機間隔不足8小時的話,則不會進行 Failover,之所以這樣限制是為了避免ping-pong效應。該參數代表忽略上次MHA觸發切換產生的文件,默認情況下,MHA發生切換后會在日志目錄,也就是上面我設置的/data產生app1.failover.complete文件,下次再次切換的時候如果發現該目錄下存在該文件將不允許觸發切換,除非在第一次切換后收到刪除該文件,為了方便,這里設置為–ignore_last_failover
關閉MHA Manage監控
masterha_stop --conf=/etc/mha/app1/app1.cnf
守護進程方式參考:<https://code.google.com/p/mysql-master-ha/wiki/Runnning_Background ftp://ftp.pbone.net/mirror/ftp5.gwdg.de/pub/opensuse/repositories/home:/weberho:/qmailtoaster/openSUSE_Tumbleweed/x86_64/daemontools-0.76-5.3.x86_64.rpm>
9、配置VIP
vip配置可以采用兩種方式,一種通過keepalived的方式管理虛擬ip的漂移;另外一種通過腳本方式啟動虛擬ip的方式(即不需要keepalived或者heartbeat類似的軟件)。這里采用的是keepalived的方式管理vip的漂移。
keepalived方式管理虛擬ip,keepalived配置方法如下:
- 安裝keepalived
# 下載軟件進行並進行安裝(兩台master,准確的說一台是master,另外一台是備選master,在沒有切換以前是slave): # 首先安裝內核包 yum install kernel-headers kernel-devel # 然后安裝依賴包 yum install popt-static kernel-devel make gcc openssl-devel lftp libnl* popt* # 最后軟連接 ln -s /usr/src/kernels/2.6.32-279.el6.i686//usr/src/linux/ # 安裝keepalived wget http://www.keepalived.org/software/keepalived-1.2.4.tar.gz tar zxvf keepalived-1.2.4.tar.gz cd keepalived-1.2.4 ./configure --prefix=/usr/local/keepalived make make install # 將keepalived做成啟動服務,方便管理 cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/ cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/ mkdir /etc/keepalived/ cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/ cp /usr/local/keepalived/sbin/keepalived /usr/sbin/ # 啟動keepalived服務 service keepalived start
- 配置keepalived配置文件
在keepalived中2種模式,分別是master->backup模式和backup->backup模式。這兩種模式有很大區別。在master->backup模式下,一旦主庫宕機,虛擬ip會自動漂移到從庫,當主庫修復后,keepalived啟動后,還會把虛擬ip搶占過來,即使設置了非搶占模式(nopreempt)搶占ip的動作也會發生。在backup->backup模式下,當主庫宕機后虛擬ip會自動漂移到從庫上,當原主庫恢復和keepalived服務啟動后,並不會搶占新主的虛擬ip,即使是優先級高於從庫的優先級別,也不會發生搶占。為了減少ip漂移次數,通常是把修復好的主庫當做新的備庫。下面依次是兩種配置方式:
a、master->backup模式
# 配置keepalived的配置文件,在master上配置 [root@192.168.3.110~]# cat /etc/keepalived/keepalived.conf ! ConfigurationFile for keepalived global_defs { notification_email { 343838596@qq.com } notification_email_from 343838596@qq.com smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id LVS_DEVEL } #這里加一個檢查mysql服務是否停掉的腳本,如果停掉就關閉keepalived服務器,讓其自動跳到另外一台虛擬ip varrp_script check_mysql{ script "/etc/keepalived/check_mysql.sh" } vrrp_instance VI_1 { state BACKUP interface eth0 virtual_router_id 51 priority 80 advert_int 1 nopreempt authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.10.219 } }
其中router_id MySQL HA表示設定keepalived組的名稱,將192.168.3.250這個虛擬ip綁定到該主機的eth0網卡上,並且設置了狀態為backup模式,將keepalived的模式設置為非搶占模式(nopreempt),priority 80表示設置的優先級為80。下面的配置略有不同,但是都是一個意思。
# 在候選master上配置keepalived的配置文件 cat /etc/keepalived/keepalived.conf ! ConfigurationFile for keepalived global_defs { notification_email { 343838596@qq.com } notification_email_from 343838596@qq.com smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id LVS_DEVEL } #這里加一個檢查mysql服務是否停掉的腳本,如果停掉就關閉keepalived服務器,讓其自動跳到另外一台虛擬ip varrp_script check_mysql{ script "/etc/keepalived/check_mysql.sh" } vrrp_instance VI_1 { state BACKUP interface eth0 virtual_router_id 51 priority 60 advert_int 1 nopreempt authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.10.219 } }
b、backup->backup模式
# 配置keepalived的配置文件,在master上配置(node1) vim /etc/keepalived/keepalived.conf ! ConfigurationFile for keepalived global_defs{ router_id MHA notification_email{ root@localhost #接收郵件,可以有多個,一行一個 } #當主、備份設備發生改變時,通過郵件通知 notification_email_from m@localhost #發送郵箱服務器 smtp_server 127.0.0.1 #發送郵箱超時時間 smtp_connect_timeout 30 } varrp_script check_mysql{ script "/etc/keepalived/check_mysql.sh" } vrrp_sync_group VG1{ group{ VI_1 } notify_master "/etc/keepalived/master.sh" } vrrp_instance VI_1{ state master interface eth0 virtual_router_id 110 priority 100 advert_int 1 nopreempt #不搶占資源,意思就是它活了之后也不會再把主搶回來 authentication{ # 認證方式,可以是PASS或AH兩種認證方式 auth_type PASS # 認證密碼 auth_pass 111111 } track_script{ check_mysql } virtual_ipaddress{ 192.168.10.219 } } # 在slave上配置(node2) vim /etc/keepalived/keepalived.conf ! ConfigurationFile for keepalived global_defs{ router_id MHA notification_email{ root@localhost #接收郵件,可以有多個,一行一個 } #當主、備份設備發生改變時,通過郵件通知 notification_email_from m@localhost #發送郵箱服務器 smtp_server 127.0.0.1 #發送郵箱超時時間 smtp_connect_timeout 30 } varrp_script check_mysql{ script "/etc/keepalived/check_mysql.sh" } vrrp_sync_group VG1{ group{ VI_1 } notify_master "/etc/keepalived/master.sh" } vrrp_instance VI_1{ state backup interface eth0 virtual_router_id 110 priority 99 advert_int 1 authentication{ # 認證方式,可以是PASS或AH兩種認證方式 auth_type PASS # 認證密碼 auth_pass 111111 } track_script{ check_mysql } virtual_ipaddress{ 192.168.10.219 } }
check_mysql.sh 腳本
vi /etc/keepalived/check_mysql.sh #!/bin/bash MYSQL=/usr/local/mysql/bin/mysql MYSQL_HOST=127.0.0.1 MYSQL_USER=root MYSQL_PASSWORD=123456 CHECK_TIME=3 #mysql is working MYSQL_OK is 1 , mysql down MYSQL_OK is 0 MYSQL_OK=1 function check_mysql_helth(){ $MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p$MYSQL_PASSWORD -e "show status;" >/dev/null 2>&1 if[ $? = 0 ];then MYSQL_OK=1 else MYSQL_OK=0 fi return $MYSQL_OK } while [ $CHECK_TIME -ne 0 ] do let "CHECK_TIME -= 1" check_mysql_helth if [ $MYSQL_OK = 1 ];then CHECK_TIME=0 exit0 fi if[ $MYSQL_OK -eq 0 ] && [ $CHECK_TIME -eq 0 ] then pkill keepalived exit 1 fi sleep 1 done
master.sh 腳本
vi /etc/keepalived/master.sh #!/bin/bash VIP=192.168.10.219 GATEWAY=1.1 /sbin/arping -I eth0 -c 5 -s $VIP $GATEWAY &>/dev/null
chmod +x /etc/keepalived/check_mysql.sh chmod +x /etc/keepalived/master.sh
- 啟動keepalived服務,在master上啟動並查看日志
[root@]# /etc/init.d/keepalived start Starting keepalived:[ OK ] [root@]# tail -f /var/log/messages
- 查看綁定情況
[root@]# ip addr | grep eth0 2: eth0:<BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.110/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.250/32 scope global eth0 [root@]#
- MHA引入keepalived
要想把keepalived服務引入MHA,我們只需要修改切換是觸發的腳本文件master_ip_failover即可,在該腳本中添加在master發生宕機時對keepalived的處理。其實可以不用master_ip_failover,因為上面配置文件里嵌套了check_mysql腳本,若沒有,就要用該腳本。
編輯腳本/usr/local/bin/master_ip_failover,修改后如下,這里完整貼出該腳本(192.168.3.123)。
在MHA Manager修改腳本修改后的內容如下:
#!/usr/bin/env perl use strict; use warnings FATAL =>'all'; useGetopt::Long; my( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); my $vip ='192.168.3.219'; my $ssh_start_vip ="/etc/init.d/keepalived start"; my $ssh_stop_vip ="/etc/init.d/keepalived stop"; GetOptions( 'command=s'=> \$command, 'ssh_user=s'=> \$ssh_user, 'orig_master_host=s'=> \$orig_master_host, 'orig_master_ip=s'=> \$orig_master_ip, 'orig_master_port=i'=> \$orig_master_port, 'new_master_host=s'=> \$new_master_host, 'new_master_ip=s'=> \$new_master_ip, 'new_master_port=i'=> \$new_master_port, ); exit &main(); sub main { print"\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if( $command eq "stop"|| $command eq "stopssh"){ my $exit_code =1; eval{ print"Disabling the VIP on old master: $orig_master_host \n"; &stop_vip(); $exit_code =0; }; if($@){ warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif( $command eq "start"){ my $exit_code =10; eval{ print"Enabling the VIP - $vip on the new master - $new_master_host \n"; &start_vip(); $exit_code =0; }; if($@){ warn $@; exit $exit_code; } exit $exit_code; } elsif( $command eq "status"){ print"Checking the Status of the script.. OK \n"; #`ssh $ssh_user\@cluster1 \" $ssh_start_vip \"`; exit0; } else{ &usage(); exit1; } } # A simple system call that enable the VIP on the new mastersub start_vip() { `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; } # A simple system call that disable the VIP on the old_master sub stop_vip(){ return0unless($ssh_user); `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; }
現在已經修改這個腳本了,我們現在打開在上面提到過的參數,再檢查集群狀態,看是否會報錯。
# node3上把注釋取消 [root@192.168.3.123~]# grep 'master_ip_failover_script' /etc/masterha/app1.cnf master_ip_failover_script=/usr/local/bin/master_ip_failover [root@192.168.3.123~]# #在node3上驗證主從復制 masterha_check_repl --conf=/etc/masterha/app1.cnf
10、MHA常用命令
# 查看manager狀態 masterha_check_status --conf=/etc/mha/app1/app1.cnf # 查看免密鑰是否正常 masterha_check_ssh --conf=/etc/mha/app1/app1.cnf # 查看主從復制是否正常 masterha_check_repl --conf=/etc/mha/app1/app1.cnf # 添加新節點server4到配置文件 masterha_conf_host --command=add —conf=/etc/mha/app1/app1.cnf —hostname=geekwolf —block=server4 —params=“no_master=1;ignore_fail=1” # 刪除server4節點 masterha_conf_host --command=delete —conf=/etc/mha/app1/app1.cnf —block=server4 # 注:block:為節點區名,默認值 為[server_$hostname],如果設置成block=100,則為[server100] params:參數,分號隔開(參考https://code.google.com/p/mysql-master-ha/wiki/Parameters) # 關閉manager服務 masterha_stop —conf=/etc/mha/app1/app1.cnf # 主手動切換(前提不要啟動masterha_manager服務),在主node1存活情況下進行切換 # 交互模式: masterha_master_switch —master_state=alive —conf=/etc/mha/app1/app1.cnf —new_master_host=node2 # 非交互模式: masterha_master_switch —master_state=alive —conf=/etc/mha/app1/app1.cnf —new_master_host=node2 —interactive=0 # 在主node1宕掉情況下進行切換 masterha_master_switch —master_state=dead —conf=/etc/mha/app1/app1.cnf —dead_master_host=node1 —dead_master_ip=192.168.10.216 —dead_master_port=3306 —new_master_host=192.168.10.217
詳細請參考:https://code.google.com/p/mysql-master-ha/wiki/TableOfContents?tm=6
11、注意事項
A. 以上兩種vip切換方式,建議采用第一種方法;
B. 發生主備切換后,manager服務會自動停掉,且在/var/log/mha/app1下面生成 app1.failover.complete,若再次發生切換需要刪除app1.failover.complete文件;
C. 測試過程發現一主兩從的架構(兩從都設置可以擔任主角色candidate_master=1),當舊主故障遷移到備主后,刪除app1.failover.complete,再次啟動manager,停掉新主后,發現無法正常切換(解決方式:刪除/etc/mha/app1/app1.cnf里面的舊主node1的信息后,重新切換正常);
D. arp緩存導致切換VIP后,無法使用問題;
E. 使用Semi-Sync能夠最大程度保證數據安全;
F. Purge_relay_logs腳本刪除中繼日志不會阻塞SQL線程,在每台從節點上設置計划任務定期清除中繼日志
0 5 * * * root /usr/bin/purge_relay_logs —user=root —password=geekwolf —disable_relay_log_purge >> /var/log/mha/purge_relay_logs.log 2>&1
12、部署過程遇到的問題
問題1:
[root@node1 mha4mysql-node-0.56]# perl Makefile.PL Can’t locate ExtUtils/MakeMaker.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Makefile.pm line 4. BEGIN failed—compilation aborted at inc/Module/Install/Makefile.pm line 4. Compilation failed in require at inc/Module/Install.pm line 283. Can’t locate ExtUtils/MakeMaker.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/ vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Can.pm line 6. BEGIN failed—compilation aborted at inc/Module/Install/Can.pm line 6. Compilation failed in require at inc/Module/Install.pm line 283. Can’t locate ExtUtils/MM_Unix.pm in @INC (@INC contains: inc /usr/local/lib64/ perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share