MMM介紹
MMM(Master-Master replication manager for MySQL)是一套支持雙主故障切換和雙主日常管理的腳本程序。MMM使用Perl語言開發,主要用來監控和管理MySQL Master-Master(雙主)復制,可以說是mysql主主復制管理器。雖然叫做雙主復制,但是業務上同一時刻只允許對一個主進行寫入,另一台備選主上提供部分讀服務,以加速在主主切換時刻備選主的預熱,可以說MMM這套腳本程序一方面實現了故障切換的功能,另一方面其內部附加的工具腳本也可以實現多個slave的read負載均衡。關於mysql主主復制配置的監控、故障轉移和管理的一套可伸縮的腳本套件(在任何時候只有一個節點可以被寫入),這個套件也能對居於標准的主從配置的任意數量的從服務器進行讀負載均衡,所以你可以用它來在一組居於復制的服務器啟動虛擬ip,除此之外,它還有實現數據備份、節點之間重新同步功能的腳本。
MMM提供了自動和手動兩種方式移除一組服務器中復制延遲較高的服務器的虛擬ip,同時它還可以備份數據,實現兩節點之間的數據同步等。由於MMM無法完全的保證數據一致性,所以MMM適用於對數據的一致性要求不是很高,但是又想最大程度的保證業務可用性的場景。MySQL本身沒有提供replication failover的解決方案,通過MMM方案能實現服務器的故障轉移,從而實現mysql的高可用。對於那些對數據的一致性要求很高的業務,非常不建議采用MMM這種高可用架構。
從網上分享一個Mysql-MMM的內部架構圖:
MySQL-MMM優缺點
優點:高可用性,擴展性好,出現故障自動切換,對於主主同步,在同一時間只提供一台數據庫寫操作,保證的數據的一致性。 缺點:Monitor節點是單點,可以結合Keepalived實現高可用。
MySQL-MMM工作原理
MMM(Master-Master replication managerfor Mysql,Mysql主主復制管理器)是一套靈活的腳本程序,基於perl實現,用來對mysql replication進行監控和故障遷移, 並能管理mysql Master-Master復制的配置(同一時間只有一個節點是可寫的)。 mmm_mond:監控進程,負責所有的監控工作,決定和處理所有節點角色活動。此腳本需要在監管機上運行。 mmm_agentd:運行在每個mysql服務器上(Master和Slave)的代理進程,完成監控的探針工作和執行簡單的遠端服務設置。此腳本需要在被監管機上運行。 mmm_control:一個簡單的腳本,提供管理mmm_mond進程的命令。 mysql-mmm的監管端會提供多個虛擬IP(VIP),包括一個可寫VIP,多個可讀VIP,通過監管的管理,這些IP會綁定在可用mysql之上,當某一台mysql宕機時,監管會將VIP遷移 至其他mysql。在整個監管過程中,需要在mysql中添加相關授權用戶,以便讓mysql可以支持監理機的維護。授權的用戶包括一個mmm_monitor用戶和一個mmm_agent用戶,如果 想使用mmm的備份工具則還要添加一個mmm_tools用戶。
MySQL-MMM高可用架構環境部署記錄(自動切換讀寫分離)
0)機器配置信息
角色 ip地址 主機名字 server-id monitoring 182.48.115.233 mmm-monit - master1 182.48.115.236 db-master1 1 master2 182.48.115.237 db-master2 2 slave1 182.48.115.238 db-slave 3 業務中的服務ip(vip)信息如下所示: ip地址 角色 描述 182.48.115.234 write 應用程序連接該ip對主庫進行寫請求 182.48.115.235 read 應用程序連接該ip進行讀請求 182.48.115.239 read 應用程序連接該ip進行讀請求
1)配置/etc/hosts(所有機器都要操作)
[root@mmm-monit ~]# cat /etc/hosts ....... 182.48.115.233 mmm-monit 182.48.115.236 db-master1 182.48.115.237 db-master2 182.48.115.238 db-slave
2)首先在3台主機上安裝mysql,部署復制環境
其中:182.48.115.236和182.48.115.237互為主從,182.48.115.238為1182.48.115.236的從 ........................................................................ mysql安裝參考:http://www.cnblogs.com/kevingrace/p/6109679.html mysql主從/主主配置參考:http://www.cnblogs.com/kevingrace/p/6256603.html ........................................................................ ---------182.48.115.236的my.cnf添加配置--------- server-id = 1 log-bin = mysql-bin log_slave_updates = 1 auto-increment-increment = 2 auto-increment-offset = 1 ---------182.48.115.237的my.cnf添加配置--------- server-id = 2 log-bin = mysql-bin log_slave_updates = 1 auto-increment-increment = 2 auto-increment-offset = 2 ---------182.48.115.238的my.cnf添加配置--------- server-id = 3 log-bin = mysql-bin log_slave_updates = 1 注意: 上面的server-id不一定要按順序來,只要沒有重復即可。 然后182.48.115.236和182.48.115.237相互授權連接;182.48.115.236授權給182.48.115.238連接。 最后通過"change master...."做對應的主主和主從復制,具體操作步驟在此省略,可以參考上面給出的文檔。
3)安裝MMM(所有機器上都要執行)
.......先安裝MMM所需要的Perl模塊....... [root@db-master1 ~]# vim install.sh //在所有機器上執行下面的安裝腳本 #!/bin/bash wget http://xrl.us/cpanm --no-check-certificate mv cpanm /usr/bin chmod 755 /usr/bin/cpanm cat > /root/list << EOF install Algorithm::Diff install Class::Singleton install DBI install DBD::mysql install File::Basename install File::stat install File::Temp install Log::Dispatch install Log::Log4perl install Mail::Send install Net::ARP install Net::Ping install Proc::Daemon install Thread::Queue install Time::HiRes EOF for package in `cat /root/list` do cpanm $package done [root@db-master1 ~]# chmod 755 install.sh [root@db-master1 ~]# ./install.sh .........下載mysql-mmm軟件,在所有服務器上安裝............ [root@db-master1 ~]# wget http://mysql-mmm.org/_media/:mmm2:mysql-mmm-2.2.1.tar.gz [root@db-master1 ~]# mv :mmm2:mysql-mmm-2.2.1.tar.gz mysql-mmm-2.2.1.tar.gz [root@db-master1 ~]# tar -zvxf mysql-mmm-2.2.1.tar.gz [root@db-master1 ~]# cd mysql-mmm-2.2.1 [root@db-master1 mysql-mmm-2.2.1]# make install mysql-mmm安裝后的主要拓撲結構如下所示(注意:yum安裝的和源碼安裝的路徑有所區別): /usr/lib/perl5/vendor_perl/5.8.8/MMM MMM使用的主要perl模塊 /usr/lib/mysql-mmm MMM使用的主要腳本 /usr/sbin MMM使用的主要命令的路徑 /etc/init.d/ MMM的agent和monitor啟動服務的目錄 /etc/mysql-mmm MMM配置文件的路徑,默認所以的配置文件位於該目錄下 /var/log/mysql-mmm 默認的MMM保存日志的位置 到這里已經完成了MMM的基本需求,接下來需要配置具體的配置文件,其中mmm_common.conf,mmm_agent.conf為agent端的配置文件,mmm_mon.conf為monitor端的配置文件。
4)配置agent端的配置文件,需要在db-master1 ,db-master2,db-slave上分別配置(配置內容一樣)
先在db-master1主機上配置agent的mmm_common.conf文件(這個在所有機器上都要配置,包括monitor機器) [root@db-master1 ~]# cd /etc/mysql-mmm/ [root@db-master1 mysql-mmm]# cp mmm_common.conf mmm_common.conf.bak [root@db-master1 mysql-mmm]# vim mmm_common.conf active_master_role writer <host default> cluster_interface eth0 pid_path /var/run/mmm_agentd.pid bin_path /usr/lib/mysql-mmm/ replication_user slave //注意這個賬號和下面一行的密碼是在前面部署主主/主從復制時創建的復制賬號和密碼 replication_password slave@123 agent_user mmm_agent agent_password mmm_agent </host> <host db-master1> ip 182.48.115.236 mode master peer db-master2 </host> <host db-master2> ip 182.48.115.237 mode master peer db-master1 </host> <host db-slave> ip 182.48.115.238 mode slave </host> <role writer> hosts db-master1, db-master2 ips 182.48.115.234 mode exclusive </role> <role reader> hosts db-master2, db-slave ips 182.48.115.235, 182.48.115.239 mode balanced </role> 配置解釋,其中: replication_user 用於檢查復制的用戶 agent_user為agent的用戶 mode標明是否為主或者備選主,或者從庫。 mode exclusive主為獨占模式,同一時刻只能有一個主 <role write>中hosts表示目前的主庫和備選主的真實主機ip或者主機名,ips為對外提供的虛擬機ip地址 <role readr>中hosts代表從庫真實的ip和主機名,ips代表從庫的虛擬ip地址。 可以直接把mmm_common.conf從db-master1拷貝到db-master2、db-slave和mmm-monit三台主機的/etc/mysql-mmm下。 [root@db-master1 ~]# scp /etc/mysql-mmm/mmm_common.conf db-master2:/etc/mysql-mmm/ [root@db-master1 ~]# scp /etc/mysql-mmm/mmm_common.conf db-slave:/etc/mysql-mmm/ [root@db-master1 ~]# scp /etc/mysql-mmm/mmm_common.conf mmm-monit:/etc/mysql-mmm/ 分別在db-master1,db-master2,db-slave三台主機的/etc/mysql-mmm配置mmm_agent.conf文件,分別用不同的字符標識。注意這個文件的this db1這行內容要修改 為各自的主機名。比如本環境中,db-master1要配置this db-master1,db-master2要配置為this db-master2,而db-slave要配置為this db-slave。 在db-master1(182.48.115.236)上: [root@db-master1 ~]# vim /etc/mysql-mmm/mmm_agent.conf include mmm_common.conf this db-master1 在db-master2(182.48.115.237)上: [root@db-master2 ~]# vim /etc/mysql-mmm/mmm_agent.conf include mmm_common.conf this db-master2 在db-slave(182.48.115.238)上: [root@db-slave ~]# vim /etc/mysql-mmm/mmm_agent.conf include mmm_common.conf this db-slave ------------------------------------------------------------------------------------------------------ 接着在mmm-monit(182.48.115.233)配置monitor的配置文件: [root@mmm-monit ~]# cp /etc/mysql-mmm/mmm_mon.conf /etc/mysql-mmm/mmm_mon.conf.bak [root@mmm-monit ~]# vim /etc/mysql-mmm/mmm_mon.conf include mmm_common.conf <monitor> ip 182.48.115.233 pid_path /var/run/mysql-mmm/mmm_mond.pid bin_path /usr/libexec/mysql-mmm status_path /var/lib/mysql-mmm/mmm_mond.status ping_ips 182.48.115.238,182.48.115.237,182.48.115.236 auto_set_online 10 //發現節點丟失,則過10秒進行切換 </monitor> <host default> monitor_user mmm_monitor monitor_password mmm_monitor </host> debug 0 這里只在原有配置文件中的ping_ips添加了整個架構被監控主機的ip地址,而在<host default>中配置了用於監控的用戶。
5)創建監控用戶,這里需要創建3個監控用戶
具體描述: 用戶名 描述 權限 monitor user MMM的monitor端監控所有的mysql數據庫的狀態用戶 REPLICATION CLIENT agent user 主要是MMM客戶端用於改變的master的read_only狀態用戶 SUPER,REPLICATION CLIENT,PROCESS repl 用於復制的用戶 REPLICATION SLAVE 在3台服務器(db-master1,db-master2,db-slave)進行授權,因為之前部署的主主復制,以及主從復制已經是ok的,所以這里在其中一台服務器執行就ok了,執行后 權限會自動同步到其它兩台機器上。用於復制的賬號之前已經有了,所以這里就授權兩個賬號。 在db-master1上進行授權操作: mysql> GRANT SUPER, REPLICATION CLIENT, PROCESS ON *.* TO 'mmm_agent'@'182.48.115.%' IDENTIFIED BY 'mmm_agent'; Query OK, 0 rows affected (0.00 sec) mysql> GRANT REPLICATION CLIENT ON *.* TO 'mmm_monitor'@'182.48.115.%' IDENTIFIED BY 'mmm_monitor'; Query OK, 0 rows affected (0.01 sec) mysql> flush privileges; Query OK, 0 rows affected (0.00 sec) 然后在db-master2和db-slave兩台機器上查看,發現上面在db-master1機器上授權的賬號已經同步過來了!
6)啟動agent和monitor服務
最后分別在db-master1,db-master2,db-slave上啟動agent [root@db-master1 ~]# /etc/init.d/mysql-mmm-agent start //將start替換成status,則查看agent進程起來了沒? Daemon bin: '/usr/sbin/mmm_agentd' Daemon pid: '/var/run/mmm_agentd.pid' Starting MMM Agent daemon... Ok [root@db-master2 ~]# /etc/init.d/mysql-mmm-agent start Daemon bin: '/usr/sbin/mmm_agentd' Daemon pid: '/var/run/mmm_agentd.pid' Starting MMM Agent daemon... Ok [root@db-slave ~]# /etc/init.d/mysql-mmm-agent start Daemon bin: '/usr/sbin/mmm_agentd' Daemon pid: '/var/run/mmm_agentd.pid' Starting MMM Agent daemon... Ok 接着在mmm-monit上啟動monitor程序 [root@mmm-monit ~]# mkdir /var/run/mysql-mmm [root@mmm-monit ~]# /etc/init.d/mysql-mmm-monitor start Daemon bin: '/usr/sbin/mmm_mond' Daemon pid: '/var/run/mmm_mond.pid' Starting MMM Monitor daemon: Ok ........................................................................................................ 如果monitor程序啟動出現如下報錯: Daemon bin: '/usr/sbin/mmm_mond' Daemon pid: '/var/run/mmm_mond.pid' Starting MMM Monitor daemon: Base class package "Class::Singleton" is empty. (Perhaps you need to 'use' the module which defines that package first, or make that module available in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .). at /usr/share/perl5/vendor_perl/MMM/Monitor/Agents.pm line 2 BEGIN failed--compilation aborted at /usr/share/perl5/vendor_perl/MMM/Monitor/Agents.pm line 2. Compilation failed in require at /usr/share/perl5/vendor_perl/MMM/Monitor/Monitor.pm line 15. BEGIN failed--compilation aborted at /usr/share/perl5/vendor_perl/MMM/Monitor/Monitor.pm line 15. Compilation failed in require at /usr/sbin/mmm_mond line 28. BEGIN failed--compilation aborted at /usr/sbin/mmm_mond line 28. failed 解決辦法: [root@mmm-monit ~]# perl -MCPAN -e shell ............................................... 如是執行這個命令后,有如下報錯: Can't locate CPAN.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .). BEGIN failed--compilation aborted. 解決: [root@mmm-monit ~]# rpm -q perl-CPAN package perl-CPAN is not installed [root@mmm-monit ~]# yum install perl-CPAN ............................................... 執行上面的"perl -MCPAN -e shell"命令后,出現下面的安裝命令 ...... cpan[1]> install MIME::Entity //依次輸入這些安裝命令 cpan[2]> install MIME::Parser cpan[3]> install Crypt::PasswdMD5 cpan[4]> install Term::ReadPassword cpan[5]> install Crypt::CBC cpan[6]> install Crypt::Blowfish cpan[7]> install Daemon::Generic cpan[8]> install DateTime cpan[9]> install SOAP::Lite 或者直接執行下面的安裝命令的命令也行: [root@mmm-monit ~]# perl -MCPAN -e 'install HTML::Template' [root@mmm-monit ~]# perl -MCPAN -e 'install MIME::Entity' [root@mmm-monit ~]# perl -MCPAN -e 'install Crypt::PasswdMD5' [root@mmm-monit ~]# perl -MCPAN -e 'install Term::ReadPassword' [root@mmm-monit ~]# perl -MCPAN -e 'install Crypt::CBC' [root@mmm-monit ~]# perl -MCPAN -e 'install Crypt::Blowfish' [root@mmm-monit ~]# perl -MCPAN -e 'install Daemon::Generic' [root@mmm-monit ~]# perl -MCPAN -e 'install DateTime' [root@mmm-monit ~]# perl -MCPAN -e 'install SOAP::Lite' ............................................................................................................ monitor進程啟動后,如下查看,發現進程並沒有起來! [root@mmm-monit ~]# /etc/init.d/mysql-mmm-monitor status Daemon bin: '/usr/sbin/mmm_mond' Daemon pid: '/var/run/mmm_mond.pid' Checking MMM Monitor process: not running. 解決辦法: 將mmm_mon.conf的debug模式開啟設為1,即打開debug模式,然后執行: [root@mmm-monit ~]# /etc/init.d/mysql-mmm-monitor start ....... open2: exec of /usr/libexec/mysql-mmm/monitor/checker ping_ip failed at /usr/share/perl5/vendor_perl/MMM/Monitor/Checker.pm line 143. 2017/06/01 20:16:02 WARN Checker 'ping_ip' is dead! 2017/06/01 20:16:02 INFO Spawning checker 'ping_ip'... 2017/06/01 20:16:02 DEBUG Core: reaped child 17439 with exit 65280 原因是mmm_mon.conf文件里check的bin_path路徑寫錯了 [root@mmm-monit ~]# cat /etc/mysql-mmm/mmm_mon.conf|grep bin_path bin_path /usr/libexec/mysql-mmm 將上面的bin_path改為/usr/lib/mysql-mmm 即可解決!即: [root@mmm-monit ~]# cat /etc/mysql-mmm/mmm_mon.conf|grep bin_path bin_path /usr/lib/mysql-mmm 接着再次啟動monitor進程 [root@mmm-monit ~]# /etc/init.d/mysql-mmm-monitor start ....... FATAL Couldn't open status file '/var/lib/mysql-mmm/mmm_mond.status': Starting up without status inf ....... Error in tempfile() using template /var/lib/mysql-mmm/mmm_mond.statusXXXXXXXXXX: Parent directory (/var/lib/mysql-mmm/) does not exist at /usr/share/perl5/vendor_perl/MMM/Monitor/Agents.pm line 158. Perl exited with active threads: 6 running and unjoined 0 finished and unjoined 0 running and detached 原因是mmm_mon.conf文件里check的status_path路徑寫錯了 [root@mmm-monit ~]# cat /etc/mysql-mmm/mmm_mon.conf |grep status_path status_path /var/lib/mysql-mmm/mmm_mond.status 將上面的status_path改為/var/lib/misc//mmm_mond.status 即可解決!即: [root@mmm-monit ~]# cat /etc/mysql-mmm/mmm_mon.conf|grep status_path status_path /var/lib/misc/mmm_mond.status 然后再次啟動monitor進程 [root@mmm-monit ~]# /etc/init.d/mysql-mmm-monitor restart ........ 2017/06/01 20:57:14 DEBUG Sending command 'SET_STATUS(ONLINE, reader(182.48.115.235), db-master1)' to db-master2 (182.48.115.237:9989) 2017/06/01 20:57:14 DEBUG Received Answer: OK: Status applied successfully!|UP:885492.82 2017/06/01 20:57:14 DEBUG Sending command 'SET_STATUS(ONLINE, writer(182.48.115.234), db-master1)' to db-master1 (182.48.115.236:9989) 2017/06/01 20:57:14 DEBUG Received Answer: OK: Status applied successfully!|UP:65356.14 2017/06/01 20:57:14 DEBUG Sending command 'SET_STATUS(ONLINE, reader(182.48.115.239), db-master1)' to db-slave (182.48.115.238:9989) 2017/06/01 20:57:14 DEBUG Received Answer: OK: Status applied successfully!|UP:945625.05 2017/06/01 20:57:15 DEBUG Listener: Waiting for connection... 2017/06/01 20:57:17 DEBUG Sending command 'SET_STATUS(ONLINE, reader(182.48.115.235), db-master1)' to db-master2 (182.48.115.237:9989) 2017/06/01 20:57:17 DEBUG Received Answer: OK: Status applied successfully!|UP:885495.95 2017/06/01 20:57:17 DEBUG Sending command 'SET_STATUS(ONLINE, writer(182.48.115.234), db-master1)' to db-master1 (182.48.115.236:9989) 2017/06/01 20:57:17 DEBUG Received Answer: OK: Status applied successfully!|UP:65359.27 2017/06/01 20:57:17 DEBUG Sending command 'SET_STATUS(ONLINE, reader(182.48.115.239), db-master1)' to db-slave (182.48.115.238:9989) 2017/06/01 20:57:17 DEBUG Received Answer: OK: Status applied successfully!|UP:945628.17 2017/06/01 20:57:18 DEBUG Listener: Waiting for connection... ......... 只要上面在啟動過程中的check檢查中沒有報錯信息,並且有successfully信息,則表示monitor進程正常了。 [root@mmm-monit ~]# ps -ef|grep monitor root 30651 30540 0 20:59 ? 00:00:00 perl /usr/lib/mysql-mmm/monitor/checker ping_ip root 30654 30540 0 20:59 ? 00:00:00 perl /usr/lib/mysql-mmm/monitor/checker mysql root 30656 30540 0 20:59 ? 00:00:00 perl /usr/lib/mysql-mmm/monitor/checker ping root 30658 30540 0 20:59 ? 00:00:00 perl /usr/lib/mysql-mmm/monitor/checker rep_backlog root 30660 30540 0 20:59 ? 00:00:00 perl /usr/lib/mysql-mmm/monitor/checker rep_threads 那么,最終mmm_mon.cnf文件的配置如下: [root@mmm-monit ~]# cat /etc/mysql-mmm/mmm_mon.conf include mmm_common.conf <monitor> ip 182.48.115.233 pid_path /var/run/mysql-mmm/mmm_mond.pid bin_path /usr/lib/mysql-mmm status_path /var/lib/misc/mmm_mond.status ping_ips 182.48.115.238,182.48.115.237,182.48.115.236 auto_set_online 10 </monitor> <host default> monitor_user mmm_monitor monitor_password mmm_monitor </host> debug 1 [root@mmm-monit ~]# ll /var/lib/misc/mmm_mond.status -rw-------. 1 root root 121 6月 1 21:06 /var/lib/misc/mmm_mond.status [root@mmm-monit ~]# ll /var/run/mysql-mmm/mmm_mond.pid -rw-r--r--. 1 root root 5 6月 1 20:59 /var/run/mysql-mmm/mmm_mond.pid ----------------------------------------------------------- 其中agent的日志存放在/var/log/mysql-mmm/mmm_agentd.log,monitor日志放在/var/log/mysql-mmm/mmm_mond.log, 啟動過程中有什么問題,通常日志都會有詳細的記錄。
7)在monitor主機上檢查集群主機的狀態
[root@mmm-monit ~]# mmm_control checks all db-master2 ping [last change: 2017/06/01 20:59:39] OK db-master2 mysql [last change: 2017/06/01 20:59:39] OK db-master2 rep_threads [last change: 2017/06/01 20:59:39] OK db-master2 rep_backlog [last change: 2017/06/01 20:59:39] OK: Backlog is null db-master1 ping [last change: 2017/06/01 20:59:39] OK db-master1 mysql [last change: 2017/06/01 20:59:39] OK db-master1 rep_threads [last change: 2017/06/01 20:59:39] OK db-master1 rep_backlog [last change: 2017/06/01 20:59:39] OK: Backlog is null db-slave ping [last change: 2017/06/01 20:59:39] OK db-slave mysql [last change: 2017/06/01 20:59:39] OK db-slave rep_threads [last change: 2017/06/01 20:59:39] OK db-slave rep_backlog [last change: 2017/06/01 20:59:39] OK: Backlog is null
8)在monitor主機上檢查集群環境在線狀況
[root@mmm-monit ~]# mmm_control show db-master1(182.48.115.236) master/ONLINE. Roles: writer(182.48.115.234) db-master2(182.48.115.237) master/ONLINE. Roles: reader(182.48.115.235) db-slave(182.48.115.238) slave/ONLINE. Roles: reader(182.48.115.239) 然后到mmm agent機器上查看,就會發現vip已經綁定了 [root@db-master1 ~]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:5f:58:dc brd ff:ff:ff:ff:ff:ff inet 182.48.115.236/27 brd 182.48.115.255 scope global eth0 inet 182.48.115.234/32 scope global eth0 inet6 fe80::5054:ff:fe5f:58dc/64 scope link valid_lft forever preferred_lft forever [root@db-master2 mysql-mmm]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:1b:6e:53 brd ff:ff:ff:ff:ff:ff inet 182.48.115.237/27 brd 182.48.115.255 scope global eth0 inet 182.48.115.235/32 scope global eth0 inet6 fe80::5054:ff:fe1b:6e53/64 scope link valid_lft forever preferred_lft forever [root@db-slave ~]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:ca:d5:f8 brd ff:ff:ff:ff:ff:ff inet 182.48.115.238/27 brd 182.48.115.255 scope global eth0 inet 182.48.115.239/27 brd 182.48.115.255 scope global secondary eth0:1 inet6 fe80::5054:ff:feca:d5f8/64 scope link valid_lft forever preferred_lft forever 從上面輸出信息中可以看出,虛擬ip已經綁定到各agent上了。其中: 182.48.115.234順利添加到182.48.115.236上作為主對外提供寫服務 182.48.115.235順利添加到182.48.115.237上作為主對外提供讀服務 182.48.115.239順利添加到182.48.115.238上作為主對外提供讀服務
9)online(上線)所有主機
這里主機已經在線了,如果沒有在線,可以使用下面的命令將相關主機online [root@mmm-monit ~]# mmm_control set_online db-master1 OK: This host is already ONLINE. Skipping command. [root@mmm-monit ~]# mmm_control set_online db-master2 OK: This host is already ONLINE. Skipping command. [root@mmm-monit ~]# mmm_control set_online db-slave OK: This host is already ONLINE. Skipping command. 提示主機已經在線,已經跳過命令執行了。到這里整個集群就配置完成了。
--------------------------------------------------MMM高可用測試-------------------------------------------------------
已經完成高可用環境的搭建了,下面我們就可以做MMM的HA測試咯。
首先查看整個集群的狀態,可以看到整個集群狀態正常。 [root@mmm-monit ~]# mmm_control show db-master1(182.48.115.236) master/ONLINE. Roles: writer(182.48.115.234) db-master2(182.48.115.237) master/ONLINE. Roles: reader(182.48.115.235) db-slave(182.48.115.238) slave/ONLINE. Roles: reader(182.48.115.239) 1)模擬db-master2(182.48.115.237)宕機,手動停止mysql服務. [root@db-master2 ~]# /etc/init.d/mysql stop Shutting down MySQL.... SUCCESS! 在mmm-monit機器上觀察monitor日志 [root@mmm-monit ~]# tail -f /var/log/mysql-mmm/mmm_mond.log ......... 2017/06/01 21:28:17 FATAL State of host 'db-master2' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK) 重新查看mmm集群的最新狀態: [root@mmm-monit ~]# mmm_control show db-master1(182.48.115.236) master/ONLINE. Roles: writer(182.48.115.234) db-master2(182.48.115.237) master/HARD_OFFLINE. Roles: db-slave(182.48.115.238) slave/ONLINE. Roles: reader(182.48.115.235), reader(182.48.115.239) 發現之前添加到db-master2對外提供讀服務器的虛擬ip,即182.48.115.235已經漂移到db-slave機器上了. [root@db-slave ~]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:ca:d5:f8 brd ff:ff:ff:ff:ff:ff inet 182.48.115.238/27 brd 182.48.115.255 scope global eth0 inet 182.48.115.235/32 scope global eth0 inet 182.48.115.239/27 brd 182.48.115.255 scope global secondary eth0:1 inet6 fe80::5054:ff:feca:d5f8/64 scope link valid_lft forever preferred_lft forever 測試mysql數據同步: 雖然db-master2機器的mysql服務關閉,但是由於它的vip漂移到db-slave機器上了,所以此時db-master1和db-slave這個時候是主從復制關系。 在db-master1數據庫里更新數據,會自動更新到db-slave數據庫里。 ------------------ 接着重啟db-master2的mysql服務,可以看到db-master2由HARD_OFFLINE轉到AWAITING_RECOVERY。這時候db-master2再次接管讀請求。 [root@db-master2 ~]# /etc/init.d/mysql start Starting MySQL.. SUCCESS! 在mmm-monit機器上觀察monitor日志 [root@mmm-monit ~]# tail -f /var/log/mysql-mmm/mmm_mond.log ......... 2017/06/01 21:36:00 FATAL State of host 'db-master2' changed from HARD_OFFLINE to AWAITING_RECOVERY 2017/06/01 21:36:12 FATAL State of host 'db-master2' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(10 seconds). It was in state AWAITING_RECOVERY for 12 seconds [root@mmm-monit ~]# mmm_control show db-master1(182.48.115.236) master/ONLINE. Roles: writer(182.48.115.234) db-master2(182.48.115.237) master/ONLINE. Roles: reader(182.48.115.235) db-slave(182.48.115.238) slave/ONLINE. Roles: reader(182.48.115.239) [root@db-master2 mysql-mmm]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:1b:6e:53 brd ff:ff:ff:ff:ff:ff inet 182.48.115.237/27 brd 182.48.115.255 scope global eth0 inet 182.48.115.235/32 scope global eth0 inet6 fe80::5054:ff:fe1b:6e53/64 scope link valid_lft forever preferred_lft forever 發現之前的vip資源又回到了db-master2機器上,db-master2重新接管了服務。並且db-master2恢復后,在故障期間更新的數據也會自動和其它兩台機器同步! --------------------------------------------------------------------------------------------------- 2)模擬db-master1主庫宕機,手動關閉mysql服務 [root@db-master1 ~]# /etc/init.d/mysql stop Shutting down MySQL.... SUCCESS! 在mmm-monit機器上觀察monitor日志 [root@mmm-monit ~]# tail -f /var/log/mysql-mmm/mmm_mond.log ......... 2017/06/01 21:43:36 FATAL State of host 'db-master1' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK) 查看mmm集群狀態: [root@mmm-monit ~]# mmm_control show db-master1(182.48.115.236) master/HARD_OFFLINE. Roles: db-master2(182.48.115.237) master/ONLINE. Roles: reader(182.48.115.235), writer(182.48.115.234) db-slave(182.48.115.238) slave/ONLINE. Roles: reader(182.48.115.239) 從上面可以發現,db-master1由以前的ONLINE轉化為HARD_OFFLINE,移除了寫角色,因為db-master2是備選主,所以接管了寫角色,db-slave指向新的主庫db-master2, 應該說db-slave實際上找到了db-master2的sql現在的位置,即db-master2的show master返回的值,然后直接在db-slave上change master to到db-master2。 db-master2機器上可以發現,db-master1對外提供寫服務的vip漂移過來了 [root@db-master2 mysql-mmm]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:1b:6e:53 brd ff:ff:ff:ff:ff:ff inet 182.48.115.237/27 brd 182.48.115.255 scope global eth0 inet 182.48.115.235/32 scope global eth0 inet 182.48.115.234/32 scope global eth0 inet6 fe80::5054:ff:fe1b:6e53/64 scope link valid_lft forever preferred_lft forever 這個時候,在db-master2數據庫里更新數據,db-slave數據庫會自動同步過去。 ------------------------ 接着重啟db-master1的mysql [root@db-master1 ~]# /etc/init.d/mysql start Starting MySQL.. SUCCESS! 在mmm-monit機器上觀察monitor日志 [root@mmm-monit ~]# tail -f /var/log/mysql-mmm/mmm_mond.log ......... 2017/06/01 21:52:14 FATAL State of host 'db-master1' changed from HARD_OFFLINE to AWAITING_RECOVERY 再次查看mmm集群狀態(發現寫服務的vip轉移到db-master2上了): [root@mmm-monit ~]# mmm_control show db-master1(182.48.115.236) master/ONLINE. Roles: db-master2(182.48.115.237) master/ONLINE. Roles: reader(182.48.115.235), writer(182.48.115.234) db-slave(182.48.115.238) slave/ONLINE. Roles: reader(182.48.115.239) 發現db-master1雖然恢復了,並已經上線在集群中,但是其之前綁定的寫服務的vip並沒有從db-master2上轉移回來,即db-master1恢復后沒有重新接管服務。 只有等到db-master2發生故障時,才會把182.48.115.234的寫服務的vip轉移到db-master1上,同時把182.48.115.235的讀服務的vip轉移到db-slave 機器上(然后db-master2恢復后,就會把轉移到db-slave上的182.48.115.235的讀服務的vip再次轉移回來)。 --------------------------------------------------------------------------------------------------- 再接着模擬db-slave從庫宕機,手動關閉mysql服務 [root@db-slave ~]# /etc/init.d/mysql stop Shutting down MySQL.. [確定] 在mmm-monit機器上觀察monitor日志 [root@mmm-monit ~]# tail -f /var/log/mysql-mmm/mmm_mond.log ......... 2017/06/01 22:42:24 FATAL State of host 'db-slave' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK) 查看mmm集群的最新狀態: [root@mmm-monit ~]# mmm_control show db-master1(182.48.115.236) master/ONLINE. Roles: writer(182.48.115.234) db-master2(182.48.115.237) master/ONLINE. Roles: reader(182.48.115.235), reader(182.48.115.239) db-slave(182.48.115.238) slave/HARD_OFFLINE. Roles: 發現db-slave發生故障后,其讀服務的182.48.115.239的vip轉移到db-master2上了。 [root@db-master2 mysql-mmm]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:1b:6e:53 brd ff:ff:ff:ff:ff:ff inet 182.48.115.237/27 brd 182.48.115.255 scope global eth0 inet 182.48.115.235/32 scope global eth0 inet 182.48.115.239/32 scope global eth0 inet6 fe80::5054:ff:fe1b:6e53/64 scope link 當db-slave恢復后,讀服務的vip還是再次轉移回來,即重新接管服務。並且故障期間更新的數據會自動同步回來。 需要注意: db-master1,db-master2,db-slave之間為一主兩從的復制關系,一旦發生db-master2,db-slave延時於db-master1時,這個時刻db-master1 mysql宕機, db-slave將會等待數據追上db-master1后,再重新指向新的主db-master2,進行change master to db-master2操作,在db-master1宕機的過程中,一旦db-master2 落后於db-master1,這時發生切換,db-master2變成了可寫狀態,數據的一致性將會無法保證。
總結:MMM不適用於對數據一致性要求很高的環境。但是高可用完全做到了。