使用NFS服務器(比如圖片業務),一台為主,一台為備。通常主到備的數據同步是通過rsync來做(可以結合inotify做實時同步)。由於NFS服務是存在單點的,出於對業務在線率和數據安全的保障,可以采用"DRBD+NFS+Keepalived"架構來完成高可用方案部署。之前介紹了DRBD詳細解說及配置過程記錄,廢話不多說了,基於之前的那篇文檔的機器配置信息,以下記錄部署過程:
思路:
1)在兩台機器上安裝keepalived,VIP為192.168.1.200
2)將DRBD的掛載目錄/data作為NFS的掛載目錄。遠程客戶機使用vip地址掛載NFS
3)當Primary主機發生宕機或NFS掛了的故障時,Secondary主機提權升級為DRBD的主節點,並且VIP資源也會轉移過來。
當Primary主機的故障恢復時,會再次變為DRBD的主節點,並重新奪回VIP資源。從而實現故障轉移
-----------------------------------------------------------------------------------------------------------
Primary和Secondary兩台主機的DRBD環境部署,參見http://www.cnblogs.com/kevingrace/p/5740940.html
Primary主機(192.168.1.151)默認作為DRBD的主節點,DRBD掛載目錄是/data
Secondary主機(192.168.1.152)是DRBD的備份節點
在Primary主機上查看DRBD狀態,如下,可知Primary主機是DRBD的主節點
[root@Primary ~]# /etc/init.d/drbd status
drbd driver loaded OK; device status:
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37
m:res cs ro ds p mounted fstype
0:r0 Connected Primary/Secondary UpToDate/UpToDate C /data ext4
如下,DRBD已完成掛載,掛載目錄是/data
[root@Primary ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
156G 36G 112G 25% /
tmpfs 2.9G 0 2.9G 0% /dev/shm
/dev/vda1 190M 98M 83M 55% /boot
/dev/drbd0 9.8G 23M 9.2G 1% /data
DRBD數據如下
[root@Primary ~]# cd /data
[root@Primary data]# ll
total 16
-rw-r--r--. 1 root root 9 May 25 09:33 test3
-rw-r--r--. 1 root root 5 May 25 09:34 wangshibo
-rw-r--r--. 1 root root 5 May 25 09:34 wangshibo1
-rw-r--r--. 1 root root 5 May 25 09:34 wangshibo2
-----------------------------------------------------------------------------------------------------------
在Primary和Secondary兩台主機上安裝NFS(可以參考:http://www.cnblogs.com/kevingrace/p/6084604.html)
[root@Primary ~]# yum install rpcbind nfs-utils
[root@Primary ~]# vim /etc/exports
/data 192.168.1.0/24(rw,sync,no_root_squash)
[root@Primary ~]# /etc/init.d/rpcbind start
[root@Primary ~]# /etc/init.d/nfs start
---------------------------------------------------------------------------------------------------------
關閉兩台主機的iptables防火牆
防火牆最好關閉,否則可能導致客戶機掛載nfs時會失敗!
若開啟防火牆,需要在iptables中開放nfs相關端口機以及VRRP組播地址
[root@Primary ~]# /etc/init.d/iptables stop
兩台機器上的selinux一定要關閉!!!!!!!!!!
否則下面在keepalived.conf里配置的notify_master.sh等腳本執行失敗!這是曾經踩過的坑!
[root@Primary ~]# setenforce 0 //臨時關閉。永久關閉的話,還需要在/etc/sysconfig/selinux 文件里將SELINUX改為disabled
[root@Primary ~]# getenforce
Permissive
-----------------------------------------------------------------------------------------------------------
在兩台主機上安裝Keepalived,配合keepalived實現自動fail-over
安裝Keepalived
[root@Primary ~]# yum install -y openssl-devel popt-devel
[root@Primary ~]# cd /usr/local/src/
[root@Primary src]# wget http://www.keepalived.org/software/keepalived-1.3.5.tar.gz
[root@Primary src]# tar -zvxf keepalived-1.3.5.tar.gz
[root@Primary src]# cd keepalived-1.3.5
[root@Primary keepalived-1.3.5]# ./configure --prefix=/usr/local/keepalived
[root@Primary keepalived-1.3.5]# make && make install
[root@Primary keepalived-1.3.5]# cp /usr/local/src/keepalived-1.3.5/keepalived/etc/init.d/keepalived /etc/rc.d/init.d/
[root@Primary keepalived-1.3.5]# cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
[root@Primary keepalived-1.3.5]# mkdir /etc/keepalived/
[root@Primary keepalived-1.3.5]# cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
[root@Primary keepalived-1.3.5]# cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
[root@Primary keepalived-1.3.5]# echo "/etc/init.d/keepalived start" >> /etc/rc.local
[root@Primary keepalived-1.3.5]# chmod +x /etc/rc.d/init.d/keepalived #添加執行權限
[root@Primary keepalived-1.3.5]# chkconfig keepalived on #設置開機啟動
[root@Primary keepalived-1.3.5]# service keepalived start #啟動
[root@Primary keepalived-1.3.5]# service keepalived stop #關閉
[root@Primary keepalived-1.3.5]# service keepalived restart #重啟
-----------Primary主機的keepalived.conf配置
[root@Primary ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf-bak
[root@Primary ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id DRBD_HA_MASTER
}
vrrp_script chk_nfs {
script "/etc/keepalived/check_nfs.sh"
interval 5
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_nfs
}
notify_stop /etc/keepalived/notify_stop.sh
notify_master /etc/keepalived/notify_master.sh
virtual_ipaddress {
192.168.1.200
}
}
啟動keepalived服務
[root@Primary data]# /etc/init.d/keepalived start
Starting keepalived: [ OK ]
[root@Primary data]# ps -ef|grep keepalived
root 30937 1 0 11:49 ? 00:00:00 keepalived -D
root 30939 30937 0 11:49 ? 00:00:00 keepalived -D
root 30940 30937 0 11:49 ? 00:00:00 keepalived -D
root 31123 10364 0 11:50 pts/1 00:00:00 grep --color keepalived
查看VIP
[root@Primary data]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether fa:16:3e:35:d1:d6 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.151/24 brd 192.168.1.255 scope global eth0
inet 192.168.1.200/32 scope global eth0
inet6 fe80::f816:3eff:fe35:d1d6/64 scope link
valid_lft forever preferred_lft forever
-----------Secondary主機的keepalived.conf配置
[root@Secondary ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf-bak
[root@Secondary ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id DRBD_HA_BACKUP
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
notify_master /etc/keepalived/notify_master.sh //當此機器為keepalived的master角色時執行這個腳本
notify_backup /etc/keepalived/notify_backup.sh //當此機器為keepalived的backup角色時執行這個腳本
virtual_ipaddress {
192.168.1.200
}
}
啟動keepalived服務
[root@Secondary ~]# /etc/init.d/keepalived start
Starting keepalived: [ OK ]
[root@Secondary ~]# ps -ef|grep keepalived
root 17128 1 0 11:50 ? 00:00:00 keepalived -D
root 17129 17128 0 11:50 ? 00:00:00 keepalived -D
root 17131 17128 0 11:50 ? 00:00:00 keepalived -D
root 17219 29939 0 11:50 pts/1 00:00:00 grep --color keepalived
-------------四個腳本配置---------------
1)此腳本只在Primary機器上配置
[root@Primary ~]# vim /etc/keepalived/check_nfs.sh
#!/bin/sh
###檢查nfs可用性:進程和是否能夠掛載
/sbin/service nfs status &>/dev/null
if [ $? -ne 0 ];then
###如果服務狀態不正常,先嘗試重啟服務
/sbin/service nfs restart
/sbin/service nfs status &>/dev/null
if [ $? -ne 0 ];then
###若重啟nfs服務后,仍不正常
###卸載drbd設備
umount /dev/drbd0
###將drbd主降級為備
drbdadm secondary r0
#關閉keepalived
/sbin/service keepalived stop
fi
fi
[root@Primary ~]# chmod 755 /etc/keepalived/check_nfs.sh
2)此腳本只在Primary機器上配置
[root@Primary ~]# mkdir /etc/keepalived/logs
[root@Primary ~]# vim /etc/keepalived/notify_stop.sh
#!/bin/bash
time=`date "+%F %H:%M:%S"`
echo -e "$time ------notify_stop------\n" >> /etc/keepalived/logs/notify_stop.log
/sbin/service nfs stop &>> /etc/keepalived/logs/notify_stop.log
/bin/umount /data &>> /etc/keepalived/logs/notify_stop.log
/sbin/drbdadm secondary r0 &>> /etc/keepalived/logs/notify_stop.log
echo -e "\n" >> /etc/keepalived/logs/notify_stop.log
[root@Primary ~]# chmod 755 /etc/keepalived/notify_stop.sh
3)此腳本在兩台機器上都要配置
[root@Primary ~]# vim /etc/keepalived/notify_master.sh
#!/bin/bash
time=`date "+%F %H:%M:%S"`
echo -e "$time ------notify_master------\n" >> /etc/keepalived/logs/notify_master.log
/sbin/drbdadm primary r0 &>> /etc/keepalived/logs/notify_master.log
/bin/mount /dev/drbd0 /data &>> /etc/keepalived/logs/notify_master.log
/sbin/service nfs restart &>> /etc/keepalived/logs/notify_master.log
echo -e "\n" >> /etc/keepalived/logs/notify_master.log
[root@Primary ~]# chmod 755 /etc/keepalived/notify_master.sh
4)此腳本只在Secondary機器上配置
[root@Secondary ~]# mkdir /etc/keepalived/logs
[root@Secondary ~]# vim /etc/keepalived/notify_backup.sh
#!/bin/bash
time=`date "+%F %H:%M:%S"`
echo -e "$time ------notify_backup------\n" >> /etc/keepalived/logs/notify_backup.log
/sbin/service nfs stop &>> /etc/keepalived/logs/notify_backup.log
/bin/umount /dev/drbd0 &>> /etc/keepalived/logs/notify_backup.log
/sbin/drbdadm secondary r0 &>> /etc/keepalived/logs/notify_backup.log
echo -e "\n" >> /etc/keepalived/logs/notify_backup.log
[root@Secondary ~]# chmod 755 /etc/keepalived/notify_backup.sh
-----------------------------------------------------------------------------------------------------------
在遠程客戶機上掛載NFS
客戶端只需要安裝rpcbind程序,並確認服務正常
[root@huanqiu ~]# yum install rpcbind nfs-utils
[root@huanqiu ~]# /etc/init.d/rpcbind start
掛載NFS
[root@huanqiu ~]# mount -t nfs 192.168.1.200:/data /web
如下查看,發現已經成功掛載了NFS
[root@huanqiu ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
107G 15G 87G 14% /
tmpfs 2.9G 0 2.9G 0% /dev/shm
/dev/vda1 190M 67M 113M 38% /boot
192.168.1.200:/data 9.8G 23M 9.2G 1% /web
[root@huanqiu ~]# cd /web/
[root@huanqiu web]# ll
total 16
-rw-r--r--. 1 root root 9 May 25 09:33 test3
-rw-r--r--. 1 root root 5 May 25 09:34 wangshibo
-rw-r--r--. 1 root root 5 May 25 09:34 wangshibo1
-rw-r--r--. 1 root root 5 May 25 09:34 wangshibo2
-----------------------------------------------------------------------------------------------------------
接着進行fail-over(故障)自動切換測試:
1)
先關閉Primary主機上的keepalived服務。就會發現VIP資源已經轉移到Secondary主機上了。
同時,Primary主機的nfs也會主動關閉,同時Secondary會升級為DRBD的主節點
[root@Primary ~]# /etc/init.d/keepalived stop
Stopping keepalived: [ OK ]
[root@Primary ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether fa:16:3e:35:d1:d6 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.151/24 brd 192.168.1.255 scope global eth0
inet6 fe80::f816:3eff:fe35:d1d6/64 scope link
valid_lft forever preferred_lft forever
查看系統日志,也能看到VIP資源轉移信息
[root@Primary ~]# tail -1000 /var/log/messages
........
May 25 11:50:03 localhost Keepalived_vrrp[30940]: Sending gratuitous ARP on eth0 for 192.168.1.200
May 25 11:50:03 localhost Keepalived_vrrp[30940]: Sending gratuitous ARP on eth0 for 192.168.1.200
May 25 11:50:03 localhost Keepalived_vrrp[30940]: Sending gratuitous ARP on eth0 for 192.168.1.200
May 25 11:50:03 localhost Keepalived_vrrp[30940]: Sending gratuitous ARP on eth0 for 192.168.1.200
May 25 11:58:51 localhost Keepalived[30937]: Stopping
May 25 11:58:51 localhost Keepalived_vrrp[30940]: VRRP_Instance(VI_1) sent 0 priority
May 25 11:58:51 localhost Keepalived_vrrp[30940]: VRRP_Instance(VI_1) removing protocol VIPs.
[root@Primary ~]# ps -ef|grep nfs
root 588 10364 0 12:13 pts/1 00:00:00 grep --color nfs
[root@Primary ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
156G 36G 112G 25% /
tmpfs 2.9G 0 2.9G 0% /dev/shm
/dev/vda1 190M 98M 83M 55% /boot
[root@Primary ~]# /etc/init.d/drbd status
drbd driver loaded OK; device status:
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37
m:res cs ro ds p mounted fstype
0:r0 Connected Secondary/Secondary UpToDate/UpToDate C
登錄到Secondary備份機器上,發現VIP資源已經轉移過來
[root@Secondary ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether fa:16:3e:4c:7e:88 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.152/24 brd 192.168.1.255 scope global eth0
inet 192.168.1.200/32 scope global eth0
inet6 fe80::f816:3eff:fe4c:7e88/64 scope link
valid_lft forever preferred_lft forever
[root@Secondary ~]# tail -1000 /var/log/messages
........
May 25 11:58:53 localhost Keepalived_vrrp[17131]: Sending gratuitous ARP on eth0 for 192.168.1.200
May 25 11:58:53 localhost Keepalived_vrrp[17131]: Sending gratuitous ARP on eth0 for 192.168.1.200
May 25 11:58:53 localhost Keepalived_vrrp[17131]: Sending gratuitous ARP on eth0 for 192.168.1.200
May 25 11:58:53 localhost Keepalived_vrrp[17131]: Sending gratuitous ARP on eth0 for 192.168.1.200
May 25 11:58:58 localhost Keepalived_vrrp[17131]: Sending gratuitous ARP on eth0 for 192.168.1.200
May 25 11:58:58 localhost Keepalived_vrrp[17131]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.1.200
[root@Secondary ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether fa:16:3e:4c:7e:88 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.152/24 brd 192.168.1.255 scope global eth0
inet 192.168.1.200/32 scope global eth0
inet6 fe80::f816:3eff:fe4c:7e88/64 scope link
valid_lft forever preferred_lft forever
[root@Secondary ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
156G 13G 135G 9% /
tmpfs 2.9G 0 2.9G 0% /dev/shm
/dev/vda1 190M 89M 92M 50% /boot
/dev/drbd0 9.8G 23M 9.2G 1% /data
當Primary機器的keepalived服務恢復啟動后,VIP資源又會強制奪回來(可以查看/var/log/message系統日志)
並且Primary還會再次變為DRBD的主節點
2)
關閉Primary主機的nfs服務。根據監控腳本,會主動去啟動nfs,只要當啟動失敗時,才會強制由DRBD的主節點降為備份節點,並關閉keepalived。
從而跟上面流程一樣實現故障轉移
結論:
在上面的主從故障切換過程中,對於客戶端來說,掛載NFS不影響使用,只是會有一點的延遲。
這也驗證了drbd提供的數據一致性功能(包括文件的打開和修改狀態等),在客戶端看來,真個切換過程就是"一次nfs重啟"(主nfs停,備nfs啟)。
