Redis Sentinel高可用架構


     Redis的高可用架構現在越來越多了,可以見得Redis的發展是有多么的迅速,現在不少公司都用上了Redis,所以Redis高可用也顯得尤其重要,現在Redis的高可用架構有比如keepalived+redis,redis cluster,twemproxy,codis,下面我們主要針對Redis Sentinel高可用架構展開學習。

Redis Sentinel主要功能有以下幾點:

  • 不時地監控redis是否按照預期良好地運行;

  • 如果發現某個redis節點運行出現狀況,能夠通知另外一個進程(例如它的客戶端);

  • 能夠進行自動切換。當一個master節點不可用時,能夠選舉出master的多個slave(如果有超過一個slave的話)中的一個來作為新的master,其它的slave節點會將它所追隨的master的地址改為被提升為master的slave的新地址。

    Sentinel是一個監視器,它可以根據被監視實例的身份和狀態來判斷應該執行何種動作。Sentinel是如何發現其他Sentinel的呢?Sentinel會通過命令連接向被監視的主從服務器發送HELLO信息,該消息包含Sentinel的IP、端口號、ID等內容,以此來向其他Sentinel宣告自己的存在。與此同時,Sentinel會通過訂閱連接接收其他Sentinel的HELLO信息,以此來發現監視同一個主服務器的其他Sentinel。

Sentinel之間會互相創建命令連接,用於進行通信。因為已經有主從服務器作發送和接收HELLO信息的中介,所以Sentinel之間不會創建訂閱連接:

 

 以下是Redis Sentinel的架構圖,Sentinel節點數最好是單數,至於為什么,請參考以下的資料:

http://segmentfault.com/a/1190000002680804

http://segmentfault.com/a/1190000002685515

 

下面進行Redis Sentinel的部署和測試,本次實驗的版本是redis-3.0.7版本,環境說明:

 192.168.10.128  Sentinel_1
 192.168.10.129  Sentinel_2
 192.168.10.130  Sentinel_3
 192.168.10.131  Redis_Master
 192.168.10.132  Redis_Slave

一、 在五台服務器上分別執行下redis-3.0.7的安裝,以Sentinel_1服務為例:

[root@Sentinel_1 ~]# wget http://download.redis.io/releases/redis-3.0.7.tar.gz
[root@Sentinel_1 ~]# tar xf redis-3.0.7.tar.gz 
[root@Sentinel_1 ~]# cd redis-3.0.7/src/
[root@Sentinel_1 ~]# make PREFIX=/data/service/redis install

安裝完成后,會在/data/service/redis下會產生一個bin目錄:

[root@Sentinel_1 ~]# ll /data/service/redis/
total 12
drwxr-xr-x. 2 root root 4096 Mar  7 19:19 bin
[root@Sentinel_1 ~]# 

分別在五台服務器上添加redis的bin目錄的環境變量(不是必需的),方便命令的使用,編輯vim /etc/profile.d/redis.sh 添加以下內容:

export PATH=/data/service/redis/bin:$PATH

執行source /etc/profile.d/redis.sh 讓環境變量生效:

[root@Sentinel_1 ~]# source /etc/profile.d/redis.sh

 

二、配置Redis主從環境,主從環境的部署很簡單,這里不演示搭建過程,Redis_Master: 192.168.10.131  Redis_Slave: 192.168.10.132

Redis_Master啟動的Log:

[root@Redis_Master redis]# tail -f logs/redis_6379.log 
1974:M 07 Mar 22:03:05.381 * DB loaded from disk: 0.001 seconds
1974:M 07 Mar 22:03:05.381 * The server is now ready to accept connections on port 6379
1974:M 07 Mar 22:03:44.592 * Slave 192.168.10.132:6379 asks for synchronization
1974:M 07 Mar 22:03:44.593 * Full resync requested by slave 192.168.10.132:6379
1974:M 07 Mar 22:03:44.593 * Starting BGSAVE for SYNC with target: disk
1974:M 07 Mar 22:03:44.594 * Background saving started by pid 1977
1977:C 07 Mar 22:03:44.632 * DB saved on disk
1977:C 07 Mar 22:03:44.632 * RDB: 4 MB of memory used by copy-on-write
1974:M 07 Mar 22:03:44.649 * Background saving terminated with success
1974:M 07 Mar 22:03:44.650 * Synchronization with slave 192.168.10.132:6379 succeeded

在Redis_Slave啟動的Log:

[root@Redis_Slave redis]# tail -f logs/redis_6379.log 
2437:S 07 Mar 22:03:44.246 * Connecting to MASTER 192.168.10.131:6379
2437:S 07 Mar 22:03:44.247 * MASTER <-> SLAVE sync started
2437:S 07 Mar 22:03:44.262 * Non blocking connect for SYNC fired the event.
2437:S 07 Mar 22:03:44.268 * Master replied to PING, replication can continue...
2437:S 07 Mar 22:03:44.269 * Partial resynchronization not possible (no cached master)
2437:S 07 Mar 22:03:44.270 * Full resync from master: 5d1fbf46ddd1eb0a7728abbbad61e78908dd7963:1
2437:S 07 Mar 22:03:44.326 * MASTER <-> SLAVE sync: receiving 34 bytes from master
2437:S 07 Mar 22:03:44.326 * MASTER <-> SLAVE sync: Flushing old data
2437:S 07 Mar 22:03:44.328 * MASTER <-> SLAVE sync: Loading DB in memory
2437:S 07 Mar 22:03:44.329 * MASTER <-> SLAVE sync: Finished with success

可以看到主從環境是正常的!

 

三、進行Sentinel配置,及配置文件的解釋。

在三台Sentinel服務器下創建conf目錄和log目錄,存放配置文件和log:

[root@Sentinel_1 ~]# mkdir -p /data/service/redis/sentinel/conf

[root@Sentinel_1 ~]# mkdir -p /data/service/redis/sentinel/log

 進到conf目錄,編輯文件26379.conf,三台Sentinel服務器,配置都一樣:

[root@Sentinel_1 conf]# pwd
/data/service/redis/sentinel/conf
[root@Sentinel_1 conf]# cat 26379.conf 
port 26379
dir "/data/service/redis/sentinel"
daemonize yes
logfile "/data/service/redis/sentinel/log/sentinel.log"

# 6379
sentinel monitor master-6379 192.168.10.131 6379 2
sentinel down-after-milliseconds master-6379 15000
sentinel parallel-syncs master-6379 1
sentinel failover-timeout master-6379 180000
sentinel auth-pass master-6379 123456
sentinel client-reconfig-script master-6379 /data/script/python/notify.py
[root@Sentinel_1 conf]# 

26379.conf配置文件解釋:
1、前4行是定義sentinel的一些基本信息,跟redis很類似,不作過多解釋。

2、sentinel monitor master-6379 192.168.10.131 6379 2(這一行代表sentinel監控的master的名字叫做master-6379,地址為192.168.10.131:6379,這個2代表,當集群中有2個sentinel認為master死了時,才能真正認為該master已經不可用了)

3、down-after-milliseconds (sentinel會向master發送心跳PING來確認master是否存活,如果master在“一定時間范圍”內不回應PONG 或者是回復了一個錯誤消息,那么這個sentinel會主觀地(單方面地)認為這個master已經不可用,而這個down-after-milliseconds就是用來指定這個“一定時間范圍”的,單位是毫秒。

4、parallel-syncs(在發生failover主備切換時,這個選項指定了最多可以有多少個slave同時對新的master進行同步,這個數字越小,完成failover所需的時間就越長,但是如果這個數字越大,就意味着越多的slave因為replication而不可用。可以通過將這個值設為 1 來保證每次只有一個slave處於不能處理命令請求的狀態

5、failover-timeout(sentinel集群都遵守一個規則:如果sentinel A推薦sentinel B去執行failover,B會等待一段時間后,自行再次去對同一個master執行failover,這個等待的時間是通過failover-timeout配置項去配置的。從這個規則可以看出,sentinel集群中的sentinel不會再同一時刻並發去failover同一個master,第一個進行failover的sentinel如果失敗了,另外一個將會在一定時間內進行重新進行failover,以此類推

6、auth-pass(這選項主要針對redis master/slave架構設置了密碼認證,如果配置主從時沒有設定密碼,就不需要些選項,若有密碼,這里要指定連接的密碼)

7、client-reconfig-script (該參數是定義故障轉移腳本,當master故障轉移后,執行發短信或者IP切換等)

 

故障轉移后發郵件的notify.py腳本是參考了大神的博客:http://www.cnblogs.com/gomysql/p/5040847.html

#!/usr/bin/python
#coding:utf8

import sys
import time
import smtplib
import logging
from email.mime.text import MIMEText
from email.message import Message
from email.header import Header


alarm_mail =['1111111111@qq.com']

def main():
  
    failover_time=time.strftime("%Y-%m-%d %H:%M:%S")

    logging.basicConfig(level=logging.DEBUG,
                format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
                datefmt='%Y-%m-%d %H:%M:%S',
                filename='/data/service/redis/failover.log',
                filemode='a')

    console = logging.StreamHandler()
    console.setLevel(logging.INFO)
    formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
    console.setFormatter(formatter)
    logging.getLogger('').addHandler(console)

    mail_host='smtp.163.com'
    mail_port=25
    mail_user=''
    mail_pass=''
    mail_send_from = ''

    def send_mail(to_list,sub,content):
        me=mail_send_from
        msg = MIMEText(content, _subtype='html', _charset='utf-8')
        msg['Subject'] = Header(sub,'utf-8')
        msg['From'] = Header(me,'utf-8')
        msg['To'] = ";".join(to_list)
        try:
            smtp = smtplib.SMTP()
            smtp.connect(mail_host,mail_port)
            smtp.login(mail_user,mail_pass)
            smtp.sendmail(me,to_list, msg.as_string())
            smtp.close()
            return True
        except Exception as error:
            logging.error("郵件發送失敗: %s" % (error))
            return False

    try:
        master_name = sys.argv[1]
        role = sys.argv[2]
        from_ip = sys.argv[4]
        from_port = sys.argv[5]
        to_ip = sys.argv[6]
        to_port = sys.argv[7]
    except Exception as error:
        logging.error('從 Sentinel 獲取參數錯誤: %s ' % (error))
        sys.exit(1)

    sub='redis %s faiover' % (master_name)
    nodify_message = "%s %s is failover end. sentinel find redis master %s:%s is down. failover to slave %s:%s" % (failover_time,master_name,from_ip,from_port,to_ip,to_port)
    
    if role == 'leader':
        logging.info(nodify_message)
        send_mail(alarm_mail,sub,nodify_message)

if __name__ == "__main__":
    main()
View Code

 

四、下面啟動Sentinel服務,啟動方式有兩種:

方式一:

redis-sentinel /path/to/sentinel.conf

方式二:

redis-server /path/to/sentinel.conf --sentinel

我習慣用第一種方法,分別在三台Sentinel服務器進行啟動:

第一台Sentinel_1啟動log:

[root@Sentinel_1 sentinel]# redis-sentinel /data/service/redis/sentinel/conf/26379.conf 
[root@Sentinel_1 sentinel]# tail -f log/sentinel.log 
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

5153:X 07 Mar 22:37:16.290 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
5153:X 07 Mar 22:37:16.290 # Sentinel runid is 21e629e6d2b26682e660258787d5fb995010e6c8
5153:X 07 Mar 22:37:16.290 # +monitor master master-6379 192.168.10.131 6379 quorum 2
5153:X 07 Mar 22:37:17.330 * +slave slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379
5153:X 07 Mar 22:38:29.406 * +sentinel sentinel 192.168.10.129:26379 192.168.10.129 26379 @ master-6379 192.168.10.131 6379
5153:X 07 Mar 22:38:45.024 * +sentinel sentinel 192.168.10.130:26379 192.168.10.130 26379 @ master-6379 192.168.10.131 6379

第二台Sentinel_2啟動log:

[root@Sentinel_2 sentinel]# redis-sentinel /data/service/redis/sentinel/conf/26379.conf
[root@Sentinel_2 sentinel]# tail -f log/sentinel.log 
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

4647:X 07 Mar 22:38:27.570 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
4647:X 07 Mar 22:38:27.570 # Sentinel runid is f391228f430177d881464e908c683bfc73d61c24
4647:X 07 Mar 22:38:27.571 # +monitor master master-6379 192.168.10.131 6379 quorum 2
4647:X 07 Mar 22:38:28.582 * +slave slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379
4647:X 07 Mar 22:38:29.218 * +sentinel sentinel 192.168.10.128:26379 192.168.10.128 26379 @ master-6379 192.168.10.131 6379
4647:X 07 Mar 22:38:45.200 * +sentinel sentinel 192.168.10.130:26379 192.168.10.130 26379 @ master-6379 192.168.10.131 6379

第三台Sentinel_3啟動log:

[root@Sentinel_3 sentinel]# redis-sentinel /data/service/redis/sentinel/conf/26379.conf
[root@Sentinel_3 sentinel]# tail -f log/sentinel.log 
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

2115:X 07 Mar 22:38:43.161 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
2115:X 07 Mar 22:38:43.161 # Sentinel runid is 7fbee9138d4e5c1e2def7bbc4f888cef04d95677
2115:X 07 Mar 22:38:43.161 # +monitor master master-6379 192.168.10.131 6379 quorum 2
2115:X 07 Mar 22:38:44.167 * +slave slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379
2115:X 07 Mar 22:38:44.818 * +sentinel sentinel 192.168.10.129:26379 192.168.10.129 26379 @ master-6379 192.168.10.131 6379
2115:X 07 Mar 22:38:44.851 * +sentinel sentinel 192.168.10.128:26379 192.168.10.128 26379 @ master-6379 192.168.10.131 6379

可以看到Sentinel整個集群都開始工作了,我們可以隨便登錄一台Sentinel看下現在監視的狀態:

[root@Sentinel_1 sentinel]# redis-cli -p 26379
127.0.0.1:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
master0:name=master-6379,status=ok,address=192.168.10.131:6379,slaves=1,sentinels=3
127.0.0.1:26379> 

可以看到狀態是status=ok,slaves=1有一個從節點。

 

五、Redis down機測試

測試一、停掉Redis_Master,看Sentinel會不會把存活的Slave節點提升為Master節點

[root@Redis_Master redis]# sh redis stop
Stopping ...
Waiting for Redis to shutdown ...
Redis stopped
[root@Redis_Master redis]# 

1、隨便查看一台Sentinel的log,tail -f log/sentinel.log:

5153:X 07 Mar 22:48:20.986 # +sdown master master-6379 192.168.10.131 6379
5153:X 07 Mar 22:48:21.047 # +odown master master-6379 192.168.10.131 6379 #quorum 2/2
5153:X 07 Mar 22:48:21.049 # +new-epoch 1
5153:X 07 Mar 22:48:21.050 # +try-failover master master-6379 192.168.10.131 6379
5153:X 07 Mar 22:48:21.053 # +vote-for-leader 21e629e6d2b26682e660258787d5fb995010e6c8 1
5153:X 07 Mar 22:48:21.057 # 192.168.10.130:26379 voted for 7fbee9138d4e5c1e2def7bbc4f888cef04d95677 1
5153:X 07 Mar 22:48:21.062 # 192.168.10.129:26379 voted for 7fbee9138d4e5c1e2def7bbc4f888cef04d95677 1
5153:X 07 Mar 22:48:22.441 # +config-update-from sentinel 192.168.10.130:26379 192.168.10.130 26379 @ master-6379 192.168.10.131 6379
5153:X 07 Mar 22:48:22.442 # +switch-master master-6379 192.168.10.131 6379 192.168.10.132 6379
5153:X 07 Mar 22:48:22.443 * +slave slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
5153:X 07 Mar 22:48:37.496 # +sdown slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379

2、再查看Redis_Slave的log:

2437:S 07 Mar 22:48:18.023 * Connecting to MASTER 192.168.10.131:6379
2437:S 07 Mar 22:48:18.026 * MASTER <-> SLAVE sync started
2437:S 07 Mar 22:48:18.029 # Error condition on socket for SYNC: Connection refused
2437:S 07 Mar 22:48:19.050 * Connecting to MASTER 192.168.10.131:6379
2437:S 07 Mar 22:48:19.053 * MASTER <-> SLAVE sync started
2437:S 07 Mar 22:48:19.055 # Error condition on socket for SYNC: Connection refused
2437:S 07 Mar 22:48:20.074 * Connecting to MASTER 192.168.10.131:6379
2437:S 07 Mar 22:48:20.077 * MASTER <-> SLAVE sync started
2437:S 07 Mar 22:48:20.079 # Error condition on socket for SYNC: Connection refused
2437:M 07 Mar 22:48:20.724 * Discarding previously cached master state.
2437:M 07 Mar 22:48:20.725 * MASTER MODE enabled (user request from 'id=7 addr=192.168.10.130:60991 fd=11 name=sentinel-7fbee913-cmd age=577 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=rw cmd=exec')
2437:M 07 Mar 22:48:20.745 # CONFIG REWRITE executed with success.
2437:M 07 Mar 22:48:20.796 * 1 changes in 900 seconds. Saving...
2437:M 07 Mar 22:48:20.870 * Background saving started by pid 2442
2442:C 07 Mar 22:48:20.915 * DB saved on disk
2442:C 07 Mar 22:48:20.915 * RDB: 4 MB of memory used by copy-on-write
2437:M 07 Mar 22:48:20.974 * Background saving terminated with success

3、現在再登錄Sentinel查看現在的主節點是誰:

[root@Sentinel_1 sentinel]# redis-cli -p 26379       
127.0.0.1:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
master0:name=master-6379,status=ok,address=192.168.10.132:6379,slaves=1,sentinels=3
127.0.0.1:26379> 

可以看到,新的Master已經變成192.168.10.132了。切換后的郵件通知:

4、把down機的redis啟動后,會自動添加為slave角色:

[root@Redis_Master redis]# sh redis start
Starting Redis server...
[root@Redis_Master redis]# tail -f logs/redis_6379.log 
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

2050:M 07 Mar 22:55:21.357 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
2050:M 07 Mar 22:55:21.357 # Server started, Redis version 3.0.7
2050:M 07 Mar 22:55:21.357 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
2050:M 07 Mar 22:55:21.357 * DB loaded from disk: 0.000 seconds
2050:M 07 Mar 22:55:21.357 * The server is now ready to accept connections on port 6379
2050:S 07 Mar 22:55:31.393 * SLAVE OF 192.168.10.132:6379 enabled (user request from 'id=4 addr=192.168.10.129:50326 fd=8 name=sentinel-f391228f-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=rw cmd=exec')
2050:S 07 Mar 22:55:31.397 # CONFIG REWRITE executed with success.
2050:S 07 Mar 22:55:31.596 * Connecting to MASTER 192.168.10.132:6379
2050:S 07 Mar 22:55:31.597 * MASTER <-> SLAVE sync started
2050:S 07 Mar 22:55:31.597 * Non blocking connect for SYNC fired the event.
2050:S 07 Mar 22:55:31.598 * Master replied to PING, replication can continue...
2050:S 07 Mar 22:55:31.600 * Partial resynchronization not possible (no cached master)
2050:S 07 Mar 22:55:31.634 * Full resync from master: 234202729a196fd6523e41bcb7e29d9866c905c6:1
2050:S 07 Mar 22:55:31.648 * MASTER <-> SLAVE sync: receiving 34 bytes from master
2050:S 07 Mar 22:55:31.649 * MASTER <-> SLAVE sync: Flushing old data
2050:S 07 Mar 22:55:31.649 * MASTER <-> SLAVE sync: Loading DB in memory
2050:S 07 Mar 22:55:31.649 * MASTER <-> SLAVE sync: Finished with success

5、查看Sentinel log,可以看到slave被加進來,並成為Slave的角色了:

4647:X 07 Mar 22:55:31.787 * +convert-to-slave slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379

 

測試二、把新的Redis_Master(192.168.10.132,原來的slave)停掉,看是否把新的Slave(192.168.10.131,原來的master)提升為主:

1、執行redis stop操作

[root@Redis_Slave redis]# sh redis stop
Stopping ...
Waiting for Redis to shutdown ...
Redis stopped

2、查看Sentinel log:

5153:X 07 Mar 23:01:54.895 # +try-failover master master-6379 192.168.10.132 6379
5153:X 07 Mar 23:01:54.898 # +vote-for-leader 21e629e6d2b26682e660258787d5fb995010e6c8 2
5153:X 07 Mar 23:01:54.908 # 192.168.10.129:26379 voted for f391228f430177d881464e908c683bfc73d61c24 2
5153:X 07 Mar 23:01:54.913 # 192.168.10.130:26379 voted for 21e629e6d2b26682e660258787d5fb995010e6c8 2
5153:X 07 Mar 23:01:54.968 # +elected-leader master master-6379 192.168.10.132 6379
5153:X 07 Mar 23:01:54.968 # +failover-state-select-slave master master-6379 192.168.10.132 6379
5153:X 07 Mar 23:01:55.027 # +selected-slave slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
5153:X 07 Mar 23:01:55.027 * +failover-state-send-slaveof-noone slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
5153:X 07 Mar 23:01:55.085 * +failover-state-wait-promotion slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
5153:X 07 Mar 23:01:55.912 # +promoted-slave slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
5153:X 07 Mar 23:01:55.915 # +failover-state-reconf-slaves master master-6379 192.168.10.132 6379
5153:X 07 Mar 23:01:56.009 # +failover-end master master-6379 192.168.10.132 6379
5153:X 07 Mar 23:01:56.010 # +switch-master master-6379 192.168.10.132 6379 192.168.10.131 6379
5153:X 07 Mar 23:01:56.010 * +slave slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379
5153:X 07 Mar 23:02:11.066 # +sdown slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379

3、再查看新Redis_Master的log,可以看到狀態從SLave轉回了Master:

[root@Redis_Master redis]# tail -f logs/redis_6379.log 
2050:S 07 Mar 23:01:52.367 # Error condition on socket for SYNC: Connection refused
2050:S 07 Mar 23:01:53.382 * Connecting to MASTER 192.168.10.132:6379
2050:S 07 Mar 23:01:53.384 * MASTER <-> SLAVE sync started
2050:S 07 Mar 23:01:53.384 # Error condition on socket for SYNC: Connection refused
2050:S 07 Mar 23:01:54.404 * Connecting to MASTER 192.168.10.132:6379
2050:S 07 Mar 23:01:54.405 * MASTER <-> SLAVE sync started
2050:S 07 Mar 23:01:54.406 # Error condition on socket for SYNC: Connection refused
2050:M 07 Mar 23:01:54.868 * Discarding previously cached master state.
2050:M 07 Mar 23:01:54.868 * MASTER MODE enabled (user request from 'id=8 addr=192.168.10.128:37585 fd=6 name=sentinel-21e629e6-cmd age=383 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=rw cmd=exec')
2050:M 07 Mar 23:01:54.870 # CONFIG REWRITE executed with success.

4、再查看Sentinel的監視信息,可以看到新的Redis_Master已經是192.168.10.131了:

[root@Sentinel_1 sentinel]# redis-cli -p 26379       
127.0.0.1:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
master0:name=master-6379,status=ok,address=192.168.10.131:6379,slaves=1,sentinels=3
127.0.0.1:26379> 

故障轉移后的郵件報警如下:

 5、把down機的Redis啟動,Sentinel 又會把它加進來來,作為Slave的角色:

[root@Redis_Slave redis]# sh redis start
Starting Redis server...
[root@Redis_Slave redis]#

查看Sentinel log:

2115:X 07 Mar 23:18:49.494 * +convert-to-slave slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379

再查看Redis-Master log:

[root@Redis_Master redis]# tail -f logs/redis_6379.log 
2050:S 07 Mar 23:01:52.367 # Error condition on socket for SYNC: Connection refused
2050:S 07 Mar 23:01:53.382 * Connecting to MASTER 192.168.10.132:6379
2050:S 07 Mar 23:01:53.384 * MASTER <-> SLAVE sync started
2050:S 07 Mar 23:01:53.384 # Error condition on socket for SYNC: Connection refused
2050:S 07 Mar 23:01:54.404 * Connecting to MASTER 192.168.10.132:6379
2050:S 07 Mar 23:01:54.405 * MASTER <-> SLAVE sync started
2050:S 07 Mar 23:01:54.406 # Error condition on socket for SYNC: Connection refused
2050:M 07 Mar 23:01:54.868 * Discarding previously cached master state.
2050:M 07 Mar 23:01:54.868 * MASTER MODE enabled (user request from 'id=8 addr=192.168.10.128:37585 fd=6 name=sentinel-21e629e6-cmd age=383 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=rw cmd=exec')
2050:M 07 Mar 23:01:54.870 # CONFIG REWRITE executed with success.
2050:M 07 Mar 23:10:22.009 * 1 changes in 900 seconds. Saving...
2050:M 07 Mar 23:10:22.126 * Background saving started by pid 2084
2084:C 07 Mar 23:10:22.167 * DB saved on disk
2084:C 07 Mar 23:10:22.167 * RDB: 4 MB of memory used by copy-on-write
2050:M 07 Mar 23:10:22.229 * Background saving terminated with success
2050:M 07 Mar 23:18:49.389 * Slave 192.168.10.132:6379 asks for synchronization
2050:M 07 Mar 23:18:49.389 * Full resync requested by slave 192.168.10.132:6379
2050:M 07 Mar 23:18:49.389 * Starting BGSAVE for SYNC with target: disk
2050:M 07 Mar 23:18:49.417 * Background saving started by pid 2085
2085:C 07 Mar 23:18:49.428 * DB saved on disk
2085:C 07 Mar 23:18:49.429 * RDB: 4 MB of memory used by copy-on-write
2050:M 07 Mar 23:18:49.479 * Background saving terminated with success
2050:M 07 Mar 23:18:49.479 * Synchronization with slave 192.168.10.132:6379 succeeded

再查看Redis_Slave的log:

2514:S 07 Mar 23:18:48.859 # CONFIG REWRITE executed with success.
2514:S 07 Mar 23:18:49.049 * Connecting to MASTER 192.168.10.131:6379
2514:S 07 Mar 23:18:49.053 * MASTER <-> SLAVE sync started
2514:S 07 Mar 23:18:49.055 * Non blocking connect for SYNC fired the event.
2514:S 07 Mar 23:18:49.059 * Master replied to PING, replication can continue...
2514:S 07 Mar 23:18:49.065 * Partial resynchronization not possible (no cached master)
2514:S 07 Mar 23:18:49.099 * Full resync from master: 3e1fbd2ec6f57b3362687051ab1bb6edf1d2ee27:1
2514:S 07 Mar 23:18:49.157 * MASTER <-> SLAVE sync: receiving 34 bytes from master
2514:S 07 Mar 23:18:49.157 * MASTER <-> SLAVE sync: Flushing old data
2514:S 07 Mar 23:18:49.157 * MASTER <-> SLAVE sync: Loading DB in memory
2514:S 07 Mar 23:18:49.157 * MASTER <-> SLAVE sync: Finished with success

可以看到Redis主從同步還是正常運行的。更多的測試就留給同學們了^o^

 

 

 

總結:

     一、Redis-Sentinel是Redis官方推薦的高可用性(HA)解決方案,還是比較可靠的,推薦大家在生產環境部署並使用

     二、Redis-Sentinel可以自定義故障轉移腳本,這還是比較人性化的,可以結合shell腳本或者Python腳本

     三、現在Redis高可用架構非常多,但各有優劣,需要說的是,如果要上Redis高可用架構,需要反復測試。

 

參考資料:

http://segmentfault.com/a/1190000002680804

http://segmentfault.com/a/1190000002685515

http://redis.io/topics/sentinel-clients

https://pypi.python.org/pypi/redis/

http://www.cnblogs.com/gomysql/p/5040847.html

 

 

 

作者:陸炫志

出處:xuanzhi的博客 http://www.cnblogs.com/xuanzhi201111

您的支持是對博主最大的鼓勵,感謝您的認真閱讀。本文版權歸作者所有,歡迎轉載,但請保留該聲明。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM