docker 實現redis集群搭建


摘要:接觸docker以來,似乎養成了一種習慣,安裝什么應用軟件都想往docker方向做,今天就想來嘗試下使用docker搭建redis集群。

首先,我們需要理論知識:Redis ClusterRedis的分布式解決方案,它解決了redis單機中心化的問題,分布式數據庫——首要解決把整個數據集按照分區規則映射到多個節點的問題。

這邊就需要知道分區規則——哈希分區規則。Redis Cluster 采用哈希分區規則中的虛擬槽分區。所有的鍵根據哈希函數映射到0 ~ 16383,計算公式:slot = CRC16(key)&16383。每一個節點負責維護一部分槽以及槽所映射的鍵值數據。

一、創建redis docker基礎鏡像

  1. 下載redis安裝包,使用版本為:4.0.1
    [root@etcd1 tmp]# mkdir docker_redis_cluster
    [root@etcd1 tmp]# cd docker_redis_cluster/
    [root@etcd2 docker_redis_cluster]# wget http://download.redis.io/releases/redis-4.0.1.tar.gz
  2. 解壓編譯redis
    [root@etcd1 docker_redis_cluster]# tar zxvf redis-4.0.1.tar.gz
    [root@etcd1 docker_redis_cluster]# cd redis-4.0.1/
    [root@etcd1 redis-4.0.1]# make
  3. 修改redis配置
    [root@etcd3 redis-4.0.1]# vi /tmp/docker_redis_cluster/redis-4.0.1/redis.conf
    

      
    修改bind ip地址

    # ~~~ WARNING ~~~ If the computer running Redis is directly exposed to the
    # internet, binding to all the interfaces is dangerous and will expose the
    # instance to everybody on the internet. So by default we uncomment the
    # following bind directive, that will force Redis to listen only into
    # the IPv4 lookback interface address (this means Redis will be able to
    # accept connections only from clients running into the same computer it
    # is running).
    #
    # IF YOU ARE SURE YOU WANT YOUR INSTANCE TO LISTEN TO ALL THE INTERFACES
    # JUST COMMENT THE FOLLOWING LINE.
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    #bind 127.0.0.1
    bind 0.0.0.0
    

      

    將守護進程yes改成no

    # By default Redis does not run as a daemon. Use 'yes' if you need it.
    # Note that Redis will write a pid file in /var/run/redis.pid when daemonized.
    daemonize no
    

      
    將密碼項注釋去掉,添加新密碼

    # Warning: since Redis is pretty fast an outside user can try up to
    # 150k passwords per second against a good box. This means that you should
    # use a very strong password otherwise it will be very easy to break.
    #
    # requirepass foobared
    

      修改為

    # Warning: since Redis is pretty fast an outside user can try up to
    # 150k passwords per second against a good box. This means that you should
    # use a very strong password otherwise it will be very easy to break.
    #
    requirepass 123456
    


    因為配置了密碼,所以,配置中另外一處主從連接也需要配置密碼

    # If the master is password protected (using the "requirepass" configuration
    # directive below) it is possible to tell the slave to authenticate before
    # starting the replication synchronization process, otherwise the master will
    # refuse the slave request.
    #
    # masterauth <master-password>
    

      修改為

    # If the master is password protected (using the "requirepass" configuration
    # directive below) it is possible to tell the slave to authenticate before
    # starting the replication synchronization process, otherwise the master will
    # refuse the slave request.
    #
    # masterauth <master-password>
    masterauth 123456
    

      

      

    設置日志路徑

    # Specify the log file name. Also the empty string can be used to force
    # Redis to log on the standard output. Note that if you use standard
    # output for logging but daemonize, logs will be sent to /dev/null
    logfile "/var/log/redis/redis-server.log"
    

      
    配置集群相關信息,去掉配置項前面的注釋

    # Normal Redis instances can't be part of a Redis Cluster; only nodes that are
    # started as cluster nodes can. In order to start a Redis instance as a
    # cluster node enable the cluster support uncommenting the following:
    #
    cluster-enabled yes
    
    # Every cluster node has a cluster configuration file. This file is not
    # intended to be edited by hand. It is created and updated by Redis nodes.
    # Every Redis Cluster node requires a different cluster configuration file.
    # Make sure that instances running in the same system do not have
    # overlapping cluster configuration file names.
    #
    cluster-config-file nodes-6379.conf
    
    # Cluster node timeout is the amount of milliseconds a node must be unreachable
    # for it to be considered in failure state.
    # Most other internal time limits are multiple of the node timeout.
    #
    cluster-node-timeout 15000
  4. 鏡像制作
    [root@etcd3 docker_redis_cluster]# cd /tmp/docker_redis_cluster
    [root@etcd3 docker_redis_cluster]# vi Dockerfile 
    # Redis
    # Version 4.0.1
    
    FROM Centos:7
    ENV REDIS_HOME /usr/local
    ADD redis-4.0.1.tar.gz / # 本地的redis源碼包復制到鏡像的根路徑下,ADD命令會在復制過后自動解包。被復制的對象必須處於Dockerfile同一路徑,且ADD后面必須使用相對路徑 RUN mkdir -p $REDIS_HOME/redis # 創建安裝目錄 ADD redis-4.0.1/redis.conf $REDIS_HOME/redis/ # 將一開始編譯產生並修改后的配置復制到安裝目錄 RUN yum -y update # 更新yum源 RUN yum install -y gcc make # 安裝編譯需要的工具 WORKDIR /redis-4.0.1 RUN make RUN mv /redis-4.0.1/src/redis-server $REDIS_HOME/redis/ # 編譯后,容器中只需要可執行文件redis-server WORKDIR / RUN rm -rf /redis-4.0.1 # 刪除解壓文件 RUN yum remove -y gcc make # 安裝編譯完成之后,可以刪除多余的gcc跟make VOLUME ["/var/log/redis"] # 添加數據卷 EXPOSE 6379 # 暴露6379端口,也可以暴露多個端口,這里不需要如此

      
    PS.當前鏡像非可執行鏡像,所以沒有包含ENTRYPOINT和CMD指令

  5. 構建鏡像
    # 切換中國源
    [root@etcd3 docker_redis_cluster]# vi /etc/docker/daemon.json
    {
      "registry-mirrors": ["https://registry.docker-cn.com"]
    }
    
    # 編譯
    [root@etcd3 docker_redis_cluster]# docker build -t hakimdstx/cluster-redis .
    ...
    
    Complete!
     ---> 546cb1d34f35
    Removing intermediate container 6b6556c5f28d
    Step 14/15 : VOLUME /var/log/redis
     ---> Running in 05a6642e4046
     ---> e7e2fb8676b2
    Removing intermediate container 05a6642e4046
    Step 15/15 : EXPOSE 6379
     ---> Running in 5d7abe1709e2
     ---> 2d1322475f79
    Removing intermediate container 5d7abe1709e2
    Successfully built 2d1322475f79
    

      
    鏡像制作完成,制作中間可能會報: Public key for glibc-headers-2.17-222.el7.x86_64.rpm is not installed 錯誤,這時候需要在鏡像配置中添加一句命令:

    ...
    RUN rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
    RUN yum -y update  # 更新yum源
    RUN yum install -y gcc make # 安裝編譯需要的工具

    查看鏡像:
    [root@etcd3 docker_redis_cluster]# docker images
    REPOSITORY                                  TAG                 IMAGE ID            CREATED             SIZE
    hakimdstx/cluster-redis                     4.0.1               1fca5a08a4c7        14 seconds ago      435 MB
    centos                                      7                   49f7960eb7e4        2 days ago          200 MB
    

     

    以上,redis 基礎鏡像就制作完成了

二、制作redis節點鏡像

  1. 基於此前制作的redis基礎鏡像創建一個redis節點鏡像
    [root@etcd3 tmp]# mkdir docker_redis_nodes
    [root@etcd3 tmp]# cd docker_redis_nodes
    [root@etcd3 docker_redis_nodes]# vi Dockerfile
    # Redis Node
    # Version 4.0.1
    FROM hakimdstx/cluster-redis:4.0.1 # MAINTAINER_INFO MAINTAINER hakim 1194842583@qq.com ENTRYPOINT ["/usr/local/redis/redis-server", "/usr/local/redis/redis.conf"]

      

  2. 構建redis節點鏡像
    [root@etcd3 docker_redis_nodes]# docker build -t hakimdstx/nodes-redis:4.0.1 .       
    Sending build context to Docker daemon 2.048 kB
    Step 1/3 : FROM hakimdstx/cluster-redis:4.0.1
     ---> 1fca5a08a4c7
    Step 2/3 : MAINTAINER hakim 1194842583@qq.com
     ---> Running in cc6e07eb2c36
     ---> 55769d3bfacb
    Removing intermediate container cc6e07eb2c36
    Step 3/3 : ENTRYPOINT /usr/local/redis/redis-server /usr/local/redis/redis.conf
     ---> Running in f5dedf88f6f6
     ---> da64da483559
    Removing intermediate container f5dedf88f6f6
    Successfully built da64da483559
    

      

  3. 查看鏡像
    [root@etcd3 docker_redis_nodes]# docker images
    REPOSITORY                                  TAG                 IMAGE ID            CREATED             SIZE
    hakimdstx/nodes-redis                       4.0.1               da64da483559        51 seconds ago      435 MB
    hakimdstx/cluster-redis                     4.0.1               1fca5a08a4c7        9 minutes ago       435 MB
    centos                                      7                   49f7960eb7e4        2 days ago          200 MB
    

      

三、運行redis集群

  1. 運行redis容器
    [root@etcd3 docker_redis_nodes]# docker run -d --name redis-6379 -p 6379:6379 hakimdstx/nodes-redis:4.0.1   
    1673a7d859ea83257d5bf14d82ebf717fb31405c185ce96a05f597d8f855aa7d
    [root@etcd3 docker_redis_nodes]# docker run -d --name redis-6380 -p 6380:6379 hakimdstx/nodes-redis:4.0.1    
    df6ebce6f12a6f3620d5a29adcfbfa7024e906c3af48f21fa7e1fa524a361362
    [root@etcd3 docker_redis_nodes]# docker run -d --name redis-6381 -p 6381:6379 hakimdstx/nodes-redis:4.0.1   
    396e174a1d9235228b3c5f0266785a12fb1ea49efc7ac755c9e7590e17aa1a79
    [root@etcd3 docker_redis_nodes]# docker run -d --name redis-6382 -p 6382:6379 hakimdstx/nodes-redis:4.0.1 
    d9a71dd3f969094205ffa7596c4a04255575cdd3acca2d47fe8ef7171a3be528
    [root@etcd3 docker_redis_nodes]# docker run -d --name redis-6383 -p 6383:6379 hakimdstx/nodes-redis:4.0.1 
    73e4f843d8cb28595456e21b04f97d18ce1cdf8dc56d1150844ba258a3781933
    [root@etcd3 docker_redis_nodes]# docker run -d --name redis-6384 -p 6384:6379 hakimdstx/nodes-redis:4.0.1 
    10c62aafa4dac47220daf5bf3cec84406f086d5261599b54ec6c56bb7da97d6d
  2. 查看容器信息
    [root@etcd3 redis]# docker ps
    CONTAINER ID        IMAGE                         COMMAND                  CREATED             STATUS              PORTS                    NAMES
    10c62aafa4da        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   3 seconds ago       Up 2 seconds        0.0.0.0:6384->6379/tcp   redis-6384
    73e4f843d8cb        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   12 seconds ago      Up 10 seconds       0.0.0.0:6383->6379/tcp   redis-6383
    d9a71dd3f969        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   20 seconds ago      Up 18 seconds       0.0.0.0:6382->6379/tcp   redis-6382
    396e174a1d92        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   3 days ago          Up 3 days           0.0.0.0:6381->6379/tcp   redis-6381
    df6ebce6f12a        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   3 days ago          Up 3 days           0.0.0.0:6380->6379/tcp   redis-6380
    1673a7d859ea        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   3 days ago          Up 3 days           0.0.0.0:6379->6379/tcp   redis-6379
    

      

  3. 運行 redis 集群容器
    1. 通過遠程連接,查看redis  info replication 信息
      [root@etcd2 ~]#  redis-cli -h 192.168.10.52 -p 6379
      192.168.10.52:6379> info replication
      NOAUTH Authentication required.
      192.168.10.52:6379> auth 123456
      OK
      192.168.10.52:6379> info replication
      # Replication
      role:master
      connected_slaves:0
      master_replid:2f0a7b50aed699fa50a79f3f7f9751a070c50ee9
      master_replid2:0000000000000000000000000000000000000000
      master_repl_offset:0
      second_repl_offset:-1
      repl_backlog_active:0
      repl_backlog_size:1048576
      repl_backlog_first_byte_offset:0
      repl_backlog_histlen:0
      192.168.10.52:6379>
      # 其余基本信息同上 
      

        可以看到,客戶連接之后,因為之前設置了密碼,所以需要先輸入密碼認證,否則就無法通過。以上信息,我們知道所有的redis都是master角色 role:master ,這顯然不是我們所希望的。

    2. 在配置之前我們需要查看所有容器當前的IP地址
      [root@etcd3 redis]# docker ps
      CONTAINER ID        IMAGE                         COMMAND                  CREATED             STATUS              PORTS                    NAMES
      10c62aafa4da        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   3 seconds ago       Up 2 seconds        0.0.0.0:6384->6379/tcp   redis-6384
      73e4f843d8cb        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   12 seconds ago      Up 10 seconds       0.0.0.0:6383->6379/tcp   redis-6383
      d9a71dd3f969        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   20 seconds ago      Up 18 seconds       0.0.0.0:6382->6379/tcp   redis-6382
      396e174a1d92        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   3 days ago          Up 3 days           0.0.0.0:6381->6379/tcp   redis-6381
      df6ebce6f12a        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   3 days ago          Up 3 days           0.0.0.0:6380->6379/tcp   redis-6380
      1673a7d859ea        hakimdstx/nodes-redis:4.0.1   "/usr/local/redis/..."   3 days ago          Up 3 days           0.0.0.0:6379->6379/tcp   redis-6379
      [root@etcd3 redis]# 
      [root@etcd3 redis]# docker inspect 10c62aafa4da 73e4f843d8cb d9a71dd3f969 396e174a1d92 df6ebce6f12a 1673a7d859ea | grep IPA
                  "SecondaryIPAddresses": null,
                  "IPAddress": "172.17.0.7",
                          "IPAMConfig": null,
                          "IPAddress": "172.17.0.7",
                  "SecondaryIPAddresses": null,
                  "IPAddress": "172.17.0.6",
                          "IPAMConfig": null,
                          "IPAddress": "172.17.0.6",
                  "SecondaryIPAddresses": null,
                  "IPAddress": "172.17.0.5",
                          "IPAMConfig": null,
                          "IPAddress": "172.17.0.5",
                  "SecondaryIPAddresses": null,
                  "IPAddress": "172.17.0.4",
                          "IPAMConfig": null,
                          "IPAddress": "172.17.0.4",
                  "SecondaryIPAddresses": null,
                  "IPAddress": "172.17.0.3",
                          "IPAMConfig": null,
                          "IPAddress": "172.17.0.3",
                  "SecondaryIPAddresses": null,
                  "IPAddress": "172.17.0.2",
                          "IPAMConfig": null,
                          "IPAddress": "172.17.0.2",

      可以知道:  redis-6379:172.17.0.2,redis-6380:172.17.0.3,redis-6381:172.17.0.4,redis-6382:172.17.0.5,redis-6383:172.17.0.6,redis-6384:172.17.0.7  

    3. 配置redis
    4. ert
  4. Redis Cluster 的集群感知操作
    //集群(cluster)  
    CLUSTER INFO 打印集群的信息  
    CLUSTER NODES 列出集群當前已知的所有節點(node),以及這些節點的相關信息。   
      
    //節點(node)  
    CLUSTER MEET <ip> <port> 將 ip 和 port 所指定的節點添加到集群當中,讓它成為集群的一份子。  
    CLUSTER FORGET <node_id> 從集群中移除 node_id 指定的節點。  
    CLUSTER REPLICATE <node_id> 將當前節點設置為 node_id 指定的節點的從節點。  
    CLUSTER SAVECONFIG 將節點的配置文件保存到硬盤里面。   
      
    //槽(slot)  
    CLUSTER ADDSLOTS <slot> [slot ...] 將一個或多個槽(slot)指派(assign)給當前節點。  
    CLUSTER DELSLOTS <slot> [slot ...] 移除一個或多個槽對當前節點的指派。  
    CLUSTER FLUSHSLOTS 移除指派給當前節點的所有槽,讓當前節點變成一個沒有指派任何槽的節點。  
    CLUSTER SETSLOT <slot> NODE <node_id> 將槽 slot 指派給 node_id 指定的節點,如果槽已經指派給另一個節點,那么先讓另一個節點刪除該槽>,然后再進行指派。  
    CLUSTER SETSLOT <slot> MIGRATING <node_id> 將本節點的槽 slot 遷移到 node_id 指定的節點中。  
    CLUSTER SETSLOT <slot> IMPORTING <node_id> 從 node_id 指定的節點中導入槽 slot 到本節點。  
    CLUSTER SETSLOT <slot> STABLE 取消對槽 slot 的導入(import)或者遷移(migrate)。   
      
    //鍵 (key)  
    CLUSTER KEYSLOT <key> 計算鍵 key 應該被放置在哪個槽上。  
    CLUSTER COUNTKEYSINSLOT <slot> 返回槽 slot 目前包含的鍵值對數量。  
    CLUSTER GETKEYSINSLOT <slot> <count> 返回 count 個 slot 槽中的鍵。  
    

      
    redis 集群感知:節點握手——是指一批運行在集群模式的節點通過Gossip協議彼此通信,達到感知對方的過程。
     

    192.168.10.52:6379> CLUSTER MEET 172.17.0.3 6379
    OK
    192.168.10.52:6379> CLUSTER MEET 172.17.0.4 6379
    OK
    192.168.10.52:6379> CLUSTER MEET 172.17.0.5 6379
    OK
    192.168.10.52:6379> CLUSTER MEET 172.17.0.6 6379
    OK
    192.168.10.52:6379> CLUSTER MEET 172.17.0.7 6379
    OK
    192.168.10.52:6379>  CLUSTER NODES
    54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c 172.17.0.3:6379@16379 master - 0 1528697195600 1 connected
    f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 172.17.0.4:6379@16379 master - 0 1528697195600 0 connected
    ae86224a3bc29c4854719c83979cb7506f37787a 172.17.0.7:6379@16379 master - 0 1528697195600 5 connected
    98aebcfe42d8aaa8a3375e4a16707107dc9da683 172.17.0.6:6379@16379 master - 0 1528697194000 4 connected
    0bbdc4176884ef0e3bb9b2e7d03d91b0e7e11f44 172.17.0.5:6379@16379 master - 0 1528697194995 3 connected
    760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 172.17.0.2:6379@16379 myself,master - 0 1528697195000 2 connected
    

      
    當前已經使這六個節點組成集群,但是現在還無法工作,因為集群節點還沒有分配槽(slot)。

    1. 分配槽信息
      查看172.17.0.2:6379 的槽個數
      192.168.10.52:6379> CLUSTER INFO
      cluster_state:fail
      cluster_slots_assigned:0    # 被分配槽的個數為0
      cluster_slots_ok:0
      cluster_slots_pfail:0
      cluster_slots_fail:0
      cluster_known_nodes:6
      cluster_size:0
      cluster_current_epoch:5
      cluster_my_epoch:2
      cluster_stats_messages_ping_sent:260418
      cluster_stats_messages_pong_sent:260087
      cluster_stats_messages_meet_sent:10
      cluster_stats_messages_sent:520515
      cluster_stats_messages_ping_received:260086
      cluster_stats_messages_pong_received:260328
      cluster_stats_messages_meet_received:1
      cluster_stats_messages_received:520415
      

        
      上面看到集群狀態是失敗的,原因是槽位沒有分配,而且需要一次性把16384個槽位完全分配了,集群才可用。

    2. 分配槽位

      分配槽位: CLUSTER ADDSLOTS  槽位,一個槽位只能分配一個節點,16384個槽位必須分配完,不同節點不能沖突
      所以通過腳本進行分配 addslots.sh:
      #!/bin/bash
      # node1 192.168.10.52   172.17.0.2
      n=0
      for ((i=n;i<=5461;i++))
      do
         /usr/local/bin/redis-cli -h 192.168.10.52 -p 6379 -a 123456  CLUSTER ADDSLOTS $i
      done
      
      
      # node2 192.168.10.52    172.17.0.3
      n=5462
      for ((i=n;i<=10922;i++))
      do
         /usr/local/bin/redis-cli -h 192.168.10.52 -p 6380 -a 123456 CLUSTER ADDSLOTS $i
      done
      
      
      # node3 192.168.10.52    172.17.0.4
      n=10923
      for ((i=n;i<=16383;i++))
      do
         /usr/local/bin/redis-cli -h 192.168.10.52 -p 6381 -a 123456 CLUSTER ADDSLOTS $i
      done
      

        
      其中, -a 123456  表示需要輸入的密碼。

      192.168.10.52:6379> CLUSTER INFO
      cluster_state:fail           # 集群狀態為失敗
      cluster_slots_assigned:16101    # 沒有完全分配結束
      cluster_slots_ok:16101
      cluster_slots_pfail:0
      cluster_slots_fail:0
      cluster_known_nodes:6
      cluster_size:3
      cluster_current_epoch:5
      cluster_my_epoch:2
      cluster_stats_messages_ping_sent:266756
      cluster_stats_messages_pong_sent:266528
      cluster_stats_messages_meet_sent:10
      cluster_stats_messages_sent:533294
      cluster_stats_messages_ping_received:266527
      cluster_stats_messages_pong_received:266666
      cluster_stats_messages_meet_received:1
      cluster_stats_messages_received:533194
      192.168.10.52:6379> CLUSTER INFO cluster_state:ok # 集群狀態為成功 cluster_slots_assigned:16384 # 已經全部分配完成 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3 cluster_current_epoch:5 cluster_my_epoch:2 cluster_stats_messages_ping_sent:266757 cluster_stats_messages_pong_sent:266531 cluster_stats_messages_meet_sent:10 cluster_stats_messages_sent:533298 cluster_stats_messages_ping_received:266530 cluster_stats_messages_pong_received:266667 cluster_stats_messages_meet_received:1 cluster_stats_messages_received:533198

        

      綜上可知,當全部槽位分配完成之后,集群還是可行的,如果我們手欠,移除一個槽位,那么集群就立馬那不行了,自己去試試吧 ——CLUSTER DELSLOTS 0 。

  5. 如何變成高可用性
    以上我們已經搭建了一套完整的可運行的redis cluster,但是每個節點都是單點,這樣子可能出現,一個節點掛掉,整個集群因為槽位分配不完全而崩潰,因此,我們需要為每個節點配置副本備用節點。
    前面我們已經提前創建了6個備用節點,搭建集群花了三個,因此還有剩下三個直接可以用來做備用副本。
    192.168.10.52:6379> CLUSTER INFO
    cluster_state:ok
    cluster_slots_assigned:16384
    cluster_slots_ok:16384
    cluster_slots_pfail:0
    cluster_slots_fail:0
    cluster_known_nodes:6   # 總共6個節點
    cluster_size:3          # 集群為 3 個節點
    cluster_current_epoch:5
    cluster_my_epoch:2
    cluster_stats_messages_ping_sent:270127
    cluster_stats_messages_pong_sent:269893
    cluster_stats_messages_meet_sent:10
    cluster_stats_messages_sent:540030
    cluster_stats_messages_ping_received:269892
    cluster_stats_messages_pong_received:270037
    cluster_stats_messages_meet_received:1
    cluster_stats_messages_received:539930
    

      
    查看所有節點的id

    192.168.10.52:6379> CLUSTER NODES
    54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c 172.17.0.3:6379@16379 master - 0 1528704114535 1 connected 5462-10922
    f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 172.17.0.4:6379@16379 master - 0 1528704114000 0 connected 10923-16383
    ae86224a3bc29c4854719c83979cb7506f37787a 172.17.0.7:6379@16379 master - 0 1528704114023 5 connected
    98aebcfe42d8aaa8a3375e4a16707107dc9da683 172.17.0.6:6379@16379 master - 0 1528704115544 4 connected
    0bbdc4176884ef0e3bb9b2e7d03d91b0e7e11f44 172.17.0.5:6379@16379 master - 0 1528704114836 3 connected
    760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 172.17.0.2:6379@16379 myself,master - 0 1528704115000 2 connected 0-5461
    

      
    編寫腳本,添加副本節點

    [root@etcd2 tmp]# vi addSlaveNodes.sh 
    #!/bin/bash
    
    /usr/local/bin/redis-cli -h 192.168.10.52 -p 6382 -a 123456 CLUSTER REPLICATE 760e4d0039c5ac13d04aa4791c9e6dc28544d7c7
    
    /usr/local/bin/redis-cli -h 192.168.10.52 -p 6383 -a 123456 CLUSTER REPLICATE 54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c
    
    /usr/local/bin/redis-cli -h 192.168.10.52 -p 6384 -a 123456 CLUSTER REPLICATE f45f9109f2297a83b1ac36f9e1db5e70bbc174ab
    

      
    注意:1、作為備用的節點,必須是未分配槽位的,否者會操作失敗 (error) ERR To set a master the node must be empty and without assigned slots 。
               2、需要從需要添加的節點上面執行操作,CLUSTER REPLICATE [node_id]  ,使當前節點成為 node_id 的副本節點。
               3、添加從節點(集群復制): 復制的原理和單機的Redis復制原理一樣,區別是:集群下的從節點也需要運行在cluster模式下,要先添加到集群里面,再做復制。
    查看所有節點信息:

    192.168.10.52:6379> CLUSTER NODES
    54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c 172.17.0.3:6379@16379 master - 0 1528705604149 1 connected 5462-10922
    f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 172.17.0.4:6379@16379 master - 0 1528705603545 0 connected 10923-16383
    ae86224a3bc29c4854719c83979cb7506f37787a 172.17.0.7:6379@16379 slave f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 0 1528705603144 5 connected
    98aebcfe42d8aaa8a3375e4a16707107dc9da683 172.17.0.6:6379@16379 slave 54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c 0 1528705603000 4 connected
    0bbdc4176884ef0e3bb9b2e7d03d91b0e7e11f44 172.17.0.5:6379@16379 slave 760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 0 1528705603000 3 connected
    760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 172.17.0.2:6379@16379 myself,master - 0 1528705602000 2 connected 0-5461
    

      
    可以看到我們現在實現了三主三從的一個高可用集群。

  6. 高可用測試——故障轉移
    查看當前運行狀態:
    192.168.10.52:6379> CLUSTER NODES
    54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c 172.17.0.3:6379@16379 master - 0 1528705604149 1 connected 5462-10922
    f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 172.17.0.4:6379@16379 master - 0 1528705603545 0 connected 10923-16383
    ae86224a3bc29c4854719c83979cb7506f37787a 172.17.0.7:6379@16379 slave f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 0 1528705603144 5 connected
    98aebcfe42d8aaa8a3375e4a16707107dc9da683 172.17.0.6:6379@16379 slave 54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c 0 1528705603000 4 connected
    0bbdc4176884ef0e3bb9b2e7d03d91b0e7e11f44 172.17.0.5:6379@16379 slave 760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 0 1528705603000 3 connected
    760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 172.17.0.2:6379@16379 myself,master - 0 1528705602000 2 connected 0-5461
    

      以上,運行正常

    嘗試關閉一個master,選擇端口為6380的容器,停掉之后:

    192.168.10.52:6379> CLUSTER NODES
    54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c 172.17.0.3:6379@16379 master,fail - 1528706408935 1528706408000 1 connected 5462-10922
    f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 172.17.0.4:6379@16379 master - 0 1528706463000 0 connected 10923-16383
    ae86224a3bc29c4854719c83979cb7506f37787a 172.17.0.7:6379@16379 slave f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 0 1528706462980 5 connected
    98aebcfe42d8aaa8a3375e4a16707107dc9da683 172.17.0.6:6379@16379 slave 54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c 0 1528706463000 4 connected
    0bbdc4176884ef0e3bb9b2e7d03d91b0e7e11f44 172.17.0.5:6379@16379 slave 760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 0 1528706463985 3 connected
    760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 172.17.0.2:6379@16379 myself,master - 0 1528706462000 2 connected 0-5461
    192.168.10.52:6379> 
    192.168.10.52:6379> CLUSTER INFO
    cluster_state:fail
    cluster_slots_assigned:16384
    cluster_slots_ok:10923
    cluster_slots_pfail:0
    cluster_slots_fail:5461
    cluster_known_nodes:6
    cluster_size:3
    cluster_current_epoch:5
    cluster_my_epoch:2
    cluster_stats_messages_ping_sent:275112
    cluster_stats_messages_pong_sent:274819
    cluster_stats_messages_meet_sent:10
    cluster_stats_messages_fail_sent:5
    cluster_stats_messages_sent:549946
    cluster_stats_messages_ping_received:274818
    cluster_stats_messages_pong_received:275004
    cluster_stats_messages_meet_received:1
    cluster_stats_messages_fail_received:1
    cluster_stats_messages_received:549824
    

      以上,發現整個集群都失敗了,從節點沒有自動升級為主節點,怎么回事??
    重啟停掉的容器,經排查日志信息 [root@df6ebce6f12a /]# tail -f /var/log/redis/redis-server.log  :

    1:S 11 Jun 09:57:46.712 # Cluster state changed: ok
    1:S 11 Jun 09:57:46.718 * (Non critical) Master does not understand REPLCONF listening-port: -NOAUTH Authentication required.
    1:S 11 Jun 09:57:46.718 * (Non critical) Master does not understand REPLCONF capa: -NOAUTH Authentication required.
    1:S 11 Jun 09:57:46.719 * Partial resynchronization not possible (no cached master)
    1:S 11 Jun 09:57:46.719 # Unexpected reply to PSYNC from master: -NOAUTH Authentication required.
    1:S 11 Jun 09:57:46.719 * Retrying with SYNC...
    1:S 11 Jun 09:57:46.719 # MASTER aborted replication with an error: NOAUTH Authentication required.
    1:S 11 Jun 09:57:46.782 * Connecting to MASTER 172.17.0.6:6379
    1:S 11 Jun 09:57:46.782 * MASTER <-> SLAVE sync started
    1:S 11 Jun 09:57:46.782 * Non blocking connect for SYNC fired the event.


    可以看到,主從之間訪問需要auth,之前忘記了配置 redis.conf  中的 # masterauth <master-password> ,所以導致主從之間無法通訊。修改配置之后,自動故障轉移正常。


    有時候需要實施人工故障轉移:

    登錄6380端口的從節點:6383,執行 CLUSTER FAILOVER 命令:
    192.168.10.52:6383> CLUSTER  FAILOVER
    (error) ERR Master is down or failed, please use CLUSTER FAILOVER FORCE
    

      
    發現因為master已經down了,所以我們需要執行強制轉移

    192.168.10.52:6383> CLUSTER FAILOVER FORCE
    OK
    

      
    查看當前 cluster node 情況:

    192.168.10.52:6383>  CLUSTER NODES
    0bbdc4176884ef0e3bb9b2e7d03d91b0e7e11f44 172.17.0.5:6379@16379 slave 760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 0 1528707535332 3 connected
    ae86224a3bc29c4854719c83979cb7506f37787a 172.17.0.7:6379@16379 slave f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 0 1528707534829 5 connected
    f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 172.17.0.4:6379@16379 master - 0 1528707534527 0 connected 10923-16383
    98aebcfe42d8aaa8a3375e4a16707107dc9da683 172.17.0.6:6379@16379 myself,master - 0 1528707535000 6 connected 5462-10922
    760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 172.17.0.2:6379@16379 master - 0 1528707535834 2 connected 0-5461
    54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c 172.17.0.3:6379@16379 master,fail - 1528707472833 1528707472000 1 connected
    

      
    從節點已經升級為master節點。這時候,我們嘗試重啟了,6380節點的redis(其實是重新啟動停掉的容器):

    192.168.10.52:6383>  CLUSTER NODES
    0bbdc4176884ef0e3bb9b2e7d03d91b0e7e11f44 172.17.0.5:6379@16379 slave 760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 0 1528707556044 3 connected
    ae86224a3bc29c4854719c83979cb7506f37787a 172.17.0.7:6379@16379 slave f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 0 1528707555000 5 connected
    f45f9109f2297a83b1ac36f9e1db5e70bbc174ab 172.17.0.4:6379@16379 master - 0 1528707556000 0 connected 10923-16383
    98aebcfe42d8aaa8a3375e4a16707107dc9da683 172.17.0.6:6379@16379 myself,master - 0 1528707556000 6 connected 5462-10922
    760e4d0039c5ac13d04aa4791c9e6dc28544d7c7 172.17.0.2:6379@16379 master - 0 1528707556000 2 connected 0-5461
    54cb5c2eb8e5f5aed2d2f7843f75a9284ef6785c 172.17.0.3:6379@16379 slave 98aebcfe42d8aaa8a3375e4a16707107dc9da683 0 1528707556547 6 connected
    

      

    我們發現,6380節點反而變成了 6383節點的從節點。

    現在集群應該是完整的了,所以,集群狀態應該已經恢復了,我們查看下:
    192.168.10.52:6383> CLUSTER INFO
    cluster_state:ok
    cluster_slots_assigned:16384
    cluster_slots_ok:16384
    cluster_slots_pfail:0
    cluster_slots_fail:0
    cluster_known_nodes:6
    cluster_size:3
    cluster_current_epoch:6
    cluster_my_epoch:6
    cluster_stats_messages_ping_sent:19419
    cluster_stats_messages_pong_sent:19443
    cluster_stats_messages_meet_sent:1
    cluster_stats_messages_auth-req_sent:5
    cluster_stats_messages_update_sent:1
    cluster_stats_messages_sent:38869
    cluster_stats_messages_ping_received:19433
    cluster_stats_messages_pong_received:19187
    cluster_stats_messages_meet_received:5
    cluster_stats_messages_fail_received:4
    cluster_stats_messages_auth-ack_received:2
    cluster_stats_messages_received:38631
    

      

    OK,沒有問題。

  7. 集群訪問
    客戶端在初始化的時候只需要知道一個節點的地址即可,客戶端會先嘗試向這個節點執行命令,比如  get key ,如果key所在的slot剛好在該節點上,則能夠直接執行成功。如果slot不在該節點,則節點會返回MOVED錯誤,同時把該slot對應的節點告訴客戶端,客戶端可以去該節點執行命令
    192.168.10.52:6383> get hello
    (error) MOVED 866 172.17.0.2:6379
    
    192.168.10.52:6379> set number 20004
    (error) MOVED 7743 172.17.0.3:6379
    

      另外,redis集群版只使用db0,select命令雖然能夠支持select 0。其他的db都會返回錯誤

    192.168.10.52:6383> select 0
    OK
    192.168.10.52:6383> select 1
    (error) ERR SELECT is not allowed in cluster mode
    

     

 

  1. 近期,有網友詢問docker redis集群連接報錯的問題,具體報錯如下:

     

     

    初步認為是,node節點沒有全部添加進去,添加之后,依然有上述問題。想到是跨主機訪問,應該是路由尋址不了導致的。當初寫上述教程的時候,docker是以默認的網絡模式bridge模式運行的,畢竟當初是以學習整理文檔為主,主要是單機訪問。但是,實際應用化場景中,多是公網跨主機訪問,問題明朗了,想着集群這東西最好還是設置成共享主機公網ip比較好,於是解決如下:
    1. 在docker運行時,執行網絡模式為:host。
    2. 端口沖突解決,畢竟host模式下,容器會占用宿主機的端口,於是,我們就從配置下手,在宿主機上生成配置redis-60001.conf,redis-60002.conf,redis-60003.conf...,有多少端口建多少個文件,最終運行一個容器,掛載一個配置到容器中用於覆蓋主機中的配置。
      最終的運行方式如下:
       docker run -d --name redis-6380 --net host -v /tmp/redis.conf:/usr/local/redis/redis.conf  hakimdstx/nodes-redis:4.0.1
      


      至此,網絡問題得到解決。
      PS.生產環境需要注意防火牆問題,不然也是會報錯的。  

以上

引用

     1、Redis Cluster部署、管理和測試

     2、Docker下redis的主從、持久化配置


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM