Redis Cluster 4.0高可用集群安裝、在線遷移操作記錄


之前介紹了redis cluster的結構及高可用集群部署過程,今天這里簡單說下redis集群的遷移。由於之前的redis cluster集群環境部署的服務器性能有限,需要遷移到高配置的服務器上。考慮到是線上生產環境,決定在線遷移,遷移過程,不中斷服務。操作過程如下:

一、機器環境

1
2
3
4
5
6
7
8
9
10
11
12
13
遷移前機器環境
-------------------------------------------------------------------------------
主機名              ip地址             節點端口
redis-node01       172.16.60.207     7000,7001
redis-node02       172.16.60.208     7002,7003
redis-node03       172.16.60.209     7004,7005
 
遷移后機器環境
-------------------------------------------------------------------------------
主機名             ip地址             節點端口
redis-new01       172.16.60.202     7000,7001
redis-new02       172.16.60.204     7002,7003
redis-new03       172.16.60.205     7004,7005

二、遷移前redis cluster高可用集群環境部署(這里采用"三主三從"模式)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
三台節點機器安裝操作如下一致
[root@redis-node01 ~] # yum install -y gcc g++ make gcc-c++ kernel-devel automake autoconf libtool make wget tcl vim ruby rubygems unzip git
[root@redis-node01 ~] # /etc/init.d/iptables stop
[root@redis-node01 ~] # setenforce 0
[root@redis-node01 ~] # vim /etc/sysconfig/selinux
SELINUX=disabled
    
下載並編譯安裝redis
[root@redis-node01 ~] # mkdir -p /data/software/
[root@redis-node01 software] # wget http://download.redis.io/releases/redis-4.0.6.tar.gz
[root@redis-node01 software] # tar -zvxf redis-4.0.6.tar.gz
[root@redis-node01 software] # mv redis-4.0.6 /data/
[root@redis-node01 software] # cd /data/redis-4.0.6/
[root@redis-node01 redis-4.0.6] # make
    
-------------------------------------------------------------------------------
分別創建和配置節點
節點1配置
[root@redis-node01 ~] # mkdir /data/redis-4.0.6/redis-cluster
[root@redis-node01 ~] # cd /data/redis-4.0.6/redis-cluster
[root@redis-node01 redis-cluster] # mkdir 7000 7001
[root@redis-node01 redis-cluster] # vim 7000/redis.conf
port 7000
bind 172.16.60.207
daemonize  yes
pidfile  /var/run/redis_7000 .pid
cluster-enabled  yes
cluster-config- file  nodes_7000.conf
cluster-node-timeout 10100
appendonly  yes
    
[root@redis-node01 redis-cluster] # vim 7001/redis.conf
port 7001
bind 172.16.60.207
daemonize  yes
pidfile  /var/run/redis_7001 .pid
cluster-enabled  yes
cluster-config- file  nodes_7001.conf
cluster-node-timeout 10100
appendonly  yes
    
節點2配置
[root@redis-node02 ~] # mkdir /data/redis-4.0.6/redis-cluster
[root@redis-node02 ~] # cd /data/redis-4.0.6/redis-cluster
[root@redis-node02 redis-cluster] # mkdir 7002 7003
[root@redis-node02 redis-cluster] # vim 7000/redis.conf
port 7002
bind 172.16.60.208
daemonize  yes
pidfile  /var/run/redis_7002 .pid
cluster-enabled  yes
cluster-config- file  nodes_7002.conf
cluster-node-timeout 10100
appendonly  yes
    
[root@redis-node01 redis-cluster] # vim 7003/redis.conf
port 7003
bind 172.16.60.208
daemonize  yes
pidfile  /var/run/redis_7003 .pid
cluster-enabled  yes
cluster-config- file  nodes_7003.conf
cluster-node-timeout 10100
appendonly  yes
    
節點3配置
[root@redis-node01 ~] # mkdir /data/redis-4.0.6/redis-cluster
[root@redis-node01 ~] # cd /data/redis-4.0.6/redis-cluster
[root@redis-node01 redis-cluster] # mkdir 7004 7005
[root@redis-node01 redis-cluster] # vim 7004/redis.conf
port 7004
bind 172.16.60.209
daemonize  yes
pidfile  /var/run/redis_7004 .pid
cluster-enabled  yes
cluster-config- file  nodes_7004.conf
cluster-node-timeout 10100
appendonly  yes
    
[root@redis-node01 redis-cluster] # vim 7005/redis.conf
port 7005
bind 172.16.60.209
daemonize  yes
pidfile  /var/run/redis_7005 .pid
cluster-enabled  yes
cluster-config- file  nodes_7005.conf
cluster-node-timeout 10100
appendonly  yes
    
-------------------------------------------------------------------------------
分別啟動redis服務(這里統一在 /data/redis-4 .0.6 /redis-cluster 路徑下啟動redis服務,即nodes_*.conf等文件也在這個路徑下產生)
節點1
[root@redis-node01 redis-cluster] # for((i=0;i<=1;i++)); do /data/redis-4.0.6/src/redis-server /data/redis-4.0.6/redis-cluster/700$i/redis.conf; done
[root@redis-node01 redis-cluster] # ps -ef|grep redis
root      1103     1  0 15:19 ?        00:00:03  /data/redis-4 .0.6 /src/redis-server  172.16.60.207:7000 [cluster]               
root      1105     1  0 15:19 ?        00:00:03  /data/redis-4 .0.6 /src/redis-server  172.16.60.207:7001 [cluster]               
root      1315 32360  0 16:16 pts /1     00:00:00  grep  redis
    
節點2
[root@redis-node02 redis-cluster] # for((i=2;i<=3;i++)); do /data/redis-4.0.6/src/redis-server /data/redis-4.0.6/redis-cluster/700$i/redis.conf; done
[root@redis-node02 redis-cluster] # ps -ef|grep redis
root      9446     1  0 15:19 ?        00:00:03  /data/redis-4 .0.6 /src/redis-server  172.16.60.208:7002 [cluster]               
root      9448     1  0 15:19 ?        00:00:03  /data/redis-4 .0.6 /src/redis-server  172.16.60.208:7003 [cluster]               
root      9644  8540  0 16:17 pts /0     00:00:00  grep  redis
    
節點3
[root@redis-node01 redis-cluster] # for((i=4;i<=5;i++)); do /data/redis-4.0.6/src/redis-server /data/redis-4.0.6/redis-cluster/700$i/redis.conf; done
[root@redis-node03 ~] # ps -ef|grep redis
root      9486     1  0 15:19 ?        00:00:03  /data/redis-4 .0.6 /src/redis-server  172.16.60.209:7004 [cluster]               
root      9488     1  0 15:19 ?        00:00:03  /data/redis-4 .0.6 /src/redis-server  172.16.60.209:7005 [cluster]               
root      9686  9555  0 16:17 pts /0     00:00:00  grep  redis
    
-------------------------------------------------------------------------------
接着在節點1上安裝 Ruby(只需要在其中一個節點上安裝即可)
[root@redis-node01 ~] # yum -y install ruby ruby-devel rubygems rpm-build
[root@redis-node01 ~] # gem install redis
    
溫馨提示:
在centos6.x下執行上面的 "gem install redis" 操作可能會報錯,坑很多!
默認yum安裝的ruby版本是1.8.7,版本太低,需要升級到ruby2.2以上,否則執行上面安裝會報錯!
    
首先安裝rvm(或者直接下載證書:https: //pan .baidu.com /s/1slTyJ7n   密鑰:7uan   下載並解壓后直接執行 "curl -L get.rvm.io | bash -s stable" 即可)
[root@redis-node01 ~] # curl -L get.rvm.io | bash -s stable          //可能會報錯,需要安裝提示進行下面一步操作
[root@redis-node01 ~] # curl -sSL https://rvm.io/mpapis.asc | gpg2 --import -      //然后再接着執行:curl -L get.rvm.io | bash -s stable
[root@redis-node01 ~] # find / -name rvm.sh
/etc/profile .d /rvm .sh
[root@redis-node01 ~] # source /etc/profile.d/rvm.sh
[root@redis-node01 ~] # rvm requirements
      
然后升級ruby到2.3
[root@redis-node01 ~] # rvm install ruby 2.3.1
[root@redis-node01 ~] # ruby -v
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]
      
列出所有ruby版本
[root@redis-node01 ~] # rvm list
      
設置默認的版本
[root@redis-node01 ~] # rvm --default use 2.3.1
      
更新下載源
[root@redis-node01 ~] # gem sources --add https://gems.ruby-china.org/ --remove https://rubygems.org
https: //gems .ruby-china.org/ added to sources
source  https: //rubygems .org not present  in  cache
      
[root@redis-node01 ~] # gem sources
*** CURRENT SOURCES ***
      
https: //rubygems .org/
https: //gems .ruby-china.org/
      
最后就能順利安裝了
[root@redis-node01 ~] # gem install redis
Successfully installed redis-4.0.6
Parsing documentation  for  redis-4.0.6
Done installing documentation  for  redis after 1 seconds
1 gem installed
    
-------------------------------------------------------------------------------
接着創建redis cluster集群(在節點1機器上操作即可)
    
首先手動指定三個master節點。master節點最好分布在三台機器上
[root@redis-node01 ~] # /data/redis-4.0.6/src/redis-trib.rb create  172.16.60.207:7000 172.16.60.208:7002  172.16.60.209:7004
    
然后手動指定上面三個master節點各自的slave節點。slave節點也最好分布在三台機器上
[root@redis-node01 ~] # /data/redis-4.0.6/src/redis-trib.rb add-node --slave 172.16.60.208:7003  172.16.60.207:7000
[root@redis-node01 ~] # /data/redis-4.0.6/src/redis-trib.rb add-node --slave 172.16.60.209:7005  172.16.60.208:7002
[root@redis-node01 ~] # /data/redis-4.0.6/src/redis-trib.rb add-node --slave 172.16.60.207:7001  172.16.60.209:7004
    
然后檢查下redis cluster集群狀態
[root@redis-node01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb check 172.16.60.207:7000
>>> Performing Cluster Check (using node 172.16.60.207:7000)
M: 971d05cd7b9bb3634ad024e6aac3dff158c52eee 172.16.60.207:7000
    slots:0-5460 (5461 slots) master
    1 additional replica(s)
S: e7592314869c29375599d781721ad76675645c4c 172.16.60.209:7005
    slots: (0 slots) slave
    replicates 0060012d749167d3f72833d916e53b3445b66c62
S: 52b8d27838244657d9b01a233578f24d287979fe 172.16.60.208:7003
    slots: (0 slots) slave
    replicates 971d05cd7b9bb3634ad024e6aac3dff158c52eee
S: 213bde6296c36b5f31b958c7730ff1629125a204 172.16.60.207:7001
    slots: (0 slots) slave
    replicates e936d5b4c95b6cae57f994e95805aef87ea4a7a5
M: e936d5b4c95b6cae57f994e95805aef87ea4a7a5 172.16.60.209:7004
    slots:10923-16383 (5461 slots) master
    1 additional replica(s)
M: 0060012d749167d3f72833d916e53b3445b66c62 172.16.60.208:7002
    slots:5461-10922 (5462 slots) master
    1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check  for  open  slots...
>>> Check slots coverage...
[OK] All 16384 slots。 covered.
  
通過上面可以看出,只有master主節點才占用slots,從節點都是0 slots,也就是說keys數值是在master節點上。
三個master主節點分割了16384 slots。分別是0-5460、5461-10922、10923-16383。
如果有一組master-slave都掛掉,16484 slots不完整,則整個集群服務也就掛了,必須待這組master-slave節點,集群才能恢復。
如果新加入master主節點,默認是0 slots,需要reshard為新master節點分布數據槽(會詢問向移動多少哈希槽到此節點),后面會提到。
 
寫入幾條測試數據
登錄三個master節點上寫入數據(登錄slave節點上寫入數據,發現也會自動跳到master節點上進行寫入)
[root@redis-node01 redis-cluster] # /data/redis-4.0.6/src/redis-cli -h 172.16.60.207 -c -p 7000
172.16.60.207:7000>  set  test1  test -207
OK
172.16.60.207:7000>  set  test11  test -207-207
-> Redirected to slot [13313] located at 172.16.60.209:7004
OK
 
[root@redis-node01 redis-cluster] # /data/redis-4.0.6/src/redis-cli -h 172.16.60.208 -c -p 7002
172.16.60.208:7002>  set  test2  test -208
OK
172.16.60.208:7002>  set  test22  test -208-208
-> Redirected to slot [4401] located at 172.16.60.207:7000
OK
 
[root@redis-node01 redis-cluster] # /data/redis-4.0.6/src/redis-cli -h 172.16.60.209 -c -p 7004
172.16.60.209:7004>  set  test3  test -209
OK
172.16.60.209:7004>  set  test33  test -209-209
OK
 
讀數據
[root@redis-node01 redis-cluster] # /data/redis-4.0.6/src/redis-cli -h 172.16.60.207 -c -p 7000
172.16.60.207:7000> get test1
"test-207"
172.16.60.207:7000> get test11
-> Redirected to slot [13313] located at 172.16.60.209:7004
"test-207-207"
172.16.60.209:7004> get test2
-> Redirected to slot [8899] located at 172.16.60.208:7002
"test-208"
172.16.60.208:7002> get test22
-> Redirected to slot [4401] located at 172.16.60.207:7000
"test-208-208"
172.16.60.207:7000> get test3
-> Redirected to slot [13026] located at 172.16.60.209:7004
"test-209"
172.16.60.209:7004> get test33
"test-209-209"
172.16.60.209:7004>

三、在線遷移

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
三台新機器安裝redis步驟省略,和上面一致。
三台新機器的各節點配置和遷移前三台機器一直,只需要修改ip地址即可。路徑和端口一致
啟動三台新機器的redis節點服務
在新節點redis-new01上安裝Ruby,安裝過程省略,和上面一直。
 
將三個新節點都添加到之前的集群中。
=====================
先添加主節點
命令格式 "redis-trib.rb add-node <新增節點名> < 原集群節點名>"
第一個為新節點IP的master端口,第二個參數為現有的任意節點IP的master端口
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb add-node 172.16.60.202:7000 172.16.60.207:7000
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb add-node 172.16.60.204:7002 172.16.60.207:7000
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb add-node 172.16.60.205:7004 172.16.60.207:7000
 
=====================
再添加新機器的從節點
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb add-node --slave 172.16.60.204:7003  172.16.60.202:7000
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb add-node --slave 172.16.60.205:7005  172.16.60.204:7002
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb add-node --slave 172.16.60.202:7001  172.16.60.205:7004
 
查看此時集群狀態
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb check 172.16.60.202:7000
 
查看集群的哈希槽slot情況
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb info 172.16.60.202:7000
172.16.60.202:7000 (a0169bec...) -> 0 keys | 0 slots | 1 slaves.
172.16.60.209:7004 (47cde5c7...) -> 3 keys | 5461 slots | 1 slaves.
172.16.60.208:7002 (656fc84a...) -> 1 keys | 5462 slots | 1 slaves.
172.16.60.205:7004 (48cbab90...) -> 0 keys | 0 slots | 1 slaves.
172.16.60.207:7000 (a8fe2d6e...) -> 2 keys | 5461 slots | 1 slaves.
172.16.60.204:7002 (c6a78cfb...) -> 0 keys | 0 slots | 1 slaves.
[OK] 6 keys  in  6 masters.
0.00 keys per slot on average.
 
新添加的master節點的slot默認都是為0,master主節點如果沒有slots的話,存取數據就都不會被選中!
數據只會存儲在master主節點中!
所以需要給新添加的master主節點分配slot,即reshard slot操作。
 
如上根據最后一個新master節點添加成功后顯示的slot可知,已有的master節點的slot分配情況為:
172.16.60.207:7000   -->  slots:0-5460 (5461 slots) master
172.16.60.208:7002   -->  slots:5461-10922 (5462 slots) master
172.16.60.209:7004   -->  slots:10923-16383 (5461 slots) master
 
現在開始往新添加的三個master節點分配slot
a)將172.16.60.207:7000的slot全部分配(5461)給172.16.60.202:7000
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb reshard 172.16.60.202:7000
........
How many slots  do  you want to move (from 1 to 16384)? 5461           #分配多少數量的slot。(這里要把172.16.60.207:7000節點的slot都分配完)
What is the receiving node ID? a0169becd97ccca732d905fd762b4d615674f7bd       #上面那些數量的slot被哪個節點接收。這里填寫172.16.60.202:7000節點ID
Please enter all the  source  node IDs.
   Type  'all'  to use all the nodes as  source  nodes  for  the  hash  slots.
   Type  'done'  once you entered all the  source  nodes IDs.
Source node  #1:971d05cd7b9bb3634ad024e6aac3dff158c52eee          #指從哪個節點分配上面指定數量的slot。這里填寫172.16.60.207:7000的ID。如果填寫all,則表示從之前所有master節點中抽取上面指定數量的slot。
Source node  #2:done                       #填寫done
.......
Do you want to proceed with the proposed reshard plan ( yes /no )?  yes      #填寫yes,確認分配
 
==================================================================
可能會遇到點問題,resharding執行中斷。然后出現兩邊都有slot的情況。
Moving slot 4396 from 172.16.60.207:7000 to 172.16.60.202:7000:
Moving slot 4397 from 172.16.60.207:7000 to 172.16.60.202:7000:
Moving slot 4398 from 172.16.60.207:7000 to 172.16.60.202:7000:
Moving slot 4399 from 172.16.60.207:7000 to 172.16.60.202:7000:
Moving slot 4400 from 172.16.60.207:7000 to 172.16.60.202:7000:
Moving slot 4401 from 172.16.60.207:7000 to 172.16.60.202:7000:
[ERR] Calling MIGRATE: ERR Syntax error, try CLIENT (LIST | KILL | GETNAME | SETNAME | PAUSE | REPLY)
 
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb check 172.16.60.202:7000  
>>> Performing Cluster Check (using node 172.16.60.202:7000)
M: a0169becd97ccca732d905fd762b4d615674f7bd 172.16.60.202:7000
    slots:0-4400 (4401 slots) master
    1 additional replica(s)
.......
M: 971d05cd7b9bb3634ad024e6aac3dff158c52eee 172.16.60.207:7000
    slots:4401-5460 (1060 slots) master
    1 additional replica(s)
 
分析原因:
reshard重新分配slot時報錯內容為:Syntax error ,try CLIENT (LIST|KILL|GETNAME|SETNAME|PAUSE|REPLY)
但是遷移沒有key-value的槽的時候就會執行成功。 這就說明問題出在了存不存在key-value上!
 
找到reshard的執行過程:發現具體遷移步驟是通過 move_slot函數調用(redis-trib.rb文件中)。
打開move_slot函數,找到具體的遷移代碼。
[root@redis-new01 redis-cluster] # cp /data/redis-4.0.6/src/redis-trib.rb /tmp/
[root@redis-new01 redis-cluster] # cat /data/redis-4.0.6/src/redis-trib.rb|grep source.r.client.call
                 source .r.client.call([ "migrate" ,target.info[:host],target.info[:port], "" ,0,@timeout,:keys,*keys])
                     source .r.client.call([ "migrate" ,target.info[:host],target.info[:port], "" ,0,@timeout,:replace,:keys,*keys])
 
上面 grep 出來的 source .r.client.call部分則就是redis-trib.rb腳本告知客戶端執行遷移帶key-value槽的指令。
 
我們會發現該指令的具體調用時,等同於
"client migrate target.info[:host],target.info[:port]," ",0,@timeout,:replace,:keys,*keys]"
 
問題來了,這條指令在服務器中怎么執行的呢?
它先執行networking.c  文件中的 clientCommand(client *c)
 
根據參數一一比對( if 條件語句)。這時候就會發現bug來了!!!clientCommand函數中沒有 migrate分支。
所以會返回一個    Syntax error ,try CLIENT (LIST|KILL|GETNAME|SETNAME|PAUSE|REPLY);
這個錯誤信息告訴你, Client中只有LIST|KILL|GETNAME|SETNAME|PAUSE|REPLY分支。
 
那么怎么去修改實現真正的帶key遷移的slot呢?
 
研究源碼,cluster.c文件中里面有migrateCommand(client *c)。恍然大悟,故只要將redis-trib.rb文件中遷移語句修改為:
   source .r.call([ "migrate" ,target.info[:host],target.info[:port], "" ,0,@timeout, "replace" ,:keys,*keys])
   source .r.call([ "migrate" ,target.info[:host],target.info[:port], "" ,0,@timeout,:replace,:keys,*keys])
 
即不執行clientCommand,直接執行migrateCommand。
 
也就是說,只需要將redis-trib.rb文件中原來的
                 source .r.client.call([ "migrate" ,target.info[:host],target.info[:port], "" ,0,@timeout,:keys,*keys])
                     source .r.client.call([ "migrate" ,target.info[:host],target.info[:port], "" ,0,@timeout,:replace,:keys,*keys])
改為
                 source .r.call([ "migrate" ,target.info[:host],target.info[:port], "" ,0,@timeout, "replace" ,:keys,*keys])
                     source .r.call([ "migrate" ,target.info[:host],target.info[:port], "" ,0,@timeout,:replace,:keys,*keys])
 
問題就解決了!
 
[root@redis-new01 redis-cluster] # cat /data/redis-4.0.6/src/redis-trib.rb |grep  source.r.call
                 source .r.call([ "migrate" ,target.info[:host],target.info[:port], "" ,0,@timeout, "replace" ,:keys,*keys])
                     source .r.call([ "migrate" ,target.info[:host],target.info[:port], "" ,0,@timeout,:replace,:keys,*keys])
 
這個bug是因為ruby的gem不同造成的,以后5.0版本會拋棄redis-trib.rb。直接使用redis-cli客戶端實現集群管理!!
==================================================================
 
redis-trib.rb腳本文件修改后,繼續將172.16.60.207:7000剩下的slot全部分配給172.16.60.202:7000
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb reshard 172.16.60.202:7000
........
>>> Check  for  open  slots...
[WARNING] Node 172.16.60.202:7000 has slots  in  importing state (4401).
[WARNING] Node 172.16.60.207:7000 has slots  in  migrating state (4401).
[WARNING] The following slots are  open : 4401
>>> Check slots coverage...
[OK] All 16384 slots covered.
*** Please fix your cluster problems before resharding
 
解決辦法:
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-cli -h 172.16.60.202 -c -p 7000
172.16.60.202:7000> cluster setslot 4401 stable
OK
172.16.60.202:7000>
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-cli -h 172.16.60.207 -c -p 7000
172.16.60.207:7000> cluster setslot 4401 stable
OK
172.16.60.207:7000>
 
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb fix 172.16.60.202:7000 
.......
[OK] All nodes agree about slots configuration.
>>> Check  for  open  slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
 
 
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb reshard 172.16.60.202:7000
......
How many slots  do  you want to move (from 1 to 16384)? 1060
What is the receiving node ID? a0169becd97ccca732d905fd762b4d615674f7bd     
Please enter all the  source  node IDs.
   Type  'all'  to use all the nodes as  source  nodes  for  the  hash  slots.
   Type  'done'  once you entered all the  source  nodes IDs.
Source node  #1:971d05cd7b9bb3634ad024e6aac3dff158c52eee         
Source node  #2:done                      
.......
Do you want to proceed with the proposed reshard plan ( yes /no )?  yes    
 
然后再check檢查集群狀態.
發現172.16.60.207:7000節點的5461個slot已經移動到172.16.60.202:7000節點上了。
  [root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb check 172.16.60.202:7000     
>>> Performing Cluster Check (using node 172.16.60.202:7000)
M: a0169becd97ccca732d905fd762b4d615674f7bd 172.16.60.202:7000
    slots:0-5460 (5461 slots) master
    2 additional replica(s)
........
M: 971d05cd7b9bb3634ad024e6aac3dff158c52eee 172.16.60.207:7000
    slots: (0 slots) master
    0 additional replica(s)
 
b)將172.16.60.208:7002的slot(5462)全部分配給172.16.60.204:7002
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb reshard 172.16.60.204:7002
.......
How many slots  do  you want to move (from 1 to 16384)? 5462
What is the receiving node ID? c6a78cfbb77804c4837963b5f589064b6111457a
Please enter all the  source  node IDs.
   Type  'all'  to use all the nodes as  source  nodes  for  the  hash  slots.
   Type  'done'  once you entered all the  source  nodes IDs.
Source node  #1:0060012d749167d3f72833d916e53b3445b66c62
Source node  #2:done
.......
Do you want to proceed with the proposed reshard plan ( yes /no )?  yes
 
c)將172.16.60.209:7004的slot(5461)全部分配給172.16.60.205:7004
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb reshard 172.16.60.205:7004
.........
How many slots  do  you want to move (from 1 to 16384)? 5461
What is the receiving node ID? 48cbab906141dd26241ccdbc38bee406586a8d03
Please enter all the  source  node IDs.
   Type  'all'  to use all the nodes as  source  nodes  for  the  hash  slots.
   Type  'done'  once you entered all the  source  nodes IDs.
Source node  #1:e936d5b4c95b6cae57f994e95805aef87ea4a7a5
Source node  #2:done
.........
Do you want to proceed with the proposed reshard plan ( yes /no )?  yes
 
待到三個新節點的master都分配完哈希槽slot之后,再次查看下集群狀態
發現遷移之前的那三個master的slot都為0了,slot都對應遷移到新的節點的三個master上了
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb check 172.16.60.202:7000
>>> Performing Cluster Check (using node 172.16.60.202:7000)
M: a0169becd97ccca732d905fd762b4d615674f7bd 172.16.60.202:7000
    slots:0-5460 (5461 slots) master
    2 additional replica(s)
S: d9671ca6b4235931a2a215cc327a400ad4f9a399 172.16.60.205:7005
    slots: (0 slots) slave
    replicates c6a78cfbb77804c4837963b5f589064b6111457a
M: e936d5b4c95b6cae57f994e95805aef87ea4a7a5 172.16.60.209:7004
    slots: (0 slots) master
    0 additional replica(s)
S: 213bde6296c36b5f31b958c7730ff1629125a204 172.16.60.207:7001
    slots: (0 slots) slave
    replicates 48cbab906141dd26241ccdbc38bee406586a8d03
M: 0060012d749167d3f72833d916e53b3445b66c62 172.16.60.208:7002
    slots: (0 slots) master
    0 additional replica(s)
S: 52b8d27838244657d9b01a233578f24d287979fe 172.16.60.208:7003
    slots: (0 slots) slave
    replicates a0169becd97ccca732d905fd762b4d615674f7bd
M: 48cbab906141dd26241ccdbc38bee406586a8d03 172.16.60.205:7004
    slots:10923-16383 (5461 slots) master
    2 additional replica(s)
S: e7592314869c29375599d781721ad76675645c4c 172.16.60.209:7005
    slots: (0 slots) slave
    replicates c6a78cfbb77804c4837963b5f589064b6111457a
S: 2950f2cb6d960cd48e792f7c82d62d2cd07d20f9 172.16.60.204:7003
    slots: (0 slots) slave
    replicates a0169becd97ccca732d905fd762b4d615674f7bd
M: 971d05cd7b9bb3634ad024e6aac3dff158c52eee 172.16.60.207:7000
    slots: (0 slots) master
    0 additional replica(s)
M: c6a78cfbb77804c4837963b5f589064b6111457a 172.16.60.204:7002
    slots:5461-10922 (5462 slots) master
    2 additional replica(s)
S: 6e663a1bcc3d241ed4d1a9667a0cc92fbe554740 172.16.60.202:7001
    slots: (0 slots) slave
    replicates 48cbab906141dd26241ccdbc38bee406586a8d03
[OK] All nodes agree about slots configuration.
>>> Check  for  open  slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
 
查看集群slot情況
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb info 172.16.60.202:7000
172.16.60.202:7000 (a0169bec...) -> 2 keys | 5461 slots | 2 slaves.
172.16.60.209:7004 (47cde5c7...) -> 0 keys | 0 slots | 0 slaves.
172.16.60.208:7002 (656fc84a...) -> 0 keys | 0 slots | 0 slaves.
172.16.60.205:7004 (48cbab90...) -> 3 keys | 5461 slots | 2 slaves.
172.16.60.207:7000 (a8fe2d6e...) -> 0 keys | 0 slots | 0 slaves.
172.16.60.204:7002 (c6a78cfb...) -> 1 keys | 5462 slots | 2 slaves.
[OK] 6 keys  in  6 masters.
0.00 keys per slot on average.
 
檢查下數據,發現測試數據也已經遷移到新的master節點上了
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-cli -h 172.16.60.202 -c -p 7000
172.16.60.202:7000> get test1
"test-207"
172.16.60.202:7000> get test2
-> Redirected to slot [8899] located at 172.16.60.204:7002
"test-208"
172.16.60.204:7002> get test3
-> Redirected to slot [13026] located at 172.16.60.205:7004
"test-209"
172.16.60.205:7004> get test11
"test-207-207"
172.16.60.205:7004> get test22
-> Redirected to slot [4401] located at 172.16.60.202:7000
"test-208-208"
172.16.60.202:7000> get test33
-> Redirected to slot [12833] located at 172.16.60.205:7004
"test-209-209"
172.16.60.205:7004>

關於reshard重新分配哈希槽slot,除了上面交互式的操作,也可以直接使用如下命令進行操作:

1
# redis-trib.rb reshard --from <node-id> --to <node-id> --slots <number of slots> --yes <host>:<port>

四、遷移完成后,從集群中刪除原來的節點

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
a)從集群中刪除遷移之前的slave從節點
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb del-node 172.16.60.207:7001 213bde6296c36b5f31b958c7730ff1629125a204
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb del-node 172.16.60.208:7003 52b8d27838244657d9b01a233578f24d287979fe
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb del-node 172.16.60.209:7005 e7592314869c29375599d781721ad76675645c4c
   
b)從集群中刪除遷移之前的master主節點。
刪除master主節點時需注意下面節點:
-  如果主節點有從節點,需要將從節點轉移到其他主節點或提前刪除從節點
-  如果主節點有slot,去掉分配的slot,然后再刪除主節點。
  
刪除master主節點時,必須確保它上面的slot為0,即必須為空!否則可能會導致整個redis cluster集群無法工作!
如果要移除的master節點不是空的,需要先用重新分片命令來把數據移到其他的節點。
另外一個移除master節點的方法是先進行一次手動的失效備援,等它的slave被選舉為新的master,並且它被作為一個新的slave被重新加到集群中來之后再移除它。
很明顯,如果你是想要減少集群中的master數量,這種做法沒什么用。在這種情況下你還是需要用重新分片來移除數據后再移除它。
  
由於已經將原來的三個master主節點的slot全部抽完了,即slot現在都為0,且他們各自的slave節點也已在上面刪除
所以這時原來的三個master主節點可以直接從集群中刪除
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb del-node 172.16.60.207:7000 971d05cd7b9bb3634ad024e6aac3dff158c52eee
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb del-node 172.16.60.208:7002 0060012d749167d3f72833d916e53b3445b66c62
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb del-node 172.16.60.209:7004 e936d5b4c95b6cae57f994e95805aef87ea4a7a5
   
最后再次查看下新的redis cluster集群狀態
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb check 172.16.60.202:7000                                         
>>> Performing Cluster Check (using node 172.16.60.202:7000)
M: a0169becd97ccca732d905fd762b4d615674f7bd 172.16.60.202:7000
    slots:0-5460 (5461 slots) master
    1 additional replica(s)
S: d9671ca6b4235931a2a215cc327a400ad4f9a399 172.16.60.205:7005
    slots: (0 slots) slave
    replicates c6a78cfbb77804c4837963b5f589064b6111457a
M: 48cbab906141dd26241ccdbc38bee406586a8d03 172.16.60.205:7004
    slots:10923-16383 (5461 slots) master
    1 additional replica(s)
S: 2950f2cb6d960cd48e792f7c82d62d2cd07d20f9 172.16.60.204:7003
    slots: (0 slots) slave
    replicates a0169becd97ccca732d905fd762b4d615674f7bd
M: c6a78cfbb77804c4837963b5f589064b6111457a 172.16.60.204:7002
    slots:5461-10922 (5462 slots) master
    1 additional replica(s)
S: 6e663a1bcc3d241ed4d1a9667a0cc92fbe554740 172.16.60.202:7001
    slots: (0 slots) slave
    replicates 48cbab906141dd26241ccdbc38bee406586a8d03
[OK] All nodes agree about slots configuration.
>>> Check  for  open  slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
   
   
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb info 172.16.60.202:7000                                        
172.16.60.202:7000 (a0169bec...) -> 2 keys | 5461 slots | 1 slaves.
172.16.60.205:7004 (48cbab90...) -> 3 keys | 5461 slots | 1 slaves.
172.16.60.204:7002 (c6a78cfb...) -> 1 keys | 5462 slots | 1 slaves.
[OK] 6 keys  in  3 masters.
0.00 keys per slot on average.
   
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-cli -h 172.16.60.202 -c -p 7000
172.16.60.202:7000> get test1
"test-207"
172.16.60.202:7000> get test11
-> Redirected to slot [13313] located at 172.16.60.205:7004
"test-207-207"
172.16.60.205:7004> get test2
-> Redirected to slot [8899] located at 172.16.60.204:7002
"test-208"
172.16.60.204:7002> get test22
-> Redirected to slot [4401] located at 172.16.60.202:7000
"test-208-208"
172.16.60.202:7000> get test3
-> Redirected to slot [13026] located at 172.16.60.205:7004
"test-209"
172.16.60.205:7004> get test33
"test-209-209"
172.16.60.205:7004>
   
=====================================================
溫馨提示:
如果被刪除的master主節點的slot不為0,則需要先將被刪除master節點的slot抽取完,即取消它的slot分配!
   
假設master主節點172.16.60.207:7000的slot還有2550個,則需要將這2550個slot從172.16.60.207:7000上抽到172.16.60.202:7000上
   
[root@redis-new01 redis-cluster] # /data/redis-4.0.6/src/redis-trib.rb reshard 172.16.60.207:7000
.......
How many slots  do  you want to move (from 1 to 16384)? 2550                // 被刪除master的所有slot數量
What is the receiving node ID? a0169becd97ccca732d905fd762b4d615674f7bd        // 接收2550的slot的master節點ID,即172.16.60.202:7000的ID
Please enter all the  source  node IDs.
   Type  'all'  to use all the nodes as  source  nodes  for  the  hash  slots.
   Type  'done'  once you entered all the  source  nodes IDs.
Source node  #1:971d05cd7b9bb3634ad024e6aac3dff158c52eee        //被刪除master節點的ID,即172.16.60.207:7000的ID
Source node  #2:done                                                                              //輸入done
.......
Do you want to proceed with the proposed reshard plan ( yes /no )?  yes            // 確認操作
   
如上成功取消了master節點的slot分配(即slot為0)之后,它就可以被刪除了!
   
溫馨提示:
1)新增master節點后,也需要進行reshard操作,不過針對的是新增節點。即 "redis-trib.rb reshard 新增節點" 。這是slot分配操作!
2)刪除master節點前,如果有slot,也需要進行reshard操作,不過針對的是刪除節點。即 "redis-trib.rb reshard 被刪除節點" 。這是slot取消操作!

經過測試,應用在redis cluster如上遷移過程中沒有受到任何影響!但是要注意,遷移后需要在應用程序里將redis連接地址更新為新的redis地址和端口。

轉:https://www.cnblogs.com/kevingrace/p/9844310.html


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM