結論:
這種情況下復制節點(即從節點)無法提升為主節點,復制節點會一直嘗試和主節點建立連接,直接成功。主節點恢復后,復制節點仍然保持為復制節點,並不會成為主節點。
復制節點無法提升為主節點的原因是復制節點未發起成為主節點的選舉。
復制節點日志:
14304:S 26 Mar 2019 15:42:01.158 * Connecting to MASTER 10.49.126.98:4076 14304:S 26 Mar 2019 15:42:01.158 * MASTER <-> REPLICA sync started 14304:S 26 Mar 2019 15:42:01.158 # Error condition on socket for SYNC: Connection refused 14304:S 26 Mar 2019 15:42:02.161 * Connecting to MASTER 10.49.126.98:4076 14304:S 26 Mar 2019 15:42:02.161 * MASTER <-> REPLICA sync started 14304:S 26 Mar 2019 15:42:02.161 # Error condition on socket for SYNC: Connection refused 14304:S 26 Mar 2019 15:42:03.167 * Connecting to MASTER 10.49.126.98:4076 14304:S 26 Mar 2019 15:42:03.167 * MASTER <-> REPLICA sync started 14304:S 26 Mar 2019 15:42:03.167 * Non blocking connect for SYNC fired the event. 主節點正在從磁盤加載數據集(-LOADING)到內存 14304:S 26 Mar 2019 15:42:03.173 # Error reply to PING from master: '-LOADING Redis is loading the dataset in memory' 14304:S 26 Mar 2019 15:42:03.770 * Clear FAIL state for node c67dc9e02e25f2e6321df8ac2eb4d99789917783: is reachable again and nobody is serving its slots after some time. 集群狀態恢復正常(之前因為其中一個master故障轉為fail狀態) 14304:S 26 Mar 2019 15:42:03.770 # Cluster state changed: ok 14304:S 26 Mar 2019 15:42:04.169 * Connecting to MASTER 10.49.126.98:4076 14304:S 26 Mar 2019 15:42:04.169 * MASTER <-> REPLICA sync started 14304:S 26 Mar 2019 15:42:04.169 * Non blocking connect for SYNC fired the event. 14304:S 26 Mar 2019 15:42:04.169 * Master replied to PING, replication can continue... 14304:S 26 Mar 2019 15:42:04.169 * Trying a partial resynchronization (request 725b1fbcfc073eec81837cb0f1fd786c995f4d46:1). 復制節點全量復制主節點數據 14304:S 26 Mar 2019 15:42:04.174 * Full resync from master: 68ef812d5b3dc70adca8c6ed0f306249725df91f:0 因為是全量復制,所以原來的狀態沒用了(Discarding) 14304:S 26 Mar 2019 15:42:04.174 * Discarding previously cached master state. 14304:S 26 Mar 2019 15:42:04.275 * MASTER <-> REPLICA sync: receiving 106404 bytes from master 14304:S 26 Mar 2019 15:42:04.275 * MASTER <-> REPLICA sync: Flushing old data 14304:S 26 Mar 2019 15:42:04.275 * MASTER <-> REPLICA sync: Loading DB in memory 14304:S 26 Mar 2019 15:42:04.292 * MASTER <-> REPLICA sync: Finished with success 復制節點開始重新AOF文件 14304:S 26 Mar 2019 15:42:04.293 * Background append only file rewriting started by pid 21172 14304:S 26 Mar 2019 15:42:04.325 * AOF rewrite child asks to stop sending diffs. 21172:C 26 Mar 2019 15:42:04.326 * Parent agreed to stop sending diffs. Finalizing AOF... 21172:C 26 Mar 2019 15:42:04.326 * Concatenating 0.00 MB of AOF diff received from parent. 21172:C 26 Mar 2019 15:42:04.326 * SYNC append only file rewrite performed 21172:C 26 Mar 2019 15:42:04.326 * AOF rewrite: 0 MB of memory used by copy-on-write 14304:S 26 Mar 2019 15:42:04.370 * Background AOF rewrite terminated with success 14304:S 26 Mar 2019 15:42:04.370 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 14304:S 26 Mar 2019 15:42:04.370 * Background AOF rewrite finished successfully |
在主節點未恢復之前,從節點無法提供讀寫服務,即使設置了READONLY:
127.0.0.1:4071> get k4156 (error) CLUSTERDOWN The cluster is down 127.0.0.1:4071> readonly OK 127.0.0.1:4071> get k4156 (error) CLUSTERDOWN The cluster is down |
雖然執行SCAN可看到數據:
127.0.0.1:4071> scan 0 1) "10752" 2) 1) "k5948" 2) "k4156" 3) "k12819" 4) "k24497" 5) "k5926" 6) "k10947" 7) "k7653" 8) "k21631" 9) "k6672" 10) "k2687" 11) "k29036" |
如果主節點永久無法恢復,那么怎么恢復集群?
127.0.0.1:4071> CLUSTER FAILOVER (error) ERR Master is down or failed, please use CLUSTER FAILOVER FORCE |
也就是這種情況下,只能強制恢復(丟失數據和數據不一風險),這個時候復制節點日志變化如下:
1021:S 26 Mar 2019 16:02:04.994 * Connecting to MASTER 10.49.126.98:4076 1021:S 26 Mar 2019 16:02:04.994 * MASTER <-> REPLICA sync started 未強制恢復之前 1021:S 26 Mar 2019 16:02:04.995 # Error condition on socket for SYNC: Connection refused 強制恢復 1021:S 26 Mar 2019 16:02:05.842 # Forced failover user request accepted. 准備發起選舉(隨機延遲后發起) 1021:S 26 Mar 2019 16:02:05.896 # Start of election delayed for 0 milliseconds (rank #0, offset 0). 1021:S 26 Mar 2019 16:02:05.996 * Connecting to MASTER 10.49.126.98:4076 1021:S 26 Mar 2019 16:02:05.996 * MASTER <-> REPLICA sync started 正式發起選舉(任期號為34) 1021:S 26 Mar 2019 16:02:05.996 # Starting a failover election for epoch 34. 1021:S 26 Mar 2019 16:02:06.004 # Error condition on socket for SYNC: Connection refused 不出意料地贏得選舉 1021:S 26 Mar 2019 16:02:06.020 # Failover election won: I'm the new master. 1021:S 26 Mar 2019 16:02:06.020 # configEpoch set to 34 after successful failover 1021:M 26 Mar 2019 16:02:06.020 # Setting secondary replication ID to 7b83d297fa53f119c79021661fff533eafabc222, valid up to offset: 1. New replication ID is c5011813ad8fda9ef68da648f2fdfc27eae2afd3 自己已為主,不需要“cached master”了 1021:M 26 Mar 2019 16:02:06.020 * Discarding previously cached master state. 集群狀態又恢復正常 1021:M 26 Mar 2019 16:02:06.021 # Cluster state changed: ok |
同時段集群其它主節點日志:
30651:M 26 Mar 2019 15:31:45.438 * Marking node c67dc9e02e25f2e6321df8ac2eb4d99789917783 as failing (quorum reached). 集群狀態變標記為fail 30651:M 26 Mar 2019 15:31:45.438 # Cluster state changed: fail 30651:M 26 Mar 2019 15:34:03.022 * Clear FAIL state for node f805e652ff8abe151393430cb3bcbf514b8a7399: replica is reachable again. 30651:M 26 Mar 2019 15:35:45.005 * 10 changes in 300 seconds. Saving... 30651:M 26 Mar 2019 15:35:45.006 * Background saving started by pid 28683 28683:C 26 Mar 2019 15:35:45.016 * DB saved on disk 28683:C 26 Mar 2019 15:35:45.018 * RDB: 0 MB of memory used by copy-on-write 30651:M 26 Mar 2019 15:35:45.106 * Background saving terminated with success 30651:M 26 Mar 2019 15:42:03.769 * Clear FAIL state for node c67dc9e02e25f2e6321df8ac2eb4d99789917783: is reachable again and nobody is serving its slots after some time. 集群狀態恢復正常 30651:M 26 Mar 2019 15:42:03.769 # Cluster state changed: ok |
同時段集群其它復制節點日志:
31463:S 26 Mar 2019 15:31:45.438 * FAIL message received from 29fcce29837d3e5266b6178a15aecfa938ff241a about c67dc9e02e25f2e6321df8ac2eb4d99789917783 集群狀態變標記為fail 31463:S 26 Mar 2019 15:31:45.439 # Cluster state changed: fail 31463:S 26 Mar 2019 15:34:03.023 * Clear FAIL state for node f805e652ff8abe151393430cb3bcbf514b8a7399: replica is reachable again. 31463:S 26 Mar 2019 15:35:45.100 * 10 changes in 300 seconds. Saving... 31463:S 26 Mar 2019 15:35:45.101 * Background saving started by pid 28695 28695:C 26 Mar 2019 15:35:45.116 * DB saved on disk 28695:C 26 Mar 2019 15:35:45.118 * RDB: 0 MB of memory used by copy-on-write 31463:S 26 Mar 2019 15:35:45.201 * Background saving terminated with success 31463:S 26 Mar 2019 15:42:03.769 * Clear FAIL state for node c67dc9e02e25f2e6321df8ac2eb4d99789917783: is reachable again and nobody is serving its slots after some time. 集群狀態恢復正常 31463:S 26 Mar 2019 15:42:03.769 # Cluster state changed: ok |