本文為原創文章。歡迎任何形式的轉載,但請務必注明出處 冷冷https://lltx.github.io。
本篇是 spring boot v2.3 系列第三篇,來分享一下 v2.3 關於 spring data redis 的故障轉移優化。
背景
關於 Redis 在生產中我們一般情況下都會選擇 redis cluster 高可用架構部署,既能保證數據分片並且實現節點的故障自動轉移。 基本部署拓撲如下:
創建測試集群
- 這里通過我封裝的 pig4cloud/redis-cluster:4.0 鏡像,即可構建一個 6 個節點的 redis cluster 測試環境。
docker run --name redis-cluster -d -e CLUSTER_ANNOUNCE_IP=宿主機IP \
-p 7000-7005:7000-7005 -p 17000-17005:17000-17005 pig4cloud/redis-cluster:4.0
- 查看集群節點信息
⋊> ./redis-cli -h 172.17.0.111 -p 7000 -c 16:09:48
172.17.0.111:7000> cluster nodes
3d882206d40935beef84ff564b538d57369e4fd9 172.17.0.111:7003@17003 slave b8d24150df4a221c1045cd9a0696bd1972912d52 0 1591344590000 4 connected
b8d24150df4a221c1045cd9a0696bd1972912d52 172.17.0.111:7001@17001 master - 0 1591344590513 2 connected 5461-10922
c21167a6da7f8af31d2dd612d449cdf92ad2e7e9 172.17.0.111:7005@17005 slave 810baa140db6e008a137708f09d4335f5207ede3 0 1591344591000 6 connected
810baa140db6e008a137708f09d4335f5207ede3 172.17.0.111:7000@17000 myself,master - 0 1591344590000 1 connected 0-5460
05d2f9884d350a50ac9e38f575b57f19e864e74c 172.17.0.111:7004@17004 slave b3cf24a918d96a1949f49a1d7b3a965ff9dc858c 0 1591344590011 5 connected
b3cf24a918d96a1949f49a1d7b3a965ff9dc858c 172.17.0.111:7002@17002 master - 0 1591344591617 3 connected 10923-16383
應用層接入集群
- 這里使用 spring boot 2.2 演示, 默認的連接池使用 lettuce
spring:
redis:
cluster:
nodes:
- 172.17.0.111:7000
- 172.17.0.111:7001
- 172.17.0.111:7002
- 172.17.0.111:7003
- 172.17.0.111:7004
- 172.17.0.111:7005
- 簡單使用 redisTemplate 操作集群
@RestController
public class DemoController {
@Autowired
private RedisTemplate redisTemplate;
@GetMapping("/add")
public String redis() {
redisTemplate.opsForValue().set("k1", "v1");
return "ok";
}
}
- 調用查看日志
⋊> curl http://localhost:8080/add
ok⏎
我們會發現操作 k1 是在 7000 節點進行操作寫入
[channel=0x5ff7aa8f, /172.17.0.156:50783 -> /172.17.0.111:7000, epid=0x8] write() writeAndFlush command ClusterCommand [command=AsyncCommand [type=SET, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.Command], redirections=0, maxRedirections=5]
[channel=0x5ff7aa8f, /172.17.0.156:50783 -> /172.17.0.111:7000, epid=0x8] write() done
模擬單點故障
- 關閉 7000 節點
./redis-cli -h 172.17.0.111 -p 7000 -c
172.17.0.111:7000> SHUTDOWN
- 查看 redis cluster 集群日志 docker logs -f redis-cluster
我們可以看到此時集群選舉完畢,完成故障轉移
23:S 05 Jun 08:24:49.387 # Starting a failover election for epoch 7.
29:M 05 Jun 08:24:49.388 # Failover auth granted to c21167a6da7f8af31d2dd612d449cdf92ad2e7e9 for epoch 7
26:M 05 Jun 08:24:49.388 # Failover auth granted to c21167a6da7f8af31d2dd612d449cdf92ad2e7e9 for epoch 7
23:S 05 Jun 08:24:49.389 # Failover election won: I'm the new master.
23:S 05 Jun 08:24:49.389 # configEpoch set to 7 after successful failover
23:M 05 Jun 08:24:49.389 # Setting secondary replication ID to 5253748ecf5bd7ab3536058fba8cad62d2d5e825, valid up to offset: 1622. New replication ID is 21d6a0b199a1ba655c0279d9c78f9682477ac9a3
23:M 05 Jun 08:24:49.389 * Discarding previously cached master state.
23:M 05 Jun 08:24:49.390 # Cluster state changed: ok
- 此時集群狀態。7005 從 slave 變更為 master
應用層日志
- 大量輸出連接 7000 節點異常
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /172.17.0.111:7000
Caused by: java.net.ConnectException: Connection refused
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /172.17.0.111:7000
Caused by: java.net.ConnectException: Connection refused
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /172.17.0.111:7000
Caused by: java.net.ConnectException: Connection refused
- 再次操作 redisTemplate 會發現卡死,等待結果返回
⋊> curl http://localhost:8080/add
- 原因分析
此時還是操作 k1, 根據 slot 對應連接到 7000 節點,已經連接不到無限嘗試重連的問題。 lettuce 客戶端並未和 redis cluster 集群狀態同步刷新,把宕機節點移除,完成故障轉移。
集群拓撲動態感應
拓撲動態感應即客戶端能夠根據 redis cluster 集群的變化,動態改變客戶端的節點情況,完成故障轉移。
我們只需要在 spring boot 2.3.0 版本中 開啟此特性即可。
spring:
redis:
lettuce:
cluster:
refresh:
adaptive: true
- 其實 lettuce 官方一直有這個功能,但 spring data redis 並未跟進,具體內容可以參考 user-content-refreshing-the-cluster-topology-view 章節
舊版本兼容
我們只需要參考 adaptive 開關打開后做了哪些事情,給自己的項目配置上 topology-view 即可
@Bean
public LettuceConnectionFactory redisConnectionFactory(RedisProperties redisProperties) {
RedisClusterConfiguration redisClusterConfiguration = new RedisClusterConfiguration(redisProperties.getCluster().getNodes());
// https://github.com/lettuce-io/lettuce-core/wiki/Redis-Cluster#user-content-refreshing-the-cluster-topology-view
ClusterTopologyRefreshOptions clusterTopologyRefreshOptions = ClusterTopologyRefreshOptions.builder()
.enablePeriodicRefresh()
.enableAllAdaptiveRefreshTriggers()
.refreshPeriod(Duration.ofSeconds(5))
.build();
ClusterClientOptions clusterClientOptions = ClusterClientOptions.builder()
.topologyRefreshOptions(clusterTopologyRefreshOptions).build();
// https://github.com/lettuce-io/lettuce-core/wiki/ReadFrom-Settings
LettuceClientConfiguration lettuceClientConfiguration = LettuceClientConfiguration.builder()
.readFrom(ReadFrom.REPLICA_PREFERRED)
.clientOptions(clusterClientOptions).build();
return new LettuceConnectionFactory(redisClusterConfiguration, lettuceClientConfiguration);
}