springcloud線上一個問題,當config-server連不上git時,微服務集群慢慢的都掛掉。
在入口層增加了日志跟蹤問題:
org.springframework.cloud.config.server.environment.EnvironmentController.java
@RequestMapping("/{name}/{profiles}/{label:.*}") public Environment labelled(@PathVariable String name, @PathVariable String profiles, @PathVariable String label) { if (name != null && name.contains("(_)")) { // "(_)" is uncommon in a git repo name, but "/" cannot be matched // by Spring MVC name = name.replace("(_)", "/"); } if (label != null && label.contains("(_)")) { // "(_)" is uncommon in a git branch name, but "/" cannot be matched // by Spring MVC label = label.replace("(_)", "/"); } StopWatch sw = new StopWatch("labelled"); sw.start(); logger.info("EnvironmentController.labelled()開始,name={},profiles={},label={}", name, profiles, label); Environment environment = this.repository.findOne(name, profiles, label); sw.stop(); logger.info("EnvironmentController.labelled()結束,name={},profiles={},label={},耗時={}毫秒,耗時={}秒", name, profiles, label, sw.getTotalTimeMillis(), sw.getTotalTimeSeconds()); return environment; }
健康檢查的入口ConfigServerHealthIndicator.java增加日志:
@Override protected void doHealthCheck(Health.Builder builder) throws Exception { StopWatch sw = new StopWatch("doHealthCheck"); sw.start(); logger.info("ConfigServerHealthIndicator.doHealthCheck()開始,builder={}", builder); builder.up(); List<Map<String, Object>> details = new ArrayList<>(); for (String name : this.repositories.keySet()) { Repository repository = this.repositories.get(name); String application = (repository.getName() == null)? name : repository.getName(); String profiles = repository.getProfiles(); try { Environment environment = this.environmentRepository.findOne(application, profiles, repository.getLabel()); HashMap<String, Object> detail = new HashMap<>(); detail.put("name", environment.getName()); detail.put("label", environment.getLabel()); if (environment.getProfiles() != null && environment.getProfiles().length > 0) { detail.put("profiles", Arrays.asList(environment.getProfiles())); } if (!CollectionUtils.isEmpty(environment.getPropertySources())) { List<String> sources = new ArrayList<>(); for (PropertySource source : environment.getPropertySources()) { sources.add(source.getName()); } detail.put("sources", sources); } details.add(detail); } catch (Exception e) { HashMap<String, String> map = new HashMap<>(); map.put("application", application); map.put("profiles", profiles); builder.withDetail("repository", map); builder.down(e); return; } } builder.withDetail("repositories", details); sw.stop(); logger.info("ConfigServerHealthIndicator.doHealthCheck()結束,耗時={}毫秒,耗時={}秒,builder={}", sw.getTotalTimeMillis(), sw.getTotalTimeSeconds(), builder); }
通過耗時統計的日志分析后,發現是EnvironmentController和ConfigServerHealthIndicator調用次數太多,這兩個調用最終會調用JGitEnvironmentRepository.fetch()方法,這個fetch方法會去請求git,超時時間大概是5秒。
由於請求的數量過多,服務請求不過來,線程阻塞了很長時間。
分析:
1、EnvironmentController的調用是每個微服務模塊發起的,為什么?
2、ConfigServerHealthIndicator的調用是config-server的健康檢查,可以通過設置檢查的間隔時間緩解問題。
consul: host: 10.200.110.100 port: 8500 enabled: true discovery: enabled: true hostname: 10.200.110.100 healthCheckInterval: 30s queryPassing: true
EnvironmentController的請求時用config-server的client端的健康檢查發起的調用。看源碼:
各個客戶端在連接注冊中心,獲取到配置中心實例后,會調用上面這段代碼邏輯從配置中心獲取到 Environment數據變量,上線環境后,遇到了一個問題,查看日志,發現這塊邏輯被不停的調用,每20多秒就會調用一次,application的name為 app,通過查看SpringCloudConfig的官方文檔知道Config Server 通過一個健康指示器來檢測配置的EnvironmentRepository是否正常工作。 默認情況下會向EnvironmentRepository詢問一個名字為app的應用配置,EnvironmentRepository實例回應default配置。 也就是說當健康監視器默認開啟的時候,會不停的調用findOne來檢測,配置是否可用,是否會出現異常,
這段代碼是org.springframework.cloud.config.server.config.ConfigServerHealthIndicator類里初始化名稱為application名字為app的代碼
@ConfigurationProperties("spring.cloud.config.server.health") public class ConfigServerHealthIndicator extends AbstractHealthIndicator { private EnvironmentRepository environmentRepository; private Map<String, Repository> repositories = new LinkedHashMap<>(); public ConfigServerHealthIndicator(EnvironmentRepository environmentRepository) { this.environmentRepository = environmentRepository; } @PostConstruct public void init() { if (this.repositories.isEmpty()) { this.repositories.put("app", new Repository()); } } //... }
如果想停止掉這樣的檢測可以通過配置health.config.enabled=false去關閉此功能。
看源碼:org.springframework.cloud.config.client.ConfigClientAutoConfiguration.java
@Configuration public class ConfigClientAutoConfiguration { @Configuration @ConditionalOnClass(HealthIndicator.class) @ConditionalOnBean(ConfigServicePropertySourceLocator.class) @ConditionalOnProperty(value = "health.config.enabled", matchIfMissing = true) protected static class ConfigServerHealthIndicatorConfiguration { @Bean public ConfigServerHealthIndicator configServerHealthIndicator( ConfigServicePropertySourceLocator locator, ConfigClientHealthProperties properties, Environment environment) { return new ConfigServerHealthIndicator(locator, environment, properties); } } //...