參考:
http://ginobefunny.com/post/learning_distributed_systems_tracing/
http://www.cnblogs.com/zhengyun_ustc/p/55solution2.html
Dapper,大規模分布式系統的跟蹤系統: http://bigbully.github.io/Dapper-translation/
http://blog.csdn.net/liaokailin/article/details/52077620
Google叫Dapper,淘寶叫鷹眼,Twitter叫ZipKin,京東商城叫Hydra,eBay叫Centralized Activity Logging (CAL),大眾點評網叫CAT
請求到達服務器,應用容器在執行業務處理之前,先執行埋點邏輯,分配一個全局唯一調用鏈ID(TraceId),埋點邏輯將TraceId放在一個調用上下文對象里,該對象存儲在ThreadLocal中。還有一個RpcId用於區分同一個調用鏈多個網絡調用的發生順序和嵌套層次關系。發起RPC調用后,首先從當前線程ThreadLocal獲取上下文,底層RpcId序列號,可以采用多級序列號形式。返回響應對象之前,會把這次調用情況以及 TraceId、RpcId 都打印到它的訪問日志之中,同時,會從ThreadLocal 清理掉調用上下文
為每次調用分配 TraceId、RpcId,放在 ThreadLocal 的調用上下文上面,調用結束的時候,把 TraceId、RpcId 打印到訪問日志。
zipkin作用:
服務調用追蹤,統計,問題排查
zipkin工作原理:
創造一些追蹤標識符(tracingId,spanId,parentId),最終將一個request的流程樹構建出來,各業務系統在彼此調用時,將特定的跟蹤消息傳遞至zipkin,zipkin在收集到跟蹤信息后將其聚合處理、存儲、展示等,用戶可通過web UI方便獲得網絡延遲、調用鏈路、系統依賴等等。
transport作用:收集被trace的services的spans,並將它們轉化為zipkin common Span,之后把這些Spans傳遞的存儲層
collector會對一個到來的被trace的數據(span)進行驗證、存儲並設置索引(Cassandra/ES-search/Memory)
zipkin核心數據結構:
Annotation(用途:用於定位一個request的開始和結束,cs/sr/ss/cr含有額外的信息,比如說時間點):
cs:Client Start,表示客戶端發起請求一個span的開始
sr:Server Receive,表示服務端收到請求
ss:Server Send,表示服務端完成處理,並將結果發送給客戶端
cr:Client Received,表示客戶端獲取到服務端返回信息一個span的結束,當這個annotation被記錄了,這個RPC也被認為完成了
客戶端調用時間=cr-cs
服務端處理時間=sr-ss
Span:一個請求(包含一組Annotation和BinaryAnnotation);它是基本工作單元,一次鏈路調用(可以是RPC,DB等沒有特定的限制)創建一個span,通 過一個64位ID標識它。
span通過還有其他的數據,例如描述信息,時間戳,key-value對的(Annotation)tag信息,parent-id等,其中parent-id可以表示span調用鏈路來 源,通俗的理解span就是一次請求信息
Trace:類似於樹結構的Span集合,表示一條調用鏈路,存在唯一標識
通過traceId(全局的跟蹤ID,是跟蹤的入口點,根據需求來決定在哪生成traceId)、spanId(請求跟蹤ID,比如一次rpc等)和parentId(上一次 請求跟蹤ID,用來將前后的請求串聯起來),被收集到的span會匯聚成一個tree,從而提供出一個request的整體流程。
Zipkin-springboot試驗:
安裝,默認端口9411:
wget -O zipkin.jar 'https://search.maven.org/remote_content?g=io.zipkin.java&a=zipkin-server&v=LATEST&c=exec'
nohup java -jar zipkin.jar &
創建spring-boot工程:
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-aop</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency> <dependency> <groupId>io.zipkin.brave</groupId> <artifactId>brave-core</artifactId> <version>3.9.0</version> </dependency> <!-- https://mvnrepository.com/artifact/io.zipkin.brave/brave-http --> <dependency> <groupId>io.zipkin.brave</groupId> <artifactId>brave-http</artifactId> <version>3.9.0</version> </dependency> <dependency> <groupId>io.zipkin.brave</groupId> <artifactId>brave-spancollector-http</artifactId> <version>3.9.0</version> </dependency> <dependency> <groupId>io.zipkin.brave</groupId> <artifactId>brave-web-servlet-filter</artifactId> <version>3.9.0</version> </dependency> <dependency> <groupId>io.zipkin.brave</groupId> <artifactId>brave-okhttp</artifactId> <version>3.9.0</version> </dependency>
創建spring-boot啟動類:
@SpringBootApplication public class Application { public static void main(String[] args) { SpringApplication app = new SpringApplication(Application.class); app.run(args); } }
創建對應的controller,通過服務調用9090/foo:
@RestController public class HomeController { @Autowired private OkHttpClient client; private Random random = new Random(); @RequestMapping(value = "/start") public String start() throws InterruptedException, IOException { int sleep= random.nextInt(100); TimeUnit.MILLISECONDS.sleep(sleep); Request request = new Request.Builder().url("http://localhost:9090/foo").get().build(); Response response = client.newCall(request).execute(); return " [service1 sleep " + sleep+" ms]" + response.body().toString(); } }
創建application.properties指定服務啟動的Port,對zipkin提供的名稱以及zipkin服務的地址和其他設置信息:
com.zipkin.serviceName=service-start com.zipkin.url=http://******:9411 com.zipkin.connectTimeout=6000 com.zipkin.readTimeout=6000 com.zipkin.flushInterval=1 com.zipkin.compressionEnabled=true server.port=8080
創建配置實體:
@Configuration @ConfigurationProperties(prefix = "com.zipkin") public class ZipkinProperties { private String serviceName; private String url; private int connectTimeout; private int readTimeout; private int flushInterval; private boolean compressionEnabled; public String getUrl() { return url; } public void setUrl(String url) { this.url = url; } public int getConnectTimeout() { return connectTimeout; } public void setConnectTimeout(int connectTimeout) { this.connectTimeout = connectTimeout; } public int getReadTimeout() { return readTimeout; } public void setReadTimeout(int readTimeout) { this.readTimeout = readTimeout; } public int getFlushInterval() { return flushInterval; } public void setFlushInterval(int flushInterval) { this.flushInterval = flushInterval; } public boolean isCompressionEnabled() { return compressionEnabled; } public void setCompressionEnabled(boolean compressionEnabled) { this.compressionEnabled = compressionEnabled; } public String getServiceName() { return serviceName; } public void setServiceName(String serviceName) { this.serviceName = serviceName; } }
創建brave處理類:
@Configuration public class ZipkinConfig { @Autowired private ZipkinProperties properties; @Bean public SpanCollector spanCollector() { HttpSpanCollector.Config config = HttpSpanCollector.Config.builder().connectTimeout(properties.getConnectTimeout()).readTimeout(properties.getReadTimeout()) .compressionEnabled(properties.isCompressionEnabled()).flushInterval(properties.getFlushInterval()).build(); return HttpSpanCollector.create(properties.getUrl(), config, new EmptySpanCollectorMetricsHandler()); } @Bean public Brave brave(SpanCollector spanCollector){ Brave.Builder builder = new Brave.Builder(properties.getServiceName()); //指定state builder.spanCollector(spanCollector); builder.traceSampler(Sampler.ALWAYS_SAMPLE); Brave brave = builder.build(); return brave; } @Bean public BraveServletFilter braveServletFilter(Brave brave){ BraveServletFilter filter = new BraveServletFilter(brave.serverRequestInterceptor(),brave.serverResponseInterceptor(),new DefaultSpanNameProvider()); return filter; } @Bean public OkHttpClient okHttpClient(Brave brave){ OkHttpClient client = new OkHttpClient.Builder() .addInterceptor(new BraveOkHttpRequestResponseInterceptor(brave.clientRequestInterceptor(), brave.clientResponseInterceptor(), new DefaultSpanNameProvider())) .build(); return client; } }
啟動服務,即可啟動服務並將本服務注冊到zipkin
其他服務9090:
重新創建spring-boot工程,修改application.properties指定服務啟動的Port,對zipkin提供的名稱以及zipkin服務的地址和其他設置信息:
com.zipkin.serviceName=service-foo com.zipkin.url=http://******:9411 com.zipkin.connectTimeout=6000 com.zipkin.readTimeout=6000 com.zipkin.flushInterval=1 com.zipkin.compressionEnabled=true server.port=9090
修改Controller,增加對/foo的處理:
@RestController public class HomeController { @Autowired private OkHttpClient client;
private Random random = new Random(); @RequestMapping(value = "/foo") public String foo() throws InterruptedException, IOException { Random random = new Random(); int sleep= random.nextInt(100); TimeUnit.MILLISECONDS.sleep(sleep); Request request = new Request.Builder().url("http://localhost:9091/bar").get().build(); //service3 Response response = client.newCall(request).execute(); String result = response.body().string(); request = new Request.Builder().url("http://localhost:9092/tar").get().build(); //service4 response = client.newCall(request).execute(); result += response.body().string(); return " [service2 sleep " + sleep+" ms]" + result; } }
其他服務9091/9092 application.properties修改端口即可,Controller添加對應的處理方法:
@RequestMapping(value = "/bar") public String bar() throws InterruptedException, IOException { Random random = new Random(); int sleep= random.nextInt(100); TimeUnit.MILLISECONDS.sleep(sleep); return " [service3 sleep " + sleep+" ms]"; }
@RequestMapping(value = "/tar") public String tar() throws InterruptedException, IOException { Random random = new Random(); int sleep= random.nextInt(1000); TimeUnit.MILLISECONDS.sleep(sleep); return " [service4 sleep " + sleep+" ms]"; }
分別啟動各個spring-boot工程,訪問http://localhost:8080/start啟動調用關系,查看zipkin:
點擊進入查看具體的調用關系:
再次進入詳情,看到具體的到達和處理時間:
至此可以清楚看到每次調用鏈的關系。
Zipkin-dubbo
在dubbo中引入zipkin是非常方便的,因為無非就是寫filter,在請求處理前后發送日志數據,讓zipkin生成調用鏈數據