在應用程序中,通常會記錄日志以便事后分析,在很多情況下是產生了問題之后,再去查看日志,是一種事后的靜態分析。在很多時候,我們可能需要了解整個系統在當前,或者某一時刻運行的情況,比如一個系統后台服務,我們可能需要了解一些實時監控的數據例如
1、每秒鍾的請求數是多少(TPS)?
2、平均每個請求處理的時間?
3、請求處理的最長耗時?
4.請求處理的響應的直方圖?
5、請求處理正確響應率?
6、等待處理的請求隊列長度?
7、查看整個系統的的CPU使用率、內存占用、jvm運行情況;以及系統運行出錯率等等一系列的實時數據采集時,最簡單的方法就是在系統的入口、出口和關鍵位置設置埋點,然后將采集到的信息發送到實時監控平台或者存入到緩存和DB中做進一步的分析和展示。
Metrics作為一款監控指標的度量類庫,提供了許多工具幫助開發者來完成各項數據的監控。
詳見官方文檔:https://metrics.dropwizard.io/3.1.0/manual/core/
一.Metrics 工具類庫的介紹
Metrics提供5種基本的度量類型:Meters Gauges Counters Histograms 和 Timers
1.設置maven依賴
<dependencies> <dependency> <groupId>io.dropwizard.metrics</groupId> <artifactId>metrics-core</artifactId> <version>3.2.6</version> </dependency> <dependency> <groupId>io.dropwizard.metrics</groupId> <artifactId>metrics-healthchecks</artifactId> <version>3.2.6</version> </dependency> </dependencies>
2.Meters 的介紹與使用
//Meter(測量)是一種只能自增的計數器,通常用來度量一系列事件發生的概率。它提供了平均速率,以及指數平滑平均速率,以及采樣后的1分鍾,5分鍾,15分鍾的樣例。 public class MetricsExample { //創建注冊表 private final static MetricRegistry registry = new MetricRegistry(); //創建tps測量表 private final static Meter requestMeter = registry.meter("tps"); //創建異常測量表 private final static Meter errorMeter = registry.meter("err_request"); public static void main(String[] args) { //數據生成報告(按每分鍾來統計) ConsoleReporter report = ConsoleReporter.forRegistry(registry) .convertRatesTo(TimeUnit.MINUTES) .convertDurationsTo(TimeUnit.MINUTES) .build(); report.start(10, TimeUnit.SECONDS); //每10秒將數據打印到控制台上 for(;;){ //模擬一直調用請求 getAsk(); //發送請求 randomSleep(); //間隔的發送請求 } } //處理請求方法 public static void getAsk(){ try { requestMeter.mark(); randomSleep(); int x = 10/ThreadLocalRandom.current().nextInt(6); } catch (Exception e) { System.out.println("Error"); errorMeter.mark(); } } //模擬處理請求耗時 public static void randomSleep(){ try { TimeUnit.SECONDS.sleep(ThreadLocalRandom.current().nextInt(10)); //隨機休眠時間 } catch (InterruptedException e) { e.printStackTrace(); } } }
//打印結果如下
19-6-4 16:38:47 ================================================================
-- Meters ----------------------------------------------------------------------
err_request
count = 1
mean rate = 1.50 events/minute
1-minute rate = 0.75 events/minute
5-minute rate = 0.19 events/minute
15-minute rate = 0.07 events/minute
tps
count = 4
mean rate = 5.99 events/minute
1-minute rate = 8.85 events/minute
5-minute rate = 11.24 events/minute
15-minute rate = 11.74 events/minute
3.gauge的介紹與使用
3.1 gauge的使用
/** * @des gauge的使用 * @author zhao * @date 2019年6月14日上午12:08:02 * Gauge是一個最簡單的計量,一般用來統計瞬時狀態的數據信息 * 例:某一時刻的集合中的大小 */ public class GaugeExample { //定義度量中心 private static MetricRegistry registry = new MetricRegistry(); //定義隊列 private static Queue<Integer> queue = new LinkedBlockingQueue<>(); public static void main(String[] args) throws InterruptedException { //將信息展示到控制台上 ConsoleReporter reporter = ConsoleReporter.forRegistry(registry).build(); reporter.start(3, TimeUnit.SECONDS); Gauge<Integer> gauge = new Gauge<Integer>() { @Override public Integer getValue() { return queue.size(); } }; //將定義過的gauge 注冊到注冊中心 registry.register(MetricRegistry.name(GaugeExample.class, "queue-size"), gauge); //模擬queue隊列中的數據 for (int i = 0; i < 100; i++) { queue.add(i); TimeUnit.MILLISECONDS.sleep(100); } Thread.currentThread().join(); } } // 打印結果 19-6-14 0:39:17 ================================================================ -- Gauges ---------------------------------------------------------------------- com.zpb.gauge.GaugeExample.queue-size value = 31 19-6-14 0:39:20 ================================================================ -- Gauges ---------------------------------------------------------------------- com.zpb.gauge.GaugeExample.queue-size value = 60 19-6-14 0:39:23 ================================================================ -- Gauges ---------------------------------------------------------------------- com.zpb.gauge.GaugeExample.queue-size value = 90
3.2RatioGauge 的使用
作用:度量事件成功率的計算。 例:度量緩存命中率、接口調用率等等。
public class RatioGaugeExample { private static MetricRegistry registry = new MetricRegistry(); private static Meter totalMeter = registry.meter("totalCount"); private static Meter succMeter = registry.meter("succCount"); public static void main(String[] args) { ConsoleReporter reporter = ConsoleReporter.forRegistry(registry).build(); reporter.start(5, TimeUnit.SECONDS); //每5秒發送一次到控制台 registry.gauge("succ-ratio", ()-> new RatioGauge() { @Override protected Ratio getRatio() { return Ratio.of(succMeter.getCount(),totalMeter.getCount()); //第一個參數:分子 第二個參數:分母 } }); //調用 for(;;){ processHandle(); } } public static void processHandle(){ //total count totalMeter.mark(); try { int x = 10/ThreadLocalRandom.current().nextInt(10); TimeUnit.MILLISECONDS.sleep(100); //succ count succMeter.mark(); } catch (Exception e) { System.out.println("================ err"); } } }
//打印結果
19-6-17 9:28:13 ================================================================
-- Gauges ----------------------------------------------------------------------
succ-ratio
value = 0.9607843137254902
-- Meters ----------------------------------------------------------------------
succCount
count = 49
mean rate = 9.52 events/second
1-minute rate = 9.60 events/second
5-minute rate = 9.60 events/second
15-minute rate = 9.60 events/second
totalCount
count = 51
mean rate = 9.90 events/second
1-minute rate = 10.00 events/second
5-minute rate = 10.00 events/second
15-minute rate = 10.00 events/second
19-6-17 9:28:18 ================================================================
-- Gauges ----------------------------------------------------------------------
succ-ratio
value = 0.9423076923076923
-- Meters ----------------------------------------------------------------------
succCount
count = 98
mean rate = 9.71 events/second
1-minute rate = 9.63 events/second
5-minute rate = 9.61 events/second
15-minute rate = 9.60 events/second
totalCount
count = 104
mean rate = 10.31 events/second
1-minute rate = 10.06 events/second
5-minute rate = 10.01 events/second
15-minute rate = 10.00 events/second
4.Counter 的使用
作用:Counter是Gauge的一個特例,維護一個計數器,可以通過inc()和dec()方法對計數器做修改。使用步驟與Gauge基本類似,在MetricRegistry中提供了靜態方法可以直接實例化一個Counter。可以用來度量生產者和消費者之間的關系
public class CounterExample { private static final Logger LOG = LoggerFactory.getLogger(CounterExample.class); //度量注冊中心 private static final MetricRegistry registry = new MetricRegistry(); //度量計數器 private static final Counter counter = registry.counter(MetricRegistry.name(CounterExample.class, "")); private static final ConsoleReporter report = ConsoleReporter.forRegistry(registry) .convertRatesTo(TimeUnit.MINUTES) .convertDurationsTo(TimeUnit.MINUTES) .build(); private static Queue<String> queue = new LinkedList<String>(); public static void main(String[] args) throws Exception { report.start(5, TimeUnit.SECONDS); //每5秒將數據打印到控制台上 new Thread(new Runnable() { @Override public void run() { try { production("abc"); } catch (InterruptedException e) { e.printStackTrace(); } } }).start(); new Thread(new Runnable() { @Override public void run() { try { consume(); } catch (InterruptedException e) { e.printStackTrace(); } } }).start();; Thread.currentThread().join(); } public static void production(String s) throws InterruptedException{ for(int i = 0; i < 100;i++){ counter.inc(); queue.offer(s); } } public static void consume() throws InterruptedException{ while(queue.size() != 0){ queue.poll(); //刪除第1個元素 counter.dec(); } } }
5.Histograms直方圖
作用:主要使用來統計數據的分布情況, 最大值、最小值、平均值、中位數,百分比(75%、90%、95%、98%、99%和99.9%)。
例如,需要統計某個頁面的請求、接口方法請求的響應時間
public class HistogramsExample { private static final MetricRegistry registry = new MetricRegistry(); private static ConsoleReporter reporter = ConsoleReporter.forRegistry(registry).build(); //實例化一個Histograms private static final Histogram histogram = registry.histogram(MetricRegistry.name(HistogramsExample.class,"histogram")); public static void main(String[] args) throws InterruptedException { reporter.start(5, TimeUnit.SECONDS); Random r = new Random(); while(true){ processHandle(r.nextDouble()); Thread.sleep(100); } } private static void processHandle(Double d){ histogram.update((int) (d*100)); //在應用中,需要統計的位置調用Histogram的update()方法。 } }
6.Timer的使用
作用:統計請求的速率和處理時間
例如:某接口的總在一定時間內的請求總數,平均處理時間
public class TimerExample { //創建度量中心 private static final MetricRegistry registry = new MetricRegistry(); //輸出到控制台 private static final ConsoleReporter report = ConsoleReporter.forRegistry(registry).build(); //實例化timer private static final Timer timer = registry.timer("request"); public static void main(String[] args) { report.start(5, TimeUnit.SECONDS); while(true){ handleRequest(); } } private static void handleRequest(){ Context time = timer.time();try { Thread.sleep(500); //模擬處理請求時間 } catch (Exception e) { System.out.println("err"); }finally { time.stop(); //每次執行完都會關閉 System.out.println("==== timer 已關閉"); } } } // 打印結果 19-6-17 11:25:27 =============================================================== -- Histograms ------------------------------------------------------------------ com.zpb.histograms.HistogramsExample.histogram count = 50 #總請求數 min = 0 max = 98 mean = 53.14 #平均值 stddev = 27.04 #標准差 median = 50.00 #中間值 75% <= 78.00 95% <= 92.00 98% <= 94.00 99% <= 98.00 99.9% <= 98.00
7.HealthChecks
作用:健康檢查,用於對系統應用、子模塊、關聯模塊的運行是否正常做檢測
實現過程:
類A:繼承 HealthCheck ,並重寫check()方法 ,在check()中調用類B中的被檢測方法
類B:定義一個方法,返回結果是boolean類型。(類B也可以是其它系統中的一個類)
public class HealthChecksExample extends HealthCheck{ private DataBase database; public HealthChecksExample(DataBase database) { this.database = database; } @Override protected Result check() throws Exception { if (database.ping()) { return Result.healthy(); } return Result.unhealthy("Can't ping database."); } static class DataBase{ //模擬ping方法 public boolean ping(){ Random r = new Random(); return r.nextBoolean(); } } public static void main(String[] args) { //創建健康檢查注冊中心 HealthCheckRegistry registry = new HealthCheckRegistry(); //將被檢查的類注冊到中心
registry.register("database1",new HealthChecksExample(new DataBase())); registry.register("database2", new HealthChecksExample(new DataBase()));
//從運行的健康檢查注冊中心獲取被檢測的結果 Set<Entry<String, Result>> entrySet = registry.runHealthChecks().entrySet(); while(true){ for(Entry<String, Result> entry : entrySet){ if(entry.getValue().isHealthy()){ System.out.println(entry.getKey()+": OK"); }else{ System.err.println(entry.getKey()+"FAIL:error message: "+entry.getValue().getMessage()); final Throwable e = entry.getValue().getError(); if(e !=null){ e.printStackTrace(); } } } try { Thread.sleep(1000); } catch (Exception e) { e.printStackTrace(); } } } }
//打印結果
database1FAIL:error message: Can't ping database.
database2: OK
database1FAIL:error message: Can't ping database.
database2: OK
database1FAIL:error message: Can't ping database.
database2: OK
二.report 報告
如上例子所示,我們拿到了很多類型的數據,但我們不能展示到控制台上,因此我們需要將數據導出,做成可展示的報告,在官網上有很多種類型的report,這里只介紹在工作中經常使用到的。
將數據寫到log日志中
將日志通過logback寫入到日志中,具體使用配置過程詳見:loback的介紹與配置-(通俗易通)
public class TimerExample { //創建度量中心 private static final MetricRegistry registry = new MetricRegistry(); //輸出到日志文件中 private static final Slf4jReporter report = Slf4jReporter.forRegistry(registry) .outputTo(LoggerFactory.getLogger("com.metrics.timer")) //定義該日志寫到哪個包,這個你可以隨意定義,但要與logback.xml中的logger中name一致即可 .convertRatesTo(TimeUnit.SECONDS) .convertDurationsTo(TimeUnit.SECONDS) .build(); //實例化timer private static final Timer timer = registry.timer("request"); public static void main(String[] args) { report.start(5, TimeUnit.SECONDS); while(true){ handleRequest(); } } private static void handleRequest(){ Context time = timer.time(); try { Thread.sleep(500);; //模擬處理請求時間 } catch (Exception e) { System.out.println("err ="+e); }finally { time.stop(); //一定要寫finally,每次執行完都會關閉 System.out.println("==== timer 已關閉"); } } }
2.Counter將數據寫入到日志中
public class CounterExample { private static final Logger LOG = LoggerFactory.getLogger(CounterExample.class); //度量注冊中心 private static final MetricRegistry registry = new MetricRegistry(); //度量計數器 private static final Counter counter = registry.counter(MetricRegistry.name(CounterExample.class, "")); //通過logback打印到日志文件上 private static final Slf4jReporter reporter = Slf4jReporter.forRegistry(registry) .outputTo(LoggerFactory.getLogger("com.metrics")) .convertRatesTo(TimeUnit.SECONDS) .convertDurationsTo(TimeUnit.SECONDS) .build(); private static Queue<String> queue = new LinkedList<String>(); public static void main(String[] args) throws Exception { reporter.start(5, TimeUnit.SECONDS); //每5秒鍾寫一次日志
new Thread(new Runnable() { @Override public void run() { try { production("abc"); } catch (InterruptedException e) { e.printStackTrace(); } } }).start(); new Thread(new Runnable() { @Override public void run() { try { consume(); } catch (InterruptedException e) { e.printStackTrace(); } } }).start();; Thread.currentThread().join(); } public static void production(String s) throws InterruptedException{ for(int i = 0; i < 100;i++){ counter.inc(); queue.offer(s); System.out.println("------- 生產 ----------->"+queue.size()); } } public static void consume() throws InterruptedException{ while(queue.size() != 0){ queue.poll(); //刪除第1個元素 counter.dec(); System.err.println("<------- 消費 ----------- "+queue.size()); } } }
metrics for spring https://blog.csdn.net/woshibigsail/article/details/95341831