1.監控關鍵業務鏈接的狀態(是否返回200,響應時間等等)
這在prometheus上已經提供了現成的exporter,可以參考
https://github.com/prometheus/blackbox_exporter
具體使用和效果這里還有一個博客
https://medium.com/the-telegraph-engineering/how-prometheus-and-the-blackbox-exporter-makes-monitoring-microservice-endpoints-easy-and-free-of-a986078912ee
2.監控具體的業務指標
以tomcat為例,創建一個metrics的應用,然后里面加入一個index.jsp文件,將需要暴露的指標都寫到這個文件中
比如
[root@master metrics]# cat index.jsp # HELP helloworld_ordernumber Number of Order. # TYPE helloworld_ordernumber gauge helloworld_ordernumber 10 # HELP helloworld_orderamount Amount of Order. # TYPE helloworld_orderamount gauge helloworld_orderamount 100
說明如下:
- 整個是text格式,不需要加html,body什么的
- 每個自定義指標前面加上HELP和TYPE, gauge類型意思是可大可小,而不是累加的counter類型。
- 這個具體指標的獲取以后可以設計成通過調用程序接口或者訪問數據庫的模式,這里為了簡化寫死。
修改prometheus的配置文件
加入被監控的target
[root@master prometheus-2.7.1.linux-amd64]# cat prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9100'] - targets: ['localhost:8080']
啟動,然后打開http://192.168.56.108:9090/targets
獲取和訪問指標
3.在OpenShift容器雲環境下的監控
- index.jsp或者類似的metrics,和業務應用綁定在一起,因此和業務是一個Pod
- 如果需要針對每個微服務暴露的業務指標進行監控,需要在Openshift容器內部部署Prometheus.
- 如果是在集群外部署Prometheus,需要把需要監控的服務通過route暴露出來
4. 監控類別及方式說明
- 業務監控實際上是獲取業務的指標,比如存放在redis或者數據庫,如果存在多個應用實例,只需要走任意一個實例訪問獲取即可。
- 如果是監控每個實例是否正常工作,可以通過OpenShift提供的readness Probe和liveness Probe.由Kubernetes來保障