prometheus常用監控實例

本文轉載自查看原文 2020-04-17 11:00 2845 monitor

一實例

1 內存使用率

使⽤率 = 實際可⽤內存 / 總內存
node_memory_Buffers_bytes
可用內存
node_memory_MemAvailable_bytes

((node_memory_MemTotal_bytes - node_memory_MemFree_bytes - node_memory_Buffers_bytes - node_memory_Cached_bytes) / (node_memory_MemTotal_bytes )) * 100

2 硬盤io使用情況

((rate(node_disk_read_bytes_total[1m] )+ rate(node_disk_written_bytes_total[1m])) / 1024 /1024) > 0 

硬盤使⽤率 是 read + written 讀和寫 都會占⽤IO /1024 兩次后 就由 bytes => Mbs

3 硬盤使用情況

● node_filesystem_size_bytes

（總大小-剩余大小）/總大小=硬盤使用率

(node_filesystem_size_bytes {mountpoint ="/"} - node_filesystem_free_bytes {mountpoint ="/"}) /
node_filesystem_size_bytes {mountpoint ="/"} * 100

4 網卡流量

● node_network_transmit_bytes_total

 irate(node_network_transmit_bytes_total{device!~"lo"}[1m]) / 1000

irate和rate都會用於計算某個指標在一定時間間隔內的變化速率。但是它們的計算方法有所不同：irate取的是在指定時間范圍內的最近兩個數據點來算速率，而rate會取指定時間范圍內所有數據點，算出一組速率，然后取平均值作為結果

5 tcp連接數監控

數據來源： pushgateway + 腳本
腳本內容如下：

#!/bin/bash 
instance_name=`hostname -f | cut -d'.' -f1` #本機機器名變量用於之后的標簽 
if [ $instance_name == "localhost" ];then
echo "Must FQDN hostname" 
exit 1 
fi
count_netstat_wait_connections=`netstat -an | grep -i wait | wc -l` 
echo "count_netstat_wait_connections $count_netstat_wait_connections" | curl --data-binary @- http://192.168.1.211:9091/metrics/job/pushgateway/instance/$instance_name

然后設置crontab */1 * * * * bash /prometheus/pushgateway.sh

predict_linear() 函數可以起到對曲線變化速率的計算以及在⼀段時間加速度的未來預測

6 docker 容器監控

首先要安裝Cadviosr，Cadviosr是Google用來監測單節點的資源信息的監控工具。雖然Docker提供了一些CLI的命令行的功能，但是在一個看圖的時代，基本的功能是很難滿足人民群眾日益增長的物質文化需求，Cadvisor提供了一目了然的單節點多容器的資源監控功能。Google的Kubernetes中也缺省地將其作為單節點的資源監控工具，各個節點缺省會被安裝上Cadvisor。在免費的世界里，Cadvisor作為一個很不錯的工具，越來越多的引起很多人過渡性的關注。

6.1 拉取鏡像

docker pull docker.io/google/cadvisor

6.2 運行一個cadvisor容器，並配置

 docker run -d --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8080:8080 --detach=true --name=cadvisor1 --net=host -v "/etc/localtime:/etc/localtime" google/cadvisor:latest

這里我們使用了 --net=host，這樣 Prometheus Server 可以直接與 Exporter 和 Grafana 通信。

6.3 打開網頁查看

  localhost:8080/containers/

6.4 更改prometheus配置文件

 - job_name: 'docker'
    static_configs:
    - targets: ['192.168.1.211:8080']

然后重啟prometheus服務
最后出圖成功

7 監控cpu

● node_cpu_seconds_total
cpu情況
查看cpu使用空閑情況

(1-((sum(increase(node_cpu_seconds_total{mode="idle"}[1m])) by (instance))/(sum(increase(node_cpu_seconds_total[1m])) by (instance)))) * 100

cpu分別在1分鍾，3分鍾，5分鍾之內的使用情況:


node_load1 / count by(job, instance)(count by(job, instance, cpu)(node_cpu_seconds_total))

node_load3 / count by(job, instance)(count by(job, instance, cpu)(node_cpu_seconds_total))

node_load5 / count by(job, instance)(count by(job, instance, cpu)(node_cpu_seconds_total))

二監控模板

https://grafana.com/grafana/dashboards/8919 linux系統監控
https://grafana.com/grafana/dashboards/9276 linux主機監控中文

https://grafana.com/grafana/dashboards/193 docker容器模板
https://grafana.com/grafana/dashboards/8588 k8s監控模板

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 基於Prometheus監控實例 prometheus 常用監控指標 prometheus監控多個MySQL實例 4.prometheus metrics常用監控 7.prometheus監控多個MySQL實例 Prometheus監控學習筆記之Prometheus 2.x版本的常用變化 Prometheus監控之grafana常用模板編號記錄 Prometheus監控(二) prometheus監控 Prometheus監控