Prometheus報警規則編寫

本文轉載自查看原文 2020-09-07 17:25 1109 prometheus

一、編寫監控kafka的topic，每秒會話次數，超過一個特定值，即觸發報警

1、根據grafana儀表盤監控，可查看具體監控指標

2、可在prometheus監控頁面找到抓取的實時數據

3、根據prometheus抓取的數據編寫報警規則文件

# pwd
/usr/local/prometheus-2.6.1.linux-amd64

# mkdir rules
# cat  rules/kafka.yml

groups:
- name: kafka.rules
  rules:
  - alert: topic消費者每分鍾流量
    expr: kafka_topic_partition_current_offset{topic="superman"} > 2000
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance  }} ：{{ $labels.topic }} 消費使用率過高"
      description: "{{ $labels.instance  }} ： {{ $labels.job  }} ：{{ $labels.partition  }} : { { $labels.topic } } 這個分區使用大於百分之80% (當前值：{{ $value }})"

4、修改prometheus.yml配置文件

# cat prometheus.yml

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
    - "rules/*.yml"
  # - "first_rules.yml"
  # - "second_rules.yml"

重啟prometheus

6、查看prometheus頁面Alerts

二、編寫監控kafka某個會話組，topic的lag超過特定值，就觸發報警（步驟同上）

根據上述信息編寫報警配置

# cd /usr/local/prometheus-2.6.1.linux-amd64/rules/
# cat kafka_lag.yml

groups:
- name: kafka_rules
  rules:
  - alert: 消費組中topic的lag值,每分鍾
    expr: kafka_consumergroup_lag{consumergroup="mygroup"} > 20
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.consumergroup  }} ：{{ $labels.topic }} 消費滯后"
      description: "{{ $labels.consumergroup  }} ： {{ $labels.job  }} ：{{ $labels.partition  }} : { { $labels.topic } } 消費滯后 (當前值：{{ $value }})"

重啟prometheus

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 原創-prometheus報警規則 Prometheus 報警規則配置 Prometheus報警規則別名設置 Prometheus 編寫告警規則案例 prometheus學習系列十一： Prometheus 報警規則配置 prometheus自定義監控項的報警規則 Prometheus整合Alertmanager報警 prometheus + alertmanager 實現報警 Prometheus監控報警系統 Prometheus + AlertManager 郵件報警