[k8s]prometheus+alertmanager二進制安裝實現簡單郵件告警

本文轉載自查看原文 2018-01-12 17:48 4142

本次任務是用alertmanaer發一個報警郵件
本次環境采用二進制普羅組件
本次准備監控一個節點的內存,當使用率大於2%時候(測試),發郵件報警.

環境准備

下載二進制https://prometheus.io/download/

https://github.com/prometheus/prometheus/releases/download/v2.0.0/prometheus-2.0.0.windows-amd64.tar.gz
https://github.com/prometheus/alertmanager/releases/download/v0.12.0/alertmanager-0.12.0.windows-amd64.tar.gz
https://github.com/prometheus/node_exporter/releases/download/v0.15.2/node_exporter-0.15.2.linux-amd64.tar.gz

解壓

/root/
├── alertmanager -> alertmanager-0.12.0.linux-amd64
├── alertmanager-0.12.0.linux-amd64
├── alertmanager-0.12.0.linux-amd64.tar.gz
├── node_exporter-0.15.2.linux-amd64
├── node_exporter-0.15.2.linux-amd64.tar.gz
├── prometheus -> prometheus-2.0.0.linux-amd64
├── prometheus-2.0.0.linux-amd64
└── prometheus-2.0.0.linux-amd64.tar.gz

實驗架構

配置alertmanager

創建 alert.yml

[root@n1 alertmanager]# ls
alertmanager  alert.yml  amtool  data  LICENSE  NOTICE  simple.yml

alert.yml 里面定義下: 誰發送什么事件發給誰怎么發等.

cat alert.yml 
global:
  smtp_smarthost: 'smtp.163.com:25'
  smtp_from: 'maotai@163.com'
  smtp_auth_username: 'maotai@163.com'
  smtp_auth_password: '123456'


templates:
  - '/root/alertmanager/template/*.tmpl'

route:
  group_by: ['alertname', 'cluster', 'service']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 10m
  receiver: default-receiver


receivers:
- name: 'default-receiver'
  email_configs:
  - to: 'maotai@foxmail.com'
  
  
- 配置好后啟動即可
./alertmanager -config.file=./alert.yml

配置prometheus

報警規則rule.yml配置(將被prometheus.yml調用)

當使用率大於2%時候(測試),發郵件報警

$ cat rule.yml 
groups:
- name: test-rule
  rules:
  - alert: NodeMemoryUsage
    expr: (node_memory_MemTotal - (node_memory_MemFree+node_memory_Buffers+node_memory_Cached )) / node_memory_MemTotal * 100 > 2
    for: 1m
    labels:
      severity: warning 
    annotations:
      summary: "{{$labels.instance}}: High Memory usage detected"
      description: "{{$labels.instance}}: Memory usage is above 80% (current value is: {{ $value }}"

關鍵在於這個公式

(node_memory_MemTotal - (node_memory_MemFree+node_memory_Buffers+node_memory_Cached )) / node_memory_MemTotal * 100 > 2

labels 給這個規則打個標簽

annotations(報警說明)這部分是報警內容

監控k從哪里獲取?(后面有說) node_memory_MemTotal/node_memory_Buffers/node_memory_Cached

prometheus.yml配置

添加node_expolore這個job
添加rule_files的報警規則,rule_files部分調用rule.yml

$ cat prometheus.yml 
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

alerting:
  alertmanagers:
  - static_configs:
    - targets: ["localhost:9093"]

rule_files:
  - /root/prometheus/rule.yml

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['192.168.14.11:9090']
  - job_name: linux
    static_configs:
      - targets: ['192.168.14.11:9100']
        labels:
          instance: db1

配置好后啟動普羅然后訪問,可以看到了node target了.

查看node_explore拋出的metric

查看alert,可以看到告警規則發生的狀態

這些公式的key從這里可以看到(前提是當你安裝了對應的explore),按照這個k來寫告警公式

查看收到的郵件

微信報警配置

global:
  # The smarthost and SMTP sender used for mail notifications.
  resolve_timeout: 6m
  smtp_smarthost: 'x.x.x.x:25'
  smtp_from: 'maomao@qq.com'
  smtp_auth_username: 'maomao'
  smtp_auth_password: 'maomao@qq.com'
  smtp_require_tls: false

  # The auth token for Hipchat.
  hipchat_auth_token: '1234556789'
  # Alternative host for Hipchat. 
  hipchat_api_url: 'https://123'
  wechat_api_url: "https://123"
  wechat_api_secret: "123"
  wechat_api_corp_id: "123"
  

# The directory from which notification templates are read.
templates:
- 'templates/*.tmpl'

# The root route on which each incoming alert enters.
route:
  # The labels by which incoming alerts are grouped together. For example,
  # multiple alerts coming in for cluster=A and alertname=LatencyHigh would
  # be batched into a single group.
  group_by: ['alertname']

  # When a new group of alerts is created by an incoming alert, wait at
  # least 'group_wait' to send the initial notification.
  # This way ensures that you get multiple alerts for the same group that start
  # firing shortly after another are batched together on the first 
  # notification.
  group_wait: 3s

  # When the first notification was sent, wait 'group_interval' to send a batch
  # of new alerts that started firing for that group.
  group_interval: 5m

  # If an alert has successfully been sent, wait 'repeat_interval' to
  # resend them.
  repeat_interval: 1h

  # A default receiver
  receiver: maotai


  routes:
  - match:
      job: "11"
      #service: "node_exporter"
    routes:
    - match:
        status: yellow
      receiver: maotai
    - match:
        status: orange
      receiver: berlin


# Inhibition rules allow to mute a set of alerts given that another alert is
# firing.
# We use this to mute any warning-level notifications if the same alert is 
# already critical.
inhibit_rules:
- source_match:
    service: 'up'
  target_match:
    service: 'mysql'
  # Apply inhibition if the alerqtname is the same.
  equal: ["instance"]

- source_match:
    service: "mysql"
  target_match:
    service: "mysql-query"
  equal: ['instance']

- source_match:
    service: "A"
  target_match:
    service: "B"
  equal: ["instance"]

- source_match:
    service: "B"
  target_match:
    service: "C"
  equal: ["instance"]

receivers:
- name: 'maotai'
  email_configs:
  - to: 'maotai@qq.com'
    send_resolved: true
    html: '{{ template "email.default.html" . }}'
    headers: { Subject: "[mail] 測試技術部監控告警郵件" }
    
- name: "berlin"
  wechat_configs:
  - send_resolved: true
    to_user: "@all"
    to_party: ""
    to_tag: ""
    agent_id: "1"
    corp_id: "xxxxxxx"

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 k8s 二進制安裝大神寫的K8S 二進制安裝筆記二進制方式安裝 k8s 二進制部署K8S 二進制部署k8s K8S二進制部署 k8s全方位監控-prometheus-alertmanager部署-配置第一條告警郵件 Prometheus監控k8s(9)-部署Alertmanager實現郵件/釘釘/微信報警 Prometheus（四）：Prometheus+Alertmanager 配置郵件報警 Prometheus+Alertmanager企業微信告警