Prometheus（五）：Prometheus+Alertmanager 配置企業微信報警

本文轉載自查看原文 2019-11-29 16:16 827 Prometheus

此處默認已安裝Prometheus服務，服務地址：192.168.56.200

一、設置企業微信

1.1、企業微信注冊（已有企業微信賬號請跳過）

企業微信注冊地址：https://work.weixin.qq.com/

按照要求填寫相應信息，注冊企業微信

1.2 、創建自建應用

企業微信注冊完成后，登錄進去，點擊上方導航條中的【應用管理】按鈕，切換到應用管理頁面。

然后選擇【創建應用】，創建用於接收Alertmanager告警信息的自建應用

填寫應用名稱等信息，創建應用。創建應用后，記錄應用的 AgentId 和 Secret 以備后用

然后點擊上方導航條中的【我的企業】按鈕，在頁面最下方查看企業ID 並記錄，以備后用。

二、安裝Alertmanager

此處采用源碼編譯的方式安裝。首先下載alertmanager的軟件包，下載地址：https://github.com/prometheus/alertmanager/releases/download/v0.19.0/alertmanager-0.19.0.linux-amd64.tar.gz

下載完成后，將下載中軟件包上傳至Prometheus服務所在的機器（192.168.56.200）的 /usr/local 目錄下

解壓alertmanager軟件包：

#   tar -zvxf alertmanager-0.19.0.linux-amd64.tar.gz
#   mv alertmanager-0.19.0.linux-amd64/ alertmanager

進入解壓后的alertmanager文件夾，修改alertmanager.yml文件，配置報警信息，alertmanager.yml 內容如下：

global: resolve_timeout: 5m templates: #告警模板 - './template/test.tmpl' route: # 設置報警分發策略 group_by: ['alertname'] # 分組標簽 group_wait: 10s # 告警等待時間。告警產生后等待10s，如果有同組告警一起發出 group_interval: 10s # 兩組告警的間隔時間 repeat_interval: 1m # 重復告警的間隔時間，減少相同右鍵的發送頻率 此處為測試設置為1分鍾 receiver: 'wechat' # 默認接收者 receivers: - name: 'wechat' wechat_configs: - send_resolved: true agent_id: '1000002' # 自建應用的agentId to_user: '*****' # 接收告警消息的人員Id api_secret: '******' # 自建應用的secret corp_id: '******' # 企業ID #inhibit_rules: # - source_match: # severity: 'critical' # target_match: # severity: 'warning' # equal: ['alertname', 'dev', 'instance']

創建告警模板

進入Alertmanager安裝文件夾，創建告警模板文件

#   cd /usr/local/alertmanager # mkdir template # cd template/ # vim test.tmpl

將以下內容寫入文件當中

{{ define "wechat.default.message" }} {{ range .Alerts }} ========監控報警========== 告警狀態：{{ .Status }} 告警級別：{{ .Labels.severity }} 告警類型：{{ .Labels.alertname }} 告警應用：{{ .Annotations.summary }} 告警主機：{{ .Labels.instance }} 告警詳情：{{ .Annotations.description }} 觸發閥值：{{ .Annotations.value }} 告警時間：{{ .StartsAt.Format "2006-01-02 15:04:05" }} 
========end============= 
{{ end }} {{ end }}

檢查alertmanager.yml 配置是否正確

#   cd /usr/local/alertmanager # ./amtool check-config alertmanager.yml

配置正確，模板文件也已經識別

啟動alertmanager

#  ./alertmanager

可以看到alertmanager服務已經起來，服務所在的端口為9093

瀏覽器訪問： http://192.168.56.200:9093 (IP:9093)

alertmanager成功啟動。

三、配置Prometheus

Ctrl+C 結束掉alertmanager服務進程，進入Prometheus的安裝目錄下修改Prometheus配置。

#  cd /usr/local/prometheus
#  vim prometheus.yml

修改Prometheus.yml文件中的 alerting 配置項及rule_files配置項

alerting:
  alertmanagers:
  - static_configs:
    - targets: ['localhost:9093']

rule_files: #配置告警規則
- "rule.yml"

修改完成后保存退出

以下是Prometheus.yml 文件全部內容：

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets: ['localhost:9093']
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "rule.yml"
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

  - job_name: 'Linux'
    static_configs:
    - targets: ['192.168.56.201:9100']
      labels:
        instance: Linux

  - job_name: 'Windows'
    static_configs:
    - targets: ['192.168.56.1:9182']
      labels:
        instance: Windows

  - job_name: 'snmp'
    scrape_interval: 10s
    static_configs: 
     - targets: 
       - 172.20.2.83  # 交換機IP地址
    metrics_path: /snmp
    # params:
     # module: [if_mib]
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 192.168.56.100:9116 # snmp_exporter 服務IP地址

創建並編寫告警規則文件rule.yml

#  vim rule.yml

將以下內容寫入文件當中，（此處用於測試，設置為當內存占用高於10%時，就會告警）

groups:
- name: mem-rule
  rules:
  - alert: "內存報警"
    expr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes+node_memory_Buffers_bytes+node_memory_Cached_bytes )) / node_memory_MemTotal_bytes * 100 > 10
    for: 30s
    labels:
      severity: warning
    annotations:
      summary: "服務名:{{$labels.alertname}} 內存報警"
      description: "{{ $labels.alertname }} 內存資源利用率大於 10%"
      value: "{{ $value }}"

保存退出

四、告警檢測

重啟Prometheus服務，使配置的告警規則生效

#  systemctl restart prometheus

進入alertmanager的安裝文件夾，啟動alertmanager

#  cd /usr/local/alertmanager
#  ./alertmanager

稍等片刻，登錄企業微信，可以看到已經接收到告警信息

瀏覽器訪問 http://192.168.56.200:9093/#/alerts ，也能看到告警信息

五、配置alertmanager服務開機自啟

Ctrl+C 結束掉 alertmanager 服務進程，創建 alertmanager服務，讓 alertmanager 以服務的方式，開機自啟。

添加系統服務

#  vim /etc/systemd/system/alertmanager.service

將以下內容寫入文件中

[Unit]
Description=alertmanager
After=network.target

[Service]
WorkingDirectory=/usr/local/alertmanager
ExecStart=/usr/local/alertmanager/alertmanager --config.file=alertmanager.yml --log.level=debug --log.format=json
Restart=on-failure

[Install]
WantedBy=multi-user.target

保存退出

啟動服務，設置開機自啟

#  systemctl daemon-reload
#  systemctl enable alertmanager
#  systemctl start alertmanager

至此Prometheus+alertmanage配置企業微信報警完成。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Prometheus+alertmanager告警配置-2 prometheus，alertmanager 報警配置詳解 Prometheus-Alertmanager告警對接到企業微信配置zabbix通過微信報警企業微信報警 Prometheus整合Alertmanager報警 prometheus + alertmanager 實現報警 AlertManager企業微信報警，時間是UTC時間，錯8個小時的兩種解決辦法 zabbix5.0企業微信報警 Prometheus之Alertmanager配置詳解 zabbix 企業微信公眾號實現微信報警