1、指定告警服務和規則文件

告訴Promentheus，將告警信息發送給那個告警管理服務，以及使用那個告警規則文件。這里的告警服務在Kubernetes中部署，對外提供的服務名稱為alertmanager，端口為9093。告警規則文件為“/etc/prometheus/rules/”目錄下的所有規則文件。

global:  
 scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.  
 evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.  
 # scrape_timeout is set to the global default (10s).  
  
# 指定告警服務器  
alerting:  
 alertmanagers:  
 - static_configs:  
 - targets:  
 - alertmanager:9093  
  
# 指定告警規則文件  
rule_files:  
 - "/etc/prometheus/rules/*.yml"  
 # - "second_rules.yml"  
  
# A scrape configuration containing exactly one endpoint to scrape:  
# Here it's Prometheus itself.  
scrape_configs:  
 # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.  
 - job_name: 'prometheus'  
  
# metrics_path defaults to '/metrics'  
 # scheme defaults to 'http'.  
  
static_configs:  
 - targets: ['localhost:9090']  
 - job_name: 'redis'  
 static_configs:  
 - targets: ['redis-exporter-np:9121']  
 - job_name: 'node'  
 static_configs:  
 - targets: ['prometheus-prometheus-node-exporter:9100']  
 - job_name: 'windows-node-001'  
 static_configs:  
 - targets: ['10.0.32.148:9182']  
 - job_name: 'windows-node-002'  
 static_configs:  
 - targets: ['10.0.34.4:9182']  
 - job_name: 'rabbit'  
 static_configs:  
 - targets: ['prom-rabbit-prometheus-rabbitmq-exporter:9419']

2、設置告警規則

設置告警的規則，Prometheus基於此告警規則，將告警信息發送給告警服務。這將未啟動的實例信息發送給告警服務，告知哪些實例沒有正常啟動。

#rules  
groups:  
 - name: node-rules  
 rules:  
 - alert: InstanceDown # 告警名稱  
   expr: up == 0 # 告警判定條件  
   for: 3s # 持續多久后，才發送  
   labels: # 標簽  
    team: k8s  
   annotations: # 警報信息  
    summary: "{{$labels.instance}}: has been down"  
    description: "{{$labels.instance}}: job {{$labels.job}} has been down "

3、設置告警信息路由和接收器

這里設置通過郵件接收告警信息，當告警服務接收到告警信息后，會通過郵件將告警信息發送給被告知者。

global:  
 resolve_timeout: 5m  
 smtp_smarthost: 'smtp.163.com:25' # 發送信息郵箱的smtp服務器代理  
 smtp_from: 'xxx@163.com' # 發送信息的郵箱名稱  
 smtp_auth_username: 'xxx' # 郵箱的用戶名  
 smtp_auth_password: 'SYNUNQBZMIWUQXGZ' # 郵箱的密碼或授權碼  
  
route:  
 group_by: ['alertname']  
 group_wait: 10s  
 group_interval: 10s  
 repeat_interval: 1h  
 receiver: 'email'  
receivers:  
 - name: 'email'  
 email_configs:  
 - to: 'xxxxxx@aliyun.com' # 接收告警的郵箱  
 headers: { Subject: "[WARN] 報警郵件"} # 接收郵件的標題  
  
inhibit_rules:  
 - source_match:  
 severity: 'critical'  
 target_match:  
 severity: 'warning'  
 equal: ['alertname', 'dev', 'instance']

4、驗證

在方案中Prometheus所監控的實例中，redis和windows-node-002沒有正常啟動，因此根據上述的告警規則，應該會將這些信息發送給被告警者的郵箱。

K8s系列－Prometheus基於郵件告警

在被告警者的郵箱中，接收的告警信息如下。

K8s系列－Prometheus基於郵件告警感謝作者分享-http://bjbsair.com/2020-04-07/tech-info/30650.html

1、指定告警服務和規則文件

global:  
 scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.  
 evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.  
 # scrape_timeout is set to the global default (10s).  
  
# 指定告警服務器  
alerting:  
 alertmanagers:  
 - static_configs:  
 - targets:  
 - alertmanager:9093  
  
# 指定告警規則文件  
rule_files:  
 - "/etc/prometheus/rules/*.yml"  
 # - "second_rules.yml"  
  
# A scrape configuration containing exactly one endpoint to scrape:  
# Here it's Prometheus itself.  
scrape_configs:  
 # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.  
 - job_name: 'prometheus'  
  
# metrics_path defaults to '/metrics'  
 # scheme defaults to 'http'.  
  
static_configs:  
 - targets: ['localhost:9090']  
 - job_name: 'redis'  
 static_configs:  
 - targets: ['redis-exporter-np:9121']  
 - job_name: 'node'  
 static_configs:  
 - targets: ['prometheus-prometheus-node-exporter:9100']  
 - job_name: 'windows-node-001'  
 static_configs:  
 - targets: ['10.0.32.148:9182']  
 - job_name: 'windows-node-002'  
 static_configs:  
 - targets: ['10.0.34.4:9182']  
 - job_name: 'rabbit'  
 static_configs:  
 - targets: ['prom-rabbit-prometheus-rabbitmq-exporter:9419']

2、設置告警規則

設置告警的規則，Prometheus基於此告警規則，將告警信息發送給告警服務。這將未啟動的實例信息發送給告警服務，告知哪些實例沒有正常啟動。

#rules  
groups:  
 - name: node-rules  
 rules:  
 - alert: InstanceDown # 告警名稱  
   expr: up == 0 # 告警判定條件  
   for: 3s # 持續多久后，才發送  
   labels: # 標簽  
    team: k8s  
   annotations: # 警報信息  
    summary: "{{$labels.instance}}: has been down"  
    description: "{{$labels.instance}}: job {{$labels.job}} has been down "

3、設置告警信息路由和接收器

這里設置通過郵件接收告警信息，當告警服務接收到告警信息后，會通過郵件將告警信息發送給被告知者。

global:  
 resolve_timeout: 5m  
 smtp_smarthost: 'smtp.163.com:25' # 發送信息郵箱的smtp服務器代理  
 smtp_from: 'xxx@163.com' # 發送信息的郵箱名稱  
 smtp_auth_username: 'xxx' # 郵箱的用戶名  
 smtp_auth_password: 'SYNUNQBZMIWUQXGZ' # 郵箱的密碼或授權碼  
  
route:  
 group_by: ['alertname']  
 group_wait: 10s  
 group_interval: 10s  
 repeat_interval: 1h  
 receiver: 'email'  
receivers:  
 - name: 'email'  
 email_configs:  
 - to: 'xxxxxx@aliyun.com' # 接收告警的郵箱  
 headers: { Subject: "[WARN] 報警郵件"} # 接收郵件的標題  
  
inhibit_rules:  
 - source_match:  
 severity: 'critical'  
 target_match:  
 severity: 'warning'  
 equal: ['alertname', 'dev', 'instance']

4、驗證

在方案中Prometheus所監控的實例中，redis和windows-node-002沒有正常啟動，因此根據上述的告警規則，應該會將這些信息發送給被告警者的郵箱。

K8s系列－Prometheus基於郵件告警

在被告警者的郵箱中，接收的告警信息如下。

K8s系列－Prometheus基於郵件告警感謝作者分享-http://bjbsair.com/2020-04-07/tech-info/30650.html

1、指定告警服務和規則文件

global:  
 scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.  
 evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.  
 # scrape_timeout is set to the global default (10s).  
  
# 指定告警服務器  
alerting:  
 alertmanagers:  
 - static_configs:  
 - targets:  
 - alertmanager:9093  
  
# 指定告警規則文件  
rule_files:  
 - "/etc/prometheus/rules/*.yml"  
 # - "second_rules.yml"  
  
# A scrape configuration containing exactly one endpoint to scrape:  
# Here it's Prometheus itself.  
scrape_configs:  
 # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.  
 - job_name: 'prometheus'  
  
# metrics_path defaults to '/metrics'  
 # scheme defaults to 'http'.  
  
static_configs:  
 - targets: ['localhost:9090']  
 - job_name: 'redis'  
 static_configs:  
 - targets: ['redis-exporter-np:9121']  
 - job_name: 'node'  
 static_configs:  
 - targets: ['prometheus-prometheus-node-exporter:9100']  
 - job_name: 'windows-node-001'  
 static_configs:  
 - targets: ['10.0.32.148:9182']  
 - job_name: 'windows-node-002'  
 static_configs:  
 - targets: ['10.0.34.4:9182']  
 - job_name: 'rabbit'  
 static_configs:  
 - targets: ['prom-rabbit-prometheus-rabbitmq-exporter:9419']

2、設置告警規則

設置告警的規則，Prometheus基於此告警規則，將告警信息發送給告警服務。這將未啟動的實例信息發送給告警服務，告知哪些實例沒有正常啟動。

#rules  
groups:  
 - name: node-rules  
 rules:  
 - alert: InstanceDown # 告警名稱  
   expr: up == 0 # 告警判定條件  
   for: 3s # 持續多久后，才發送  
   labels: # 標簽  
    team: k8s  
   annotations: # 警報信息  
    summary: "{{$labels.instance}}: has been down"  
    description: "{{$labels.instance}}: job {{$labels.job}} has been down "

3、設置告警信息路由和接收器

這里設置通過郵件接收告警信息，當告警服務接收到告警信息后，會通過郵件將告警信息發送給被告知者。

global:  
 resolve_timeout: 5m  
 smtp_smarthost: 'smtp.163.com:25' # 發送信息郵箱的smtp服務器代理  
 smtp_from: 'xxx@163.com' # 發送信息的郵箱名稱  
 smtp_auth_username: 'xxx' # 郵箱的用戶名  
 smtp_auth_password: 'SYNUNQBZMIWUQXGZ' # 郵箱的密碼或授權碼  
  
route:  
 group_by: ['alertname']  
 group_wait: 10s  
 group_interval: 10s  
 repeat_interval: 1h  
 receiver: 'email'  
receivers:  
 - name: 'email'  
 email_configs:  
 - to: 'xxxxxx@aliyun.com' # 接收告警的郵箱  
 headers: { Subject: "[WARN] 報警郵件"} # 接收郵件的標題  
  
inhibit_rules:  
 - source_match:  
 severity: 'critical'  
 target_match:  
 severity: 'warning'  
 equal: ['alertname', 'dev', 'instance']

4、驗證

在方案中Prometheus所監控的實例中，redis和windows-node-002沒有正常啟動，因此根據上述的告警規則，應該會將這些信息發送給被告警者的郵箱。

K8s系列－Prometheus基於郵件告警

在被告警者的郵箱中，接收的告警信息如下。

K8s系列－Prometheus基於郵件告警

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 kubernetes實戰(二十)：k8s一鍵部署高可用Prometheus並實現郵件告警 Kubernetes - - k8s - v1.12.3 一鍵部署高可用 Prometheus 並實現郵件告警 [k8s]prometheus+alertmanager二進制安裝實現簡單郵件告警 k8s全方位監控-prometheus-alertmanager部署-配置第一條告警郵件 k8s全方位監控 -prometheus實現短信告警接口編寫（python） k8s上搭建loki日志服務並通過prometheus進行錯誤日志告警 kubernetes(k8s) Prometheus+grafana監控告警安裝部署 k8s實戰之部署Prometheus+Grafana可視化監控告警平台 k8s安裝prometheus Prometheus K8S部署