前段時間接到公司IT同事需求,幫助其配置smokeping的告警功能,之前配置的姿勢有些問題,告警有些問題,現在調試OK,在此將關鍵配置點簡單記錄下。
關鍵的配置項主要有:
- 定義告警規則並配置將告警信息通過管道交給自定義的alert腳本
- 在主機定義里調用定義的告警規則
- 自定義的alert腳本對告警內容進行解析和處理
定義告警規則並配置將告警信息通過管道交給自定義的alert腳本
需要在config文件的Alert配置section中進行配置
# /usr/local/smokeping/etc/config
*** Alerts *** # 將告警信息交給自己定的alert腳本進行處理 to = |/usr/local/smokeping/bin/send_alert.sh from = a@b.com # 定義各種告警規則 +hostdown type = loss # in percent pattern = ==0%,==0%,==0%, ==U comment = 對端無響應 +bigloss type = loss # in percent pattern = ==0%,==0%,==0%,==0%,>20%,>20%,>20% comment = 連續3次采樣-丟包率超過20% +lossdetect type = loss # in percent pattern = ==0%,==0%,==0%,==0%,>0%,>0%,>0% comment = 連續3次采樣-存在丟包 +someloss type = loss # in percent pattern = >0%,*12*,>0%,*12*,>0% comment = 間斷性丟包 +rttdetect type = rtt # in milli seconds pattern = <100,<100,<100,<100,<100,<150,>150,>150,>150 comment = 連續3次采樣延遲增大-超過150ms
The Alert section lets you setup loss and RTT pattern detectors. After each round of polling, SmokePing will examine its data and determine which detectors match. Detectors are enabled per target and get inherited by the targets children.
Detectors are not just simple thresholds which go off at first sight of a problem. They are configurable to detect special loss or RTT patterns. They let you look at a number of past readings to make a more educated decision on what kind of alert should be sent, or if an alert should be sent at all.
The patterns are numbers prefixed with an operator indicating the type of comparison required for a match.
告警規則參考:官方文檔配置詳解的Alert段
http://oss.oetiker.ch/smokeping/doc/smokeping_config.en.html
在主機定義里調用告警規則
配置語法
alerts = 告警規則1,告警規則2,告警規則3
如你所了解的,smokeping的配置文件里面通過"+"號的個數來定義層級關系,因此你可以在不同的層級里面調用告警規則,上級的定義可以被下級繼承和覆蓋(內層的優先級更高)
+ xxoo menu = xxoo-top title = xxoo-所有網絡監控列表 host = /xxoo/net-A /xxoo/net-B /xxoo/net-C alerts = hostdown,bigloss,lossdetect,someloss,rttdetect # 這里的作用范圍就是/xxoo ++net-A menu = Menu-Name-A title = Titile-Name-A host = 10.10.10.101 alerts = hostdown,bigloss,lossdetect # 這里的規則作用范圍就是/xxoo/net-A ++net-B menu = Google-DNS title = To-Google-DNS host = 8.8.8.8
自定義的alert腳本對告警內容進行解析和處理
smokeping在告警的時候會發送5~6個參數到告警接收媒介(這里也就是我們自定義的alert腳本),參數按照順序分別為:name-of-alert, target, loss-pattern, rtt-pattern, hostname,[raise]。
因此我們的alert腳本需要做的就是對上述參數進行解析和處理。
告警腳本樣例:
[root@smokeping ~]# cat /usr/local/smokeping/bin/send_alert.sh #!/bin/bash ######################################################### # Script to email a ping report on alert from Smokeping # ######################################################### # 解析變量 alertname=$1 target=$2 losspattern=$3 rtt=$4 hostname=$5 # 自定義變量 email="xxx@yyy.com" phone="12345678901" smokename="AlertName" smokeping_mail_content=/tmp/smokeping_mail_content #smokeping_sms_content=/tmp/smokeping_sms_content # 把所有傳過來的變量輸出到腳本調用日志里,方便統計和問題排查 echo "$(date +%F-%T)" >> invoke.log echo $@ >> invoke.log # 網絡恢復邏輯判斷 if [ "$losspattern" = "loss: 0%" ]; then subject="Clear-${smokename}-Alert: $target host: ${hostname}" else subject="${smokename}Alert: ${target} – ${hostname}" fi # generate mail content # 清空並重新生成郵件內容 >${smokeping_mail_content} echo "Name of Alert: " $alertname | tee -a ${smokeping_mail_content} echo "Target: " $target | tee -a ${smokeping_mail_content} echo "Loss Pattern: " $losspattern | tee -a ${smokeping_mail_content} echo "RTT Pattern: " $rtt | tee -a ${smokeping_mail_content} echo "Hostname: " $hostname | tee -a ${smokeping_mail_content} echo "" | tee -a ${smokeping_mail_content} echo "Ping Report:" | tee -a ${smokeping_mail_content} ping ${hostname} -c 4 -i 0.5 | tee -a ${smokeping_mail_content} # send mail # 發送email,下面的if邏輯其實沒有什么卵用,因為腳本只要被調用了,這個${smokeping_mail_content}就一定是有內容的 if [ -s ${smokeping_mail_content} ];then content=`cat ${smokeping_mail_content}` curl http://notice.api.ourcompany.com/send_mail -d "receiver=${email}&subject=${subject}&content=${content}" fi # send sms # 判斷alertname是否是hostdown,bigloss,rttdetect這幾種比較嚴重的級別,如果是的話就調用短信接口進行短信發送。 # 注意,這里需要控制下短信發送內容的字數,要花錢的~哈哈 judge_alert_type=`echo ${alertname} | egrep "hostdown|bigloss|rttdetect"|wc -l` if [ "${judge_alert_type}" -eq 1 ];then curl http://notice.api.ourcompany.com/send_sms -d "receiver=${phone}&subject=${subject}&content=${alertname} on ${hostname}" fi [root@smokeping ~]#
上述腳本中調用了公司的通知接口進行告警的發送,此配置結合自己的需求進行調整即可
http://notice.api.ourcompany.com/send_mail http://notice.api.ourcompany.com/send_sms
告警效果
郵件
短信