參考:https://www.bbsmax.com/A/gGdXbgXmJ4/
https://www.deathearth.com/333.html
https://www.cnblogs.com/amyzhu/p/10193557.html
ELK搭建好之后,如何利用收集到的數據進行告警呢,可以使用插件sentiel
一,安裝環境
1,系統環境

2,軟件版本選擇
java 1.8.0_171 elasticsearch 6.2.4 kibana 6.2.4
二,安裝
1,安裝ELK
略
2,安裝sentinl插件
根據ELK版本下載插件,本次下載版本為6.2.4
https://github.com/sirensolutions/sentinl/releases/
/usr/share/kibana/bin/kibana-plugin install file:///nas/nas/softs/elk/6.2.4/sentinl-v6.2.4-1.zip
安裝后查看

設置郵件,修改kibana配置文件/etc/kibana/kibana.yml在尾部添加以下內容
sentinl:
settings:
email:
active: true
user: xxx@xxx.com #郵箱地址
password: xxxx #郵箱密碼或者授權碼
host: smtp.exmail.qq.com #發送郵件服務器
ssl: true #根據實際情況添加 改成false則port修改成25,如果是阿里雲禁用25端口需要使用ssl
port: 465
report:
active: true

重啟kibana
systemctl restart kibana
打開head可以查看到生成了一個名字為wacter_alarms的索引

打開kibana菜單可以看到sentina選項
新建一個watchers




修改完可以編輯或者測試

點擊運行測試

查看告警信息

配置advanced文件設置查詢告警條件,一個較為完整的配置文件如下
{
"actions": {
"Email_alarm_773206d5-2977-465e-882d-762a7d69fe68": {
"name": "Email alarm",
"throttle_period": "15m",
"email": {
"priority": "low",
"stateless": false,
"body": "Find error log {{payload.hits.total}}", #發送郵件的內容,統計出現關鍵字錯誤的匹配次數
"to": "xxx@xxx.com", #郵件接收方自定義
"from": "xxx@xxx.com" #郵件發送方為kibana配置文件里面的郵箱
}
}
},
"input": {
"search": {
"request": {
"index": [
"system-log-*" #索引名
],
"body": {
"query": {
"bool": {
"must": [
{
"range": {
"@timestamp": { #匹配時間
"gte": "now-5m/m", #大於或等於從現在減5分鍾
"lte": "now/m", #小於等於現在
"format": "epoch_millis"
}
}
}
],
"filter": [
{
"multi_match": {
"type": "best_fields",
"query": "error", #匹配日志里面是否出現關鍵字error
"lenient": true
}
}
]
}
},
"size": 0,
"aggs": {
"dateAgg": {
"date_histogram": {
"field": "@timestamp",
"time_zone": "Asia/Shanghai",
"interval": "1m",
"min_doc_count": 1
}
}
}
}
}
}
},
"condition": {
"script": {
"script": "payload.hits.total>1" #匹配的次數大於1則觸發告警動作
}
},
"trigger": {
"schedule": {
"later": "every 5 minutes" #每五分鍾執行一次
}
},
"disable": false,
"report": false,
"title": "system-log錯誤日志監控告警",
"wizard": {},
"save_payload": false,
"spy": false,
"impersonate": false
}
PS:為方便理解加了注釋,時間配置文件不可加注釋
監控對應日志五分鍾內是否出現關鍵字error如果出現並且大於1則觸發郵件告警
往對應日志重定向幾次error即可觸發該告警
郵件內容如下

在寫一個監控CPU使用率告警配置文件
{
"actions": {
"HTML_email_alarm_5fbf1925-81fc-4d73-a37e-b6ac8b9bfc06": {
"name": "HTML email alarm",
"throttle_period": "1m",
"email_html": {
"html": "五分鍾內cpu使用率超過10% 次數為{{ payload.hits.total }}",
"priority": "low",
"stateless": false,
"to": "xxx@xxx.com",
"from": "xxx@xxx.com"
}
}
},
"input": {
"search": {
"request": {
"index": [
"metricbeat-*"
],
"body": {
"query": {
"bool": {
"filter": [
{
"range": {
"system.cpu.total.pct": {
"gt": 0.1
}
}
}
],
"must": [
{
"range": {
"@timestamp": {
"gte": "now-5m/m",
"lte": "now/m",
"format": "epoch_millis"
}
}
}
]
}
},
"size": 0,
"aggs": {
"dateAgg": {
"date_histogram": {
"field": "@timestamp",
"time_zone": "Europe/Amsterdam",
"interval": "1m",
"min_doc_count": 1
}
}
}
}
}
}
},
"condition": {
"script": {
"script": "payload.hits.total >=1"
}
},
"trigger": {
"schedule": {
"later": "every 5 minutes"
}
},
"disable": false,
"report": false,
"title": "metricber",
"wizard": {},
"save_payload": true,
"spy": false,
"impersonate": false
}
監控CPU使用率如果大於10%就告警,system.cpu.total.pct為浮點數,對比大於0.1就是大於10%
