1. 前言
先上一張整體的效果圖:
上面這張圖就是通過 ELK 分析 nginx 日志所得到的數據,通過 kibana 的功能展示出來的效果圖。是不是這樣對日志做了解析,想要知道的數據一目了然。接下來就是實現過程實錄。
2. 實現過程
通過上一篇:ELK 部署文檔 已經對 ELK + filebeat 獲取 nginx 做了詳細的配置介紹,這里重點就不在安裝 ELK 上面了。 下面這邊的內容,主要是針對 logstash 配置文件的編寫和 kibana web界面的配置。
主機信息在申明下,和上一篇一樣:
2.1 nginx 日志格式編寫
在編寫logstash 文件之前,得有一個標准輸入輸出格式,這個格式通用的就是 json 格式。
首先,考慮如何才能獲取 json 格式的日志,當然可以直接通過修改 nginx 日志的格式來實現,因此開始修改 nginx 日志格式。如果獲取的日志無法修改json 格式,則可以通過正則表達式來匹配。
在nginx 配置文件中添加如下日志格式:
http { … log_format main_json '{"domain":"$server_name",' '"http_x_forwarded_for":"$http_x_forwarded_for",' '"time_local":"$time_iso8601",' '"request":"$request",' '"request_body":"$request_body",' '"status":$status,' '"body_bytes_sent":"$body_bytes_sent",' '"http_referer":"$http_referer",' '"upstream_response_time":"$upstream_response_time",' '"request_time":"$request_time",' '"http_user_agent":"$http_user_agent",' '"upstream_addr":"$upstream_addr",' '"upstream_status":"$upstream_status"}'; …. }
定義的這個nginx 日志格式叫 main_json 后面的配置文件,都可以引用這個日志格式。除了nginx 日志參數以外,還可以通過配置文件來自行添加自定義參數,比如 獲取用戶的真實ip
於是編寫一個自定義變量的配置文件:
[root@192.168.118.16 ~]#vim /etc/nginx/location.conf #set $real_ip $remote_addr; if ( $http_x_forwarded_for ~ "^(\d+\.\d+\.\d+\.\d+)" ) { set $real_ip $1; }
這個配置文件只是為了獲取用戶的真實IP,變量名為: real_ip 需要在nginx.conf 中引用,在剛才的配置文件中也加入該變量,完整日志格式如下:
log_format main_json '{"domain":"$server_name",'
'"real_ip":"$real_ip",'
'"http_x_forwarded_for":"$http_x_forwarded_for",'
'"time_local":"$time_iso8601",'
'"request":"$request",'
'"request_body":"$request_body",'
'"status":$status,'
'"body_bytes_sent":"$body_bytes_sent",'
'"http_referer":"$http_referer",'
'"upstream_response_time":"$upstream_response_time",'
'"request_time":"$request_time",'
'"http_user_agent":"$http_user_agent",'
'"upstream_addr":"$upstream_addr",'
'"upstream_status":"$upstream_status"}';
注釋掉該行:
#access_log /var/log/nginx/access.log main;
接下來,編寫一個nginx 配置文件 端口為 9527 作為測試使用
[root@192.168.118.16 ~]#vim /etc/nginx/conf.d/server_9527.conf server { listen 9527; server_name localhost; include location.conf; location / { root /www/9527/; index index.html; access_log /www/log/access.log main_json; error_log /www/log/error.log; } location /shop { root /www/9527; access_log /www/log/shop_access.log main_json; error_log /www/log/shop_error.log; } } [root@192.168.118.16 ~]#mkdir -p /www/{9527,log} [root@192.168.118.16 ~]#cd /www/9527/ [root@192.168.118.16 /www/9527]#vim index.html hello, 9527 [root@192.168.118.16 /www/9527]#mkdir -pv /www/9527/shop [root@192.168.118.16 /www/9527]#vim /www/9527/shop/index.html 出售9527 [root@192.168.118.16 /www/9527]#nginx -t [root@192.168.118.16 /www/9527]#nginx -s reload
Nginx 配置完成,重新加載,訪問測試:
[root@192.168.118.16 ~]#curl http://192.168.118.16:9527/index.html hello, 9527 [root@192.168.118.16 ~]#curl http://192.168.118.16:9527/shop/index.html 出售9527
頁面訪問正常,查看日志:
[root@192.168.118.16 ~]#ll -tsh /www/log/ total 8.0K 4.0K -rw-r--r-- 1 root root 346 Sep 14 14:35 shop_access.log 4.0K -rw-r--r--. 1 root root 341 Sep 14 14:35 access.log 0 -rw-r--r--. 1 root root 0 Sep 14 14:35 error.log 0 -rw-r--r-- 1 root root 0 Sep 14 14:34 shop_error.log
日志文件已生成,查看日志格式:
[root@192.168.118.16 ~]#cat /www/log/access.log {"domain":"localhost","real_ip":"","http_x_forwarded_for":"-","time_local":"2019-09-14T14:35:11+08:00","request":"GET /index.html HTTP/1.1","request_body":"-","status":200,"body_bytes_sent":"12","http_referer":"-","upstream_response_time":"-","request_time":"0.000","http_user_agent":"curl/7.29.0","upstream_addr":"-","upstream_status":"-"}
定義的 json 格式已經被引用到,nginx日志格式配置完成,接下來就是 通過filebeat 將nginx 日志傳遞給 logstash
2.2 filebeat 傳輸 nginx 日志
這里在上一篇的基礎上做修改,直接修改 filebeat 配置文件:
[root@192.168.118.16 ~]#vim /etc/filebeat/modules.d/nginx.yml
重啟 filebeat 服務
[root@192.168.118.16 ~]#systemctl restart filebeat
2.3 logstash 配置文件編寫
通過上面的步驟,filebeat 已經將 nginx 日志傳輸過來了,接下來就看 logstash 要怎么接收日志數據了,還是通過循序漸進的方式來編寫。
首先將日志數據打印到屏幕,保證數據的正確性。
從 nginx.conf 啟動 logstash。通過瀏覽器訪問 nginx 9527端口產生日志數據。
這里啟動 logstash 可以添加 修改自動重載的模式,這樣當修改了 nginx.conf 時,不必頻繁的去關閉重啟。
[root@192.168.118.15 /etc/logstash/conf.d]#logstash -f nginx.conf --config.reload.automatic
通過抓取一段 json 數據,分析下:

{ "@timestamp" => 2019-09-14T06:52:16.056Z, "@version" => "1", "source" => "/www/log/access.log", "input" => { "type" => "log" }, "beat" => { "name" => "web-node1", "version" => "6.8.2", "hostname" => "web-node1" }, "host" => { "name" => "web-node1", "architecture" => "x86_64", "id" => "4b3b32a1db0343458c4942a10c79acef", "os" => { "name" => "CentOS Linux", "codename" => "Core", "family" => "redhat", "platform" => "centos", "version" => "7 (Core)" }, "containerized" => false }, "log" => { "file" => { "path" => "/www/log/access.log" } }, "tags" => [ [0] "beats_input_codec_plain_applied" ], "prospector" => { "type" => "log" }, "fileset" => { "module" => "nginx", "name" => "access" }, "offset" => 9350, "event" => { "dataset" => "nginx.access" }, "message" => "{\"domain\":\"localhost\",\"real_ip\":\"\",\"http_x_forwarded_for\":\"-\",\"time_local\":\"2019-09-14T14:52:15+08:00\",\"request\":\"GET / HTTP/1.1\",\"request_body\":\"-\",\"status\":304,\"body_bytes_sent\":\"0\",\"http_referer\":\"-\",\"upstream_response_time\":\"-\",\"request_time\":\"0.000\",\"http_user_agent\":\"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36\",\"upstream_addr\":\"-\",\"upstream_status\":\"-\"}" }
這里面數據很多,但是這里的有些數據不是必要的,應該保留需要的數據,而剔除沒必要的數據,使 json 看起來更加簡潔。
首先來查看這段 json ,發現真正的 nginx 日志數據都存在 message 里面,其他的都是一些主機服務相關的信息,但是 message 看起來亂糟糟的,簡直沒法看。既然采用的 json 格式,那就能夠格式化。
修改配置文件如下:

{ "@version" => "1", "host" => { "os" => { "name" => "CentOS Linux", "version" => "7 (Core)", "family" => "redhat", "platform" => "centos", "codename" => "Core" }, "name" => "web-node1", "id" => "4b3b32a1db0343458c4942a10c79acef", "architecture" => "x86_64", "containerized" => false }, "upstream_response_time" => "-", "beat" => { "name" => "web-node1", "version" => "6.8.2", "hostname" => "web-node1" }, "domain" => "localhost", "request_body" => "-", "log" => { "file" => { "path" => "/www/log/access.log" } }, "http_user_agent" => "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36", "prospector" => { "type" => "log" }, "http_referer" => "-", "real_ip" => "", "fileset" => { "module" => "nginx", "name" => "access" }, "upstream_status" => "-", "body_bytes_sent" => "0", "@timestamp" => 2019-09-14T07:03:36.087Z, "http_x_forwarded_for" => "-", "status" => 304, "source" => "/www/log/access.log", "input" => { "type" => "log" }, "time_local" => "2019-09-14T15:03:28+08:00", "request_time" => "0.000", "upstream_addr" => "-", "tags" => [ [0] "beats_input_codec_plain_applied" ], "offset" => 11066, "event" => { "dataset" => "nginx.access" }, "request" => "GET / HTTP/1.1" }
將這兩次獲取的數據進行一個對比,下面這個數據 message 被刪除了,但是 message 中的每個字段都獨立出來了。這樣的好處:
(1)日志信息更加清晰,能夠准確的定位到某一個字段;
(2)為后面存儲到 elasticsearch 中,進行查詢或者篩選做好了准備。
上面這個操作就等於是將原來的 message 分列存放了。
上面這個json 發現有兩個時間:
@timestamp - 格林尼治時間 - logstash 獲取日志時間
Time_local - 東八區時間 - nginx日志記錄時間
這兩個時間的 分鍾和秒鍾並不一致,而后面過濾日志采用的是 @timestamp 時間,也就是 logstash 時間,這就會造成 nginx 日志時間不准確的現象,因此需要將兩個時間修改為一致。

{ "@version" => "1", "host" => { "name" => "web-node1", "os" => { "name" => "CentOS Linux", "version" => "7 (Core)", "family" => "redhat", "platform" => "centos", "codename" => "Core" }, "id" => "4b3b32a1db0343458c4942a10c79acef", "architecture" => "x86_64", "containerized" => false }, "upstream_response_time" => "-", "beat" => { "name" => "web-node1", "version" => "6.8.2", "hostname" => "web-node1" }, "domain" => "localhost", "request_body" => "-", "log" => { "file" => { "path" => "/www/log/access.log" } }, "http_user_agent" => "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36", "prospector" => { "type" => "log" }, "http_referer" => "-", "real_ip" => "", "fileset" => { "module" => "nginx", "name" => "access" }, "upstream_status" => "-", "body_bytes_sent" => "0", "status" => 304, "http_x_forwarded_for" => "-", "@timestamp" => 2019-09-14T07:14:46.000Z, "source" => "/www/log/access.log", "input" => { "type" => "log" }, "time_local" => "2019-09-14T15:14:46+08:00", "request_time" => "0.000", "upstream_addr" => "-", "tags" => [ [0] "beats_input_codec_plain_applied" ], "offset" => 11495, "event" => { "dataset" => "nginx.access" }, "request" => "GET / HTTP/1.1" }
現在,對比兩個時間的分鍾 和秒鍾,完全一致了。接下來,刪除一些不必要的字段,並重命名一些字段名,修改配置文件如下:

{ "@version" => "1", "agent" => "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36", "domain" => "localhost", "request_body" => "-", "log" => { "file" => { "path" => "/www/log/access.log" } }, "http_referer" => "-", "response_time" => [ [0] "-" ], "real_ip" => "", "fileset" => { "module" => "nginx", "name" => "access" }, "upstream_status" => "-", "body_bytes_sent" => "0", "status" => 304, "@timestamp" => 2019-09-14T07:22:14.000Z, "time_local" => "2019-09-14T15:22:14+08:00", "request_time" => "0.000", "upstream_addr" => "-", "x_forwarded_for" => [ [0] "-" ], "request" => "GET / HTTP/1.1" }
經過重命名和刪除沒有的字段,json 也變的精簡了很多,這樣存儲 elasticsearch 消耗的存儲空間也響應的變小了。
接下來就可以將數據寫入到 elasticsearch 中了。在這之前,做的都是 access.log,壓根就沒考慮到 error.log 的格式,因為 nginx 中 error.log 日志格式無法自定義。
嘗試訪問一個錯誤uri 來查看下獲取到的數據:

[WARN ] 2019-09-14 15:25:34.300 [[main]>worker3] json - Error parsing json {:source=>"message", :raw=>"2019/09/14 15:25:29 [error] 2122#0: *33 open() \"/www/9527/123.html\" failed (2: No such file or directory), client: 192.168.118.41, server: localhost, request: \"GET /123.html HTTP/1.1\", host: \"192.168.118.16:9527\"", :exception=>#<LogStash::Json::ParserError: Unexpected character ('/' (code 47)): Expected space separating root-level values at [Source: (byte[])"2019/09/14 15:25:29 [error] 2122#0: *33 open() "/www/9527/123.html" failed (2: No such file or directory), client: 192.168.118.41, server: localhost, request: "GET /123.html HTTP/1.1", host: "192.168.118.16:9527""; line: 1, column: 6]>} { "@timestamp" => 2019-09-14T07:25:33.173Z, "@version" => "1", "log" => { "file" => { "path" => "/www/log/error.log" } }, "fileset" => { "module" => "nginx", "name" => "error" }, "message" => "2019/09/14 15:25:29 [error] 2122#0: *33 open() \"/www/9527/123.html\" failed (2: No such file or directory), client: 192.168.118.41, server: localhost, request: \"GET /123.html HTTP/1.1\", host: \"192.168.118.16:9527\"" }
error.log 過來的數據就變成上面這個樣子了。這又是個問題,做 ELK 一是為了分析數據,二是為了盡快排錯,如果 ELK 連這個都做不到,那就有點雞肋了。
上面的這個格式看起來又是很亂了,nginx 錯誤日志都在 message 中,雖然nginx 錯誤日志無法定義格式,但是 logstash 可以通過正則表達式來將它轉換為 json 格式。但在這之前,應該考慮,access.log 和 error.log 是兩種不同的格式,不能用同一種方式去匹配。那怎么判斷數據是來自 access.log 還是 error.log 呢?
這里語法肯定是想到了:
If … { Access.log } elseif … { Error.log }
對,語法沒錯,但是用什么條件呢?查看上面的日志,不難發現每次都有這樣的字段:
Access.log 日志數據: "fileset" => { "module" => "nginx", "name" => "access" error.log 日志數據: "fileset" => { "module" => "nginx", "name" => "error"
這樣,就有判斷的依據了,根據logstash配置語法開始寫:
到目前為止,logstash 的 nginx 日志收集過濾配置文件如下:
配置文件名:nginx.conf

input { beats { port => "5044" } } filter { if [fileset][name] == "access" { json { source => "message" remove_field => "message" remove_field => "@timestamp" } date { match => ["time_local", "ISO8601"] target => "@timestamp" } grok { match => { "request" => "%{WORD:method} (?<url>.* )" } } mutate { remove_field => ["host","event","input","request","offset","prospector","source","type","tags","beat"] rename => {"http_user_agent" => "agent"} rename => {"upstream_response_time" => "response_time"} rename => {"http_x_forwarded_for" => "x_forwarded_for"} split => {"x_forwarded_for" => ", "} split => {"response_time" => ", "} } geoip { source => "real_ip" } } if [fileset][name] == "error" { mutate { remove_field => ["@timestamp"] } grok { match => {"message" => "(?<datetime>%{YEAR}[./-]%{MONTHNUM}[./-]%{MONTHDAY}[- ]%{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER}: %{GREEDYDATA:errormessage}(?:, client: (?<real_ip>%{IP}|%{HOSTNAME}))(?:, server: %{IPORHOST:domain}?)(?:, request: %{QS:request})?(?:, upstream: (?<upstream>\"%{URI}\"|%{QS}))?(?:, host: %{QS:request_host})?(?:, referrer: \"%{URI:referrer}\")?"} } date { match => ["datetime", "yyyy/MM/dd HH:mm:ss"] target => "@timestamp" } mutate { remove_field => ["message","request","http_referer","host","event","input","offset","prospector","source","type","tags","beat"] } } } output { stdout { codec => "rubydebug" } }
測試 access.log 日志格式數據:

{ "@version" => "1", "agent" => "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36", "domain" => "localhost", "request_body" => "-", "log" => { "file" => { "path" => "/www/log/access.log" } }, "http_referer" => "-", "response_time" => [ [0] "-" ], "real_ip" => "", "fileset" => { "module" => "nginx", "name" => "access" }, "upstream_status" => "-", "body_bytes_sent" => "0", "status" => 304, "@timestamp" => 2019-09-14T07:39:50.000Z, "time_local" => "2019-09-14T15:39:50+08:00", "request_time" => "0.000", "upstream_addr" => "-", "x_forwarded_for" => [ [0] "-" ], "request" => "GET / HTTP/1.1" }
測試 error.log 日志格式數據:

{ "@version" => "1", "agent" => "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36", "domain" => "localhost", "request_body" => "-", "log" => { "file" => { "path" => "/www/log/access.log" } }, "http_referer" => "-", "response_time" => [ [0] "-" ], "real_ip" => "", "fileset" => { "module" => "nginx", "name" => "access" }, "upstream_status" => "-", "body_bytes_sent" => "571", "status" => 404, "@timestamp" => 2019-09-14T07:41:48.000Z, "time_local" => "2019-09-14T15:41:48+08:00", "request_time" => "0.000", "upstream_addr" => "-", "x_forwarded_for" => [ [0] "-" ], "request" => "GET /123.html HTTP/1.1" }
這下沒問題了, 兩種格式的數據都獲取到了。接下來就將數據寫入到 elasticsearch中。
到目前為止,logstash 配置文件 nginx.conf 如下:

input { beats { port => "5044" } } filter { if [fileset][name] == "access" { json { source => "message" remove_field => "message" remove_field => "@timestamp" } date { match => ["time_local", "ISO8601"] target => "@timestamp" } grok { match => { "request" => "%{WORD:method} (?<url>.* )" } } mutate { remove_field => ["host","event","input","request","offset","prospector","source","type","tags","beat"] rename => {"http_user_agent" => "agent"} rename => {"upstream_response_time" => "response_time"} rename => {"http_x_forwarded_for" => "x_forwarded_for"} split => {"x_forwarded_for" => ", "} split => {"response_time" => ", "} } geoip { source => "real_ip" } } if [fileset][name] == "error" { mutate { remove_field => ["@timestamp"] } grok { match => {"message" => "(?<datetime>%{YEAR}[./-]%{MONTHNUM}[./-]%{MONTHDAY}[- ]%{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER}: %{GREEDYDATA:errormessage}(?:, client: (?<real_ip>%{IP}|%{HOSTNAME}))(?:, server: %{IPORHOST:domain}?)(?:, request: %{QS:request})?(?:, upstream: (?<upstream>\"%{URI}\"|%{QS}))?(?:, host: %{QS:request_host})?(?:, referrer: \"%{URI:referrer}\")?"} } date { match => ["datetime", "yyyy/MM/dd HH:mm:ss"] target => "@timestamp" } mutate { remove_field => ["message","request","http_referer","host","event","input","offset","prospector","source","type","tags","beat"] } } } #output { # stdout { # codec => "rubydebug" # } #} output { elasticsearch { hosts => ["192.168.118.14"] index => "logstash-nginx-%{+YYYY.MM.dd}" } }
這個配置也算是本次 nginx 最終版配置了。
使用瀏覽器多次訪問nginx 9527 端口,然后切換到 elasticsearch-head 查看索引是否創建成功。
ok,已經看到今天的索引創建成功,查看數據。
數據也是沒有問題的,切換到 kibana 添加索引。
ok,目前已經將數據存儲到 elasticsearch 並通過 kibana 展示出來了,但是想要更清晰的分析查看數據還需要在 kibana 上下一番功夫。
2.4 kibana 展示
首先是 Discover 這里,每次進來,都需要一目了然的查看日志,做以下配置:
上面兩個設置以后,每次登錄進來只需要點擊 打開 查看相關模板就能看到清晰的日志數據。
接下來,就是繪制最上面那副圖啦。
在繪制之前必須要有數據支撐,因為這個是測試環境沒有真是的用戶訪問。因此需要造一批假數據訪問。
方法就是 直接去 access.log 里復制一條數據,修改 real_ip 為 公網ip
假數據添加成功后,來進行圖表的配置,點擊 可視化
第一個:訪問省會城市 TOP 5 (餅圖)
選擇餅圖,然后選擇 logstash-nginx-* 索引
完成后點擊保存。
第二個:訪問分布地圖(坐標地圖)
完成后點擊保存。
第三個:域名TOP5 (數據表)
完成后點擊保存。
第四個:后端服務TOP5(數據表)
完成后點擊保存。
第五個:uri top 5(數據表)
完成后點擊保存。
第六個:realipTOP5 (水平條形圖)
完成后點擊保存。
第七個:http狀態TOP5 (餅圖)
完成后點擊保存。
好了, 在 可視化 一欄中,創建了 7 個數據表圖,點開 儀表板,將這些圖表展示出來就行了。
然后將圖表擺放好,大功告成。