Filebeat

本文轉載自查看原文 2020-04-20 00:38 532 ELK

Filebeat介紹

Filebeat是本地文件的日志數據采集器，可監控日志目錄或特定日志文件（tail file），並將它們轉發給Elasticsearch或Logstatsh進行索引、kafka等。帶有內部模塊（auditd，Apache，Nginx，System和MySQL），可通過一個指定命令來簡化通用日志格式的收集，解析和可視化。

官方網址:https://www.elastic.co/guide/en/beats/filebeat/current/index.html

部署與運行
下載（或使用資料中提供的安裝包，版本為：filebeat-6.5.4）：https://www.elastic.co/downloads/beats

tar -zvxf filebeat-6.2.3-linux-x86_64.tar.gz
cd filebeat-6.2.3-linux-x86_64
./filebeat -e -c wgr.yml

[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]# cat wgr.yml
filebeat.prospectors:
- type: stdin
  enabled: true
setup.template.settings:
  index.number_of_shards: 3
output.console:
  pretty: true
  enable: true
[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]#

讀取文件

[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]# cat wgr.yml
filebeat.prospectors:
- type: log
  enabled: true
  paths:
    - /root/*.log
setup.template.settings:
  index.number_of_shards: 3
output.console:
  pretty: true
  enable: true
[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]#

自定義字段

[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]# cat wgr.yml
filebeat.prospectors:
- type: log
  enabled: true
  paths:
    - /root/*.log
  tags: ["web"]
  fields:
    from: topcheer
  fields_under_root: true
setup.template.settings:
  index.number_of_shards: 3
output.console:
  pretty: true
  enable: true
[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]#

輸出到Elasticsearch

2020-04-19T22:33:43.424+0800    INFO    crawler/crawler.go:48   Loading Prospectors: 1
2020-04-19T22:33:43.425+0800    INFO    log/prospector.go:111   Configured paths: [/root/*.log]
2020-04-19T22:33:43.425+0800    INFO    crawler/crawler.go:82   Loading and starting Prospectors completed. Enabled prospectors: 1
2020-04-19T22:33:43.425+0800    INFO    log/harvester.go:216    Harvester started for file: /root/a.log
2020-04-19T22:33:44.547+0800    INFO    elasticsearch/client.go:690     Connected to Elasticsearch version 5.6.12
2020-04-19T22:33:44.559+0800    INFO    template/load.go:55     Loading template for Elasticsearch version: 5.6.12
2020-04-19T22:33:44.833+0800    INFO    template/load.go:89     Elasticsearch template with name 'filebeat-6.2.3' loaded
2020-04-19T22:34:13.425+0800    INFO    [monitoring]    log/log.go:124  Non-zero metrics in the last 30s        {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":10,"time":12},"total":{"ticks":50,"time":54,"value":50},"user":{"ticks":40,"time":42}},"info":{"ephemeral_id":"8a0c59b6-3c04-4217-af16-87450daa8965","uptime":{"ms":30010}},"memstats":{"gc_next":4194304,"memory_alloc":1611744,"memory_total":7392648,"rss":16904192}},"filebeat":{"events":{"added":7,"done":7},"harvester":{"open_files":1,"running":1,"started":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":5,"batches":2,"total":5},"read":{"bytes":1268},"type":"elasticsearch","write":{"bytes":14868}},"pipeline":{"clients":1,"events":{"active":0,"filtered":2,"published":5,"retry":2,"total":7},"queue":{"acked":5}}},"registrar":{"states":{"curren

采集nginx日志

[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]# cat wgr.yml
filebeat.prospectors:
- type: log
  enabled: true
  paths:
    - /data/nginx/logs/*.log
  tags: ["nginx"]
setup.template.settings:
  index.number_of_shards: 3
output.elasticsearch:
  hosts: ["47.131.231.241:9200"]
[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]#

Module
前面要想實現日志數據的讀取以及處理都是自己手動配置的，其實，在Filebeat中，有大量的Module，可以簡化我們的配置，直接就可以使用，如下：

[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]# ./filebeat modules list
Enabled:
nginx

Disabled:
apache2
auditd
icinga
kafka
logstash
mysql
osquery
postgresql
redis
system
traefik
[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]# ./filebeat modules disable nginx
Disabled nginx
[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]# ./filebeat modules enable nginx
Enabled nginx
[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]#

可以發現，nginx的module已經被啟用。
nginx module 配置

[root@iZ1la3d1xbmukrZ modules.d]# cat nginx.yml
- module: nginx
  # Access logs
  access:
    enabled: true
    var.paths: ["/data/nginx/logs/access.log*"]
    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

  # Error logs
  error:
    enabled: true
    var.paths: ["/data/nginx/logs/error.log*"]
    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:
[root@iZ1la3d1xbmukrZ modules.d]#

配置filebeat

[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]# cat wgr.yml
filebeat.prospectors:
setup.template.settings:
  index.number_of_shards: 3
output.elasticsearch:
  hosts: ["47.111.251.239:9200"]
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
[root@iZ1la3d1xbmukrZ filebeat-6.2.3-linux-x86_64]#

測試

#解決：需要在Elasticsearch中安裝ingest-user-agent、ingest-geoip插件
#在資料中可以找到，ingest-user-agent.tar、ingest-geoip.tar、ingest-geoip-conf.tar 3個文件
#其中，ingest-user-agent.tar、ingest-geoip.tar解壓到plugins下
#ingest-geoip-conf.tar解壓到config下
#問題解決。

移到容器內部

[root@iZ1la3d1xbmukrZ ~]# docker cp /root/ingest-geoip 47919e4d2ecc:/usr/share/elasticsearch/plugins/
[root@iZ1la3d1xbmukrZ ~]# docker cp /root/ingest-user-agent 47919e4d2ecc:/usr/share/elasticsearch/plugins/
[root@iZ1la3d1xbmukrZ ~]# docker cp /root/ingest-geoip  47919e4d2ecc:/usr/share/elasticsearch/config/

測試發現，數據已經寫入到了Elasticsearch中，並且拿到的數據更加明確了：

字段解釋

paths：指定要監控的日志，目前按照Go語言的glob函數處理。沒有對配置目錄做遞歸處理，比如配置的如果是：

/var/log/* /*.log

則只會去/var/log目錄的所有子目錄中尋找以".log"結尾的文件，而不會尋找/var/log目錄下以".log"結尾的文件。

encoding：指定被監控的文件的編碼類型，使用plain和utf-8都是可以處理中文日志的。

input_type：指定文件的輸入類型log(默認)或者stdin。

exclude_lines：在輸入中排除符合正則表達式列表的那些行。

include_lines：包含輸入中符合正則表達式列表的那些行（默認包含所有行），include_lines執行完畢之后會執行exclude_lines。

exclude_files：忽略掉符合正則表達式列表的文件（默認為每一個符合paths定義的文件都創建一個harvester）。

fields：向輸出的每一條日志添加額外的信息，比如"level:debug"，方便后續對日志進行分組統計。默認情況下，會在輸出信息的fields子目錄下以指定的新增fields建立子目錄，

fields_under_root：如果該選項設置為true，則新增fields成為頂級目錄，而不是將其放在fields目錄下。自定義的field會覆蓋filebeat默認的field。

ignore_older：可以指定Filebeat忽略指定時間段以外修改的日志內容，比如2h（兩個小時）或者5m(5分鍾)。

close_older：如果一個文件在某個時間段內沒有發生過更新，則關閉監控的文件handle。默認1h。

force_close_files：Filebeat會在沒有到達close_older之前一直保持文件的handle，如果在這個時間窗內刪除文件會有問題，所以可以把force_close_files設置為true，只要filebeat檢測到文件名字發生變化，就會關掉這個handle。

scan_frequency：Filebeat以多快的頻率去prospector指定的目錄下面檢測文件更新（比如是否有新增文件），如果設置為0s，則Filebeat會盡可能快地感知更新（占用的CPU會變高）。默認是10s。

document_type：設定Elasticsearch輸出時的document的type字段，也可以用來給日志進行分類。

harvester_buffer_size：每個harvester監控文件時，使用的buffer的大小。

max_bytes：日志文件中增加一行算一個日志事件，max_bytes限制在一次日志事件中最多上傳的字節數，多出的字節會被丟棄。默認是10MB。

multiline：適用於日志中每一條日志占據多行的情況，比如各種語言的報錯信息調用棧。這個配置的下面包含如下配置：

pattern：多行日志開始的那一行匹配的pattern

negate：是否需要對pattern條件轉置使用，不翻轉設為true，反轉設置為false。

match：匹配pattern后，與前面（before）還是后面（after）的內容合並為一條日志

max_lines：合並的最多行數（包含匹配pattern的那一行），默認為500行。

timeout：到了timeout之后，即使沒有匹配一個新的pattern（發生一個新的事件），也把已經匹配的日志事件發送出去

tail_files：如果設置為true，Filebeat從文件尾開始監控文件新增內容，把新增的每一行文件作為一個事件依次發送，而不是從文件開始處重新發送所有內容。

backoff：Filebeat檢測到某個文件到了EOF之后，每次等待多久再去檢測文件是否有更新，默認為1s。

max_backoff：Filebeat檢測到某個文件到了EOF之后，等待檢測文件更新的最大時間，默認是10秒。

backoff_factor：定義到達max_backoff的速度，默認因子是2，到達max_backoff后，變成每次等待max_backoff那么長的時間才backoff一次，直到文件有更新才會重置為backoff。比如：

如果設置成1，意味着去使能了退避算法，每隔backoff那么長的時間退避一次。

spool_size:spooler的大小，spooler中的事件數量超過這個閾值的時候會清空發送出去（不論是否到達超時時間），默認1MB。

idle_timeout:spooler的超時時間，如果到了超時時間，spooler也會清空發送出去（不論是否到達容量的閾值），默認1s。

registry_file:記錄filebeat處理日志文件的位置的文件

config_dir:如果要在本配置文件中引入其他位置的配置文件，可以寫在這里（需要寫完整路徑），但是只處理prospector的部分。

publish_async：是否采用異步發送模式（實驗功能）。

工作原理

Filebeat涉及兩個組件：查找器prospector和采集器harvester，來讀取文件(tail file)並將事件數據發送到指定的輸出。

啟動Filebeat時，它會啟動一個或多個查找器，查看你為日志文件指定的本地路徑。對於prospector所在的每個日志文件，prospector啟動harvester。每個harvester都會為新內容讀取單個日志文件，並將新日志數據發送到libbeat，后者將聚合事件並將聚合數據發送到你為Filebeat配置的輸出。

當發送數據到Logstash或Elasticsearch時，Filebeat使用一個反壓力敏感(backpressure-sensitive)的協議來解釋高負荷的數據量。當Logstash數據處理繁忙時，Filebeat放慢它的讀取速度。一旦壓力解除，Filebeat將恢復到原來的速度，繼續傳輸數據。

1.1采集器Harvester

Harvester負責讀取單個文件的內容。讀取每個文件，並將內容發送到the output，每個文件啟動一個harvester, harvester負責打開和關閉文件，這意味着在運行時文件描述符保持打開狀態。

如果文件在讀取時被刪除或重命名，Filebeat將繼續讀取文件。這有副作用，即在harvester關閉之前，磁盤上的空間被保留。默認情況下，Filebeat將文件保持打開狀態，直到達到close_inactive狀態

關閉harvester會產生以下結果：

1）如果在harvester仍在讀取文件時文件被刪除，則關閉文件句柄，釋放底層資源。

2）文件的采集只會在scan_frequency過后重新開始。

3）如果在harvester關閉的情況下移動或移除文件，則不會繼續處理文件。

要控制收割機何時關閉，請使用close_ *配置選項

1.2查找器Prospector

Prospector負責管理harvester並找到所有要讀取的文件來源。如果輸入類型為日志，則查找器將查找路徑匹配的所有文件，並為每個文件啟動一個harvester。每個prospector都在自己的Go協程中運行。

Filebeat目前支持兩種prospector類型：log和stdin。每個prospector類型可以定義多次。日志prospector檢查每個文件來查看harvester是否需要啟動，是否已經運行，或者該文件是否可以被忽略（請參閱ignore_older）。

只有在harvester關閉后文件的大小發生了變化，才會讀取到新行。

注：Filebeat prospector只能讀取本地文件，沒有功能可以連接到遠程主機來讀取存儲的文件或日志。

啟動和停止

開啟filebeat

cd FILEBEAT_HOME

nohup ./bin/filebeat -f config/test.conf >>/FILEBEAT_HOME/logs/filebeat.log &

后台啟動filebeat，配置對應的參數

啟動多個filebeat配置，新建一個目錄（conf）存放多個filebeat的配置文件，

#nohup ./bin/filebeat -f conf/* >>/FILEBEAT_HOME/logs/filebeat.log &

注意：一台服務器只能啟動一個filebeat進程。

停止filebeat

ps -ef |grep filebeat

kill -9 $pid

注意：非緊急情況下，殺掉進程只能用優雅方式。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Filebeat filebeat Filebeat入門 Filebeat 調試 Filebeat 快速開始 filebeat5與filebeat6配置index的差異 Flume、Logstash、Filebeat對比 filebeat安裝、配置及測試 filebeat+logstash配置 docker stack 部署 filebeat