filebeat配置詳解


從input讀取事件源,經過相應解析和處理之后,從output輸出到目標存儲庫(elasticsearch或其他)。輸入可以從Log、Syslog、Stdin、Redis、UDP、Docker、TCP、NetFlow輸入,然后可以輸出到Elasticsearch、Logstash、Kafka、Redis、File、Console、Cloud。

 

詳細講解,請參考官網:https://www.elastic.co/guide/en/beats/filebeat/current/configuring-howto-filebeat.html

filebeat.yml的格式如下,我們主要了解從log中輸入的相應配置

filebeat.inputs:
- input_type: log
paths:
- /var/log/apache/httpd-*.log
document_type: apache
- input_type: log
paths:
- /var/log/messages
- /var/log/*.log 

Filebeat Options
input_type: log
指定輸入類型
paths
支持基本的正則,所有golang glob都支持,支持/var/log/*/*.log
encoding

plain, latin1, utf-8, utf-16be-bom, utf-16be, utf-16le, big5, gb18030, gbk, hz-gb-2312,
euc-kr, euc-jp, iso-2022-jp, shift-jis, and so on
exclude_lines

支持正則 排除匹配的行,如果有多行,合並成一個單一行來進行過濾
include_lines
支持正則 include_lines執行完畢之后會執行exclude_lines。
exclude_files
支持正則 排除匹配的文件
exclude_files: ['.gz$']
tags
列表中添加標簽,用過過濾
filebeat.inputs:
- paths: ["/var/log/app/*.json"]
tags: ["json"]
fields
可選字段,選擇額外的字段進行輸出
可以是標量值,元組,字典等嵌套類型
默認在sub-dictionary 位置
filebeat.inputs:
- paths: ["/var/log/app/*.log"]
fields:
app_id: query_engine_12
fields_under_root
如果值為ture,那么fields存儲在輸出文檔的頂級位置
如果與filebeat中字段沖突,自定義字段會覆蓋其他字段

fields_under_root: true
fields:
instance_id: i-10a64379
region: us-east-1
ignore_older

可以指定Filebeat忽略指定時間段以外修改的日志內容
文件被忽略之前,確保文件不在被讀取,必須設置ignore older時間范圍大於close_inactive
如果一個文件正在讀取時候被設置忽略,它會取得到close_inactive后關閉文件,然后文件被忽略
close_*
close_ *配置選項用於在特定標准或時間之后關閉harvester。 關閉harvester意味着關閉文件處理程序。 如果在harvester關閉后文件被更新,則在scan_frequency過后,文件將被重新拾取。 但是,如果在harvester關閉時移動或刪除文件,Filebeat將無法再次接收文件,並且harvester未讀取的任何數據都將丟失。
close_inactive
啟動選項時,如果在制定時間沒有被讀取,將關閉文件句柄
讀取的最后一條日志定義為下一次讀取的起始點,而不是基於文件的修改時間
如果關閉的文件發生變化,一個新的harverster將在scan_frequency運行后被啟動
建議至少設置一個大於讀取日志頻率的值,配置多個prospector來實現針對不同更新速度的日志文件
使用內部時間戳機制,來反映記錄日志的讀取,每次讀取到最后一行日志時開始倒計時
使用2h 5m 來表示

recursive_glob.enabled 遞歸匹配日志文件,默認false
close_rename
當選項啟動,如果文件被重命名和移動,filebeat關閉文件的處理讀取
close_removed
當選項啟動,文件被刪除時,filebeat關閉文件的處理讀取
這個選項啟動后,必須啟動clean_removed
close_eof
適合只寫一次日志的文件,然后filebeat關閉文件的處理讀取
close_timeout
當選項啟動時,filebeat會給每個harvester設置預定義時間,不管這個文件是否被讀取,達到設定時間后,將被關閉
close_timeout 不能等於ignore_older,會導致文件更新時,不會被讀取
如果output一直沒有輸出日志事件,這個timeout是不會被啟動的,至少要要有一個事件發送,然后haverter將被關閉
設置0 表示不啟動
clean_inactived
從注冊表文件中刪除先前收獲的文件的狀態
設置必須大於ignore_older+scan_frequency,以確保在文件仍在收集時沒有刪除任何狀態
配置選項有助於減小注冊表文件的大小,特別是如果每天都生成大量的新文件
此配置選項也可用於防止在Linux上重用inode的Filebeat問題
clean_removed
啟動選項后,如果文件在磁盤上找不到,將從注冊表中清除filebeat
如果關閉close removed 必須關閉clean removed
scan_frequency
prospector  檢查指定用於收獲的路徑中的新文件的頻率,默認10s
document_type
類型事件,被用於設置輸出文檔的type字段,默認是log
harvester_buffer_size
每次harvester讀取文件緩沖字節數,默認是16384
max_bytes
對於多行日志信息,很有用,最大字節數
json
這些選項使Filebeat解碼日志結構化為JSON消息
逐行進行解碼json
keys_under_root
設置key為輸出文檔的頂級目錄
overwrite_keys
覆蓋其他字段
add_error_key
定一個json_error
message_key
指定json 關鍵建作為過濾和多行設置,與之關聯的值必須是string
multiline
控制filebeat如何處理跨多行日志的選項,多行日志通常發生在java堆棧中
multiline.pattern: '^\['
multiline.negate: true
multiline.match: after
上面匹配是將多行日志所有不是以[符號開頭的行合並成一行它可以將下面的多行日志進行合並成一行

Exception in thread "main" java.lang.NullPointerException
at com.example.myproject.Book.getTitle(Book.java:16)
at com.example.myproject.Author.getBookTitles(Author.java:25)
at com.example.myproject.Bootstrap.main(Bootstrap.java:14)
multiline.pattern

指定匹配的正則表達式,filebeat支持的regexp模式與logstash支持的模式有所不同
pattern regexp
multiline.negate
定義上面的模式匹配條件的動作是 否定的,默認是false
假如模式匹配條件'^b',默認是false模式,表示講按照模式匹配進行匹配 將不是以b開頭的日志行進行合並
如果是true,表示將不以b開頭的日志行進行合並
multiline.match
指定Filebeat如何將匹配行組合成事件,在之前或者之后,取決於上面所指定的negate
multiline.max_lines
可以組合成一個事件的最大行數,超過將丟棄,默認500
multiline.timeout
定義超時時間,如果開始一個新的事件在超時時間內沒有發現匹配,也將發送日志,默認是5s
tail_files
如果此選項設置為true,Filebeat將在每個文件的末尾開始讀取新文件,而不是開頭
此選項適用於Filebeat尚未處理的文件
symlinks
符號鏈接選項允許Filebeat除常規文件外,可以收集符號鏈接。收集符號鏈接時,即使報告了符號鏈接的路徑,Filebeat也會打開並讀取原始文件。
backoff
backoff選項指定Filebeat如何積極地抓取新文件進行更新。默認1s
backoff選項定義Filebeat在達到EOF之后再次檢查文件之間等待的時間。
max_backoff
在達到EOF之后再次檢查文件之前Filebeat等待的最長時間
backoff_factor
指定backoff嘗試等待時間幾次,默認是2
harvester_limit
harvester_limit選項限制一個prospector並行啟動的harvester數量,直接影響文件打開數
enabled
控制prospector的啟動和關閉
filebeat global
spool_size
事件發送的閥值,超過閥值,強制刷新網絡連接
filebeat.spool_size: 2048
publish_async
異步發送事件,實驗性功能
idle_timeout
事件發送的超時時間,即使沒有超過閥值,也會強制刷新網絡連接
filebeat.idle_timeout: 5s
registry_file
注冊表文件的名稱,如果使用相對路徑,則被認為是相對於數據路徑
有關詳細信息,請參閱目錄布局部分 默認值為${path.data}/registry
filebeat.registry_file: registry
config_dir
包含額外的prospector配置文件的目錄的完整路徑
每個配置文件必須以.yml結尾
每個配置文件也必須指定完整的Filebeat配置層次結構,即使只處理文件的prospector部分。
所有全局選項(如spool_size)將被忽略
必須是絕對路徑
filebeat.config_dir: path/to/configs
shutdown_timeout
Filebeat等待發布者在Filebeat關閉之前完成發送事件的時間。
Filebeat General
name
設置名字,如果配置為空,則用該服務器的主機名
name: "my-shipper"
queue_size
單個事件內部隊列的長度 默認1000
bulk_queue_size
批量事件內部隊列的長度
max_procs
設置最大使用cpu數量
geoip.paths
此配置選項目前僅由Packetbeat使用,它將在6.0版中刪除
要使GeoIP支持功能正常,GeoLite City數據庫是必需的。

geoip:
paths:
  - "/usr/share/GeoIP/GeoLiteCity.dat"
  - "/usr/local/var/GeoIP/GeoLiteCity.dat"

 

Filebeat reload
屬於測試功能

path

定義要檢查的配置路徑

reload.enabled

設置為true時,啟用動態配置重新加載。

reload.period

定義要檢查的間隔時間

filebeat.config.inputs:
path: configs/*.yml
reload.enabled: true
reload.period: 10s

一般配置:

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common

# options. The filebeat.reference.yml file from the same directory contains all the

# supported options with more comments. You can use it as a reference.

#

# You can find the full configuration reference here:

# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample

# configuration file.

#=========================== Filebeat inputs =============================

#=========================== Filebeat 輸入配置 ===========================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so

# you can use different inputs for various configurations.

# Below are the input specific configurations.

# 輸入filebeat的類型,包括log(具體路徑的日志),stdin(鍵盤輸入),redis,udp,docker,tcp,syslog,可以同時配置多個(包括相同類型的)

# 具體的每種類型的配置信息可以通過官網:https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html 了解

- type: log

  # Change to true to enable this input configuration.

  # 配置是否生效

  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.

  # 指定要監控的日志,可以指定具體得文件或者目錄

  paths:

    #- /var/log/*.log (這是默認的,自行可以修改)

    - /usr/local/tomcat/logs/catalina.out

  # Exclude lines. A list of regular expressions to match. It drops the lines that are

  # matching any regular expression from the list.

  # 在輸入中排除符合正則表達式列表的那些行。

  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are

  # matching any regular expression from the list.

  # 包含輸入中符合正則表達式列表的那些行(默認包含所有行),include_lines執行完畢之后會執行exclude_lines

  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that

  # are matching any regular expression from the list. By default, no files are dropped.

  # 忽略掉符合正則表達式列表的文件

  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked

  # to add additional information to the crawled log files for filtering

  # 向輸出的每一條日志添加額外的信息,比如“level:debug”,方便后續對日志進行分組統計。

  # 默認情況下,會在輸出信息的fields子目錄下以指定的新增fields建立子目錄,例如fields.level

  # 這個得意思就是會在es中多添加一個字段,格式為 "filelds":{"level":"debug"}

  #fields:

  #  level: debug

  #  review: 1

  #  module: mock 

  ### Multiline options

  ### 日志中經常會出現多行日志在邏輯上屬於同一條日志的情況,所以需要multiline參數來詳細闡述。

  # Multiline can be used for log messages spanning multiple lines. This is common

  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [

  # 多行匹配正則表達式,比如:用空格開頭(^[[:space:]]),或者是否以[開頭(^\[)。正則表達式是非常復雜的,詳細見filebeat的正則表達式官方鏈接:https://www.elastic.co/guide/en/beats/filebeat/current/regexp-support.html

  multiline.pattern: ^\[

  # Defines if the pattern set under pattern should be negated or not. Default is false.

  # 該參數意思是是否否定多行融入。

  #multiline.negate: false

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern

  # that was (not) matched before or after or as long as a pattern is not matched based on negate.

  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash

  # 取值為after或before。該值與上面的pattern與negate值配合使用

  # ----------------------------------------------------------------------------------------------------

  #|multiline.pattern|multiline.negate|multiline.match|                      結論                      |

  # ----------------------------------------------------------------------------------------------------

  #|      true      |    true        |    before    |表示匹配行是結尾,和前面不匹配的組成一行完整的日志|

  # ----------------------------------------------------------------------------------------------------

  #|      true      |    true        |    after    |表示匹配行是開頭,和后面不匹配的組成一行完整的日志|

  # ----------------------------------------------------------------------------------------------------

  #|      true      |    false      |    before    |表示匹配的行和后面不匹配的一行組成一行完整的日志 |

  # ----------------------------------------------------------------------------------------------------

  #|      true      |    false      |    after    |表示匹配的行和前面不匹配的一行組成一行完整的日志 |

  # ----------------------------------------------------------------------------------------------------

  multiline.match: after

  # Specifies a regular expression, in which the current multiline will be flushed from memory, ending the multiline-message.

  # 表示符合該正則表達式的,將從內存刷入硬盤。

  #multiline.flush_pattern

  # The maximum number of lines that can be combined into one event.

  # If the multiline message contains more than max_lines, any additional lines are discarded. The default is 500.

  # 表示如果多行信息的行數超過該數字,則多余的都會被丟棄。默認值為500行

  #multiline.max_lines: 500

  # After the specified timeout, Filebeat sends the multiline event even if no new pattern is found to start a new event. The default is 5s.

  # 表示超過timeout的時間(秒)還沒有新的一行日志產生,則自動結束當前的多行、形成一條日志發出去

  #multiline.timeout: 5

#============================= Filebeat modules ===============================

# 引入filebeat的module配置

filebeat.config.modules:

  # Glob pattern for configuration loading

  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading

  # 是否允許重新加載

  reload.enabled: false

  # Period on which files under path should be checked for changes

  # 重新加載的時間間隔

  #reload.period: 10s

#==================== Elasticsearch template setting ==========================

# Elasticsearch模板配置

setup.template.settings:

  # 數據分片數

  index.number_of_shards: 3

  # 數據分片備份數

  #index.number_of_replicas: 1

  #index.codec: best_compression

  #_source.enabled: false

#================================ General =====================================

# The name of the shipper that publishes the network data. It can be used to group

# all the transactions sent by a single shipper in the web interface.

# 設置filebeat的名字,如果配置為空,則用該服務器的主機名

#name:

# The tags of the shipper are included in their own field with each

# transaction published.

# 額外添加的tag標簽

#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the

# output.

# 額外添加的字段和值

#fields:

#  env: staging

#============================== Dashboards =====================================

# dashboards的相關配置

# These settings control loading the sample dashboards to the Kibana index. Loading

# the dashboards is disabled by default and can be enabled either by setting the

# options here, or by using the `-setup` CLI flag or the `setup` command.

# 是否啟用儀表盤

#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL

# has a value which is computed based on the Beat name and version. For released

# versions, this URL points to the dashboard archive on the artifacts.elastic.co

# website.

# 儀表盤地址

#setup.dashboards.url:

#============================== Kibana =====================================

# kibana的相關配置

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.

# This requires a Kibana endpoint configuration.

setup.kibana:

  # Kibana Host

  # Scheme and port can be left out and will be set to the default (http and 5601)

  # In case you specify and additional path, the scheme is required: http://localhost:5601/path

  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601

  # kibana地址

  #host: "localhost:5601"

  # Kibana Space ID

  # ID of the Kibana Space into which the dashboards should be loaded. By default,

  # the Default Space will be used.

  # kibana的空間ID

  #space.id:

#============================= Elastic Cloud ==================================

# These settings simplify using filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and

# `setup.kibana.host` options.

# You can find the `cloud.id` in the Elastic Cloud web UI.

#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and

# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.

#cloud.auth:

#================================ Outputs =====================================

# 輸出配置

# Configure what output to use when sending the data collected by the beat.

#-------------------------- Elasticsearch output ------------------------------

# 輸出到es

#output.elasticsearch:

  # Array of hosts to connect to.

  # ES地址

  # hosts: ["localhost:9200"]

  # ES索引

  # index: "filebeat-%{[beat.version]}-%{+yyyy.MM.dd}"

  # Optional protocol and basic auth credentials.

  # 協議

  #protocol: "https"

  # ES用戶名

  #username: "elastic"

  # ES密碼

  #password: "changeme"

#----------------------------- Logstash output --------------------------------

# 輸出到logstash

output.logstash:

  # The Logstash hosts

  # logstash地址

  hosts: ["localhost:5044"]

  # Optional SSL. By default is off.

  # List of root certificates for HTTPS server verifications

  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication

  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key

  #ssl.key: "/etc/pki/client/cert.key"

#================================ Procesors =====================================

# Configure processors to enhance or manipulate events generated by the beat.

processors:

  #主機相關 信息

  - add_host_metadata: ~

# 雲服務器的元數據信息,包括阿里雲ECS 騰訊雲QCloud AWS的EC2的相關信息 

  - add_cloud_metadata: ~

  #k8s元數據采集

  #- add_kubernetes_metadata: ~

  # docker元數據采集

  #- add_docker_metadata: ~

  # 執行進程的相關數據

  #- - add_process_metadata: ~

#================================ Logging =====================================

# Sets log level. The default log level is info.

# Available log levels are: error, warning, info, debug

#logging.level: debug

# At debug level, you can selectively enable logging only for some components.

# To enable all selectors use ["*"]. Examples of other selectors are "beat",

# "publish", "service".

#logging.selectors: ["*"]

#============================== Xpack Monitoring ===============================

# filebeat can export internal metrics to a central Elasticsearch monitoring

# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The

# reporting is disabled by default.

# Set to true to enable the monitoring reporter.

#xpack.monitoring.enabled: false

# Uncomment to send the metrics to Elasticsearch. Most settings from the

# Elasticsearch output are accepted here as well. Any setting that is not set is

# automatically inherited from the Elasticsearch output configuration, so if you

# have the Elasticsearch output configured, you can simply uncomment the

# following line.

#xpack.monitoring.elasticsearch:

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM