jdk版本用的是11
测试用了两台虚拟机,一台搭建elk,另外一台安装filebeat
我的测试java程序放到了gitee上面,其中的collector文件夹下有相应的程序
https://gitee.com/guoanhao/Architect-Stage-Kafka.git
第一步 安装elk7.6.0(第一台机器)
docker pull sebp/elk:760
docker images
运行elk
1,运行之前,需要编辑两个文件
vi /etc/sysctl.conf #添加如下代码 vm.max_map_count=262144 #刷新一下,否则不生效 sysctl -p
第二个文件
vim /etc/security/limits.conf
添加以下内容,注意要添加在End of file之前,否则报错
* soft nofile 65536 * hard nofile 131072 * soft nproc 2048 * hard nproc 4096
2,运行,设置一下ES_MIN_MEM和ES_MAX_MEM,因为我的虚拟机内存比较小。
docker run -p 5601:5601 -p 9200:9200 -p 5044:5044 -itd -e ES_MIN_MEM=128m -e ES_MAX_MEM=128m --name elk sebp/elk:760
# 查看elk日志,会输出很多信息,不太容易确定是否启动成功了。 # 等到日志信息不再增加了,说明启动好了 docker logs -f -t elk # 通过docker ps命令查看,也不足以证明没有问题。
怎么确定启动成功呢?访问一下下面的地址,出现页面就是启动成功了。
# elasticsearch地址 http://192.168.186.135:9200/ # kibana地址 http://192.168.186.135:5601/
# 或者这样检查elasticsearch
curl 127.0.0.1:9200
启动后有可能报错,我们修改一些配置文件就可以了
# 进入容器 docker exec -it elk bash # 进入目录 cd /etc/logstash/conf.d # 编辑文件,将ssl相关的配置删除,如下图 vim 02-beats-input.conf # 退出容器,并且重启elk exit docker restart elk
第二步,搭建filebeat(第二台机器)
下载与elk相同版本的filebeat7.6.0,我在华为开源镜像站上下载的。上传到第二台服务器上(我放到了/home/software下)
# 也可以直接通过wget安装 wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.6.0-linux-x86_64.tar.gz
1. 解压到/usr/local目录下
tar -zxvf filebeat-7.6.0-linux-x86_64.tar.gz -C /usr/local/
2. 配置filebeat(filebeat.yml)
cd /usr/local/filebeat-7.6.0-linux-x86_64
vim filebeat.yml
配置信息如下
-
配置日志收集路径
-
关闭filebeat输出到elasticsearch
-
打开filebeat输出到logstash的配置(在logstash中配置输出到elasticsearch)
-
配置文件注意缩进
-
这里我用的filebeat是7.6版本,不同于6.0之前的,这里没有了document_type,我们使用fields:service:来代替,logstash配置文件也要相应的修改。
版本更新踩的坑: https://blog.51cto.com/kexiaoke/2092029
filebeat.inputs: # Each - is an input. Most options can be set at the input level, so # you can use different inputs for various configurations. # Below are the input specific configurations. - type: log # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: # - /var/log/*.log - /home/software/logs/app-collector.log #- c:\programdata\elasticsearch\logs\* fields: service: app-log - type: log enabled: true paths: - /home/software/logs/error-collector.log #- c:\programdata\elasticsearch\logs\* fields: service: error-log # output.elasticsearch: # Array of hosts to connect to. # hosts: ["localhost:9200"] output.logstash: # The Logstash hosts hosts: ["192.168.186.135:5044"]
配置文件中我们配置了两个- type: log,作用是将不同的log文件输出到logstash中,并且可以在logstash中通过fields:service: app-log或者fields:service: error-log确定是哪一个log,以便elasticsearch以不同的index输出。详见下面的logstash配置文件。
我的filebeat.yml如下

###################### Filebeat Configuration Example ######################### # This file is an example configuration file highlighting only the most common # options. The filebeat.reference.yml file from the same directory contains all the # supported options with more comments. You can use it as a reference. # # You can find the full configuration reference here: # https://www.elastic.co/guide/en/beats/filebeat/index.html # For more available modules and options, please see the filebeat.reference.yml sample # configuration file. #=========================== Filebeat inputs ============================= filebeat.inputs: # Each - is an input. Most options can be set at the input level, so # you can use different inputs for various configurations. # Below are the input specific configurations. - type: log # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: # - /var/log/*.log - /home/software/logs/app-collector.log #- c:\programdata\elasticsearch\logs\* multiline: #pattern: '^\s*(\d{4}|\d{2})\-(\d{2}|[a-zA-Z]{3})\-(\d{2}|\d{4})' # 指定匹配的表达式(匹配以 2017-11-15 08:04:23:889 时间格式开头的字符串) pattern: '^\[' # 指定匹配的表达式(匹配以 "{ 开头的字符串) negate: true # 是否匹配到 match: after # 合并到上一行的末尾 max_lines: 2000 # 最大的行数 timeout: 2s # 如果在规定时间没有新的日志事件就不等待后面的日志 fields: service: app-log - type: log enabled: true paths: - /home/software/logs/error-collector.log #- c:\programdata\elasticsearch\logs\* multiline: #pattern: '^\s*(\d{4}|\d{2})\-(\d{2}|[a-zA-Z]{3})\-(\d{2}|\d{4})' # 指定匹配的表达式(匹配以 2017-11-15 08:04:23:889 时间格式开头的字符串) pattern: '^\[' # 指定匹配的表达式(匹配以 "{ 开头的字符串) negate: true # 是否匹配到 match: after # 合并到上一行的末尾 max_lines: 2000 # 最大的行数 timeout: 2s # 如果在规定时间没有新的日志事件就不等待后面的日志 fields: service: error-log # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^DBG'] # Include lines. A list of regular expressions to match. It exports the lines that are # matching any regular expression from the list. #include_lines: ['^ERR', '^WARN'] # Exclude files. A list of regular expressions to match. Filebeat drops the files that # are matching any regular expression from the list. By default, no files are dropped. #exclude_files: ['.gz$'] # Optional additional fields. These fields can be freely picked # to add additional information to the crawled log files for filtering #fields: # level: debug # review: 1 ### Multiline options # Multiline can be used for log messages spanning multiple lines. This is common # for Java Stack Traces or C-Line Continuation # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [ #multiline.pattern: ^\[ # Defines if the pattern set under pattern should be negated or not. Default is false. #multiline.negate: false # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern # that was (not) matched before or after or as long as a pattern is not matched based on negate. # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash #multiline.match: after #============================= Filebeat modules =============================== filebeat.config.modules: # Glob pattern for configuration loading path: ${path.config}/modules.d/*.yml # Set to true to enable config reloading reload.enabled: false # Period on which files under path should be checked for changes #reload.period: 10s #==================== Elasticsearch template setting ========================== setup.template.settings: index.number_of_shards: 1 #index.codec: best_compression #_source.enabled: false #================================ General ===================================== # The name of the shipper that publishes the network data. It can be used to group # all the transactions sent by a single shipper in the web interface. #name: # The tags of the shipper are included in their own field with each # transaction published. #tags: ["service-X", "web-tier"] # Optional fields that you can specify to add additional information to the # output. #fields: # env: staging #============================== Dashboards ===================================== # These settings control loading the sample dashboards to the Kibana index. Loading # the dashboards is disabled by default and can be enabled either by setting the # options here or by using the `setup` command. #setup.dashboards.enabled: false # The URL from where to download the dashboards archive. By default this URL # has a value which is computed based on the Beat name and version. For released # versions, this URL points to the dashboard archive on the artifacts.elastic.co # website. #setup.dashboards.url: #============================== Kibana ===================================== # Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API. # This requires a Kibana endpoint configuration. setup.kibana: # Kibana Host # Scheme and port can be left out and will be set to the default (http and 5601) # In case you specify and additional path, the scheme is required: http://localhost:5601/path # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601 #host: "localhost:5601" # Kibana Space ID # ID of the Kibana Space into which the dashboards should be loaded. By default, # the Default Space will be used. #space.id: #============================= Elastic Cloud ================================== # These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/). # The cloud.id setting overwrites the `output.elasticsearch.hosts` and # `setup.kibana.host` options. # You can find the `cloud.id` in the Elastic Cloud web UI. #cloud.id: # The cloud.auth setting overwrites the `output.elasticsearch.username` and # `output.elasticsearch.password` settings. The format is `<user>:<pass>`. #cloud.auth: #================================ Outputs ===================================== # Configure what output to use when sending the data collected by the beat. #-------------------------- Elasticsearch output ------------------------------ # output.elasticsearch: # Array of hosts to connect to. # hosts: ["localhost:9200"] # Protocol - either `http` (default) or `https`. #protocol: "https" # Authentication credentials - either API key or username/password. #api_key: "id:api_key" #username: "elastic" #password: "changeme" #----------------------------- Logstash output -------------------------------- output.logstash: # The Logstash hosts hosts: ["192.168.186.135:5044"] # Optional SSL. By default is off. # List of root certificates for HTTPS server verifications #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"] # Certificate for SSL client authentication #ssl.certificate: "/etc/pki/client/cert.pem" # Client Certificate Key #ssl.key: "/etc/pki/client/cert.key" #================================ Processors ===================================== # Configure processors to enhance or manipulate events generated by the beat. processors: - add_host_metadata: ~ - add_cloud_metadata: ~ - add_docker_metadata: ~ - add_kubernetes_metadata: ~ #================================ Logging ===================================== # Sets log level. The default log level is info. # Available log levels are: error, warning, info, debug #logging.level: debug # At debug level, you can selectively enable logging only for some components. # To enable all selectors use ["*"]. Examples of other selectors are "beat", # "publish", "service". #logging.selectors: ["*"] #============================== X-Pack Monitoring =============================== # filebeat can export internal metrics to a central Elasticsearch monitoring # cluster. This requires xpack monitoring to be enabled in Elasticsearch. The # reporting is disabled by default. # Set to true to enable the monitoring reporter. #monitoring.enabled: false # Sets the UUID of the Elasticsearch cluster under which monitoring data for this # Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch # is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch. #monitoring.cluster_uuid: # Uncomment to send the metrics to Elasticsearch. Most settings from the # Elasticsearch output are accepted here as well. # Note that the settings should point to your Elasticsearch *monitoring* cluster. # Any setting that is not set is automatically inherited from the Elasticsearch # output configuration, so if you have the Elasticsearch output configured such # that it is pointing to your Elasticsearch monitoring cluster, you can simply # uncomment the following line. #monitoring.elasticsearch: #================================= Migration ================================== # This allows to enable 6.7 migration aliases #migration.6_to_7.enabled: true
3.启动filebeat
# 进入filebeat目录 cd /usr/local/filebeat-7.6.0-linux-x86_64 # 运行filebeat nohup ./filebeat -e -c filebeat.yml > filebeat.log & # 查看运行状态 ps -ef |grep filebeat # 关闭filebeat kill -9 pid
第三步 配置logstash
我们使用docker启动的elk,所以得先进入elk容器然后再进行配置
# 进入容器 docker exec -it elk bash #进入logstash配置文件目录 /etc/logstash/conf.d #编辑02-beats-input.conf文件 vim 02-beats-input.conf
1. 修改配置文件,我的文件内容如下(注意缩进)
配置文件中的 index => "app-log-%{[fields][service]}-%{+YYYY.MM.dd}"
可以改为 index => "%{[fields][service]}-%{+YYYY.MM.dd}"或者 index => "app-log-%{+YYYY.MM.dd}"
input { beats { port => 5044 } } filter { ## 时区转换 ruby { code => "event.set('index_time',event.timestamp.time.localtime.strftime('%Y.%m.%d'))" } if "app-log" in [fields][service]{ grok { ## 表达式 match => ["message", "\[%{NOTSPACE:currentDateTime}\] \[%{NOTSPACE:level}\] \[%{NOTSPACE:thread-id}\] \[%{NOTSPACE:class}\] \[%{DATA:hostName}\] \[%{DATA:ip}\] \[%{DATA:applicationName}\] \[%{DATA:location}\] \[%{DATA:messageInfo}\] ## (\'\'|%{QUOTEDSTRING:throwable})"] } } if "error-log" in [fields][service]{ grok { ## 表达式 match => ["message", "\[%{NOTSPACE:currentDateTime}\] \[%{NOTSPACE:level}\] \[%{NOTSPACE:thread-id}\] \[%{NOTSPACE:class}\] \[%{DATA:hostName}\] \[%{DATA:ip}\] \[%{DATA:applicationName}\] \[%{DATA:location}\] \[%{DATA:messageInfo}\] ## (\'\'|%{QUOTEDSTRING:throwable})"] } } } # 输出到elasticsearch output { if [fields][service] == "app-log" { elasticsearch { hosts => ["192.168.186.135:9200"] index => "%{[fields][service]}-%{index_time}" } }else if [fields][service] == "error-log" { elasticsearch { hosts => ["192.168.186.135:9200"] index => "error-log-%{[fields][service]}-%{+YYYY.MM.dd}" } } }
2.重启elk
docker restart elk
3. 测试日志收集,我的测试程序放到了gitee上,需要将jar包放到/home/software文件加下启动(因为filebeat配置的收集日志的目录是/home/software/logs)
启动后生成logs文件夹
java -jar collector.jar
文件夹中有两个日志文件,一个是正常的log,一个是异常log
浏览器访问下面的路径,会在日志文件中写入相应的日志,filebeat会收集到里面的日志信息。
http://192.168.186.137:8001/index http://192.168.186.137:8001/err
打印出来的日志信息
日志可视化
浏览器访问kibana,点击management =》index Patterns=》create index pattern
http://192.168.186.135:5601/
创建index pattern,名称要与logstash中配置文件的相匹配
输入名称后,点击next
这里我选的是i don't want use the time Filter。也可以选另一个进行测试 。
选择后点击 Create Index Pattern。这样index pattern就创建好了。
查看index Pattern中的日志信息,右侧就是日志信息
可以进行搜索
另一个index Pattern,err-log-*在创建的时候我选择了@timestamp
这里会多出来一个时间选择,选择后会查询时间范围之内的日志信息
也可以选择currentDateTime
我一般情况下选择这种,具体区别我忘记了
到此配置完成 。