EFK(Elasticsearch+Filebeat+Kibana) 收集K8s容器日志


一、Kubernetes日志采集難點

  在 Kubernetes 中,日志采集相比傳統虛擬機、物理機方式要復雜很多,最根本的原因是 Kubernetes 把底層異常屏蔽,提供更加細粒度的資源調度,向上提供穩定、動態的環境。因此日志采集面對的是更加豐富、動態的環境,需要考慮的點也更加的多。

  1. 對於運行時間很短的 Job 類應用,從啟動到停止只有幾秒的時間,如何保證日志采集的實時性能夠跟上而且數據不丟? 
  2. K8s 一般推薦使用大規格節點,每個節點可以運行 10-100+ 的容器,如何在資源消耗盡可能低的情況下采集 100+ 的容器?
  3. 在 K8s 中,應用都以 yaml 的方式部署,而日志采集還是以手工的配置文件形式為主,如何能夠讓日志采集以 K8s 的方式進行部署?

二、Kubernetes日志采集方式

  1. 業務直寫:在應用中集成日志采集的 SDK,通過 SDK 直接將日志發送到服務端。這種方式省去了落盤采集的邏輯,也不需要額外部署 Agent,對於系統的資源消耗最低,但由於業務和日志 SDK 強綁定,整體靈活性很低,一般只有日志量極大的場景中使用;
  2. DaemonSet 方式:在每個 node 節點上只運行一個日志 agent,采集這個節點上所有的日志。DaemonSet 相對資源占用要小很多,但擴展性、租戶隔離性受限,比較適用於功能單一或業務不是很多的集群;
  3. Sidecar 方式:為每個 POD 單獨部署日志 agent,這個 agent 只負責一個業務應用的日志采集。Sidecar 相對資源占用較多,但靈活性以及多租戶隔離性較強,建議大型的 K8s 集群或作為 PaaS 平台為多個業務方服務的集群使用該方式。

總結:

  1. 業務直寫推薦在日志量極大的場景中使用

  2. DaemonSet一般在中小型集群中使用

  3. Sidecar推薦在超大型的集群中使用

  實際應用場景中,一般都是使用 DaemonSet 或 DaemonSet 與 Sidecar 混用方式,DaemonSet 的優勢是資源利用率高,但有一個問題是 DaemonSet 的所有 Logtail 都共享全局配置,而單一的 Logtail 有配置支撐的上限,因此無法支撐應用數比較多的集群。

  • 一個配置盡可能多的采集同類數據,減少配置數,降低 DaemonSet 壓力;
  • 核心的應用采集要給予充分的資源,可以使用 Sidecar 方式;
  • 配置方式盡可能使用 CRD 方式;
  • Sidecar 由於每個 Logtail 是單獨的配置,所以沒有配置數的限制,這種比較適合於超大型的集群使用。

三、以SeatefulSet的方式創建單節點elasticsearch的yaml文件

# cat elasticsearch.yaml 

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: kube-system
  labels:
    k8s-app: elasticsearch
spec:
  serviceName: elasticsearch
  selector:
    matchLabels:
      k8s-app: elasticsearch
  template:
    metadata:
      labels:
        k8s-app: elasticsearch
    spec:
      initContainers:
      - name: busybox
        imagePullPolicy: IfNotPresent
        image: busybox:latest
        securityContext:
          privileged: true
        command: ["sh", "-c", "mkdir -p /usr/share/elasticsearch/data/logs;chown -R 1000:1000 /usr/share/elasticsearch/data"]
        volumeMounts:
        - name: elasticsearch-data
          mountPath: "/usr/share/elasticsearch/data"

      containers:
      - image: elasticsearch:7.10.1
        name: elasticsearch
        resources:
          limits:
            cpu: 1
            memory: 2Gi
          requests:
            cpu: 0.5 
            memory: 500Mi
        env:
          - name: "discovery.type"
            value: "single-node"
          - name: ES_JAVA_OPTS
            value: "-Xms512m -Xmx2g" 
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        volumeMounts:
        - name: elasticsearch-data
          mountPath: /usr/share/elasticsearch/data
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-data
    spec:
      storageClassName: "disk-sc"
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1000Gi

---
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  namespace: kube-system
spec:
  clusterIP: None
  ports:
  - port: 9200
    protocol: TCP
    targetPort: db
  selector:
    k8s-app: elasticsearch

 四、以Deployment的方式創建kibana的yaml文件

# cat kibana.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: kube-system
  labels:
    k8s-app: kibana
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: kibana
  template:
    metadata:
      labels:
        k8s-app: kibana
    spec:
      containers:
      - name: kibana
        image: kibana:7.10.1
        resources:
          limits:
            cpu: 1
            memory: 500Mi
          requests:
            cpu: 0.5 
            memory: 200Mi
        env:
          - name: ELASTICSEARCH_HOSTS
            value: http://elasticsearch-0.elasticsearch.kube-system:9200
          - name: I18N_LOCALE
            value: zh-CN
        ports:
        - containerPort: 5601
          name: ui
          protocol: TCP

---  # 使用雲廠商的負載均衡
apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: kube-system
  annotations:
    service.kubernetes.io/qcloud-loadbalancer-internal-subnetid: subnet-cadefsb
spec:
  ports:
  - name: kibana-pvc
    protocol: TCP
    port: 5601
    targetPort: 5601
  selector:
    k8s-app: kibana
  type: LoadBalancer

五、以DaemonSet的方式創建filebeat的yaml文件

# cat filebeat-kubernetes.yaml 
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.config:
      inputs:
        # Mounted `filebeat-inputs` configmap:
        path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false
    filebeat.autodiscover:       # 使用filebeat自動發現的方式
      providers:
        - type: kubernetes
          templates:
            - condition:
                equals:
                  kubernetes.namespace: prod      # 收集prod命名空間的日志
              config:
                - type: log      # 日志類型為log而非docker或者container,因為我們輸出的日志非json格式。
                  containers.ids:
                    - "${data.kubernetes.container.id}"
                  paths:
# 收集日志的路徑,如果設置為"/var/lib/kubelet/pods/**/*info.log",則會收集多余的日志,使用雲k8s,所以非/var/lib/docker/containers/目錄 - "/var/lib/kubelet/pods/${data.kubernetes.pod.uid}/volumes/kubernetes.io~empty-dir/data-vol/log/java/*/*info.log" encoding: utf-8 scan_frequency: 1s # 掃描新文件的時間間隔,默認為10秒 tail_files: true # 設置為true時filebeat從每個文件的末尾取新文件,而不是開始。首次啟用filebeat時忽略舊的日志,下次起動時關閉此選項。 fields_under_root: true # 設置為true后,fields存儲在輸出文檔的頂級位置 fields: type: "prod-info" - type: kubernetes templates: - condition: equals: kubernetes.namespace: prod config: - type: log containers.ids: - "${data.kubernetes.container.id}" paths: - "/var/lib/kubelet/pods/${data.kubernetes.pod.uid}/volumes/kubernetes.io~empty-dir/data-vol/log/java/*/*error.log" encoding: utf-8 scan_frequency: 1s tail_files: true fields_under_root: true fields: type: "prod-error" multiline.type: pattern multiline.pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:' multiline.negate: false multiline.match: after - type: kubernetes templates: - condition: equals: kubernetes.namespace: test config: - type: log containers.ids: - "${data.kubernetes.container.id}" paths: - "/var/lib/kubelet/pods/${data.kubernetes.pod.uid}/volumes/kubernetes.io~empty-dir/data-vol/log/java/*/*info.log" encoding: utf-8 scan_frequency: 1s tail_files: true fields_under_root: true fields: type: "test-info" - type: kubernetes templates: - condition: equals: kubernetes.namespace: test config: - type: log containers.ids: - "${data.kubernetes.container.id}" paths: - "/var/lib/kubelet/pods/${data.kubernetes.pod.uid}/volumes/kubernetes.io~empty-dir/data-vol/log/java/*/*error.log" encoding: utf-8 scan_frequency: 1s tail_files: true fields_under_root: true fields: type: "test-error" multiline.type: pattern multiline.pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:' # 標准的java多行匹配 multiline.negate: false multiline.match: after setup.ilm.enabled: false output.elasticsearch: hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}'] indices: - index: "k8s-test-info-%{+yyyy.MM.dd}" when.contains: type: "test-info" - index: "k8s-test-error-%{+yyyy.MM.dd}" when.contains: type: "test-error" - index: "k8s-prod-info-%{+yyyy.MM.dd}" when.contains: type: "prod-info" - index: "k8s-prod-error-%{+yyyy.MM.dd}" when.contains: type: "prod-error"
# output.redis: # 將日志輸出到redis中,再用logstash收日志輸出到es中 # hosts: ["192.168.5.99:6379"] # db: 0 # password: "password" # key: "default_list" # keys: # - key: "k8s-prod-error-%{+yyyy.MM.dd}" # when.contains: # type: "prod-error" # - key: "k8s-prod-info-%{+yyyy.MM.dd}" # when.contains: # type: "prod-info" --- # 此configmap沒有使用,可以將上面的配置文件粘貼到下面,也可刪除 apiVersion: v1 kind: ConfigMap metadata: name: filebeat-inputs namespace: kube-system labels: k8s-app: filebeat data: kubernetes.yml: |- #- type: log # paths: # - "/var/lib/kubelet/pods/**/*info.log" #processors: # - add_kubernetes_metadata: # default_indexers.enabled: false # default_matchers.enabled: false # in_cluster: true # indexers: # - ip_port: # matchers: # - field_format: # format: '%{[destination.ip]}:%{[destination.port]}' # matchers: # - logs_path: # logs_path: '/var/log/pods' # resource_type: 'pod' --- apiVersion: apps/v1 kind: DaemonSet metadata: name: filebeat namespace: kube-system labels: k8s-app: filebeat spec: selector: matchLabels: k8s-app: filebeat template: metadata: labels: k8s-app: filebeat spec: serviceAccountName: filebeat terminationGracePeriodSeconds: 30 containers: - name: filebeat image: elastic/filebeat:7.10.1 args: [ "-c", "/etc/filebeat.yml", "-e", ] env: - name: ELASTICSEARCH_HOST value: elasticsearch-0.elasticsearch.kube-system - name: ELASTICSEARCH_PORT value: "9200" securityContext: runAsUser: 0 # If using Red Hat OpenShift uncomment this: #privileged: true resources: limits: memory: 6144Mi requests: cpu: 100m memory: 100Mi volumeMounts: - name: config mountPath: /etc/filebeat.yml readOnly: true subPath: filebeat.yml - name: inputs mountPath: /usr/share/filebeat/inputs.d readOnly: true - name: data mountPath: /usr/share/filebeat/data - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true - name: varlibkubeletpods mountPath: /var/lib/kubelet/pods readOnly: true volumes: - name: config configMap: defaultMode: 0600 name: filebeat-config - name: varlibkubeletpods hostPath: path: /var/lib/kubelet/pods - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers - name: inputs configMap: defaultMode: 0600 name: filebeat-inputs # data folder 存儲所有文件的讀取狀態注冊表,因此不會在filebeat pod重啟后再次發送所有的文件。 - name: data hostPath: path: /var/lib/filebeat-data type: DirectoryOrCreate --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: filebeat subjects: - kind: ServiceAccount name: filebeat namespace: kube-system roleRef: kind: ClusterRole name: filebeat apiGroup: rbac.authorization.k8s.io --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: filebeat labels: k8s-app: filebeat rules: - apiGroups: [""] # "" indicates the core API group resources: - namespaces - pods verbs: - get - watch - list --- apiVersion: v1 kind: ServiceAccount metadata: name: filebeat namespace: kube-system labels: k8s-app: filebeat ---

六、參考信息及報錯

參考鏈接 :https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover.html

以下報錯是因為日志格式非json類型

2021-02-16T12:23:33.920Z	ERROR	[reader_docker_json]	readjson/docker_json.go:204	Parse line error: invalid CRI log format
2021-02-16T12:23:33.920Z	INFO	log/harvester.go:335	Skipping unparsable line in file: /var/lib/kubelet/pods/694538b6-4629-4903-804e-8e9fc36ced4a/volumes/kubernetes.io~empty-dir/data-vol/log/java/daemon/daemon-info.log

 配置信息如下:

  filebeat.yml: |-
    filebeat.inputs:
    - type: container      # 日志類型必須為json格式日志才可以正常收集,k8s轉到前台的日志可以正常收集。
      paths:
        - "/var/lib/kubelet/pods/*/volumes/kubernetes.io~empty-dir/data-vol/log/java/*/*.log" 

 七、filebeat排除某些不需要的字段和容器項目日志

    processors:
      - drop_fields:
          fields: ["agent.ephemeral_id","agent.id","agent.name","agent.type","container.runtime","ecs.version","host.hostname","host.name","kubernetes.labels.pod-template-hash","host.os.version","agent.version","container.image.name","container.id","kubernetes.pod.uid","kubernetes.replicaset.name"]
          ignore_missing: false
      - add_kubernetes_metadata:
          in_cluster: true
      - drop_event.when:
          or:
          - equals:
              kubernetes.labels.app: "tapd-page"

結果展示:

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM