Prometheus監控神技--自動發現配置


一、自動發現類型

在上一篇文中留了一個坑

 

監控某個statefulset服務的時候,我在service文件中定義了個EP,然后把pod的ip寫死在配置文件中,這樣,當pod重啟后,IP地址變化,就監控不到數據了,這肯定是不合理的。

如果在我們的 Kubernetes 集群中有了很多的 Service/Pod,那么我們都需要一個一個的去建立一個對應的 ServiceMonitor 對象來進行監控嗎?這樣豈不是也很麻煩么?

為解決上面的問題,Prometheus Operator 為我們提供了一個額外的抓取配置的來解決這個問題,我們可以通過額外的配置來獲取k8s的資源監控(pod、service、node等)。

promethues支持多種文件發現。

 

 其中通過kubernetes_sd_configs,可以達到我們想要的目的,監控其各種資源。kubernetes SD 配置允許從kubernetes REST API接受搜集指標,且總是和集群保持同步狀態,以下任何一種role類型都能夠配置來發現我們想要的對象,來自官網翻譯的。

1、Node

Node role發現每個集群中的目標是通過默認的kubelet的HTTP端口。目標地址默認是kubernetes如下地址中node的第一個地址(NodeInternalIPNodeExternalIP,NodeLegacyHostIP, and NodeHostName.)

可用的meta標簽有:

__meta_kubernetes_node_name: The name of the node object.
__meta_kubernetes_node_label_<labelname>: Each label from the node object.
__meta_kubernetes_node_labelpresent_<labelname>: true for each label from the node object.
__meta_kubernetes_node_annotation_<annotationname>: Each annotation from the node object.
__meta_kubernetes_node_annotationpresent_<annotationname>: true for each annotation from the node object.
__meta_kubernetes_node_address_<address_type>: The first address for each node address type, if it exists.

此外,node的實例標簽將會被設置成從API server傳遞過來的node的name。

2、Service

service角色會為每個服務發現一個服務端口。對於黑盒監控的服務,這個比較有用。address將會被設置成service的kubernetes DNS名稱以及各自的服務端口。

Available meta labels:

__meta_kubernetes_namespace: The namespace of the service object.
__meta_kubernetes_service_annotation_<annotationname>: Each annotation from the service object.
__meta_kubernetes_service_annotationpresent_<annotationname>: "true" for each annotation of the service object.
__meta_kubernetes_service_cluster_ip: The cluster IP address of the service. (Does not apply to services of type ExternalName)
__meta_kubernetes_service_external_name: The DNS name of the service. (Applies to services of type ExternalName)
__meta_kubernetes_service_label_<labelname>: Each label from the service object.
__meta_kubernetes_service_labelpresent_<labelname>: true for each label of the service object.
__meta_kubernetes_service_name: The name of the service object.
__meta_kubernetes_service_port_name: Name of the service port for the target.
__meta_kubernetes_service_port_protocol: Protocol of the service port for the target.

  

3、Pod

Pod role會發現所有pods以及暴露的容器作為target。每個容器聲明一個端口,一個單獨的target就會生成。如果一個容器沒有指定端口,通過relabel手動指定一個端口,一個port-free target容器將會生成。

Available meta labels:

__meta_kubernetes_namespace: The namespace of the pod object.
__meta_kubernetes_pod_name: The name of the pod object.
__meta_kubernetes_pod_ip: The pod IP of the pod object.
__meta_kubernetes_pod_label_<labelname>: Each label from the pod object.
__meta_kubernetes_pod_labelpresent_<labelname>: truefor each label from the pod object.
__meta_kubernetes_pod_annotation_<annotationname>: Each annotation from the pod object.
__meta_kubernetes_pod_annotationpresent_<annotationname>: true for each annotation from the pod object.
__meta_kubernetes_pod_container_init: true if the container is an InitContainer
__meta_kubernetes_pod_container_name: Name of the container the target address points to.
__meta_kubernetes_pod_container_port_name: Name of the container port.
__meta_kubernetes_pod_container_port_number: Number of the container port.
__meta_kubernetes_pod_container_port_protocol: Protocol of the container port.
__meta_kubernetes_pod_ready: Set to true or false for the pod's ready state.
__meta_kubernetes_pod_phase: Set to Pending, Running, Succeeded, Failed or Unknown in the lifecycle.
__meta_kubernetes_pod_node_name: The name of the node the pod is scheduled onto.
__meta_kubernetes_pod_host_ip: The current host IP of the pod object.
__meta_kubernetes_pod_uid: The UID of the pod object.
__meta_kubernetes_pod_controller_kind: Object kind of the pod controller.
__meta_kubernetes_pod_controller_name: Name of the pod controller.

  

4、endpoints

endpoints role從每個服務監聽的endpoints發現。每個endpoint都會發現一個port。如果endpoint是一個pod,所有包含的容器不被綁定到一個endpoint port,也會被targets被發現。

Available meta labels:

__meta_kubernetes_namespace: The namespace of the endpoints object.
__meta_kubernetes_endpoints_name: The names of the endpoints object.
For all targets discovered directly from the endpoints list (those not additionally inferred from underlying pods), the following labels are attached:
__meta_kubernetes_endpoint_hostname: Hostname of the endpoint.
__meta_kubernetes_endpoint_node_name: Name of the node hosting the endpoint.
__meta_kubernetes_endpoint_ready: Set to true or false for the endpoint's ready state.
__meta_kubernetes_endpoint_port_name: Name of the endpoint port.
__meta_kubernetes_endpoint_port_protocol: Protocol of the endpoint port.
__meta_kubernetes_endpoint_address_target_kind: Kind of the endpoint address target.
__meta_kubernetes_endpoint_address_target_name: Name of the endpoint address target.
If the endpoints belong to a service, all labels of the role: service discovery are attached.
For all targets backed by a pod, all labels of the role: pod discovery are attached.

  

5、ingress

ingress role將會發現每個ingress。ingress在黑盒監控上比較有用。address將會被設置成ingress指定的配置。

Available meta labels:

__meta_kubernetes_namespace: The namespace of the ingress object.
__meta_kubernetes_ingress_name: The name of the ingress object.
__meta_kubernetes_ingress_label_<labelname>: Each label from the ingress object.
__meta_kubernetes_ingress_labelpresent_<labelname>: true for each label from the ingress object.
__meta_kubernetes_ingress_annotation_<annotationname>: Each annotation from the ingress object.
__meta_kubernetes_ingress_annotationpresent_<annotationname>: true for each annotation from the ingress object.
__meta_kubernetes_ingress_scheme: Protocol scheme of ingress, https if TLS config is set. Defaults to http.
__meta_kubernetes_ingress_path: Path from ingress spec. Defaults to /.

  

二、自動發現Pod配置

比如業務上有一個微服務,類型為statefulset,啟動后是2個pod的副本集,pod暴露的數據接口為http://pod_ip:7000/metrics。由於pod每次重啟后,ip都會變化,所以只能通過自動發現的方式獲取數據。

apiVersion: apps/v1

kind: StatefulSet
metadata:
  labels:
    run: jx3recipe
  name: jx3recipe
  annotations:
    prometheus.io/scrape: "true"
spec:
  selector:
    matchLabels:
      app: jx3recipe
  serviceName: jx3recipe-service
  replicas: 2
  template:
    metadata:
      labels:
        app: jx3recipe
        appCluster: jx3recipe-cluster
    spec:
      terminationGracePeriodSeconds: 20
      containers:
      - image: hub.kce.ooo.com/jx3pvp/jx3recipe:qa-latest
        imagePullPolicy: Always
        securityContext:
          runAsUser: 1000
        name: jx3recipe
        lifecycle:
          preStop:
            exec:
              command: ["kill","-s","SIGINT","1"]
        volumeMounts:
        - name: config-volume
          mountPath: /data/conf.yml
          subPath: conf.yml
        resources:
          requests:
            cpu: "100m"
            memory: "500Mi"
        env:
        - name: JX3PVP_ENV
          value: "qa"
        - name: JX3PVP_RUN_MODE
          value: "k8s"
        - name: JX3PVP_SERVICE_ID
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: JX3PVP_LOCAL_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: JX3PVP_CONSUL_IP
          value: $(CONSUL_AGENT_SERVICE_HOST)
        ports:
            - name: biz
              containerPort: 8000
              protocol: "TCP"
            - name: admin
              containerPort: 7000
              protocol: "TCP"
      volumes:
        - name: config-volume
          configMap:
            name: app-configure-file-jx3recipe
            items:
            - key: jx3recipe.yml
              path: conf.yml

  

1、創建發現規則

設定發現pod規則:文件名為promethues-additional.yaml

  • pod名稱的label為jx3recipe
  • pod的label_appCluster匹配為 jx3recipe-cluster
  • pod的address為http://.*:7000/metrics格式
- job_name: 'kubernetes-service-pod'
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_container_name]
    action: replace
    target_label: jx3recipe
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - source_labels:  ["__meta_kubernetes_pod_label_appCluster"]
    regex: "jx3recipe-cluster"
    action: keep
  - source_labels: [__address__]
    action: keep
    regex: '(.*):7000'

  

2、創建對應的Secret對象

kubectl create secret generic additional-configs --from-file=prometheus-additional.yaml -n monitoring

 

創建完成后,會將上面配置信息進行 base64 編碼后作為 prometheus-additional.yaml 這個 key 對應的值存在:

apiVersion: v1
data:
  prometheus-additional.yaml: LSBqb2JfbmFtZTogJ2t1YmVybmV0ZXMtc2VydmljZS1wb2QnCiAga3ViZXJuZXRlc19zZF9jb25maWdzOgogIC0gcm9sZTogcG9kCiAgcmVsYWJlbF9jb25maWdzOgogIC0gc291cmNlX2xhYmVsczogW19fbWV0YV9rdWJlcm5ldGVzX3BvZF9jb250YWluZXJfbmFtZV0KICAgIGFjdGlvbjogcmVwbGFjZQogICAgdGFyZ2V0X2xhYmVsOiBqeDNyZWNpcGUKICAtIGFjdGlvbjogbGFiZWxtYXAKICAgIHJlZ2V4OiBfX21ldGFfa3ViZXJuZXRlc19wb2RfbGFiZWxfKC4rKQogIC0gc291cmNlX2xhYmVsczogIFsiX19tZXRhX2t1YmVybmV0ZXNfcG9kX2xhYmVsX2FwcENsdXN0ZXIiXQogICAgcmVnZXg6ICJqeDNyZWNpcGUtY2x1c3RlciIKICAgIGFjdGlvbjoga2VlcAogIC0gc291cmNlX2xhYmVsczogW19fYWRkcmVzc19fXQogICAgYWN0aW9uOiBrZWVwCiAgICByZWdleDogJyguKik6NzAwMCcK
kind: Secret
metadata:
  creationTimestamp: "2019-09-10T09:32:22Z"
  name: additional-configs
  namespace: monitoring
  resourceVersion: "1004681"
  selfLink: /api/v1/namespaces/monitoring/secrets/additional-configs
  uid: e455d657-d3ad-11e9-95b4-fa163e3c10ff
type: Opaque

  然后我們只需要在聲明 prometheus 的資源對象文件中添加上這個額外的配置:(prometheus-prometheus.yaml)

3、promethues添加資源對象

修改prometheus-prometheus.yaml文件

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  labels:
    prometheus: k8s
  name: k8s
  namespace: monitoring
spec:
  alerting:
    alertmanagers:
    - name: alertmanager-main
      namespace: monitoring
      port: web
  baseImage: quay.io/prometheus/prometheus
  nodeSelector:
    beta.kubernetes.io/os: linux
  replicas: 2
  secrets:
  - etcd-certs
  resources:
    requests:
      memory: 400Mi
  ruleSelector:
    matchLabels:
      prometheus: k8s
      role: alert-rules
  securityContext:
    fsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  additionalScrapeConfigs:
    name: additional-configs
    key: prometheus-additional.yaml
  serviceAccountName: prometheus-k8s
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector: {}
  version: v2.5.0

  增加了下面這一段:

additionalScrapeConfigs:
    name: additional-configs
    key: prometheus-additional.yaml

  

4、應用配置

kubectl apply -f prometheus-prometheus.yaml

  過一段時間,刷新promethues上的config,將會看到下面紅色框框的配置。

 

 

5、添加權限

 在 Prometheus Dashboard 的配置頁面下面我們可以看到已經有了對應的的配置信息了,但是我們切換到 targets 頁面下面卻並沒有發現對應的監控任務,查看 Prometheus 的 Pod 日志:

 

 可以看到有很多錯誤日志出現,都是xxx is forbidden,這說明是 RBAC 權限的問題,通過 prometheus 資源對象的配置可以知道 Prometheus 綁定了一個名為 prometheus-k8s 的 ServiceAccount 對象,而這個對象綁定的是一個名為 prometheus-k8s 的 ClusterRole:(prometheus-clusterRole.yaml)

修改為:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-k8s
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - services
  - endpoints
  - pods
  - nodes/proxy
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - configmaps
  - nodes/metrics
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get

  更新上面的 ClusterRole 這個資源對象,然后重建下 Prometheus 的所有 Pod,正常就可以看到 targets 頁面下面有 kubernetes-service-pod這個監控任務了:

 至此,一個自動發現pod的配置就完成了,其他資源(service、endpoint、ingress、node同樣也可以通過自動發現的方式實現。)


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM