Kubernetes之prometheus監控


一、架構圖

二、prometheus安裝

2.1 可選的安裝方式

  • 二進制安裝              # 一般針對於物理機安裝
  • 容器安裝
  • helm安裝                 # 以下三種都是給k8s使用的
  • prometheus operator
  • kube-prometheus stack    # 是一個項目技術棧,包含:prometheus operator、高可用的prometheus、高可用的alertmanager、主機監控node exporter、grafana等

2.2 使用kube-prometheus stack安裝,下圖是各版本的支持,如果k8s版本較新,就下載個最新的release一般都會支持。

https://github.com/prometheus-operator/kube-prometheus/

 

 2.3 下載對應的安裝版本

git clone -b release-0.7 https://github.com/prometheus-operator/kube-prometheus.git

2.4安裝CRD(自定義的資源)

# cd kube-prometheus/manifests/
# kubectl create -f setup/

2.5 查看operator的狀態

# kubectl get pod -n monitoring | grep operator
prometheus-operator-7649c7454f-pkbbl        2/2     Running   0          3m

2.6 按需求修改alertmanager的副本數,默認3個高可用組件

# vim alertmanager-alertmanager.yaml 
  replicas: 1

2.7按需求修改prometheus的副本數,默認是2個

# vim prometheus-prometheus.yaml 
  replicas: 1

2.8 修改鏡像,默認的鏡像無法直接下載,可以在dockerhub中查找

# cat kube-state-metrics-deployment.yaml | grep image
        image: quay.io/coreos/kube-state-metrics:v1.9.7
        image: quay.io/brancz/kube-rbac-proxy:v0.8.0
        image: quay.io/brancz/kube-rbac-proxy:v0.8.0

2.9創建prometheus集群

# kubectl create -f .

2.10 修改prometheus和grafana的web界面為nodeport的訪問方式,因為沒有配置pvc所以數據不是持久化的。生產環境需要配置兩個的持久化存儲

# kubectl edit svc -n monitoring prometheus-k8s 
  ports: #在最下面添加type類型
  type: NodePort

# kubectl edit svc -n monitoring grafana
  type: NodePort

2.11 配置完成之后就可以通過主機的IP加端口進行訪問了

# kubectl get svc -n monitoring | egrep "grafana|prometheus-k8s"
grafana                 NodePort    10.107.73.70     <none>        3000:32351/TCP               1d
prometheus-k8s          NodePort    10.101.129.206   <none>        9090:32021/TCP               1d

三、什么是ServiceMonitor

二進制安裝、容器安裝、helm安裝通過prometheus.yml加載配置

prometheus operator、kube-prometheus stack通過ServiceMonitor發現監控目標,進行監控。serviceMonitor 是通過對service 獲取數據的一種方式。

  • promethus-operator可以通過serviceMonitor 自動識別帶有某些 label 的service ,並從這些service 獲取數據。
  • serviceMonitor 也是由promethus-operator 自動發現的。
# kubectl get servicemonitor -n monitoring node-exporter -o yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor

  selector:
    matchLabels:
      app.kubernetes.io/name: node-exporter

# kubectl get svc -n monitoring -l app.kubernetes.io/name=node-exporter 
NAME            TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
node-exporter   ClusterIP   None         <none>        9100/TCP   8d
# kubectl get ep -n monitoring node-exporter -o yaml apiVersion: v1 kind: Endpoints metadata: labels: app.kubernetes.io
/name: node-exporter app.kubernetes.io/version: v1.0.1 service.kubernetes.io/headless: "" name: node-exporter namespace: monitoring selfLink: /api/v1/namespaces/monitoring/endpoints/node-exporter subsets: - addresses: - ip: 192.168.0.21 nodeName: k8s-master targetRef: kind: Pod name: node-exporter-96jmq namespace: monitoring resourceVersion: "8821390" uid: ba368321-1c3f-483a-a747-1e1c7b709b65 - ip: 192.168.0.25 nodeName: k8s-node1 targetRef: kind: Pod name: node-exporter-qqzl2 namespace: monitoring resourceVersion: "8821365" uid: 5daf9ff7-c120-4fcc-8412-da243c1224ce ports: - name: https port: 9100 protocol: TCP

配置講解

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: etcd-k8s
  namespace: monitoring
  labels:
    app: etcd-k8s
spec:
  jobLabel: etcd-k8s
  endpoints:
    - interval: 30s
      port: etcd-port  # metrics端口 Service.spec.ports.name
      scheme: https    # metrics接口協議,http或者https
      tlsConfig:
        caFile: /etc/prometheus/secrets/etcd-ssl/etcd-ca.pem   # 證書路徑 (在prometheus pod里路徑)
        certFile: /etc/prometheus/secrets/etcd-ssl/etcd.pem
        keyFile: /etc/prometheus/secrets/etcd-ssl/etcd-key.pem
        insecureSkipVerify: true  # 關閉證書校驗
  selector:
    matchLabels:
      app: etcd-k8s  # 監控目標svc的標簽
  namespaceSelector:
    matchNames:
    - kube-system    # 監控目標svc所在的命名空間
# 匹配Kube-system這個命名空間下面具有app=etcd-k8s這個label標簽的Serve,job label用於檢索job任務名稱的標簽。由於證書serverName和etcd中簽發的證書可能不匹配,
所以添加了insecureSkipVerify=true將不再對服務端的證書進行校驗

prometheus的監控流程

四、雲原生應用ETCD的監控

4.1 本地測試etcd的metrics接口,我是使用kubeadm安裝的集群

# grep -E "key-file|cert-file" /etc/kubernetes/manifests/etcd.yaml 
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --key-file=/etc/kubernetes/pki/etcd/server.key
# curl -s --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key https://192.168.0.21:2379/metrics -k | tail -3
promhttp_metric_handler_requests_total{code="200"} 2
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0

4.2 創建etcd的service和endpoints

# cat etcd.yaml 
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    app: etcd-k8s
  name: etcd-k8s
  namespace: kube-system
subsets:
- addresses:     # etcd節點對應的主機ip,有幾台就寫幾台
  - ip: 192.168.0.21
  ports:
  - name: etcd-port
    port: 2379   # etcd端口
    protocol: TCP
---
apiVersion: v1
kind: Service 
metadata:
  labels:
    app: etcd-k8s
  name: etcd-k8s
  namespace: kube-system
spec:
  ports:
  - name: etcd-port
    port: 2379
    protocol: TCP
    targetPort: 2379
  type: ClusterIP

# kubectl create -f etcd.yaml

# kubectl get svc -n kube-system -l app=etcd-k8s     # 查找svc的ip,將上面的測試ip換成svc的地址再測試
NAME       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
etcd-k8s   ClusterIP   10.110.151.13   <none>        2379/TCP   74s
# curl -s --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key https://10.110.151.13:2379/metrics -k | tail -3
promhttp_metric_handler_requests_total{code="200"} 5
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0

4.3 將etcd的證書創建到secret中,讓prometheus進行掛載,因為是prometheus去請求etcd,必須要的prometheus在同一命名空間

kubectl create secret generic etcd-ssl --from-file=/etc/kubernetes/pki/etcd/server.crt --from-file=/etc/kubernetes/pki/etcd/server.key --from-file=/etc/kubernetes/pki/etcd/ca.crt -n monitoring

4.4 將secret掛載到prometheus的pod是

# kubectl edit prometheus k8s -n monitoring
  replicas: 2
  secrets:
  - etcd-ssl #添加secret名稱,保存退出后prometheus的pod會重啟
# kubectl get pod -n monitoring | grep prometheus-k8s
prometheus-k8s-0                       2/2     Running   1          46s
prometheus-k8s-1                       2/2     Running   1          54s

# kubectl exec -it prometheus-k8s-0 -n monitoring -- sh      # 查看是否掛載成功
/prometheus $ ls /etc/prometheus/secrets/etcd-ssl/
ca.crt  server.crt  server.key

4.4 創建ServiceMonitor將service的配置加載到Prometheus

# cat etcd-servicemonitor.yaml 
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: etcd-k8s
  namespace: monitoring
  labels:
    app: etcd-k8s
spec:
  jobLabel: app
  endpoints:
    - interval: 30s
      port: etcd-port  # kubectl get svc -n kube-system etcd-k8s -o yaml 中svc的pod名稱
      scheme: https
      tlsConfig:
        caFile: /etc/prometheus/secrets/etcd-ssl/ca.crt
        certFile: /etc/prometheus/secrets/etcd-ssl/server.crt
        keyFile: /etc/prometheus/secrets/etcd-ssl/server.key
        insecureSkipVerify: true  # 關閉證書校驗
  selector:
    matchLabels:
      app: etcd-k8s  # 跟scv的name保持一致
  namespaceSelector:
    matchNames:
    - kube-system    # 跟svc所在namespace保持一致
# kubectl create -f etcd-servicemonitor.yaml

匹配Kube-system這個命名空間下面具有app=etcd-k8s這個label標簽的Serve,job label用於檢索job任務名稱的標簽。由於證書serverName和etcd中簽發的證書可能不匹配,所以添加了insecureSkipVerify=true將不再對服務端的證書進行校驗

4.5 登錄頁面查看

4.6 導入grafana 模板

https://grafana.com/grafana/dashboards/3070

 五、非雲原生的監控exporter

我們使用MySQL並沒有部署在k8s內,使用prometheus監控k8s集群外的MySQL

5.1 創建mysql-exporter的deployment獲取mysql的監控數據

# cat mysql-exporter.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mysql-exporter-deployment
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mysql-exporter
  template:
    metadata:
      labels:
        app: mysql-exporter
    spec:
      containers:
      - name: mysql-exporter
        imagePullPolicy: IfNotPresent
        image: prom/mysqld-exporter
        env:
        - name: DATA_SOURCE_NAME
          value: "exporter:childe12#@(192.168.0.247:3306)/"
        ports:
        - containerPort: 9104
        resources:
          requests:
            cpu: 500m
            memory: 1024Mi
          limits:
            cpu: 1000m
            memory: 2048Mi
---
apiVersion: v1
kind: Service
metadata:
  name: mysql-exporter
  namespace: monitoring
  labels:
    app: mysql-exporter
spec:
  type: ClusterIP
  selector:
    app: mysql-exporter
  ports:
  - name: mysql
    port: 9104
    targetPort: 9104
    protocol: TCP

# kubectl create -f mysql-exporter.yaml
# kubectl get svc -n monitoring | grep mysql-exporter
mysql-exporter          ClusterIP   10.102.205.21    <none>        9104/TCP                     98m
# curl -s 10.102.205.21:9104/metrics | tail -1            # 通過svc的地址能獲取到mysql的監控數據即可
promhttp_metric_handler_requests_total{code="503"} 0

5.2 創建 servicemonitor

# cat mysql-servicemonitor.yaml 
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: mysql-exporter 
  namespace: monitoring
  labels:
    app: mysql-exporter
spec:
  jobLabel: mysql-monitoring
  endpoints:
    - interval: 30s
      port: mysql          # svc的名稱
      scheme: http
  selector:
    matchLabels:
      app: mysql-exporter  # 跟scv的name保持一致
  namespaceSelector:
    matchNames:
    - monitoring           # 跟svc所在namespace保持一致

# kubectl create -f mysql-servicemonitor.yaml

 5.3 在prometheus中查看數據

 

 5.4 監控失敗排查思路

  1. 確認ServiceMonitor是否創建成功
  2. 確認ServiceMonitor標簽是否匹配正確
  3. 確認在Pormetheus中是否生成了相關的配置
  4. 確認ServiceMonitor是否能匹配到Service(自己當時就沒有匹配到標簽所以查了好久)
  5. 確認通過Service是否能夠訪問/metrics接口
  6. 確認Service的端口是否和Scheme和ServiceMonitor的端口一致

六、使用靜態配置文件配置

touch prometheus-additional.yaml

kubectl create secret generic additional-configs -                                                                          -from-file=prometheus-additional.yaml -n monitoring

# kubectl describe secret additional-configs -n moni                                                                          toring
Name:         additional-configs
Namespace:    monitoring
Labels:       <none>
Annotations:  <none>

Type:  Opaque

Data
====
prometheus-additional.yaml:  0 bytes

# kubectl edit prometheus -n monitoring k8s
spec:
  additionalScrapeConfigs:
    key: prometheus-additional.yaml
    name: additional-configs
    optional: true
# 修改配置文件 

# cat prometheus-additional.yaml
- job_name: 'node'
  static_configs:
  - targets: ['192.168.0.26:9100']

# 進行熱更新
# kubectl create secret generic additional-configs --from-file=prometheus-additional.yaml --dry-run=client -o yaml | kubectl replace -f - -n monitoring

# 驗證配置
# kubectl get secret -n monitoring additional-configs -oyaml
apiVersion: v1
data:
  prometheus-additional.yaml: LSBqb2JfbmFtZTogJ25vZGUnCiAgc3RhdGljX2NvbmZpZ3M6CiAgLSB0YXJnZXRzOiBbJzE5Mi4xNjguMC4yNjo5MTAwJ10K
kind: Secret

# echo "LSBqb2JfbmFtZTogJ25vZGUnCiAgc3RhdGljX2NvbmZpZ3M6CiAgLSB0YXJnZXRzOiBbJzE5Mi4xNjguMC4yNjo5MTAwJ10K" | base64 -d
- job_name: 'node'
  static_configs:
  - targets: ['192.168.0.26:9100']

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM