一、自帶Metrics接口類型服務的監控
有些應用本身具有Metrics接口,我們可以使用Prometheus Operator來創建相應的servicemonitor,匹配該服務的service,就能自動將該服務納入監控中。而有些服務本身沒有創建service或者是kubernetes集群外部的應用,我們首先需要為其創建service和endpoint。
在kubernetes中,使用Prometheus監控etcd集群,二進制安裝的k8s集群中,etcd集群不在k8s集群內部,首先為etcd集群創建一個service和endpoint。
etcd-endpoint.yaml
apiVersion: v1 kind: Endpoints metadata: name: kube-etcd-monitoring namespace: kube-system labels: k8s-app: kube-etcd subsets: - addresses: - ip: 192.168.10.240 - ip: 192.168.10.241 - ip: 192.168.10.242 ports: - name: https-metrics port: 2379 protocol: TCP
接下來創建etcd的service服務,名稱要和endpoint一致,這樣就能聯系上endpoint
etcd-service.yaml
apiVersion: v1 kind: Service metadata: name: kube-etcd-monitoring namespace: kube-system labels: k8s-app: kube-etcd spec: ports: - port: 2379 name: https-metrics protocol: TCP type: ClusterIP
創建
kubectl create -f .
查看
# kubectl get svc,ep -n kube-system -l k8s-app=kube-etcd NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kube-etcd-monitoring ClusterIP 10.111.191.23 <none> 2379/TCP 4m53s NAME ENDPOINTS AGE endpoints/kube-etcd-monitoring 192.168.10.240:2379,192.168.10.241:2379,192.168.10.242:2379 4m53s
測試,連接etcd需要證書
# curl --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem https://10.111.191.23:2379/metrics -k |more % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0# HELP etcd_cluster_version Which version is running. 1 for 'cluster_version' label with current cluster version # TYPE etcd_cluster_version gauge etcd_cluster_version{cluster_version="3.4"} 1 # HELP etcd_debugging_auth_revision The current revision of auth store. # TYPE etcd_debugging_auth_revision gauge etcd_debugging_auth_revision 1 # HELP etcd_debugging_disk_backend_commit_rebalance_duration_seconds The latency distributions of commit.rebalance called by bboltdb backend. # TYPE etcd_debugging_disk_backend_commit_rebalance_duration_seconds histogram etcd_debugging_disk_backend_commit_rebalance_duration_seconds_bucket{le="0.001"} 152605 etcd_debugging_disk_backend_commit_rebalance_duration_seconds_bucket{le="0.002"} 152609 etcd_debugging_disk_backend_commit_rebalance_duration_seconds_bucket{le="0.004"} 152610 etcd_debugging_disk_backend_commit_rebalance_duration_seconds_bucket{le="0.008"} 152610 etcd_debugging_disk_backend_commit_rebalance_duration_seconds_bucket{le="0.016"} 152612 etcd_debugging_disk_backend_commit_rebalance_duration_seconds_bucket{le="0.032"
二、修改Prometheus
因為etcd訪問需要使用證書,所以需要在prometheus中掛載etcd的證書,允許其訪問etcd
創建一個secret
# kubectl create secret generic etcd-cert -n monitoring --from-file=/etc/etcd/ssl/etcd-ca.pem --from-file=/etc/etcd/ssl/etcd.pem --from-file=/etc/etcd/ssl/etcd-key.pem secret/etcd-cert created
修改prometheus的yaml文件
$path/kube-prometheus/manifests/prometheus-prometheus.yaml
... podMonitorSelector: {} probeNamespaceSelector: {} probeSelector: {} secrets: - etcd-cert replicas: 1 resources: requests: memory: 300Mi ruleSelector: matchLabels: prometheus: k8s role: alert-rules ...
- 新增配置項:secrets
- etcd的證書默認掛載在:/etc/prometheus/secrets/etcd-cert
三、創建servicemonitor
prometheus-serviceMonitorEtcd.yaml
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: k8s-app: kube-etcd name: kube-etcd namespace: monitoring spec: endpoints: - interval: 30s port: https-metrics scheme: https tlsConfig: caFile: /etc/prometheus/secrets/etcd-cert/etcd-ca.pem certFile: /etc/prometheus/secrets/etcd-cert/etcd.pem keyFile: /etc/prometheus/secrets/etcd-cert/etcd-key.pem insecureSkipVerify: true jobLabel: k8s-app namespaceSelector: matchNames: - kube-system selector: matchLabels: k8s-app: kube-etcd
- k8s-app:需要和etcd的service一致
創建
kubectl create -f prometheus-serviceMonitorEtcd.yaml
查看Prometheus