Prometheus 監控K8S Node監控
Prometheus社區提供的NodeExporter項目可以對主機的關鍵度量指標進行監控,通過Kubernetes的DeamonSet可以在各個主機節點上部署有且僅有一個NodeExporter實例,實現對主機性能指標數據的監控,但由於容器隔離原因,使用容器NodeExporter並不能正確獲取到宿主機磁盤信息,故此本課程將NodeExporter部署到宿主機。
node_exporter:用於*NIX系統監控,使用Go語言編寫的收集器
- 使用文檔:https://prometheus.io/docs/guides/node-exporter/
- GitHub:https://github.com/prometheus/node_exporter
- exporter列表:https://prometheus.io/docs/instrumenting/exporters/
官方文檔:https://github.com/kubernetes/kube-state-metrics
node-exporter所采集的指標主要有:

node_cpu_* node_disk_* node_entropy_* node_filefd_* node_filesystem_* node_forks_* node_intr_total_* node_ipvs_* node_load_* node_memory_* node_netstat_* node_network_* node_nf_conntrack_* node_scrape_* node_sockstat_* node_time_seconds_* node_timex _* node_xfs_*
配置文件
修改過得配置文件
- # prometheus 配置文件
- prometheus-configmap.yaml
-
# Prometheus configuration format https://prometheus.io/docs/prometheus/latest/configuration/configuration/ apiVersion: v1 kind: ConfigMap metadata: name: prometheus-config namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: EnsureExists data: # 存放prometheus配置文件 prometheus.yml: | # 配置采集目標 scrape_configs: - job_name: prometheus static_configs: - targets: # 采集自身 - localhost:9090 prometheus.yml: | # 配置采集目標 scrape_configs: - job_name: kubernetes-nodes static_configs: - targets: # 采集自身 - 192.168.1.110:9100 - 192.168.1.111:9100 # 采集:Apiserver 生存指標 # 創建的job name 名稱為 kubernetes-apiservers - job_name: kubernetes-apiservers # 基於k8s的服務發現 kubernetes_sd_configs: - role: endpoints # 使用通信標記標簽 relabel_configs: # 保留正則匹配標簽 - action: keep # 已經包含 regex: default;kubernetes;https source_labels: - __meta_kubernetes_namespace - __meta_kubernetes_service_name - __meta_kubernetes_endpoint_port_name # 使用方法為https、默認http scheme: https tls_config: # promethus訪問Apiserver使用認證 ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt # 跳過https認證 insecure_skip_verify: true # promethus訪問Apiserver使用認證 bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token # 采集:Kubelet 生存指標 - job_name: kubernetes-nodes-kubelet kubernetes_sd_configs: # 發現集群中所有的Node - role: node relabel_configs: # 通過regex獲取關鍵信息 - action: labelmap regex: __meta_kubernetes_node_label_(.+) scheme: https tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token # 采集:nodes-cadvisor 信息 - job_name: kubernetes-nodes-cadvisor kubernetes_sd_configs: - role: node relabel_configs: - action: labelmap regex: __meta_kubernetes_node_label_(.+) # 重命名標簽 - target_label: __metrics_path__ replacement: /metrics/cadvisor scheme: https tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token # 采集:service-endpoints 信息 - job_name: kubernetes-service-endpoints # 選定指標 kubernetes_sd_configs: - role: endpoints relabel_configs: - action: keep regex: true # 指定源標簽 source_labels: - __meta_kubernetes_service_annotation_prometheus_io_scrape - action: replace regex: (https?) source_labels: - __meta_kubernetes_service_annotation_prometheus_io_scheme # 重命名標簽采集 target_label: __scheme__ - action: replace regex: (.+) source_labels: - __meta_kubernetes_service_annotation_prometheus_io_path target_label: __metrics_path__ - action: replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: $1:$2 source_labels: - __address__ - __meta_kubernetes_service_annotation_prometheus_io_port target_label: __address__ - action: labelmap regex: __meta_kubernetes_service_label_(.+) - action: replace source_labels: - __meta_kubernetes_namespace target_label: kubernetes_namespace - action: replace source_labels: - __meta_kubernetes_service_name target_label: kubernetes_name # 采集:kubernetes-services 服務指標 - job_name: kubernetes-services kubernetes_sd_configs: - role: service # 黑盒探測,探測IP與端口是否可用 metrics_path: /probe params: module: - http_2xx relabel_configs: - action: keep regex: true source_labels: - __meta_kubernetes_service_annotation_prometheus_io_probe - source_labels: - __address__ target_label: __param_target # 使用 blackbox進行黑盒探測 - replacement: blackbox target_label: __address__ - source_labels: - __param_target target_label: instance - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: - __meta_kubernetes_namespace target_label: kubernetes_namespace - source_labels: - __meta_kubernetes_service_name target_label: kubernetes_name # 采集: kubernetes-pods 信息 - job_name: kubernetes-pods kubernetes_sd_configs: - role: pod relabel_configs: - action: keep regex: true source_labels: # 只保留采集的信息 - __meta_kubernetes_pod_annotation_prometheus_io_scrape - action: replace regex: (.+) source_labels: - __meta_kubernetes_pod_annotation_prometheus_io_path target_label: __metrics_path__ - action: replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: $1:$2 source_labels: # 采集地址 - __address__ # 采集端口 - __meta_kubernetes_pod_annotation_prometheus_io_port target_label: __address__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - action: replace source_labels: - __meta_kubernetes_namespace target_label: kubernetes_namespace - action: replace source_labels: - __meta_kubernetes_pod_name target_label: kubernetes_pod_name alerting: # 告警配置文件 alertmanagers: - kubernetes_sd_configs: # 采用動態獲取 - role: pod tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token relabel_configs: - source_labels: [__meta_kubernetes_namespace] regex: kube-system action: keep - source_labels: [__meta_kubernetes_pod_label_k8s_app] regex: alertmanager action: keep - source_labels: [__meta_kubernetes_pod_container_port_number] regex: action: drop
Node部署:node_exporter
1、生效配置文件
kubectl apply -f prometheus-configmap.yaml
2、查看是否生效
3、使用Grafana可視化模板:9276
4、選擇分組
5、顯示節點信息(為顯示可根據自身情況進行微調)