【問題】使用kube-prometheus無法監控到自定義命名空間下的資源情況
已知:多個服務開啟jmx監控,並新建一個service用於匹配 開啟監控的pod,匹配標簽為 jmx=prometheus,命名空間為自定義的jmbymt,查看endpoints信息正常,並且能獲取到指標。
[root@ymt36 tmo]# kubectl -n jmbymt describe svc jmxprometheus Name: jmxprometheus Namespace: jmbymt Labels: jmx=prometheus Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"jmx":"prometheus"},"name":"jmxprometheus","namespace":"jmbymt"... Selector: jmx=prometheus Type: ClusterIP IP: 10.87.234.23 Port: jmx 8013/TCP TargetPort: 8013/TCP Endpoints: 10.20.234.148:8013,10.20.234.149:8013,10.20.234.164:8013 + 7 more... Session Affinity: None Events: <none> [root@ymt36 tmo]# curl http://10.87.234.23:8013/metrics # HELP jmx_exporter_build_info A metric with a constant '1' value labeled with the version of the JMX exporter. # TYPE jmx_exporter_build_info gauge jmx_exporter_build_info{version="0.13.0",name="jmx_prometheus_javaagent",} 1.0 # HELP jvm_gc_collection_seconds Time spent in a given JVM garbage collector in seconds. # TYPE jvm_gc_collection_seconds summary jvm_gc_collection_seconds_count{gc="Copy",} 8180.0 jvm_gc_collection_seconds_sum{gc="Copy",} 70.006 jvm_gc_collection_seconds_count{gc="MarkSweepCompact",} 5.0 jvm_gc_collection_seconds_sum{gc="MarkSweepCompact",} 0.337 # HELP jmx_config_reload_failure_total Number of times configuration have failed to be reloaded. # TYPE jmx_config_reload_failure_total counter jmx_config_reload_failure_total 0.0 # HELP jvm_classes_loaded The number of classes that are currently loaded in the JVM # TYPE jvm_classes_loaded gauge jvm_classes_loaded 12033.0 # HELP jvm_classes_loaded_total The total number of classes that have been loaded since the JVM has started execution # TYPE jvm_classes_loaded_total counter jvm_classes_loaded_total 12087.0 # HELP jvm_classes_unloaded_total The total number of classes that have been unloaded since the JVM has started execution # TYPE jvm_classes_unloaded_total counter jvm_classes_unloaded_total 54.0 # HELP jvm_memory_bytes_used Used bytes of a given JVM memory area. # TYPE jvm_memory_bytes_used gauge jvm_memory_bytes_used{area="heap",} 1.30044704E8 jvm_memory_bytes_used{area="nonheap",} 1.22438832E8
添加Prometheus監控規則:
- job_name: jmxprometheus honor_labels: false kubernetes_sd_configs: - role: endpoints namespaces: names: - jmbymt scrape_interval: 30s relabel_configs: - action: keep source_labels: - __meta_kubernetes_service_label_jmx regex: prometheus - action: keep source_labels: - __meta_kubernetes_endpoint_port_name regex: jmx - source_labels: - __meta_kubernetes_endpoint_address_target_kind - __meta_kubernetes_endpoint_address_target_name separator: ; regex: Node;(.*) replacement: ${1} target_label: node - source_labels: - __meta_kubernetes_endpoint_address_target_kind - __meta_kubernetes_endpoint_address_target_name separator: ; regex: Pod;(.*) replacement: ${1} target_label: pod - source_labels: - __meta_kubernetes_namespace target_label: namespace - source_labels: - __meta_kubernetes_service_name target_label: service - source_labels: - __meta_kubernetes_pod_name target_label: pod - source_labels: - __meta_kubernetes_service_name target_label: job replacement: ${1} - source_labels: - __meta_kubernetes_service_label_jmx target_label: job regex: (.+) replacement: ${1} - target_label: endpoint replacement: jmx
查看prometheus日志發現報錯:
[root@cicd ~]# kubectl -n monitoring logs -f -l prometheus=k8s --all-containers=true --max-log-requests=2 error: you are attempting to follow 6 log streams, but maximum allowed concurrency is 2, use --max-log-requests to increase the limit [root@cicd ~]# kubectl -n monitoring logs -f -l prometheus=k8s -c prometheus --max-log-requests=2 level=error ts=2020-12-01T08:22:28.047Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:263: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"jmbymt\"" level=error ts=2020-12-01T08:22:29.047Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:265: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"jmbymt\"" level=error ts=2020-12-01T08:22:29.049Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:264: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"jmbymt\"" level=error ts=2020-12-01T08:22:29.049Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:263: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"jmbymt\"" level=error ts=2020-12-01T08:22:30.050Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:265: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"jmbymt\"" level=error ts=2020-12-01T08:22:30.051Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:264: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"jmbymt\"" level=error ts=2020-12-01T08:22:30.051Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:263: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"jmbymt\"" level=error ts=2020-12-01T08:22:31.053Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:265: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"jmbymt\""
【解決】更改prometheus集群訪問權限即可
[root@ymt36 custom]# cat prometheus-clusterRole.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: prometheus-k8s rules: - apiGroups: - "" resources: - nodes - services - endpoints - pods verbs: ["get", "list", "watch"] - nonResourceURLs: - /metrics verbs: - get
Github issue:https://github.com/prometheus-operator/prometheus-operator/issues/2155#issuecomment-441002864
作者:Leozhanggg
出處:https://www.cnblogs.com/leozhanggg/p/14069375.html
本文版權歸作者和博客園共有,歡迎轉載,但未經作者同意必須保留此段聲明,且在文章頁面明顯位置給出原文連接,否則保留追究法律責任的權利。