一、导模板
下载grafana模板文件
https://grafana.com/grafana/dashboards/13105
grafana页面导入,导入以后可以看到基础的模板,但没有数据。
二、配置Prometheus
Prometheus配置文件新增两个job配置。
# my global config
global:
scrape_interval: 30s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 30s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
#alerting:
# alertmanagers:
# - static_configs:
# - targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
#rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'sbc'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['192.168.40.248:9090']
######################################如下新增内容
- job_name: 'k8s-cadvisor'
metrics_path: /metrics/cadvisor
kubernetes_sd_configs:
- role: node
api_server: https://192.168.60.180:6443 #apiserver地址
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:10255'
target_label: __address__
action: replace
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
metric_relabel_configs:
- source_labels: [instance]
separator: ;
regex: (.+)
target_label: node
replacement: $1
action: replace
- source_labels: [pod_name]
separator: ;
regex: (.+)
target_label: pod
replacement: $1
action: replace
- source_labels: [container_name]
separator: ;
regex: (.+)
target_label: container
replacement: $1
action: replace
- job_name: kube-state-metrics
kubernetes_sd_configs:
- role: endpoints
api_server: https://192.168.60.180:6443 #apiserver地址
tls_config:
insecure_skip_verify: true
namespaces:
names:
- ops-monit
relabel_configs:
- source_labels: [__address__]
regex: '.*:8080'
replacement: '192.168.60.182:31080' #k8s节点IP即可
target_label: __address__
action: replace
- source_labels: [__address__]
regex: '.*:8081'
replacement: '192.168.60.182:31081' #k8s节点ip即可。
target_label: __address__
action: replace
- source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
regex: kube-state-metrics
replacement: $1
action: keep
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: k8s_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: k8s_sname
重启服务
systemctl restart prometheus.service && systemctl status prometheus.service
三、配置kube-state-metrics
注意:Prometheus需要能采集到cadvisor与kube-state-metrics的指标。
cAdvisor作为kubelet内置的一部分程序可以直接使用。
kube-state-metrics部署可参考:https://github.com/starsliao/Prometheus/tree/master/kubernetes
这个文件里的端口可以自己自定义。
vim service.yaml
apiVersion: v1
kind: Service
metadata:
# annotations:
# prometheus.io/scrape: 'true'
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: v1.9.7
name: kube-state-metrics
namespace: ops-monit
spec:
type: NodePort
ports:
- name: http-metrics
port: 8080
targetPort: http-metrics
nodePort: 31080
- name: telemetry
port: 8081
targetPort: telemetry
nodePort: 31081
selector:
app.kubernetes.io/name: kube-state-metrics
kubectl create namespace ops-monit
kubectl create -f ./

中间注意观察promethes页面内是否正常。
targets是否都是up状态。
四、查看效果