Grafana SQL匯總


  導航:這里主要是列出一個prometheus一些系統的學習過程,最后按照章節順序查看,由於寫作該文檔經歷了不同時期,所以在文中有時出現

的雲環境不統一,但是學習具體使用方法即可,在最后的篇章,有一個完整的騰訊雲的實戰案例。

  1.什么是prometheus?

  2.Prometheus安裝

  3.Prometheus的Exporter詳解

  4.Prometheus的PromQL

  5.Prometheus告警處理

  6.Prometheus的集群與高可用

  7.Prometheus服務發現

  8.kube-state-metrics 和 metrics-server

  9.監控kubernetes集群的方式

  10.prometheus operator

  11.Prometheus實戰之聯邦+高可用+持久

  12.Prometheus實戰之配置匯總

  13.Grafana簡單用法

  14.Grafana SQL匯總

  15.prometheus SQL匯總

  參考:

  https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config

  https://yunlzheng.gitbook.io/prometheus-book/part-iii-prometheus-shi-zhan/readmd/use-prometheus-monitor-kubernetes

  https://www.bookstack.cn/read/prometheus_practice/introduction-README.md

  https://www.kancloud.cn/huyipow/prometheus/521184

  https://www.qikqiak.com/k8s-book/docs/

  

  由於自己寫一些prometheus sql會比較耗時,所以這里從騰訊雲的雲原生監控和prometheus operator中扒一些過來進行記錄。

  (prometheus operator和雲原生中的基本差不多)

  這里主要從騰訊雲的雲原生監控來獲取,因為標簽以及變量問題,該sql在聯邦集群環境中需要調整才能使用。

 

1.Compute resources/Cluster

  • 大盤

  • 變量

 

  • Sql

  CPU Utilisation

1 - avg(rate(node_cpu_seconds_total{mode="idle", cluster="$cluster"}[1m]))

 

  CPU Requests Commitment

sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster"}) / sum(kube_node_status_allocatable_cpu_cores{cluster="$cluster"})

 

  CPU Limits Commitment

sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster"}) / sum(kube_node_status_allocatable_cpu_cores{cluster="$cluster"})

 

  Memory Utilisation

1 - sum(:node_memory_MemAvailable_bytes:sum{cluster="$cluster"}) / sum(kube_node_status_allocatable_memory_bytes{cluster="$cluster"})

 

  Memory Requests Commitment

sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster"}) / sum(kube_node_status_allocatable_memory_bytes{cluster="$cluster"})

 

  Memory Limits Commitment

sum(kube_pod_container_resource_limits_memory_bytes{cluster="$cluster"}) / sum(kube_node_status_allocatable_memory_bytes{cluster="$cluster"})

 

  CPU Usage

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster"}) by (namespace)

 

  Memory Usage (working_set)

sum(container_memory_working_set_bytes{cluster="$cluster", container!=""}) by (namespace)

 

  Requests by Namespace

sum(kube_pod_owner{cluster="$cluster"}) by (namespace)

count(avg(namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster"}) by (workload, namespace)) by (namespace)

sum(container_memory_rss{cluster="$cluster", container!=""}) by (namespace)

sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster"}) by (namespace)

 

  Current Network Usage

sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Receive Bandwidth

sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Transmit Bandwidth

sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Average Container Bandwidth by Namespace: Received

avg(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Average Container Bandwidth by Namespace: Transmitted

avg(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Rate of Received Packets

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Rate of Transmitted Packets

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Rate of Received Packets Dropped

sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Rate of Transmitted Packets Dropped

sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

2.Compute Resources / Namespace (Pods)

  • 大盤

 

  • 變量

 

  • Sql

  CPU Utilisation (from requests)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) / sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"})

 

  CPU Utilisation (from limits)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) / sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"})

 

  Memory Utilization (from requests)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace",container!=""}) / sum(kube_pod_container_resource_requests_memory_bytes{namespace="$namespace"})

 

  Memory Utilisation (from limits)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace",container!=""}) / sum(kube_pod_container_resource_limits_memory_bytes{cluster="$cluster",namespace="$namespace"})

 

  CPU Usage

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) by (pod)

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="requests.cpu"})

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="limits.cpu"})

 

  CPU Quota

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}) by (pod)

 

  Memory Usage (w/o cache)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}) by (pod)

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="requests.memory"})

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="limits.memory"})

 

 

  Memory Quota

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace",container!=""}) by (pod)

sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace",container!=""}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster",namespace="$namespace"}) by (pod)

sum(kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace",container!=""}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{namespace="$namespace"}) by (pod)

sum(container_memory_rss{cluster="$cluster", namespace="$namespace",container!=""}) by (pod)

sum(container_memory_cache{cluster="$cluster", namespace="$namespace",container!=""}) by (pod)

sum(container_memory_swap{cluster="$cluster", namespace="$namespace",container!=""}) by (pod)

 

  Current Network Usage

sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Receive Bandwidth

sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Transmit Bandwidth

sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Rate of Received Packets

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Rate of Transmitted Packets

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Rate of Received Packets Dropped

sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Rate of Transmitted Packets Dropped

sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

3.Compute Resources/Namespace (Workloads)

  • 大盤

 

  • 變量

 

  • Sql

CPU Usage

sum(
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="requests.cpu"})

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="limits.cpu"})

 

  CPU Quota

count(namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}) by (workload, workload_type)

sum(
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
  kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)
/sum(
  kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
  kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)
/sum(
  kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

 

  Memory Usage

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="requests.memory"})

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="limits.memory"})

 

  Memory Quota

count(namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}) by (workload, workload_type)

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
  kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)
/sum(
  kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
  kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)
/sum(
  kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

 

  Current Network Usage

(sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

(sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

(sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

(sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

(sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

(sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

 

  Receive Bandwidth

(sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Transmit Bandwidth

(sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Average Container Bandwidth by Workload: Received

(avg(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Average Container Bandwidth by Workload: Transmitted

(avg(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Received Packets

(sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Transmitted Packets

(sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Received Packets Dropped

(sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Transmitted Packets Dropped

(sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

4.Compute Resources / Node (Pods)

  • 大盤

 

  • 變量

 

  • Sql

  CPU Usage

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", node=~"$node"}) by (pod)

 

  CPU Quota

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", node=~"$node"}) by (pod)

sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", node=~"$node"}) by (pod)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", node=~"$node"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", node=~"$node"}) by (pod)

sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", node=~"$node"}) by (pod)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", node=~"$node"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", node=~"$node"}) by (pod)

 

  Memory Usage (w/o cache)

sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster="$cluster", node=~"$node", container!=""}) by (pod)

 

  Memory Quota

sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster="$cluster", node=~"$node",container!=""}) by (pod)

sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", node=~"$node"}) by (pod)

sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster="$cluster", node=~"$node",container!=""}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{node=~"$node"}) by (pod)

sum(kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", node=~"$node"}) by (pod)

sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster="$cluster", node=~"$node",container!=""}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{node=~"$node"}) by (pod)

sum(node_namespace_pod_container:container_memory_rss{cluster="$cluster", node=~"$node",container!=""}) by (pod)

sum(node_namespace_pod_container:container_memory_cache{cluster="$cluster", node=~"$node",container!=""}) by (pod)

sum(node_namespace_pod_container:container_memory_swap{cluster="$cluster", node=~"$node",container!=""}) by (pod)

 

5.Compute Resources / Pod

  • 大盤

 

  • 變量

 

  • Sql

  CPU Usage

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{namespace="$namespace", pod="$pod", container!="POD", cluster="$cluster"}) by (container)

sum(
kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"})

sum(
    kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"})

 

  CPU Throttling

sum(increase(container_cpu_cfs_throttled_periods_total{namespace="$namespace", pod="$pod", container!="POD", container!="", cluster="$cluster"}[5m])) by (container) /sum(increase(container_cpu_cfs_periods_total{namespace="$namespace", pod="$pod", container!="POD", container!="", cluster="$cluster"}[5m])) by (container)

 

  CPU Quota

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace", pod="$pod", container!="POD"}) by (container)

sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container) / sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container)

sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container) / sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container)

 

  Memory Usage

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", pod="$pod", container!="POD", container!=""}) by (container)

sum(
kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace", pod="$pod"})

sum(
    kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace", pod="$pod"})

 

  Memory Quota

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", pod="$pod", container!="POD", container!=""}) by (container)

sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container) / sum(kube_pod_container_resource_requests_memory_bytes{namespace="$namespace", pod="$pod"}) by (container)

sum(kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace", pod="$pod", container!=""}) by (container)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", pod="$pod", container!=""}) by (container) / sum(kube_pod_container_resource_limits_memory_bytes{namespace="$namespace", pod="$pod"}) by (container)

sum(container_memory_rss{cluster="$cluster", namespace="$namespace", pod="$pod", container != "", container != "POD"}) by (container)

sum(container_memory_cache{cluster="$cluster", namespace="$namespace", pod="$pod", container != "", container != "POD"}) by (container)

sum(container_memory_swap{cluster="$cluster", namespace="$namespace", pod="$pod", container != "", container != "POD"}) by (container)

 

  Receive Bandwidth

sum(irate(container_network_receive_bytes_total{namespace=~"$namespace", pod=~"$pod"}[1m])) by (pod)

 

  Transmit Bandwidth

sum(irate(container_network_transmit_bytes_total{namespace=~"$namespace", pod=~"$pod"}[1m])) by (pod)

 

  Rate of Received Packets

sum(irate(container_network_receive_packets_total{namespace=~"$namespace", pod=~"$pod"}[1m])) by (pod)

 

  Rate of Transmitted Packets

sum(irate(container_network_transmit_packets_total{namespace=~"$namespace", pod=~"$pod"}[1m])) by (pod)

 

  Rate of Received Packets Dropped

sum(irate(container_network_receive_packets_dropped_total{namespace=~"$namespace",   pod=~"$pod"}[1m])) by (pod)

 

  Rate of Transmitted Packets Dropped

sum(irate(container_network_transmit_packets_dropped_total{namespace=~"$namespace",   pod=~"$pod"}[1m])) by (pod)

 

6.Compute Resources / Workload

  • 大盤

 

  • 變量

 

  • Sql

  CPU Usage

sum(
  node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

 

  CPU Quota

sum(
    node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)
/sum(
    kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)
/sum(
    kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

 

  Memory Usage

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

 

  Memory Quota

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)
/sum(
    kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)
/sum(
    kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

 

  Current Network Usage

(sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

(sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

(sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

(sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

(sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

(sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Receive Bandwidth

(sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Transmit Bandwidth

(sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Average Container Bandwidth by Pod: Received

(avg(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Average Container Bandwidth by Pod: Transmitted

(avg(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Received Packets

(sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Transmitted Packets

(sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Received Packets Dropped

(sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Transmitted Packets Dropped

(sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

7.Networking/Cluster

  • 大盤

 

  • 變量

 

  • Sql

  Current Rate of Bytes Received

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Current Rate of Bytes Transmitted

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Current Status

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(avg(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Average Rate of Bytes Received

sort_desc(avg(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Average Rate of Bytes Transmitted

sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Receive Bandwidth

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Transmit Bandwidth

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

8.Networking / Namespace (Pods)

  • 大盤

 

  • 變量

 

  • Sql

  Current Rate of Bytes Received

sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution]))

 

  Current Rate of Bytes Transmitted

sum(irate(container_network_transmit_bytes_total{namespace=~"$namespace"}[$interval:$resolution]))

 

  Current Status

sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Receive Bandwidth

sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Transmit Bandwidth

sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Rate of Received Packets

sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Rate of Transmitted Packets

sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Rate of Received Packets Dropped

sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Rate of Transmitted Packets Dropped

sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster", namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

9.Networking / Namespace (Workload)

  • 大盤

 

  • 變量

 

  • Sql

  Current Rate of Bytes Received

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Current Rate of Bytes Transmitted

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Current Status

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(avg(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Average Rate of Bytes Received

sort_desc(avg(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Average Rate of Bytes Transmitted

sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Receive Bandwidth

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Transmit Bandwidth

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Received Packets

sort_desc(sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Transmitted Packets

sort_desc(sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Received Packets Dropped

sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Transmitted Packets Dropped

sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

10.Networking/Pod

  • 大盤

 

  • 變量

 

  • Sql

  Current Rate of Bytes Received

sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution]))

 

  Current Rate of Bytes Transmitted

sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution]))

 

  Receive Bandwidth

sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

  Transmit Bandwidth

sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

  Rate of Received Packets

sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

  Rate of Transmitted Packets

sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

  Rate of Received Packets Dropped

sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

  Rate of Transmitted Packets Dropped

sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

11.Networking / Workload

  • 大盤

 

  • 變量

 

 

  • Sql

  Current Rate of Bytes Received

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Current Rate of Bytes Transmitted

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Average Rate of Bytes Received

sort_desc(avg(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Average Rate of Bytes Transmitted

sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Receive Bandwidth

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Transmit Bandwidth

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Received Packets

sort_desc(sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Transmitted Packets

sort_desc(sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Received Packets Dropped

sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Transmitted Packets Dropped

sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

12.Node

  • 大盤

 

  • 變量

             

  • Sql

  服務器資源總覽表(每頁10行)

node_uname_info{job=~"$job", cluster=~"$cluster"} – 0

sum(time() - node_boot_time_seconds{job=~"$job",cluster=~"$cluster"})by(instance)

node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"} – 0

count(node_cpu_seconds_total{job=~"$job",mode='system',cluster=~"$cluster"}) by (instance)

node_load5{job=~"$job",cluster=~"$cluster"}

(1 - avg(irate(node_cpu_seconds_total{job=~"$job",mode="idle",cluster=~"$cluster"}[5m])) by (instance)) * 100

(1 - (node_memory_MemAvailable_bytes{job=~"$job",cluster=~"$cluster"} / (node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"})))* 100

max((node_filesystem_size_bytes{job=~"$job",cluster=~"$cluster",fstype=~"ext.?|xfs"}-node_filesystem_free_bytes{job=~"$job",cluster=~"$cluster",fstype=~"ext.?|xfs"}) *100/(node_filesystem_avail_bytes {job=~"$job",cluster=~"$cluster",fstype=~"ext.?|xfs"}+(node_filesystem_size_bytes{job=~"$job",cluster=~"$cluster",fstype=~"ext.?|xfs"}-node_filesystem_free_bytes{job=~"$job",cluster=~"$cluster",fstype=~"ext.?|xfs"})))by(instance)

max(irate(node_disk_read_bytes_total{job=~"$job",cluster=~"$cluster"}[5m])) by (instance)

max(irate(node_disk_written_bytes_total{job=~"$job",cluster=~"$cluster"}[5m])) by (instance)

max(irate(node_network_receive_bytes_total{job=~"$job",cluster=~"$cluster"}[5m])*8) by (instance)

max(irate(node_network_transmit_bytes_total{job=~"$job",cluster=~"$cluster"}[5m])*8) by (instance)

 

  $job:整體總負載與整體平均CPU使用率

count(node_cpu_seconds_total{job=~"$job",cluster=~"$cluster", mode='system'})

sum(node_load5{job=~"$job",cluster=~"$cluster"})

avg(1 - avg(irate(node_cpu_seconds_total{job=~"$job",mode="idle",cluster=~"$cluster"}[5m])) by (instance)) * 100

 

  $job:整體總內存與整體平均內存使用率

sum(node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"})

sum(node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"} - node_memory_MemAvailable_bytes{job=~"$job",cluster=~"$cluster"})

(sum(node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"} - node_memory_MemAvailable_bytes{job=~"$job",cluster=~"$cluster"}) / sum(node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"}))*100

 

  $job:整體總磁盤與整體平均磁盤使用率

sum(avg(node_filesystem_size_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance))

sum(avg(node_filesystem_size_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance)) - sum(avg(node_filesystem_free_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance))

(sum(avg(node_filesystem_size_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance)) - sum(avg(node_filesystem_free_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance))) *100/(sum(avg(node_filesystem_avail_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance))+(sum(avg(node_filesystem_size_bytes{job=~"$job",fstype=~"xfs|ext.*"})by(device,instance)) - sum(avg(node_filesystem_free_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance))))

 

  運行時間

avg(time() - node_boot_time_seconds{instance=~"$node",cluster=~"$cluster"})

 

  CPU 核數

count(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node", mode='system'})

 

  總內存

sum(node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"})

 

  無

sum(node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"})

avg(irate(node_cpu_seconds_total{instance=~"$node",mode="iowait",cluster=~"$cluster"}[5m])) * 100

(1 - (node_memory_MemAvailable_bytes{instance=~"$node",cluster=~"$cluster"} / (node_memory_MemTotal_bytes{instance=~"$node",cluster=~"$cluster"})))* 100

(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint="$maxmount"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint="$maxmount"})*100 /(node_filesystem_avail_bytes {cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint="$maxmount"}+(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint="$maxmount"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint="$maxmount"}))

(1 - ((node_memory_SwapFree_bytes{cluster=~"$cluster",instance=~"$node"} + 1)/ (node_memory_SwapTotal_bytes{cluster=~"$cluster",instance=~"$node"} + 1))) * 100

 

  【$show_hostname】:各分區可用空間(EXT.*/XFS)

node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-0

node_filesystem_avail_bytes {cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-0

(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}+(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}))

 

  CPU iowait

avg(irate(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node",mode="iowait"}[5m])) * 100

 

  剩余節點數:$maxmount

avg(node_filesystem_files_free{cluster=~"$cluster",instance=~"$node",mountpoint="$maxmount",fstype=~"ext.?|xfs"})

 

  總文件描述符

avg(node_filefd_maximum{cluster=~"$cluster",instance=~"$node"})

 

  每小時流量$device

increase(node_network_receive_bytes_total{cluster=~"$cluster",instance=~"$node",device=~"$device"}[60m])

increase(node_network_transmit_bytes_total{cluster=~"$cluster",instance=~"$node",device=~"$device"}[60m])

 

  CPU使用率

avg(irate(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node",mode="system"}[5m])) by (instance) *100

avg(irate(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node",mode="user"}[5m])) by (instance) *100

avg(irate(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node",mode="iowait"}[5m])) by (instance) *100

(1 - avg(irate(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node",mode="idle"}[5m])) by (instance))*100

  

  內存信息

node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"} - node_memory_MemAvailable_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_MemAvailable_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_Buffers_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_MemFree_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_Cached_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"} - (node_memory_Cached_bytes{cluster=~"$cluster",instance=~"$node"} + node_memory_Buffers_bytes{cluster=~"$cluster",instance=~"$node"} + node_memory_MemFree_bytes{cluster=~"$cluster",instance=~"$node"})

(1 - (node_memory_MemAvailable_bytes{cluster=~"$cluster",instance=~"$node"} / (node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"})))* 100

 

  每秒網絡帶寬使用$device

irate(node_network_receive_bytes_total{cluster=~"$cluster",instance=~'$node',device=~"$device"}[5m])*8

irate(node_network_transmit_bytes_total{cluster=~"$cluster",instance=~'$node',device=~"$device"}[5m])*8

 

  系統平均負載

node_load1{cluster=~"$cluster",instance=~"$node"}

node_load5{cluster=~"$cluster",instance=~"$node"}

node_load15{cluster=~"$cluster",instance=~"$node"}

sum(count(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node", mode='system'}) by (cpu,instance)) by(instance)

 

  每秒磁盤讀寫容量

irate(node_disk_read_bytes_total{cluster=~"$cluster",instance=~"$node"}[5m])

irate(node_disk_written_bytes_total{cluster=~"$cluster",instance=~"$node"}[5m])

 

  磁盤使用率

(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}+(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}))

node_filesystem_files_free{cluster=~"$cluster",instance=~'$node',fstype=~"ext.?|xfs"} / node_filesystem_files{cluster=~"$cluster",instance=~'$node',fstype=~"ext.?|xfs"}

 

  磁盤讀寫速率(IOPS)

irate(node_disk_reads_completed_total{cluster=~"$cluster",instance=~"$node"}[5m])

irate(node_disk_writes_completed_total{cluster=~"$cluster",instance=~"$node"}[5m])

node_disk_io_now{cluster=~"$cluster",instance=~"$node"}

 

  每1秒內I/O操作耗時占比

irate(node_disk_io_time_seconds_total{cluster=~"$cluster",instance=~"$node"}[5m])

 

  每次IO讀寫的耗時(參考:小於100ms)(beta)

irate(node_disk_read_time_seconds_total{cluster=~"$cluster",instance=~"$node"}[5m]) / irate(node_disk_reads_completed_total{instance=~"$node"}[5m])

irate(node_disk_write_time_seconds_total{cluster=~"$cluster",instance=~"$node"}[5m]) / irate(node_disk_writes_completed_total{cluster=~"$cluster",instance=~"$node"}[5m])

irate(node_disk_io_time_seconds_total{cluster=~"$cluster",instance=~"$node"}[5m])

irate(node_disk_io_time_weighted_seconds_total{cluster=~"$cluster",instance=~"$node"}[5m])

 

  網絡Socket連接信息

node_netstat_Tcp_CurrEstab{cluster=~"$cluster",instance=~'$node'}

node_sockstat_TCP_tw{cluster=~"$cluster",instance=~'$node'}

node_sockstat_sockets_used{cluster=~"$cluster",instance=~'$node'}

node_sockstat_UDP_inuse{cluster=~"$cluster",instance=~'$node'}

node_sockstat_TCP_alloc{cluster=~"$cluster",instance=~'$node'}

irate(node_netstat_Tcp_PassiveOpens{cluster=~"$cluster",instance=~'$node'}[5m])

irate(node_netstat_Tcp_ActiveOpens{cluster=~"$cluster",instance=~'$node'}[5m])

irate(node_netstat_Tcp_InSegs{cluster=~"$cluster",instance=~'$node'}[5m])

irate(node_netstat_Tcp_OutSegs{cluster=~"$cluster",instance=~'$node'}[5m])

irate(node_netstat_Tcp_RetransSegs{cluster=~"$cluster",instance=~'$node'}[5m])

irate(node_netstat_TcpExt_ListenDrops{cluster=~"$cluster",instance=~'$node'}[5m])

 

  打開的文件描述符(左 )/每秒上下文切換次數(右)

node_filefd_allocated{cluster=~"$cluster",instance=~"$node"}

irate(node_context_switches_total{cluster=~"$cluster",instance=~"$node"}[5m])

(node_filefd_allocated{cluster=~"$cluster",instance=~"$node"}/node_filefd_maximum{cluster=~"$cluster",instance=~"$node"}) *100

 

13.Pods

  • 大盤

 

  • 變量

  沒有變量顯示,可以根據之前的模版自己寫

 

  • sql

  騰訊雲無法查看

 

14.Porxy

  • 大盤

 

  • 變量

  沒有變量顯示,可以根據之前的模版自己寫

 

  • sql

  騰訊雲無法查看

 

15.Scheduler

  • 大盤

  • 變量

  沒有變量顯示,可以根據之前的模版自己寫

  • sql

  騰訊雲無法查看

 

16.StatefulSets

  • 大盤

 

 

  • 變量

  沒有變量顯示,可以根據之前的模版自己寫

 

  • sql

  騰訊雲無法查看

 

17.Persistent Volumes

  • 大盤

  因為沒有使用PV,所以可能沒有數據查詢

 

  • 變量

  沒有變量顯示,可以根據之前的模版自己寫

 

  • sql

  騰訊雲無法查看

 

18.Kubelet

  • 大盤

 

  • 變量

  沒有變量顯示,可以根據之前的模版自己寫

 

  • sql

  騰訊雲無法查看

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM