一、原理說明
-
其實目前新版的K8S在監控這塊的架構已經非常明確了,只不過國內很少有文章解釋這一塊,其官方架構說明見:
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/monitoring_architecture.md。 -
其架構分位2個部分:
- 內置於K8S的核心指標采集,安裝K8S就自帶了(下圖黑色部分)。
- 第三方的監控采集方案,需要大家自己選型,比如prometheus等(下圖藍色部分)。
-
像kubectl top獲取的cpu/mem使用情況,就屬於K8S內置的核心指標采集而來,完全不需要第三方的支持
二、API 訪問 metrics-server
-
那么當前,metrics-server在API SERVER注冊的GROUP叫做metrics.k8s.io,VERSION是v1beta1,所以其對應的Restful資源URI就是以:/apis/metrics.k8s.io/v1beta1/為前綴的。
-
metrics-server官方已經說明了其資源URI的幾種形式:https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/resource-metrics-api.md。
-
The list of supported endpoints:
- /nodes – all node metrics; type []NodeMetrics
- /nodes/{node} – metrics for a specified node; type NodeMetrics
- /namespaces/{namespace}/pods – all pod metrics within namespace with support for all-namespaces; type []PodMetrics
- /namespaces/{namespace}/pods/{pod} – metrics for a specified pod; type PodMetrics
-
The following query parameters are supported:
- labelSelector – restrict the list of returned objects by labels (list endpoints only)
-
-
所以,為了獲取某個namespace下面的pods的資源利用率,我們可以有2種方式:
- 按標簽篩選:/apis/metrics.k8s.io/v1beta1/namespaces/{namespace}/pods?labelSelector=xxxx
- 按pod名字獲取:/apis/metrics.k8s.io/v1beta1/namespaces/{namespace}/pods/
- 首先啟動一個proxy,它會幫我們解決和API SERVER之間的認證問題,我們只需要關注於接口參數即可:
# kubectl proxy --port=8181 --address=0.0.0.0
# nodes -> curl localhost:8181/apis/metrics.k8s.io/v1beta1/nodes
# pods -> curl localhost:8181/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/redis-master-dasdhjasd
curl localhost:8181/apis/metrics.k8s.io/v1beta1/namespaces/test/pods
- 通過接口訪問節點資源信息路徑
查看node資源指標
# kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq | less
查看pods資源指標
# kubectl get --raw "/apis/metrics.k8s.io/v1beta1/pods" | jq | less
三、requests封裝API
# 僅為示例,代碼不完整!!!
import requests
import re
def eventer_alarm_cutpods(string):
string = re.sub(r'\\n', '\n', string)
pods = string.split('\n')[2].split()[1]
print(pods)
return pods
def metric_get_pods(host, pods):
"""
:param hosts -> 127.0.0.1:8181
"""
if pods:
url = 'http://' + host + '/apis/metrics.k8s.io/v1beta1/namespaces/test/pods'
response = requests.get(url)
with response:
print(response.json())
else:
url = 'http://' + host + '/apis/metrics.k8s.io/v1beta1/pods'
def metric_get_nodes(host, node):
"""
:param hosts -> 127.0.0.1:8181
:return:
"""
if node:
pass
else:
url = 'http://' + host + '/apis/metrics.k8s.io/v1beta1/nodes'
response = requests.get(url)
with response:
print(response.json())
if __name__ == '__main__':
# metric_get_nodes('127.0.0.1:8181')
string = 'EventType: Warning\nEventKind: Pod\nEventObject: test-nginx-7d97ffc85d-nrnfn\nEventReason: Failed\nEventTime: 2021-05-23 17:48:12 +0800 CST\nEventMessage: Error: ErrImagePull'
eventer_alarm_cutpods(string)