1.集群監控

集群監控主要包括兩個方面的內容，分別是集群健康情況和集群的運行狀態。

集群健康狀態可以通過以下api獲取：

http://ip:9200/_cluster/health?pretty

關鍵指標說明：

status：
#集群狀態，分為green、yellow和red。 

number_of_nodes/number_of_data_nodes: #集群的節點數和數據節點數。 
 active_primary_shards： #集群中所有活躍的主分片數。 
 active_shards： #集群中所有活躍的分片數。
 relocating_shards： #當前節點遷往其他節點的分片數量，通常為0，當有節點加入或者退出時該值會增加。 
 initializing_shards： #正在初始化的分片。 
 unassigned_shards： #未分配的分片數，通常為0，當有某個節點的副本分片丟失該值就會增加。 
 number_of_pending_tasks： #是指主節點創建索引並分配shards等任務，如果該指標數值一直未減小代表集群存在不穩定因素。 
 active_shards_percent_as_number： #集群分片健康度，活躍分片數占總分片數比例。 
 number_of_pending_tasks： #pending task只能由主節點來進行處理，這些任務包括創建索引並將shards分配給節點。

集群狀態信息主要包含整個集群的一些統計信息，例如文檔數、分片數、資源使用情況等。

集群狀態信息可以由以下api獲取：

http://ip:9200/_cluster/stats?pretty

關鍵指標說明：

indices.count：
#索引總數。
 indices.shards.total： #分片總數。
 indices.shards.primaries： #主分片數量。 
 docs.count： #文檔總數。 
 store.size_in_bytes： #數據總存儲容量。
 segments.count： #段總數。 
 nodes.count.total： #總節點數。 
 nodes.count.data： #數據節點數。 
 nodes. process. cpu.percent： #節點CPU使用率。 
 fs.total_in_bytes： #文件系統使用總容量。 
 fs.free_in_bytes： #文件系統剩余總容量。

2.節點監控

節點監控主要針對各個節點，有很多指標對於保證ES集群的穩定運行非常重要。

可以通過以下api獲取：

http://ip:9200/_nodes/stats?pretty

關鍵指標說明：

name：
#節點名。 

roles：
#節點角色。 
 indices.docs.count： #索引文檔數。 
 segments.count： #段總數。 
 jvm.heap_used_percent： #內存使用百分比。 
 thread_pool.{bulk, index, get, search}.{active, queue, rejected}： #線程池的一些信息，包括bulk、index、get和search線程池，主要指標有active（激活）線程數，線程queue（隊列）數和rejected（拒絕）線程數量。

以下一些指標是一個累加值，當節點重啟之后會清零。

indices.indexing.index_total： #索引文檔數。 
 indices.indexing.index_time_in_millis： #索引總耗時。 
 indices.get.total： #get請求數。 
 indices.get.time_in_millis： #get請求總耗時。 
 indices.search.query_total： #search總請求數。 
 indices.search.query_time_in_millis： #search請求總耗時。indices.search.fetch_total：fetch操作總數量。 
 indices.search.fetch_time_in_millis： #fetch請求總耗時。 
 jvm.gc.collectors.young.collection_count： #年輕代垃圾回收次數。 
 jvm.gc.collectors.young.collection_time_in_millis： #年輕代垃圾回收總耗時。 
 jvm.gc.collectors.old.collection_count： #老年代垃圾回收次數。 
 jvm.gc.collectors.old.collection_time_in_millis： #老年代垃圾回收總耗時。

一些需要計算的指標：節點監控的計算指標主要分為兩類，分別為請求速率指標和請求處理延遲指標，下面作具體介紹。

index_per_min：
#每分鍾索引請求數量。計算公式如下：
#索引請求率=(index_total兩次采集差值)/(系統時間差值（ms）)×60000 (公式1)
 indexAverge_per_min： #索引請求處理延遲。計算公式如下： #索引延遲=(index_time_in_millis兩次采集差值)/(index_total兩次采集差值) (公式2)
 get_per_min： #每分鍾get請求數量，計算公式如(公式1)，更改相應參數。 
 getAverage_per_min： #get請求處理延遲，計算公式如(公式2) ，更改相應參數。 
 merge_per_min： #每分鍾merge請求數量，計算公式如(公式1)，更改相應參數。 
 mergeAverage_per_min： #merge請求處理延遲，計算公式如(公式2) ，更改相應參數。 
 searchQuery_per_min： #每分鍾query請求數量，計算公式如(公式1)，更改相應參數。 
 searchQueryAverage_per_min： #query請求延遲，計算公式如(公式2) ，更改相應參數。 
 searchFetch_per_min： #每分鍾fetch請求數量，計算公式如(公式1)，更改相應參數。 
 searchFetchAverage_per_min： #fetch請求延遲，計算公式如(公式2) ，更改相應參數。 
 youngGc_per_min： #每分鍾young gc數量，計算公式如(公式1)，更改相應參數。 
 youngGcAverage_per_min： #young gc請求延遲，計算公式如(公式2) ，更改相應參數。 
 oldGc_per_min： #每分鍾old gc數量，計算公式如(公式1)，更改相應參數。 
 oldGcAverage_per_min： #old gc請求延遲，計算公式如(公式2) ，更改相應參數。

3.索引監控

索引監控指標主要針對單個索引，不過也可以通過“_all”對集群中所有索引進行監控。

索引監控指標可以通過以下api獲取：

http://ip:9200/_stats?pretty

關鍵指標說明：

http://ip:9200/_stats?pretty。 #關鍵指標說明： 
 indexname.primaries.docs.count： #索引文檔數量。

以下一些指標是一個累加值，當節點重啟之后會清零。

indexname.primaries.indexing.index_total： #索引文檔數。 
 indexname.primaries.indexing.index_time_in_millis： #索引總耗時。 
 indexname.primaries.get.total： #get請求數。 
 indexname.primaries.get.time_in_millis： #get請求總耗時。 
 indexname.primaries.search.query_total： #search總請求數。 
 indexname.primaries.search.query_time_in_millis： #search請求總耗時。indices.search.fetch_total：fetch操作總數量。 
 indexname.primaries.search.fetch_time_in_millis： #fetch請求總耗時。 
 indexname.primaries.refresh.total： #refresh請求總量。 
 indexname.primaries.refresh.total_time_in_millis： #refresh請求總耗時。
 indexname.primaries.flush.total： #flush請求總量。 
 indexname.primaries.flush.total_time_in_millis： #flush請求總耗時。

理解了上面的指標

就可以使用Prometheus和Grafana進行監控展示

下面是我們測試環境的Grafana上展示的Elasticsearch集群的狀態

可以看到prometheus采集到的的指標信息還是比較全面的

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Elasticsearch集群監控指標學習 Elasticsearch_exporter 監控指標 Elasticsearch Top10 監控指標 Elasticsearch 主要監控指標 -- 描述了es監控的幾個維度，相當不錯！服務監控-zabbix監控指標 Elasticsearch的基本概念和指標 2、Prometheus監控指標類型 zookeeper 的監控指標（一）監控Hadoop指標 Hbase監控指標項