kafka監控(JMXTrans+InfluxDb+Grafana)


一、簡介

  環境介紹

  角色

172.16.133.82   InfluxDb
172.16.133.82  Grafana
172.16.133.82   jmxtrans
kafka
172.16.133.82  node1

  軟件版本

influxdb-1.7.7.x86_64.rpm
grafana-6.2.5-1.x86_64.rpm
jmxtrans-266.rpm
kafka_2.12-0.10.2.1

二、配置規划

  • jmxtrans可以分別在每台kafka節點上部署,也可以部署到一台機器上,這里是選擇了后者,因為集群小,這樣配置文件可以集中管理,如果集群比較大,可以考慮分散部署
  • 關於jmxtrans的配置文件,分全局指標(每個kafka節點)和topic指標,全局指標每個節點一個配置文件,命名規則:base_172.16.133.82.json,topic指標是每個topic一個配置文件,命名規則:falcon_monitor_us_82.json

三、監控指標

  全局指標

每秒輸入的流量

"obj" : "kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec"
"attr" : [ "Count" ]
"resultAlias":"BytesInPerSec"
"tags"     : {"application" : "BytesInPerSec"}

每秒輸出的流量

"obj" : "kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec"
"attr" : [ "Count" ]
"resultAlias":"BytesOutPerSec"
"tags"     : {"application" : "BytesOutPerSec"}

每秒輸入的流量

"obj" : "kafka.server:type=BrokerTopicMetrics,name=BytesRejectedPerSec"
"attr" : [ "Count" ]
"resultAlias":"BytesRejectedPerSec"
"tags"     : {"application" : "BytesRejectedPerSec"}

每秒的消息寫入總量

"obj" : "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec"
"attr" : [ "Count" ]
"resultAlias":"MessagesInPerSec"
"tags"     : {"application" : "MessagesInPerSec"}

每秒FetchFollower的請求次數

"obj" : "kafka.network:type=RequestMetrics,name=RequestsPerSec,request=FetchFollower"
"attr" : [ "Count" ]
"resultAlias":"RequestsPerSec"
"tags"     : {"request" : "FetchFollower"}

每秒FetchConsumer的請求次數

"obj" : "kafka.network:type=RequestMetrics,name=RequestsPerSec,request=FetchConsumer"
"attr" : [ "Count" ]
"resultAlias":"RequestsPerSec"
"tags"     : {"request" : "FetchConsumer"}

每秒Produce的請求次數

"obj" : "kafka.network:type=RequestMetrics,name=RequestsPerSec,request=Produce"
"attr" : [ "Count" ]
"resultAlias":"RequestsPerSec"
"tags"     : {"request" : "Produce"}

內存使用的使用情況

"obj" : "java.lang:type=Memory"
"attr" : [ "HeapMemoryUsage", "NonHeapMemoryUsage" ]
"resultAlias":"MemoryUsage"
"tags"     : {"application" : "MemoryUsage"}

GC的耗時和次數

"obj" : "java.lang:type=GarbageCollector,name=*"
"attr" : [ "CollectionCount","CollectionTime" ]
"resultAlias":"GC"
"tags"     : {"application" : "GC"}

線程的使用情況

"obj" : "java.lang:type=Threading"
"attr" : [ "PeakThreadCount","ThreadCount" ]
"resultAlias":"Thread"
"tags"     : {"application" : "Thread"}

副本落后主分片的最大消息數量

"obj" : "kafka.server:type=ReplicaFetcherManager,name=MaxLag,clientId=Replica"
"attr" : [ "Value" ]
"resultAlias":"ReplicaFetcherManager"
"tags"     : {"application" : "MaxLag"}

該broker上的partition的數量

"obj" : "kafka.server:type=ReplicaManager,name=PartitionCount"
"attr" : [ "Value" ]
"resultAlias":"ReplicaManager"
"tags"     : {"application" : "PartitionCount"}

正在做復制的partition的數量

"obj" : "kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions"
"attr" : [ "Value" ]
"resultAlias":"ReplicaManager"
"tags"     : {"application" : "UnderReplicatedPartitions"}

Leader的replica的數量

"obj" : "kafka.server:type=ReplicaManager,name=LeaderCount"
"attr" : [ "Value" ]
"resultAlias":"ReplicaManager"
"tags"     : {"application" : "LeaderCount"}

一個請求FetchConsumer耗費的所有時間

"obj" : "kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchConsumer"
"attr" : [ "Count","Max" ]
"resultAlias":"TotalTimeMs"
"tags"     : {"application" : "FetchConsumer"}

一個請求FetchFollower耗費的所有時間

"obj" : "kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchFollower"
"attr" : [ "Count","Max" ]
"resultAlias":"TotalTimeMs"
"tags"     : {"application" : "FetchFollower"}

一個請求Produce耗費的所有時間

"obj" : "kafka.network:type=RequestMetrics,name=TotalTimeMs,request=Produce"
"attr" : [ "Count","Max" ]
"resultAlias":"TotalTimeMs"
"tags"     : {"application" : "Produce"}

  topic的監控指標

falcon_monitor_us每秒的寫入流量

"kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=falcon_monitor_us"
"attr" : [ "Count" ]
"resultAlias":"falcon_monitor_us"
"tags"     : {"application" : "BytesInPerSec"}

falcon_monitor_us每秒的輸出流量

"kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,topic=falcon_monitor_us"
"attr" : [ "Count" ]
"resultAlias":"falcon_monitor_us"
"tags"     : {"application" : "BytesOutPerSec"}

falcon_monitor_us每秒寫入消息的數量

"obj" : "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=falcon_monitor_us"
"attr" : [ "Count" ]
"resultAlias":"falcon_monitor_us"
"tags"     : {"application" : "MessagesInPerSec"}

falcon_monitor_us在每個分區最后的Offset

"obj" : "kafka.log:type=Log,name=LogEndOffset,topic=falcon_monitor_us,partition=*"
"attr" : [ "Value" ]
"resultAlias":"falcon_monitor_us"
"tags"     : {"application" : "LogEndOffset"}

  參數說明

obj對應jmx的ObjectName,就是要監控的指標
attr對應ObjectName的屬性,可以理解為要監控的指標的值
resultAlias對應metric 的名稱,在InfluxDb里面就是MEASUREMENTS名
tags對應InfluxDb的tag功能,對與存儲在同一個MEASUREMENTS里面的不同監控指標可以做區分,我們在用Grafana繪圖的時候會用到,建議對每個監控指標都打上tags

對於全局監控,每一個監控指標對應一個MEASUREMENTS,所有的kafka節點同一個監控指標數據寫同一個MEASUREMENTS ,對於topc監控的監控指標,同一個topic所有kafka節點寫到同一個MEASUREMENTS,並且以topic名稱命名

四、安裝與配置

  kafka

因為需要通過jmx采集kafka的監控數據,所以在kafka的啟動時候需要啟動jmx端口,啟動方式如下:

cd /data/kafka/bin/
JMX_PORT=9999 nohup ./kafka-server-start.sh ../config/server.properties  >/dev/null 2>&1 &

或者在啟動kafka的腳本kafka-server-start.sh中找到堆設置,添加export JMX_PORT="9999" 

 

if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
    export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"
    export JMX_PORT="9999"
fi

 

  influxDb

創建jmxDB數據庫:

[devuser@annie thirdparties]$ influx
Connected to http://localhost:8086 version 1.6.2
InfluxDB shell version: 1.7.7
> CREATE DATABASE "jmxDB"
> create retention policy "72_hour" on jmxDB duration 72h replication 1 DEFAULT
> 

  jmxtrans

#判斷是否已安裝此軟件
rpm -qa |grep jmx
#卸載
rpm -e jmxXXXXXX
#下載
wget https://github.com/downloads/jmxtrans/jmxtrans/jmxtrans-20121016.145842.6a28c97fbb-0.noarch.rpm#安裝
rpm -ivh jmxtrans-20121016.145842.6a28c97fbb-0.noarch.rpm#啟動[啟動前配置好/var/lib/jmxtrans下的json配置]
#啟動
必須root用戶啟動
/etc/init.d/jmxtrans start
#或
./jmxtrans.sh start

說明:

  這些只是默認目錄,如果用 jmxtrans.sh start 啟動的話,是不會默認這些目錄的 ,如果用 /etc/init.d/jmxtrans start  啟動,會有一些報錯

  jmxtrans安裝目錄:/usr/share/jmxtrans
  jmxtrans配置文件 :/etc/sysconfig/jmxtrans
  json配置文件默認目錄:/var/lib/jmxtrans/

  去安裝目錄建立json和log目錄

cd /usr/share/jmxtrans 
mkdir json 
mkdir logs

  這里在用 /etc/init.d/jmxtrans start 啟動時報錯如下:

報錯一:

Caused by: java.lang.IllegalArgumentException: Invalid type id 'com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory' (for id type 'Id.class'): no such class found
        at org.codehaus.jackson.map.jsontype.impl.ClassNameIdResolver.typeFromId(ClassNameIdResolver.java:89)
        at org.codehaus.jackson.map.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:73)
        at org.codehaus.jackson.map.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:65)
        at org.codehaus.jackson.map.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:81)
        at org.codehaus.jackson.map.deser.CollectionDeserializer.deserialize(CollectionDeserializer.java:118)

解決方案:

  官網找到github地址下載源碼,重新編譯將jar包替換,去修改jmxtrans.sh腳本,將項目所用jar包替換為重新編譯生成的

git clone https://github.com/jmxtrans/jmxtrans.git
mvn clean package -Dmaven.test.skip=true -DskipTests=true;

 

 

cd /usr/share/jmxtrans

vim jmxtrans.conf
#export JAR_FILE="/usr/share/jmxtrans/jmxtrans-all.jar"
export JAR_FILE="/usr/share/jmxtrans/jmxtrans-271-all.jar"

vim jmxtrans.sh
#JAR_FILE=${JAR_FILE:-"jmxtrans-all.jar"}
JAR_FILE=${JAR_FILE:-"jmxtrans-271-all.jar"}

對比一下發現編譯的包是有這個類的,而自帶的那個沒有

[devuser@annie jmxtrans]$ grep 'com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory' ./jmxtrans-271-all.jar 
Binary file ./jmxtrans-271-all.jar matches
[devuser@annie jmxtrans]$ grep 'com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory' ./jmxtrans-all.jar
[devuser@annie jmxtrans]$ 

報錯二:

Starting jmxtrans:                                         [  OK  ]
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=384m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=384m; support was removed in 8.0
MaxTenuringThreshold of 16 is invalid; must be between 0 and 15
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

解決方案:

#JDK8 里Nimbus -XX:MaxTenuringThreshold 的最大值是15,默認配置里的是16
cd /usr/share/jmxtrans
vim jmxtrans.sh
#-XX:MaxTenuringThreshold=16 改為:
-XX:MaxTenuringThreshold=15

  jmxtrans默認讀取/var/lib/jmxtrans下的配置文件去采集數據的,所以需要把采集kafka監控數據的配置文件都放在這個目錄下,下面是是一些配置文件命名規范:

[root@annie thirdparties]# cd /var/lib/jmxtrans
[root@annie jmxtrans]# ll
total 0
[root@annie jmxtrans]# pwd
/var/lib/jmxtrans
[root@annie jmxtrans]# wget http://qu2lhckc6.hn-bkt.clouddn.com/jmxtrans-kafka/base_172.16.133.82.json
[root@annie jmxtrans]# wget http://qu2lhckc6.hn-bkt.clouddn.com/jmxtrans-kafka/falcon_monitor_us_82.json
[root@annie jmxtrans]# ll
total 16
-rw-r--r-- 1 root root 8462 Jun  2 18:41 base_172.16.133.82.json
-rw-r--r-- 1 root root 2029 Jun  2 18:41 falcon_monitor_us_82.json

重新啟動   /etc/init.d/jmxtrans start  

然后在influxdb里可以看到數據已經生成

[devuser@annie jmxtrans]$ influx
Connected to http://localhost:8086 version 1.6.2
InfluxDB shell version: 1.7.7
> show DATABASES
name: databases
name
----
_internal
metrics
jmxDB> use jmxDB
Using database jmxDB
> show MEASUREMENTS
name: measurements
name
----
BytesInPerSec
BytesOutPerSec
BytesRejectedPerSec
GC
MemoryUsage
MessagesInPerSec
ReplicaFetcherManager
ReplicaManager
RequestsPerSec
Thread
TotalTimeMs
jvmMemory

小插曲:

  如果這里查詢不到數據,先drop調database再重新創建,數據就能進去了

五、grafana的配置與預覽

  鏈接: https://pan.baidu.com/s/1NGqdRYKRBCkzuAEESvnfCw 提取碼: qtrv

  鏈接: https://pan.baidu.com/s/1xMMOuMwRQsEmTrrUxJf6lw

 

 

參考文獻

  jmxtrans介紹與安裝

  kafka集群中jmx端口設置

  kafka0.10.x監控項分析

  jmxtrans+InfluxDb+Grafana

  Kafka JMX 監控 之 jmxtrans + influxdb + grafana (內有json模板配置文件)

  

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM