hadoop的ganglia數據監控


如果我們想知道當前運行的hadoop集群的狀態,可以通過hadoop的客戶端和web頁面來獲得,但是如果我們想知道當前集群的繁忙程度,如讀寫次數,這些工具就辦不到了。幸運的是hadoop提供了一種ganglia的數據采集方式。在這篇文章里,將介紹一下hadoop與ganglia的配置方式。

Hadoop 版本:1.2.1

OS 版本: Centos6.4

Jdk 版本: jdk1.6.0_32

Ganglia 版本:3.1.7

環境配置

機器名

Ip地址

功能

Hadoop1

192.168.124.135

namenode, datanode,

secondNameNode

jobtracker, tasktracer

Hadoop2

192.168.124.136

Datanode, tasktracker

Hadoop3

192.168.124.137

Datanode, tasktracker

ganglia

192.168.124.140

Gmetad,gmond   ganglia-web

基本架構

hadoop1, hadoop2, hadoop將數據發送給ganglia節點上的gmond, gmetad定期向gmond獲取數據,最后通過httpd顯示出來。

安裝ganglia

Yum倉庫中沒有ganglia,需要安裝一個epel倉庫

rpm -Uvh http://dl.Fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

在ganglia依次運行

Yum install ganglia-gmetad

Yum install ganglia-gmond

Yum install ganglia-web

運行完這三條命令后,整個ganglia環境就准備好了,包括httpd,php

配置ganglia

vi /etc/ganglia/gmetad.conf 修改data_source 

data_source "my_cluster" ganglia

 

vi /etc/ganglia/gmond.conf

單播模式

cluster {

  name = "my_cluster"

  owner = "unspecified"

  latlong = "unspecified"

  url = "unspecified"

}

 

udp_send_channel {

  #bind_hostname = yes # Highly recommended, soon to be default.

                       # This option tells gmond to use a source address

                       # that resolves to the machine's hostname.  Without

                       # this, the metrics may appear to come from any

                       # interface and the DNS names associated with

                       # those IPs will be used to create the RRDs.

 #mcast_join = 239.2.11.71

  host = 192.168.124.140

  port = 8649

  ttl = 1

}

 

/* You can specify as many udp_recv_channels as you like as well. */

udp_recv_channel {

  #mcast_join = 239.2.11.71

  port = 8649

  #bind = 239.2.11.71

}

 

vi conf/hadoop-metrics2.properties

*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31

*.sink.ganglia.period=10

*.sink.ganglia.supportsparse=true

*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both

*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40

namenode.sink.ganglia.servers=192.168.124.140:8649

datanode.sink.ganglia.servers=192.168.124.140:8649

jobtracker.sink.ganglia.servers=192.168.124.140:8649

tasktracker.sink.ganglia.servers=192.168.124.140:8649

maptask.sink.ganglia.servers=192.168.124.140:8649

reducetask.sink.ganglia.servers=192.168.124.140:8649

 

啟動

先關閉防火牆: service iptables stop

啟動httpd: service httpd start

啟動gmetad: service gmetad start

啟動gmond: service gmond start

啟動 hadoop集群:bin/start-all.sh

結果

從圖上可以看出,我們已經成功的顯出ganglia, hadoop1, hadoop2, hadoop3的信息

Hadoop2和hadoop3都監控datanode,tasktracker,他們顯示的metric是一樣的

Hadoop1比hadoop2,hadoop3多運行三個組件:namenode, secondnamenode, jobtracker,所以會多出dfs.FSNameSystem metrics,dfs.namenode metrics,mapred.Queue metrics,mapred.jobtracker metrics

下面我們將列出hadoop1節點上所有metric的圖,有興趣的可以看一看。

結論

  1. 此hadoop集群是沒有啟動security,因為ugi沒有數據
  2. 可以看出hadoop的一些參數信息
  3. 可以看出目前hadoop的一些系統信息,是否繁忙

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM