--日期:2020年7月21日
--作者:飛翔的小胖豬
文檔功能說明
文檔通過ansible+shell+consul的方式實現批量下發安裝Linux操作系統監控的node_exporter軟件,自動加載node_exporter軟件到系統開機啟動中並通過consul注冊的功能。為部署prometheus系統做好前期准備。
適用范圍
文檔試用於centos、redhat系列操作系統。由於文檔使用了ansible對主機操作,被管理端linux需要有python2.6以上的環境才能使用,對於centos和redhat系統默認只能控制6及以上的系統,6以下的操作系統需要單獨的升級python版本。所有被管主機上請提前安裝好wget工具。
注:
對於已有業務承載的6以下操作系統不建議升級python包來實現,該部分機器請手動下發軟件及腳本。
環境准備
ansible服務器一台 批量操作控制主機節點
httpd服務器一台 存放soft文件和腳本文件
consul服務器一台 實現自動注冊node_exporter到資源中
ansible和consul可以部署到一台設備上,也可以分開部署。文檔中不涉及到ansible及consul服務的安裝部署操作,如有需求請查看作者其他部署筆記或自己百度。
步驟
1.在ansible中添加被管控節點列表
2.編寫腳本下發軟件及腳本至服務器
3.實現node_exporter注冊到consul中
4.檢查查看consul注冊情況
step1:配置ansible服務及ansible控制清單
文章不對ansible安裝進行介紹,在ansible安裝完成的前提下進行配置。進行配置前需要保證環境中以完成ansible軟件安裝。
配置首次通過ssh協議登錄被控主機時不用敲yes。
修改/etc/ansible/ansible.cfg文件中的host_key_checking = False
# vim /etc/ansible/ansible.cfg
# additional paths to search for roles in, colon separated
#roles_path = /etc/ansible/roles
# uncomment this to disable SSH key host checking
host_key_checking = False
# change the default callback, you can only have one 'stdout' type enabled at a time.
修改/etc/ssh/ssh_config中StrictHostKeyChecking ask選項為StrictHostKeyChecking no
# vim /etc/ssh/ssh_config
# CheckHostIP yes
# AddressFamily any
# ConnectTimeout 0 StrictHostKeyChecking no # IdentityFile ~/.ssh/id_rsa # IdentityFile ~/.ssh/id_dsa
編輯/etc/ansible/hosts文件添加需要控制的主機清單,文檔使用明文密碼添加內容。
# vim /etc/ansible/hosts
[node_exporter_group] node_1 ansible_ssh_host=192.168.111.12 ansible_ssh_user=root ansible_ssh_pass=yinwan node_2 ansible_ssh_host=192.168.111.124 ansible_ssh_user=root ansible_ssh_pass=yinwan
配置說明:
ansible_ssh_host: 被控主機ip地址或域名
ansible_ssh_user: 被控主機可訪問用戶
ansible_ssh_pass: 對應ansible_ssh_user用戶的密碼
驗證ansible服務配置成功,使用ansible調用命令查看被控主機主機名,如果正常能夠顯示出主機名。
查看ansible中一共有多少個被控主機
[root@prometheus ~]# ansible all --list hosts (2): node_1 node_2 [root@prometheus ~]#
查看被控主機的主機名,能夠正常查看主機名表示主機能夠被ansible服務控制,配置添加成功。
[root@prometheus ~]# ansible all -m shell -a 'hostname' node_2 | CHANGED | rc=0 >> docker_0001 node_1 | CHANGED | rc=0 >> ole6 [root@prometheus ~]#
step2:編寫腳本下發軟件及腳本至服務器
實驗使用http放置node_exporter軟件和各種腳本,客戶主機通過wget可從http服務器獲取到資源。在進行軟件及腳本下發前需要讀者自定搭建一個http服務,作者直接使用yum命令安裝了一個httpd服務,把軟件及腳本放置/var/www/html/soft/文件夾中。
腳本說明:
存放地址:/var/www/html/soft/
[root@prometheus ~]# ls -l /var/www/html/soft/ total 18300 -rw-r--r-- 1 root root 511 Jul 21 23:47 auto_consul_zc.sh -rw-r--r-- 1 root root 426 Jul 17 12:48 auto_start_node_exporter.sh drwxr-xr-x 4 root root 70 Jul 18 22:35 data -rw-r--r-- 1 root root 2578 Jul 21 22:56 get_soft.sh -rw-r--r-- 1 root root 9245080 Jul 17 11:44 node_exporter-1.0.0.linux-386.tar.gz -rw-r--r-- 1 root root 9476268 Jul 17 11:44 node_exporter-1.0.0.linux-amd64.tar.gz [root@prometheus ~]#
get_soft.sh:客戶端主機下載node_exporter軟件、node_exporter啟動腳本。並且完成node_exporter安裝。
auto_start_node_exporter.sh:客戶端啟動node_exporter腳本
auto_consul_zc.sh:使用/etc/ansible/hosts列表生成注冊consul命令
auto_prometh_server.sh:啟動prometheus+consul+alertmanager腳本
# vim get_soft.sh

#!/bin/bash if uname -a | grep -i _64 &> /dev/null ;then echo "64位處理方式" if [ -f /root/node_exporter-1.0.0.linux-amd64.tar.gz ];then echo ' node_exporter-1.0.0.linux-amd64.tar.gz file exist' else if timeout 20 wget -P /root http://192.168.111.83/soft/node_exporter-1.0.0.linux-amd64.tar.gz;then if [ -f /usr/local/node_exporter/node_exporter ];then echo "node_exporter file exist , not exec command. " else if [ -f /root/auto_start_node_exporter.sh ];then echo 'auto_start_node_exporter.sh exist ' else if timeout 20 wget -P /root http://192.168.111.83/soft/auto_start_node_exporter.sh;then if cat /etc/rc.d/rc.local | grep -i auto_start_node_exporter &> /dev/null ;then echo "file in auto startup low" else echo '/root/auto_start_node_exporter.sh' >> /etc/rc.d/rc.local chmod a+x /etc/rc.d/rc.local fi else echo 'auto_start_node_exporter.sh copy faile !! ' fi fi mkdir /scripts/soft/ -p tar -zxvf /root/node_exporter-1.0.0.linux-amd64.tar.gz -C /scripts/soft/ && mv /scripts/soft/node_exporter-1.0.0.linux-amd64/ /usr/local/node_exporter chmod a+x /root/auto_start_node_exporter.sh fi else echo "node_exporter soft_file copy faile !!" fi fi else echo "32位處理方式" if [ -f /root/node_exporter-1.0.0.linux-amd64.tar.gz ];then echo ' node_exporter-1.0.0.linux-amd64.tar.gz file exist' else if timeout 20 wget -P /root http://192.168.111.83/soft/node_exporter-1.0.0.linux-386.tar.gz;then if [ -f /usr/local/node_exporter/node_exporter ];then echo "node_exporter file exist , not exec command. " else if [ -f /root/auto_start_node_exporter.sh ];then echo 'auto_start_node_exporter.sh exist ' else if timeout 20 wget -P /root http://192.168.111.83/soft/auto_start_node_exporter.sh;then if cat /etc/rc.d/rc.local | grep -i auto_start_node_exporter &> /dev/null ;then echo "file in auto startup low" else echo '/root/auto_start_node_exporter.sh' >> /etc/rc.d/rc.local chmod a+x /etc/rc.d/rc.local fi else echo 'auto_start_node_exporter.sh copy faile !! ' fi fi mkdir /scripts/soft/ -p tar -zxvf /root/node_exporter-1.0.0.linux-386.tar.gz -C /scripts/soft/ && mv /scripts/soft/node_exporter-1.0.0.linux-386/ /usr/local/node_exporter chmod a+x /root/auto_start_node_exporter.sh fi else echo "node_exporter soft_file copy faile !!" fi fi fi
# vim auto_start_node_exporter.sh

#!/bin/bash if which netstat &> /dev/null;then if netstat -alntup | grep -i 9100 &> /dev/null;then echo "9100 port exist netstat !" else /usr/local/node_exporter/node_exporter &> /dev/null & fi elif which ss &> /dev/null ;then if ss -alntup | grep -i 9100 &> /dev/null;then echo "9100 port exist ss ! " else /usr/local/node_exporter/node_exporter &> /dev/null & fi else echo "未知錯誤,不啟動程序" fi
# vim auto_consul_zc.sh

#!/bin/bash cat /etc/ansible/hosts | grep -v '\[' | grep -v ^$ | sed 's/=/ /' | awk '{print $1,$3}' > /prometheus/tmp_consul_list.txt sleep 1 cat /prometheus/tmp_consul_list.txt | while read host_name host_addr do echo " curl -X PUT -d ' {\"id\": \"${host_name}\",\"name\": \"${host_name}\",\"address\": \"${host_addr}\",\"port\": 9100,\"tags\": [\"test\",\"node\",\"linux\"],\"checks\": [{\"http\": \"http://${host_addr}:9100/metrics\", \"interval\": \"5s\"}]}' http://192.168.111.83:8500/v1/agent/service/register " done
# vim auto_prometh_server.sh

#!/bin/bash sleep 1 prometheus --config.file="/usr/local/prometheus/prometheus.yml" --storage.tsdb.retention.time=90d &> /dev/null & consul agent -server -bootstrap-expect 1 -data-dir=/usr/local/consul_data/data -ui -bind 192.168.111.83 -client 0.0.0.0 &> /dev/null & alertmanager --config.file=/usr/local/alertmanager/alertmanager.yml --cluster.advertise-address=0.0.0.0:9093 &> /dev/null &
執行ansible批量下發node_exporter和啟動腳本並在主機端自動安裝node_exporter軟件加入啟動腳本到開機啟動。
在下發之前需要確認被控主機節點有wget命令,如果沒有wget命令則需要先完成wget安裝。
[root@prometheus ~]# ansible all -m shell -a 'which wget ' node_1 | CHANGED | rc=0 >> /usr/bin/wget node_2 | CHANGED | rc=0 >> /usr/bin/wget
使用ansible的script模塊執行腳本/var/www/html/soft/get_soft.sh

[root@prometheus ~]# ansible all -m script -a '/var/www/html/soft/get_soft.sh' node_2 | CHANGED => { "changed": true, "rc": 0, "stderr": "Shared connection to 192.168.111.124 closed.\r\n", "stderr_lines": [ "Shared connection to 192.168.111.124 closed." ], "stdout": "64位處理方式\r\n--2020-07-22 15:09:35-- http://192.168.111.83/soft/node_exporter-1.0.0.linux-amd64.tar.gz\r\nConnecting to 192.168.111.83:80... connected.\r\nHTTP request sent, awaiting response... 200 OK\r\nLength: 9476268 (9.0M) [application/x-gzip]\r\nSaving to: ‘/root/node_exporter-1.0.0.linux-amd64.tar.gz’\r\n\r\n\r 0% [ ] 0 --.-K/s \r100%[======================================>] 9,476,268 --.-K/s in 0.05s \r\n\r\n2020-07-22 15:09:35 (188 MB/s) - ‘/root/node_exporter-1.0.0.linux-amd64.tar.gz’ saved [9476268/9476268]\r\n\r\n--2020-07-22 15:09:35-- http://192.168.111.83/soft/auto_start_node_exporter.sh\r\nConnecting to 192.168.111.83:80... connected.\r\nHTTP request sent, awaiting response... 200 OK\r\nLength: 426 [application/x-sh]\r\nSaving to: ‘/root/auto_start_node_exporter.sh’\r\n\r\n\r 0% [ ] 0 --.-K/s \r100%[======================================>] 426 --.-K/s in 0s \r\n\r\n2020-07-22 15:09:35 (77.3 MB/s) - ‘/root/auto_start_node_exporter.sh’ saved [426/426]\r\n\r\nnode_exporter-1.0.0.linux-amd64/\r\nnode_exporter-1.0.0.linux-amd64/node_exporter\r\nnode_exporter-1.0.0.linux-amd64/NOTICE\r\nnode_exporter-1.0.0.linux-amd64/LICENSE\r\n", "stdout_lines": [ "64位處理方式", "--2020-07-22 15:09:35-- http://192.168.111.83/soft/node_exporter-1.0.0.linux-amd64.tar.gz", "Connecting to 192.168.111.83:80... connected.", "HTTP request sent, awaiting response... 200 OK", "Length: 9476268 (9.0M) [application/x-gzip]", "Saving to: ‘/root/node_exporter-1.0.0.linux-amd64.tar.gz’", "", "", " 0% [ ] 0 --.-K/s ", "100%[======================================>] 9,476,268 --.-K/s in 0.05s ", "", "2020-07-22 15:09:35 (188 MB/s) - ‘/root/node_exporter-1.0.0.linux-amd64.tar.gz’ saved [9476268/9476268]", "", "--2020-07-22 15:09:35-- http://192.168.111.83/soft/auto_start_node_exporter.sh", "Connecting to 192.168.111.83:80... connected.", "HTTP request sent, awaiting response... 200 OK", "Length: 426 [application/x-sh]", "Saving to: ‘/root/auto_start_node_exporter.sh’", "", "", " 0% [ ] 0 --.-K/s ", "100%[======================================>] 426 --.-K/s in 0s ", "", "2020-07-22 15:09:35 (77.3 MB/s) - ‘/root/auto_start_node_exporter.sh’ saved [426/426]", "", "node_exporter-1.0.0.linux-amd64/", "node_exporter-1.0.0.linux-amd64/node_exporter", "node_exporter-1.0.0.linux-amd64/NOTICE", "node_exporter-1.0.0.linux-amd64/LICENSE" ] } node_1 | CHANGED => { "changed": true, "rc": 0, "stderr": "Shared connection to 192.168.111.12 closed.\r\n", "stderr_lines": [ "Shared connection to 192.168.111.12 closed." ], "stdout": "64位處理方式\r\n--2020-07-22 15:09:35-- http://192.168.111.83/soft/node_exporter-1.0.0.linux-amd64.tar.gz\r\nConnecting to 192.168.111.83:80... connected.\r\nHTTP request sent, awaiting response... 200 OK\r\nLength: 9476268 (9.0M) [application/x-gzip]\r\nSaving to: “/root/node_exporter-1.0.0.linux-amd64.tar.gz”\r\n\r\n\r 0% [ ] 0 --.-K/s \r68% [=========================> ] 6,524,944 31.0M/s \r100%[======================================>] 9,476,268 27.7M/s in 0.3s \r\n\r\n2020-07-22 15:09:35 (27.7 MB/s) - “/root/node_exporter-1.0.0.linux-amd64.tar.gz” saved [9476268/9476268]\r\n\r\n--2020-07-22 15:09:35-- http://192.168.111.83/soft/auto_start_node_exporter.sh\r\nConnecting to 192.168.111.83:80... connected.\r\nHTTP request sent, awaiting response... 200 OK\r\nLength: 426 [application/x-sh]\r\nSaving to: “/root/auto_start_node_exporter.sh”\r\n\r\n\r 0% [ ] 0 --.-K/s \r100%[======================================>] 426 --.-K/s in 0s \r\n\r\n2020-07-22 15:09:35 (116 MB/s) - “/root/auto_start_node_exporter.sh” saved [426/426]\r\n\r\nnode_exporter-1.0.0.linux-amd64/\r\nnode_exporter-1.0.0.linux-amd64/node_exporter\r\nnode_exporter-1.0.0.linux-amd64/NOTICE\r\nnode_exporter-1.0.0.linux-amd64/LICENSE\r\n", "stdout_lines": [ "64位處理方式", "--2020-07-22 15:09:35-- http://192.168.111.83/soft/node_exporter-1.0.0.linux-amd64.tar.gz", "Connecting to 192.168.111.83:80... connected.", "HTTP request sent, awaiting response... 200 OK", "Length: 9476268 (9.0M) [application/x-gzip]", "Saving to: “/root/node_exporter-1.0.0.linux-amd64.tar.gz”", "", "", " 0% [ ] 0 --.-K/s ", "68% [=========================> ] 6,524,944 31.0M/s ", "100%[======================================>] 9,476,268 27.7M/s in 0.3s ", "", "2020-07-22 15:09:35 (27.7 MB/s) - “/root/node_exporter-1.0.0.linux-amd64.tar.gz” saved [9476268/9476268]", "", "--2020-07-22 15:09:35-- http://192.168.111.83/soft/auto_start_node_exporter.sh", "Connecting to 192.168.111.83:80... connected.", "HTTP request sent, awaiting response... 200 OK", "Length: 426 [application/x-sh]", "Saving to: “/root/auto_start_node_exporter.sh”", "", "", " 0% [ ] 0 --.-K/s ", "100%[======================================>] 426 --.-K/s in 0s ", "", "2020-07-22 15:09:35 (116 MB/s) - “/root/auto_start_node_exporter.sh” saved [426/426]", "", "node_exporter-1.0.0.linux-amd64/", "node_exporter-1.0.0.linux-amd64/node_exporter", "node_exporter-1.0.0.linux-amd64/NOTICE", "node_exporter-1.0.0.linux-amd64/LICENSE" ] } [root@prometheus ~]#
確認客戶端上是否存在node_exporter軟件,如果存在/usr/local/node_exporter/node_exporter文件表示軟件安裝成功。
[root@prometheus ~]# ansible all -m shell -a ' ls -l /usr/local/node_exporter/node_exporter ' node_1 | CHANGED | rc=0 >> -rwxr-xr-x 1 3434 3434 19572271 May 26 14:02 /usr/local/node_exporter/node_exporter node_2 | CHANGED | rc=0 >> -rwxr-xr-x 1 3434 3434 19572271 May 26 14:02 /usr/local/node_exporter/node_exporter [root@prometheus ~]#
rc=0表示命令有返回值,則表示文件存在,軟件安裝成功。
確認客戶端上是否存在auto_start_node_exporter.sh啟動腳本,如果存在/root/auto_start_node_exporter.sh則啟動腳本獲取成功。
[root@prometheus ~]# ansible all -m shell -a ' ls -l /root/auto_start_node_exporter.sh ' node_1 | CHANGED | rc=0 >> -rwxr-xr-x 1 root root 426 Jul 17 12:48 /root/auto_start_node_exporter.sh node_2 | CHANGED | rc=0 >> -rwxr-xr-x 1 root root 426 Jul 17 12:48 /root/auto_start_node_exporter.sh [root@prometheus ~]#
rc=0表示命令有返回值,則表示啟動腳本存在。
確認客戶端上是否成功添加啟動腳本到開機啟動,如果/etc/rc.d/rc.local中有/root/auto_start_node_exporter.sh行則開機啟動添加成功。
[root@prometheus ~]# ansible all -m shell -a ' cat /etc/rc.d/rc.local | grep -i auto_start_node_exporter ' node_1 | CHANGED | rc=0 >> /root/auto_start_node_exporter.sh node_2 | CHANGED | rc=0 >> /root/auto_start_node_exporter.sh [root@prometheus ~]#
rc=0表示命令有返回值,則表示/etc/rc.d/rc.local文件中有/root/auto_start_node_exporter.sh行,表示腳本開機啟動配置完成。
通過命令啟動客戶端的node_exporter軟件,啟動后查看客戶主機上9100端口是否處於監聽狀態。
[root@prometheus ~]# ansible all -m shell -a 'nohup /root/auto_start_node_exporter.sh ' node_1 | CHANGED | rc=0 >> nohup: ignoring input node_2 | CHANGED | rc=0 >> nohup: ignoring input [root@prometheus ~]#
調用客戶端腳本啟動node_exporter軟件。
確認客戶端9100端口由node_exporter軟件監聽。
[root@prometheus ~]# ansible all -m shell -a 'ss -alntup | grep -i 9100 ' node_1 | CHANGED | rc=0 >> tcp LISTEN 0 128 :::9100 :::* users:(("node_exporter",3913,3)) node_2 | CHANGED | rc=0 >> tcp LISTEN 0 128 :::9100 :::* users:(("node_exporter",pid=10833,fd=3)
9100端口啟動表示node_exporter軟件安裝完成。
step3:注冊客戶端node_exporter到consul
使用shell腳本從/etc/ansible/hosts中提取被控主機信息,自動生成注冊consul命令。
在進行該步驟前需要提前完成consul軟件的安裝與配置。
[root@prometheus ~]# source /var/www/html/soft/auto_consul_zc.sh curl -X PUT -d '{"id": "node_1","name": "node_1","address": "192.168.111.12","port": 9100,"tags": ["test","node","linux"],"checks": [{"http": "http://192.168.111.12:9100/metrics", "interval": "5s"}]}' http://192.168.111.83:8500/v1/agent/service/register curl -X PUT -d '{"id": "node_2","name": "node_2","address": "192.168.111.124","port": 9100,"tags": ["test","node","linux"],"checks": [{"http": "http://192.168.111.124:9100/metrics", "interval": "5s"}]}' http://192.168.111.83:8500/v1/agent/service/register
注:
auto_consul_zc.sh腳本中http://192.168.111.83:8500/v1/agent/service/register中的ip地址是寫死的,讀者請根據自己的consul服務器地址修改。
復制腳本生成的命令並執行。
[root@prometheus ~]# curl -X PUT -d '{"id": "node_1","name": "node_1","address": "192.168.111.12","port": 9100,"tags": ["test","node","linux"],"checks": [{"http": "http://192.168.111.12:9100/metrics", "interval": "5s"}]}' http://192.168.111.83:8500/v1/agent/service/register [root@prometheus ~]# [root@prometheus ~]# curl -X PUT -d '{"id": "node_2","name": "node_2","address": "192.168.111.124","port": 9100,"tags": ["test","node","linux"],"checks": [{"http": "http://192.168.111.124:9100/metrics", "interval": "5s"}]}' http://192.168.111.83:8500/v1/agent/service/register
step4:確認consul注冊信息
使用網頁瀏覽器打開輸入consul服務器的地址端口查看界面中是否有新注冊服務器,如果有則表示注冊成功。
能夠在web界面查看到節點表示節點注冊成功,此時可以在prometheus中添加consul資源實現對數據的收集工作。