之前介紹過一個Erlang的Web監控工具,如果在字符終端界面呢? Erlang提供了一套監控系統負載情況的模塊,可以監控CPU 磁盤 以及內存的使用情況.這些模塊組織成為os_mon應用程序,啟動os_mon才可以看到采集的系統信息;os_mon依賴sasl應用,我們首先要啟動sasl.如果沒有運行os_mon,或者系統不支持都會返回無效值;我們先動手試一下:
os_mon
Eshell V5.9 (abort with ^G) 1> application:start(os_mon). {error,{not_started,sasl}} 2> application:start(sasl). ok 3> ...... ...... %% 應用程序啟動大量的輸出信息被我省略了 =PROGRESS REPORT==== 22-Nov-2012::00:37:33 === application: sasl started_at: nonode@nohost 3> application:start(os_mon). .............%% 輸出信息省略 =PROGRESS REPORT==== 22-Nov-2012::00:37:36 === application: os_mon started_at: nonode@nohost ok 4> =PROGRESS REPORT==== 22-Nov-2012::00:37:36 === supervisor: {local,kernel_safe_sup} started: [{pid,<0.56.0>}, {name,timer_server}, {mfargs,{timer,start_link,[]}}, {restart_type,permanent}, {shutdown,1000}, {child_type,worker}] 4> disksup:get_disk_data(). [{"/",18102140,73}, {"/dev/shm",1962120,0}, {"/boot",495844,11}] 5> memsup:get_memory_data(). {4018425856,739979264,{<0.26.0>,284328}} 6> memsup:get_os_wordsize(). 64 7> 7> disksup:get_almost_full_threshold(). 80 11> disksup:set_almost_full_threshold(0.65). ok 12> disksup:get_almost_full_threshold(). 65 13>
上面查看的是磁盤和內存的使用情況,disksup memsup提供了類似的接口:查看當前狀態,查看/設定系統閾值,查看/設定系統數據采樣間隔;下面,看看CPU的監控,注意它只支持Unix.cpu_sup提供的接口有:查看當前運行的OS進程數量,獲取最近1分鍾 5分鍾 15分鍾的系統平均負載(是不是很熟悉?);
14> cpu_sup:nprocs(). 167 15> cpu_sup:avg1(). 0 16> cpu_sup:avg5(). 0 17> cpu_sup:avg15(). 0 18> cpu_sup:util(). 0.5439432506552185 20> cpu_sup:util([detailed]). {[0], [{soft_irq,0.0020078998462259544}, {hard_irq,1.1279168175796845e-4}, {kernel,0.3190071022063228}, {nice_user,1.6372986061640582e-4}, {user,0.22264870068088854}], [{steal,0.0}, {idle,99.45393492597752}, {wait,0.0021248497466662443}], []} 21>
上面說過它只是在unix操作系統可用,如果是在Windows 環境中:
cpu_sup:util([detailed]). =ERROR REPORT==== 21-Nov-2012::17:25:11 === OS_MON (cpu_sup) called by <0.30.0>, unavailable {all,0,0,[]}
[os_mon 官方文檔] http://www.erlang.org/doc/apps/os_mon/index.html
etop
平時用的最多的可能就是etop,兩種用法:
[1] 在/usr/local/lib/erlang/lib/observer-1.0/priv/bin目錄下面(視安裝情況而異)執行:
./etop -node t@zen.com -setcookie abc -lines 10 -sort memory -interval 10 -accumulate true -tracing on
可能遇到的下面的問題:
[root@localhost bin]# ./etop -name m@zen.com -node t@zen.com -setcookie abc -lines 5 -sort memory -interval 10 -accumulate true -tracing on Erlang R15B (erts-5.9) [source] [64-bit] [async-threads:0] [hipe] [kernel-poll:false] Error Couldn't connect to node 't@zen.com' =ERROR REPORT==== 22-Nov-2012::20:00:45 === ** System NOT running to use fully qualified hostnames ** ** Hostname zen.com is illegal ** Usage of the erlang top program Options are set as command line parameters as in -node a@host -.. or as parameter to etop:start([{node, a@host}, {...}])
很簡單,連不上要監控的節點,解決這種問題我們早就駕輕就熟了;打開一下etop文件,修改一下里面的sname setcookie即可;
[2] 還有一種方法就是連入了要監控的節點之后運行etop,由於etop本身執行會阻塞輸入,我們創建一個進程做這個事情
spawn(fun() -> etop:start([{output, text}, {interval, 20}, {lines, 20}, {sort, memory}]) end).
官方文檔難得圖文並茂的介紹了一下etop:http://www.erlang.org/doc/apps/observer/etop_ug.html
RabbitMQ 監控模塊
使用RabbitMQ的時候我們可以配置Rabbit內存消耗的水位線,完成內存使用監控的模塊是vm_memory_monitor [鏈接];這個模塊提供的方法還是通用的:檢測操作系統類型--> 根據操作系統,使用os:cmd執行特定命令獲取系統信息 -->解析返回結果;寫一個簡單的例子獲取系統版本,以及簡單做一下ping:
-module(m). -compile(export_all). uname() -> cmd("uname -a"). ping5(Host) when is_list(Host)-> R= cmd("ping -c 5 "++ Host), string:tokens(R,"\n"). cmd(Command) -> Exec = hd(string:tokens(Command, " ")), case os:find_executable(Exec) of false -> throw({command_not_found, Exec}); _ -> os:cmd(Command) end.
查看一下結果:
Eshell V5.9 (abort with ^G) 1> m:uname(). "Linux localhost 2.6.32-279.9.1.el6.x86_64 #1 SMP Tue Sep 25 21:43:11 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux\n" 2> m:ping5("www.baidu.com"). ["PING www.a.shifen.com (61.135.169.105) 56(84) bytes of data.", "64 bytes from 61.135.169.105: icmp_seq=1 ttl=51 time=3.34 ms", "64 bytes from 61.135.169.105: icmp_seq=2 ttl=51 time=3.01 ms", "64 bytes from 61.135.169.105: icmp_seq=3 ttl=51 time=2.85 ms", "64 bytes from 61.135.169.105: icmp_seq=4 ttl=51 time=2.76 ms", "64 bytes from 61.135.169.105: icmp_seq=5 ttl=51 time=2.87 ms", "--- www.a.shifen.com ping statistics ---", "5 packets transmitted, 5 received, 0% packet loss, time 4009ms", "rtt min/avg/max/mdev = 2.760/2.970/3.348/0.205 ms"] 3>
entop
這個針對Erlang的監控工具目標明確"A top-like Erlang node monitoring tool",先看下使用:
首先啟動一個需要監控的節點:
erl -name t@zen.com -setcookie abc
然后,啟動entop連接進去:
./entop t@zen.com -name top@zen.com -setcookie abc
效果如圖:
它的解決方案是這樣的:配置cookie,name參數,啟動監控節點之后緊接着entop就會向要監控的節點"注入"一個用於收集數據的模塊entop_collector;
remote_load_code(Module, Node) -> {_, Binary, Filename} = code:get_object_code(Module), rpc:call(Node, code, load_binary, [Module, Filename, Binary]).
在entop啟動前后你可以在被監控的節點執行一下entop_collector:get_data().看看效果,由於輸出內容較多這里只貼片段;
(t@zen.com)4> entop_collector:get_data(). {ok,[{uptime,{330312,169}}, {local_time,{{2012,11,22},{19,7,24}}}, {process_count,32}, {run_queue,0}, {reduction_count,{992425,2841}}, {process_memory_used,2106328}, {process_memory_total,2107083}, {memory,[{system,6730661}, {atom,194289}, {atom_used,173447}, {binary,733296}, {code,3866052}, {ets,248144}]}], [[{pid,"<0.0.0>"}, {registered_name,init}, {reductions,4154}, {message_queue_len,0}, {heap_size,1597}, {stack_size,2}, {total_heap_size,2207}], [{pid,"<0.3.0>"}, {registered_name,erl_prim_loader}, ... ...
有了這樣一個采集信息的模塊,rpc就可以獲得被監控節點的數據了;這里還有一個問題,就是top之類的工具需要不斷重繪字符終端界面以及響應用戶控制命令,這個怎么搞定呢?entop借助了另外一個開源項目cecho完成了這件事情,這個項目的設計目標就是"An ncurses library for Erlang",恰到好處.看看我們比較關心的兩處:
刷新字符界面:
update_screen(Time, HeaderData, RowDataList, State) -> print_nodeinfo(State), draw_title_bar(State), print_showinfo(State, Time), {Headers, State1} = process_header_data(HeaderData, State), lists:foldl(fun(Header, Y) -> cecho:hline($ , ?MAX_HLINE), cecho:mvaddstr(Y, 0, Header), Y + 1 end, 1, Headers), {RowList, State2} = process_row_data(RowDataList, State1), SortedRowList = sort(RowList, State), {Y, _} = cecho:getmaxyx(), StartY = (Y-(Y-7)), lists:foreach(fun(N) -> cecho:move(N, 0), cecho:hline($ , ?MAX_HLINE) end, lists:seq(StartY, Y)), update_rows(SortedRowList, State2#state.columns, StartY, Y), cecho:refresh(), State2.
響應用戶命令:
control(ViewPid) -> P = cecho:getch(), case P of N when N >= 49 andalso N =< 57 -> ViewPid ! {sort, N - 48}, control(ViewPid); $> -> ViewPid ! {sort, next}, control(ViewPid); $< -> ViewPid ! {sort, prev}, control(ViewPid); $r -> ViewPid ! reverse_sort, control(ViewPid); $q -> do_exit(ViewPid); 3 -> do_exit(ViewPid); %Ctrl-C _ -> ViewPid ! force_update, control(ViewPid) end.
這兩個項目的地址:
[1] ENTOP https://github.com/mazenharake/entop
[2] CECHO https://github.com/mazenharake/cecho
猜你可能感興趣的內容 : ) [Erlang 0054] Erlang Web 監控工具
最后小圖一張