[Erlang 0041] 詳解io:format


  最近遇到幾個問題,都是和Erlang Shell輸出有關,問題解決了但是追問還要繼續下去,后面幾篇文章都將圍繞這一話題展開;那我們就從io:format("hello world!")開始說起吧.
%%代碼路徑:\erl5.9\lib\stdlib-1.18\src\io.erl
format(Format) ->
format(Format, []).

format(Format, Args) ->
format(default_output(), Format, Args).

format(Io, Format, Args) ->
o_request(Io, {format,Format,Args}, format).

 打開\erl5.9\lib\stdlib-1.18\src\io.erl,顯然io:format("hello world!")的調用會走到format(default_output(), Format, Args).我們的第一個問題就是這里的default_output,它的實現很簡單:
default_output() ->
group_leader().
 這里的group_leader/0,實際上是erlang:group_leader().我們看官方文檔對它的解釋:

group_leader() -> GroupLeader

Types:
GroupLeader = pid()
     Returns the pid of the group leader for the process which evaluates the function.
   Every process is a member of some process group and all groups have a group leader. All IO from the group is channeled to the group leader. When a new process is spawned, it gets the same group leader as the spawning process. Initially, at system start-up, init is both its own group leader and the group leader of all processes.

 Erlang進程不是孤立的,進程都屬於進程組,進程組都有group leader.所有的進程組的IO都會重定向到group leader.當一個進程被創建的時候,它就會繼承父進程的group leader.系統初始化的時候,init是它自己和其它所有進程的group leader.我們下面在Erlang Shell中做一下檢驗:
1.在shell中創建一個進程,它的group_leader是什么?
2.當前shell的group leader是什么?
3.init進程的group leader
4.搞崩這個shell,讓shell重啟,我們繼續觀察
Eshell V5.9  (abort with ^G)
1> self().
<0.30.0>
2> P=spawn(fun()-> receive after infinity -> hello end end ).
<0.33.0>
3> erlang:process_info(P).
[{current_function,{erl_eval,receive_clauses,8}},
{initial_call,{erlang,apply,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[]},
{dictionary,[]},
{trap_exit,false},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>}, %%注意這里就是進程P的group leader
{total_heap_size,233},
{heap_size,233},
{stack_size,10},
{reductions,18},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,0}]},
{suspending,[]}]
4> erlang:process_info(pid(0,23,0)). %%繼續跟進看看P進程的group leader是什么樣的進程?
[{registered_name,user}, %%注意P進程的group leader的registered_name是user!!!
{current_function,{user,server_loop,2}},
{initial_call,{erlang,apply,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.21.0>,<0.24.0>,#Port<0.319>,<0.5.0>]},
{dictionary,[{unicode,false},
{read_mode,list},
{shell,<0.24.0>}]},
{trap_exit,true},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>}, %%user進程的group leader就是它自己
{total_heap_size,3194},
{heap_size,2584},
{stack_size,9},
{reductions,1310},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,2}]},
{suspending,[]}]
5> whereis(init). %%觀察下init進程的元數據
<0.0.0>
6> erlang:process_info(pid(0,0,0)).
[{registered_name,init},
{current_function,{init,loop,1}},
{initial_call,{otp_ring0,start,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.5.0>,<0.6.0>,<0.3.0>]},
{dictionary,[]},
{trap_exit,true},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.0.0>},%init的group leader 就是它自己
{total_heap_size,1974},
{heap_size,1597},
{stack_size,2},
{reductions,2357},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,4}]},
{suspending,[]}]
7> erlang:process_info(pid(0,30,0)). %調轉回頭我們看看當前這個shell的group leader
[{current_function,{erl_eval,do_apply,6}},
{initial_call,{erlang,apply,2}},
{status,running},
{message_queue_len,0},
{messages,[]},
{links,[<0.24.0>]},
{dictionary,[]},
{trap_exit,false},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>}, %還記得這是什么進程的進程ID? 對,是user
{total_heap_size,3571},
{heap_size,2584},
{stack_size,24},
{reductions,18941},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,6}]},
{suspending,[]}]
8> self(). %%下面我們要把當前shell搞崩
<0.30.0>
9> 1/0.
** exception error: bad argument in an arithmetic expression
in operator '/'/2
called as 1 / 0
10> self(). %%再次查看 Shell的pid已經變了
<0.41.0>
11> erlang:process_info(pid(0,41,0)).
[{current_function,{erl_eval,do_apply,6}},
{initial_call,{erlang,apply,2}},
{status,running},
{message_queue_len,0},
{messages,[]},
{links,[<0.24.0>]},
{dictionary,[]},
{trap_exit,false},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>}, %%注意這里的group leader 還是shell
{total_heap_size,3571},
{heap_size,2584},
{stack_size,24},
{reductions,3281},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,8}]},
{suspending,[]}]
12>
   系統啟動的時候,init進程首先被創建(Pid <0.0.0>),是自己的group leader,前面說過它還是所有進程的group leader,邏輯上是這樣的,因為它首先被創建.在shell中我們創建的進程以及shell進程,group leader都是user進程!
  user進程是做什么用的呢?我們看下官方文檔中的描述:

user 
Standard I/O Server
DESCRIPTION
user is a server which responds to all the messages defined in the I/O interface. The code in user.erl can be used as a model for building alternative I/O servers.

 原來user是標准的I/O 的server,看一下user進程的創建過程片段:
%%代碼路徑\erl5.9\lib\kernel-2.15\src\user.erl
run(P) ->
put(read_mode,list),
put(unicode,false),
case init:get_argument(noshell) of
%% non-empty list -> noshell
{ok, [_|_]} ->
put(shell, noshell),
server_loop(P, queue:new());
_ ->
group_leader(self(), self()),
catch_loop(P, start_init_shell())
end.
可以看到這里調用了 group_leader(self(), self())方法,看下這個方法的作用:
group_leader(GroupLeader, Pid) -> true
 Types:
GroupLeader = Pid = pid()

Sets the group leader of Pid to GroupLeader. Typically, this is used when a processes started from a certain shell should have another group leader than init.
See also group_leader/0.

這個方法的作用是:把某進程(Pid)的group leader 設置為GroupLeader,上面user執行的 group_leader(self(), self())就是把group leader設置為自己.
 
  到目前為止,我們還沒有把io:format的整個過程走完,轉回頭繼續看io:fromat的實現,現在進行到  o_request(Io, {format,Format,Args}, format).繼續跟進:
%%代碼路徑:\erl5.9\lib\stdlib-1.18\src\io.erl
o_request(Io, Request, Func) ->
case request(Io, Request) of %這里的Io參數的值就是group_leader哦
{error, Reason} ->
[_Name | Args] = tuple_to_list(to_tuple(Request)),
{'EXIT',{get_stacktrace,[_Current|Mfas]}} = (catch erlang:error(get_stacktrace)),
erlang:raise(error, conv_reason(Func, Reason), [{io, Func, [Io | Args]}|Mfas]);
Other ->
Other
end.

request(Request) ->
request(default_output(), Request).

request(standard_io, Request) ->
request(group_leader(), Request);
request(Pid, Request) when is_pid(Pid) -> %%看這里 我們走進的是這個分支
execute_request(Pid, io_request(Pid, Request)); %%io_request/2方法是一個消息格式轉換的方法 它的實現摘錄在后面
request(Name, Request) when is_atom(Name) ->
case whereis(Name) of
undefined ->
{error, arguments};
Pid ->
request(Pid, Request)
end.

execute_request(Pid, {Convert,Converted}) -> %%然后是到了這里
Mref = erlang:monitor(process, Pid),
Pid ! {io_request,self(),Pid,Converted}, %%這里向group_leader 發送一個消息,我們看看user進程接收到這個消息之后做了什么
if
Convert ->
convert_binaries(wait_io_mon_reply(Pid, Mref));
true ->
wait_io_mon_reply(Pid, Mref)
end.
上面的代碼跟蹤過程,最后看到了向group_leader發送已經格式化的消息,下面繼續跟蹤到user.erl,看這個消息的接收與處理
%%代碼路徑\erl5.9\lib\kernel-2.15\src\user.erl
server_loop(Port, Q) ->
receive
{io_request,From,ReplyAs,Request} when is_pid(From) ->
server_loop(Port, do_io_request(Request, From, ReplyAs, Port, Q));
{Port,{data,Bytes}} ->
case get(shell) of
noshell ->
server_loop(Port, queue:snoc(Q, Bytes));
_ ->
case contains_ctrl_g_or_ctrl_c(Bytes) of
false ->
server_loop(Port, queue:snoc(Q, Bytes));
_ ->
throw(new_shell)
end
end;
{Port, eof} ->
put(eof, true),
server_loop(Port, Q);

%% Ignore messages from port here.
{'EXIT',Port,badsig} -> % Ignore badsig errors
server_loop(Port, Q);
{'EXIT',Port,What} -> % Port has exited
exit(What);

%% Check if shell has exited
{'EXIT',SomePid,What} ->
case get(shell) of
noshell ->
server_loop(Port, Q); % Ignore
_ ->
throw({unknown_exit,{SomePid,What},Q})
end;

_Other -> % Ignore other messages
server_loop(Port, Q)
end.
 代碼里面出現了shell的身影,user進程的進程指點中會保留當前shell的Pid,這樣我們的io_server user就知道最終輸出在什么終端上了;進行一個實驗,我們查看一下user的元數據然后搞崩shell,看看這時候user的進程字典是什么情況:
Eshell V5.9  (abort with ^G)
1> whereis(user).
<0.23.0>
2> erlang:process_info(whereis(user)).
[{registered_name,user},
{current_function,{user,server_loop,2}},
{initial_call,{erlang,apply,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.21.0>,<0.24.0>,#Port<0.319>,<0.5.0>]},
{dictionary,[{unicode,false},
{read_mode,list},
{shell,<0.24.0>}]},
{trap_exit,true},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>},
{total_heap_size,987},
{heap_size,610},
{stack_size,9},
{reductions,666},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,5}]},
{suspending,[]}]
3> exit(pid(0,24,0),kill).
*** ERROR: Shell process terminated! ***
Eshell V5.9 (abort with ^G)
1> whereis(user).
<0.23.0>
2> erlang:process_info(whereis(user)).
[{registered_name,user},
{current_function,{user,server_loop,2}},
{initial_call,{erlang,apply,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.5.0>,<0.21.0>,<0.34.0>,#Port<0.319>]},
{dictionary,[{unicode,false},
{read_mode,list},
{shell,<0.34.0>}]},
{trap_exit,true},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>},
{total_heap_size,1364},
{heap_size,987},
{stack_size,9},
{reductions,1659},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,8}]},
{suspending,[]}]
3>

  到這里io:format執行的整個流程比較清晰了:向group_leader 按照指定格式發送io請求,group leader負責io請求的處理,通常情況下的group leader是user,user進程中維護了輸出終端shell的進程pid,Shell重建之后user會更新進程字典.現在我們已經可以做一些有趣的事情了,比如不使用io:format直接向user進程發送請求,就像下面這樣:
Eshell V5.9  (abort with ^G)
1> U =whereis(user).
<0.23.0>
2> U!{io_request,self(),self(), {put_chars,unicode,io_lib,format, ["hello world
:~p~n
",[zen]]}}.
hello world :zen
{io_request,<0.30.0>,<0.30.0>,
{put_chars,unicode,io_lib,format,
["hello world :~p~n",[zen]]}}
3>
 當然也可以解決一些實際的問題了,比如erlangqa上的這個問題:


litaocheng已經給出了解決方案,現在我們看這個解決方案就不再陌生了吧:
通過修改group_leader,達到io重定向的目的.
比如代碼:
cat test.erl 
-module(test).
-compile([export_all]).

r() ->
io:format("group leader:~p~n", [erlang:group_leader()]),
io:format("node:~p~n", [node()]),
erlang:group_leader(whereis(user), self()),
io:format("hello world~n").
隨后:
erl -sname t1
erl -sname t2
在t1中執行:
net_kernel:connect_node('t2@litao').
rpc:call('t2@litao', test, r, []).
會看到t2中輸出hello world

 可能注意到我們的代碼止於user進程接收到消息並沒有繼續下去,這是因為后面的代碼跟蹤會完全陷入io_protocol的細節里面去.我們簡單看下:
The Erlang I/O-protocol
The Erlang I/O-protocol
The I/O-protocol in Erlang specifies a way for a client to communicate with an io_server and vice versa. The io_server is a process handling the requests and that performs the requested task on i.e. a device. The client is any Erlang process wishing to read or write data from/to the device.
The common I/O-protocol has been present in OTP since the beginning, but has been fairly undocumented and has also somewhat evolved over the years. In an addendum to Robert Virdings rationale the original I/O-protocol is described. This document describes the current I/O-protocol.
The original I/O-protocol was simple and flexible. Demands for spacial and execution time efficiency has triggered extensions to the protocol over the years, making the protocol larger and somewhat less easy to implement than the original. It can certainly be argumented that the current protocol is too complex, but this text describes how it looks today, not how it should have looked.
The basic ideas from the original protocol still hold. The io_server and client communicate with one single, rather simplistic protocol and no server state is ever present in the client. Any io_server can be used together with any client code and client code need not be aware of the actual device the io_server communicates with.
1.1 Protocol basicsAs described in Robert's paper, servers and clients communicate using io_request/io_reply tuples as follows:
{io_request, From, ReplyAs, Request}
{io_reply, ReplyAs, Reply}
The client sends an io_request to the io_server and the server eventually sends a corresponding reply.

* From is the pid() of the client, the process which the io_server sends the reply to.
* ReplyAs can be any datum and is simply returned in the corresponding io_reply. The io-module in the Erlang standard library simply uses the pid() of the io_server as the ReplyAs datum, but a more complicated client could have several outstanding io-requests to the same server and would then use i.e. a reference() or something else to differentiate among the incoming io_reply's. The ReplyAs element should be considered opaque by the io_server. Note that the pid() of the server is not explicitly present in the io_reply. The reply can be sent from any process, not necessarily the actual io_server. The ReplyAs element is the only thing that connects one io_request with an io_reply.
* Request and Reply are described below.

When an io_server receives an io_request, it acts upon the actual Request part and eventually sends an io_reply with the corresponding Reply part.
完整協議請點擊:http://erlang.org/doc/apps/stdlib/io_protocol.html

關於io輸出的話題,遠遠沒有結束,還有很多問題值得思考,比如rpc:all的時候io輸出是怎么控制的?OTP application啟動之后的group leader是怎樣的?JCL方式接入一個節點,輸出又是怎樣一個流程?所以,今天先到這里,未完待續

io 模塊 online documentation 







免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM