[Erlang 0028] Erlang atom

本文轉載自查看原文 2012-01-04 15:57 6526 Erlang

Erlang中atom數據類型能夠做的唯一的運算就是比較;在erlang中模塊名和方法名都是原子;Atom用來構造Tag-Message,Atom的比較時間是常量的,與Atom的長度無關(如果拿binary做tag,比較時間是線性的);Atom就是為比較而設計,除了比較運算不要把Atom用在別的運算中. Erlang M-F-A方法調用可以做的非常靈活,我們在shell里面操練一下:

Eshell V5.9 (abort with ^G)
1> lists:seq(1,5).
[1,2,3,4,5]
2> L=lists.
lists
3> S=seq.
seq
4> L:S(1,5).
[1,2,3,4,5]
5> L:seq(1,5).
[1,2,3,4,5]
6> L2=list_to_atom("list" ++"s").
lists
7> L2:seq(1,5).
[1,2,3,4,5]
8> apply(list_to_atom("li"++"sts"),seq,[1,5]).
[1,2,3,4,5]
9>

Erlang atom 不參與垃圾回收,一旦創建就不會被移除掉;一旦超出atom的數量限制(默認是1048576) VM就會終止掉.對於一個會持續運行很久的系統,把任意字符串轉成atom是很危險的,內存會慢慢被吃光.如果使用的原子是在預期范圍內的,比如協議模塊的名稱,那么可以使用list_to_existing_atom來進行防范,這個方法所產出的atom必須是之前已經創建過的.

%% list_to_existing_atom demo
 
Eshell V5.9 (abort with ^G)
1> list_to_existing_atom("player_1").
** exception error: bad argument %嘗試調用 list_to_existing_atom("player_1").由於player_1的原子之前沒有被創建過
     in function list_to_existing_atom/1 %這里報錯了 異常是bad argument
        called as list_to_existing_atom("player_1")
2> list_to_atom("player_1"). %創建一下player_1
player_1
3> list_to_existing_atom("player_1"). %再次調用list_to_existing_atom就是對的了
player_1
4> list_to_existing_atom("player_2"). 
** exception error: bad argument
     in function list_to_existing_atom/1
        called as list_to_existing_atom("player_2")
5> player_2. %也可以這樣創建原子 
player_2
6> list_to_existing_atom("player_2"). %這時調用也是對的
player_2
7>

我們可以使用 string:tokens( binary_to_list(erlang:system_info(info)),"\n")在shell中看一下atom的使用情況,輸出的片段中恰好包含了原子內存使用的情況,當前數量和數量限制;想看完整的輸出?穿越到這里

%% list_to_atom demo limit

Eshell V5.9 (abort with ^G)
1> string:tokens( binary_to_list(erlang:system_info(info)),"\n").
["=memory","total: 4331920","processes: 438877",
"processes_used: 438862","system: 3893043","atom: 146321",
"atom_used: 119102","binary: 327936","code: 1929551",
"ets: 123308","=hash_table:atom_tab","size: 4813",
"used: 3508","objs: 6410","depth: 7",
"=index_table:atom_tab","size: 7168","limit: 1048576",
"entries: 6410","=hash_table:module_code","size: 97",
"used: 52","objs: 72","depth: 4","=index_table:module_code",
"size: 1024","limit: 65536","entries: 72",
[...]|...]
2> [list_to_atom("player_"++integer_to_list(Item)) || Item <- lists:seq(1,1000000) ].
[player_1,player_2,player_3,player_4,player_5,player_6,
player_7,player_8,player_9,player_10,player_11,player_12,
player_13,player_14,player_15,player_16,player_17,player_18,
player_19,player_20,player_21,player_22,player_23,player_24,
player_25,player_26,player_27,player_28,player_29|...]
3> string:tokens( binary_to_list(erlang:system_info(info)),"\n").
["=memory","total: 93955448","processes: 37830214",
"processes_used: 37830214","system: 56125234",
"atom: 20296629","atom_used: 20279627","binary: 360656",
"code: 1965264","ets: 124260","=hash_table:atom_tab",
"size: 823117","used: 627221","objs: 1006479","depth: 7",
"=index_table:atom_tab","size: 1006592","limit: 1048576",
"entries: 1006479","=hash_table:module_code","size: 97",
"used: 54","objs: 74","depth: 4","=index_table:module_code",
"size: 1024","limit: 65536","entries: 74",
[...]|...]
4> [list_to_atom("player_"++integer_to_list(Item)) || Item <- lists:seq(1,1000000) ].
[player_1,player_2,player_3,player_4,player_5,player_6,
player_7,player_8,player_9,player_10,player_11,player_12,
player_13,player_14,player_15,player_16,player_17,player_18,
player_19,player_20,player_21,player_22,player_23,player_24,
player_25,player_26,player_27,player_28,player_29|...]
5> string:tokens( binary_to_list(erlang:system_info(info)),"\n").
["=memory","total: 98839096","processes: 42712630",
"processes_used: 42712630","system: 56126466",
"atom: 20296629","atom_used: 20279627","binary: 361888",
"code: 1965264","ets: 124260","=hash_table:atom_tab",
"size: 823117","used: 627221","objs: 1006479","depth: 7",
"=index_table:atom_tab","size: 1006592","limit: 1048576", %注意再次調用的時候這里沒有變化
"entries: 1006479","=hash_table:module_code","size: 97",
"used: 54","objs: 74","depth: 4","=index_table:module_code",
"size: 1024","limit: 65536","entries: 74",
[...]|...]
6>

我們挑戰一下atom的數量上限, [list_to_atom("player_"++integer_to_list(Item)) || Item <- lists:seq(1,100000000) ].只要運行這個就可以了,不久我們就看到下面的提示:

所以在How to Crash Erlang 一文中,無節制使用atom名列前茅:

Run out of atoms. Atoms in Erlang are analogous to symbols in Lisp--that is, symbolic, non-string identifiers that make code more readable, like green or unknown_value--with one exception. Atoms in Erlang are not garbage collected. Once an atom has been created, it lives as long as the Erlang node is running. An easy way to crash the Erlang virtual machine is to loop from 1 to some large number, calling integer_to_list and then list_to_atom on the current loop index. The atom table will fill up with unused entries, eventually bringing the runtime system to halt.

Why is this is allowed? Because garbage collecting atoms would involve a pass over all data in all processes, something the garbage collector was specifically designed to avoid. And in practice, running out of atoms will only happen if you write code that's generating new atoms on the fly.

Note:

Atoms are really nice and a great way to send messages or represent constants. However there are pitfalls to using atoms for too many things: an atom is referred to in an "atom table" which consumes memory (4 bytes/atom in a 32-bit system, 8 bytes/atom in a 64-bit system). The atom table is not garbage collected, and so atoms will accumulate until the system tips over, either from memory usage or because 1048577 atoms were declared.

This means atoms should not be generated dynamically for whatever reason; if your system has to be reliable and user input lets someone crash it at will by telling it to create atoms, you're in serious trouble. Atoms should be seen as tools for the developer because honestly, it's what they are.

Even though bit strings are pretty light, you should avoid using them to tag values. It could be tempting to use string literals to say {<<"temperature">>,50}, but always use atoms when doing that. atoms were said to be taking only 4 or 8 bytes in space, no matter how long they are. By using them, you'll have basically no overhead when copying data from function to function or sending it to another Erlang node on another server.
Conversely, do not use atoms to replace strings because they are lighter. Strings can be manipulated (splitting, regular expressions, etc) while atoms can only be compared and nothing else.

link :http://learnyousomeerlang.com/starting-out-for-real

Using list_to_atom/1 to construct an atom that is passed to apply/3 like this

apply(list_to_atom("some_prefix"++Var), foo, Args)

is quite expensive and is not recommended in time-critical code.

2012-08-30 更新

Note: If you remember earlier texts, atoms can be used in a limited (though high) number. You shouldn't ever create dynamic atoms. This means naming processes should be reserved to important services unique to an instance of the VM and processes that should be there for the whole time your application runs.

If you need named processes but they are transient or there isn't any of them which can be unique to the VM, it may mean they need to be represented as a group instead. Linking and restarting them together if they crash might be the sane option, rather than trying to use dynamic names.

2014-8-5 11:44:00更新

Intent
Prevent the atom table from filling up
Motivation
The VM will crash if you use too many atoms (by default 1048576)
Atoms are created in many ways:
hand-written code (modules, functions, intentional atoms)
generated code (e.g. ASN.1 compiler, yecc)
reading files (config, file:consult)
parsing (e.g. XML, JSON, ...)
Recommendation
Don’t use list_to_atom/1 and beware of libraries that do (e.g.xmerl). Use list_to_existing_atom/1 or tag tuples with strings/binaries.

如果你在erl_crash.dump看到了"no more index entries in atom_tab (max=1048576)."的信息,那就crashdump_viewer:start/0 檢查一下看看什么邏輯創建了那么多的原子吧

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 [Erlang 0046] Erlang Timer [Erlang 0068] Erlang dict [Erlang 0034] Erlang iolist [Erlang 0069] Erlang ordsets [Erlang 0070] Erlang Queue [Erlang 0045] Erlang 雜記 Ⅲ [Erlang 0035] Erlang SMP [Erlang 0123] Erlang EPMD [Erlang 0064] Erlang Array [Erlang 0065] Erlang proplists