[Erlang 0035] Erlang SMP

本文轉載自查看原文 2012-02-01 12:36 4436 Erlang

Erlang SMP

Erlang SMP (Symmetrical Multi Processor)在97-98年左右開始,項目按照先跑起來再優化的開發策略("First make it work, then measure, then optimize".),在2006年R11B發布了第一個穩定版本;

強制SMP編譯 ./configure --enable-smp-support
關閉SMP模擬器 ./configure --disable-smp-support
啟動erl時開啟或關閉SMP -smp enable -smp disable

原理

沒有SMP支持的Erlang VM只有一個Scheduler運行於主線程;Scheduler從運行隊列中取出需要處理的Erlang進程和IO-job;由於只有一個Scheduler沒有必要對數據加鎖;見下圖:

有SMP支持的Erlang VM 可以有1~1024個Scheduler,每一個Scheduler都會運行於一個獨立的操作系統線程;操作系統決定它是否要在不同的核上運行.由於多個Scheduler就要對數據加鎖,一個Erlang進程可能前后被多個Scheduler調度;

Alright, so it was decided that lightweight processes with asynchronous message passing were the approach to take for Erlang. How to make this work? Well, first of all, the operating system can't be trusted to handle the processes. Operating systems have many different ways to handle processes, and their performance varies a lot. Most if not all of them are too slow or too heavy for what is needed by standard Erlang applications. By doing this in the VM, the Erlang implementers keep control of optimization and reliability. Nowadays, Erlang's processes take about 300 words of memory each and can be created in a matter of microseconds—not something doable on major operating systems these days.

To handle all these potential processes your programs could create, the VM starts one thread per core which acts as a scheduler. Each of these schedulers has a run queue, or a list of Erlang processes on which to spend a slice of time. When one of the schedulers has too many tasks in its run queue, some are migrated to another one. This is to say each Erlang VM takes care of doing all the load-balancing and the programmer doesn't need to worry about it. There are some other optimizations that are done, such as limiting the rate at which messages can be sent on overloaded processes in order to regulate and distribute the load.

link:http://learnyousomeerlang.com/the-hitchhikers-guide-to-concurrency

Every once in a while, processes are migrated between schedulers according to a quite intricate process. The aim of the heuristic is to balance load over multiple schedulers so all cores get utilized fully. But the algorithm also considers if there is enough work to warrant starting up new schedulers. If not, it is better to keep the scheduler turned off as this means the thread has nothing to do. And in turn this means the core can enter power save mode and get turned off. Yes, Erlang conserves power if possible. Schedulers can also work-steal if they are out of work. For the details of this, see [1].

IMPORTANT: In R15, schedulers are started and stopped in a "lagged" fashion. What this means is that Erlang/OTP recognizes that starting a scheduler or stopping one is rather expensive so it only does this if really needed. Suppose there is no work for a scheduler. Rather than immediately taking it to sleep, it will spin for a little while in the hope that work arrives soon. If work arrives, it can be handled immediately with low latency. On the other hand, this means you cannot use tools like top(1) or the OS kernel to measure how efficient your system is executing. You must use the internal calls in the Erlang system. Many people were incorrectly assuming that R15 was worse than R14 for exactly this reason.

link:http://jlouisramblings.blogspot.dk/2013/01/how-erlang-does-scheduling.html How Erlang does scheduling

性能

下面這個數據比較老了(2008年1月),但是還可以做參考:

Measurements from a real telecom product showed a 1.7 speed improvement between a single and a dual core system.

The SMP VM with only one scheduler is slightly slower (10%) than the non SMP VM.
This is because the SMP VM need to use locks for all shared datastructures. But as long as there are no lock-conflicts the overhead caused by locking is not that high (it is the lock conflicts that takes time).
This explains why it in some cases can be more efficient to run several SMP VM's with one scheduler each instead on one SMP VM with several schedulers. Of course the running of several VM's require that the application can run in many parallel tasks which has no or very little communication with each other.

2014-2-28 10:56:46
To gain performance by using the SMP emulator, your application must have more than one runnable Erlang process most of the time. Otherwise, the Erlang emulator can still only run one Erlang process at the time, but you must still pay the overhead for locking. Although we try to reduce the locking overhead as much as possible, it will never become exactly zero.

應用

從OTP R12B開始只要操作系統告知當前是多CPU(多核),SMP就會自動開啟並設定Scheduler個數和CPU或核的數量一致;我們啟動一下看看:

Erlang R15B (erts-5.9) [source] [64-bit] [smp:8:8] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.9  (abort with ^G)
1>

檢查一下服務器CPU信息

# grep "model name" /proc/cpuinfo | cut -f2 -d:
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz

可以看到8核的機器啟動的時候自動開啟了smp:8:8,這兩個數字是什么意思?

+S Schedulers:SchedulerOnline

Sets the amount of scheduler threads to create and scheduler threads to set online when SMP support has been enabled. Valid range for both values are 1-1024. If the Erlang runtime system is able to determine the amount of logical processors configured and logical processors available, Schedulers will default to logical processors configured, and SchedulersOnline will default to logical processors available; otherwise, the default values will be 1. Schedulers may be omitted if :SchedulerOnline is not and vice versa. The amount of schedulers online can be changed at run time via erlang:system_flag(schedulers_online, SchedulersOnline).

Note: the results are similar whether symmetric multiprocessing is enabled or not. To prove it, you can just test it out by starting the Erlang VM with $ erl -smp disable.

To see if your Erlang VM runs with or without SMP support in the first place, start a new VM without any options and look for the first line output. If you can spot the text [smp:2:2] [rq:2], it means you're running with SMP enabled, and that you have 2 run queues (rq, or schedulers) running on two cores. If you only see [rq:1], it means you're running with SMP disabled.

If you wanted to know, [smp:2:2] means there are two cores available, with two schedulers. [rq:2] means there are two run queues active. In earlier versions of Erlang, you could have multiple schedulers, but with only one shared run queue. Since R13B, there is one run queue per scheduler by default; this allows for better parallelism.

注意:

如果設定的數量超出CPU數或者核數並不能得到什么好處
有的操作系統可以使用tasket之類的命令綁定CPU,Erlang VM只會檢測可用的CPU數,因為綁定這事可能在任何時刻發生;SchedulersOnline 就是實際可用的CPU數或核數;
運行時是可以調整該參數的 erlang:system_flag(schedulers_online, SchedulersOnline).
http://erlang.org/doc/man/erl.html
+S Schedulers:SchedulerOnline
Sets the amount of scheduler threads to create and scheduler threads to set online when SMP support has been enabled. Valid range for both values are 1-1024. If the Erlang runtime system is able to determine the amount of logical processors configured and logical processors available, Schedulers will default to logical processors configured, and SchedulersOnline will default to logical processors available; otherwise, the default values will be 1. Schedulers may be omitted if :SchedulerOnline is not and vice versa. The amount of schedulers online can be changed at run time via erlang:system_flag(schedulers_online, SchedulersOnline).

如果要關閉SMP 可以使用下面的啟動參數:

-smp [enable|auto|disable]
-smp enable and -smp starts the Erlang runtime system with SMP support enabled. This may fail if no runtime system with SMP support is available. -smp auto starts the Erlang runtime system with SMP support enabled if it is available and more than one logical processor are detected. -smp disable starts a runtime system without SMP support. By default -smp auto will be used unless a conflicting parameter has been passed, then -smp disable will be used. Currently only the -hybrid parameter conflicts with -smp auto.

Parallelism is not the answer to every problem. In some cases, going parallel will even slow down your application. This can happen whenever your program is 100% sequential, but still uses multiple processes.

One of the best examples of this is the ring benchmark. A ring benchmark is a test where many thousands of processes will pass a piece of data to one after the other in a circular manner. Think of it as a game of telephone if you want. In this benchmark, only one process at a time does something useful, but the Erlang VM still spends time distributing the load accross cores and giving every process its share of time.

This plays against many common hardware optimizations and makes the VM spend time doing useless stuff. This often makes purely sequential applications run much slower on many cores than on a single one. In this case, disabling symmetric multiprocessing ($ erl -smp disable) might be a good idea.

總結

SMP使用操作系統線程實現多個調度器,利用了多核多CPU的優勢並把實現細節對開發者隱藏起來, 可以不修改代碼甚至不用重新編譯就可以使用;使用SMP可以有非常靈活的啟動選擇和運行時調整的方法入口;

相關資料:

[1] 本文截圖來自Ericsson的一篇文檔: Inside the Erlang VM with focus on SMP

[2] Some facts about Erlang and SMP http://erlang.2086793.n4.nabble.com/Some-facts-about-Erlang-and-SMP-td2108770.html

[3] 要綁定CPU可以使用taskset 命令 http://linuxcommand.org/man_pages/taskset1.html 2013-01-10 15:22:35更新

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 [Erlang 0045] Erlang 雜記 Ⅲ [Erlang 0046] Erlang Timer [Erlang 0068] Erlang dict [Erlang 0028] Erlang atom [Erlang 0034] Erlang iolist [Erlang 0069] Erlang ordsets [Erlang 0070] Erlang Queue [Erlang 0123] Erlang EPMD [Erlang 0064] Erlang Array [Erlang 0065] Erlang proplists