目錄
一、介紹
A “greenlet” is a small independent pseudo-thread. Think about it as a small stack of frames; the outermost (bottom) frame is the initial function you called, and the innermost frame is the one in which the greenlet is currently paused. You work with greenlets by creating a number of such stacks and jumping execution between them. Jumps are never implicit: a greenlet must choose to jump to another greenlet, which will cause the former to suspend and the latter to resume where it was suspended. Jumping between greenlets is called “switching”.
When you create a greenlet, it gets an initially empty stack; when you first switch to it, it starts the run a specified function, which may call other functions, switch out of the greenlet, etc. When eventually the outermost function finishes its execution, the greenlet’s stack becomes empty again and the greenlet is “dead”. Greenlets can also die of an uncaught exception.
一個 “greenlet” 是一個小型的獨立偽線程。可以把它想像成一些棧幀,棧底是初始調用的函數,而棧頂是當前greenlet的暫停位置。你使用greenlet創建一堆這樣的堆棧,然后在他們之間跳轉執行。跳轉必須顯式聲明的:一個greenlet必須選擇要跳轉到的另一個greenlet,這會讓前一個掛起,而后一個在此前掛起處恢復執行。不同greenlets之間的跳轉稱為切換(switching) 。
當你創建一個greenlet時,它得到一個開始時為空的棧;當你第一次切換到它時,它會執行指定的函數,這個函數可能會調用其他函數、切換跳出greenlet等等。當最終棧底的函數執行結束出棧時,這個greenlet的棧又變成空的,這個greenlet也就死掉了。greenlet也會因為一個未捕捉的異常死掉。
例如:
>>> from greenlet import greenlet >>> >>> def test1(): ... print 12 ... gr2.switch() ... print 34 ... >>> def test2(): ... print 56 ... gr1.switch() ... print 78 ... >>> gr1 = greenlet(test1) >>> gr2 = greenlet(test2) >>> gr1.switch() 12 56 34
The last line jumps to test1, which prints 12, jumps to test2, prints 56, jumps back into test1, prints 34; and then test1 finishes and gr1 dies. At this point, the execution comes back to the original gr1.switch() call. Note that 78 is never printed.
最后一行首先跳轉到greenlet gr1 執行其指定的函數 test1 ,這里 test1沒有參數,因此 gr1.switch() 也不需要指定參數。 test1打印12,然后跳轉到 test2 ,打印56,然后跳轉回 test1 ,打印34,最后 test1 結束執行, gr1 死掉。這時執行會回到最初的 gr1.switch() 調用。注意,78是不會被打印的。
補充:
該部分關於greenlet和eventlet的介紹摘自《Python幾種並發實現方案的性能比較》
greenlet不是一種真正的並發機制,而是在同一線程內,在不同函數的執行代碼塊之間切換,實施“你運行一會、我運行一會”,並且在進行切換時必須指定何時切換以及切換到哪。greenlet的接口是比較簡單易用的,但是使用greenlet時的思考方式與其他並發方案存在一定區別。
線程/進程模型在大邏輯上通常從並發角度開始考慮,把能夠並行處理的並且值得並行處理的任務分離出來,在不同的線程/進程下運行,然后考慮分離過程可能造成哪些互斥、沖突問題,將互斥的資源加鎖保護來保證並發處理的正確性。
greenlet則是要求從避免阻塞的角度來進行開發,當出現阻塞時,就顯式切換到另一段沒有被阻塞的代碼段執行,直到原先的阻塞狀況消失以后,再人工切換回原來的代碼段繼續處理。因此,greenlet本質是一種合理安排了的串行,實驗中greenlet方案能夠得到比較好的性能表現,主要也是因為通過合理的代碼執行流程切換,完全避免了死鎖和阻塞等情況(執行帶屏幕輸出的ring_greenlet.py我們會看到腳本總是一個一個地處理消息,把一個消息在環上從頭傳到尾之后,再開始處理下一個消息)。因為greenlet本質是串行,因此在沒有進行顯式切換時,代碼的其他部分是無法被執行到的,如果要避免代碼長時間占用運算資源造成程序假死,那么還是要將greenlet與線程/進程機制結合使用(每個線程、進程下都可以建立多個greenlet,但是跨線程/進程時greenlet之間無法切換或通訊)。
粗糙來講,greenlet是“阻塞了我就先干點兒別的,但是程序員得明確告訴greenlet能先干點兒啥以及什么時候回來”;greenlet應該是學習了Stackless的上下文切換機制,但是對底層資源沒有進行適合並發的改造。並且實際上greenlet也沒有必要改造底層資源的並發性,因為它本質是串行的單線程,不與其他並發模型混合使用的話是無法造成對資源的並發訪問的。
greenlet 封裝后的 eventlet 方案
eventlet 是基於 greenlet 實現的面向網絡應用的並發處理框架,提供“線程”池、隊列等與其他 Python 線程、進程模型非常相似的 api,並且提供了對 Python 發行版自帶庫及其他模塊的超輕量並發適應性調整方法,比直接使用 greenlet 要方便得多。並且這個解決方案源自著名虛擬現實游戲“第二人生”,可以說是久經考驗的新興並發處理模型。其基本原理是調整 Python 的 socket 調用,當發生阻塞時則切換到其他 greenlet 執行,這樣來保證資源的有效利用。需要注意的是:
- eventlet 提供的函數只能對 Python 代碼中的 socket 調用進行處理,而不能對模塊的 C 語言部分的 socket 調用進行修改。對后者這類模塊,仍然需要把調用模塊的代碼封裝在 Python 標准線程調用中,之后利用 eventlet 提供的適配器實現 eventlet 與標准線程之間的協作。
- 再有,雖然 eventlet 把 api 封裝成了非常類似標准線程庫的形式,但兩者的實際並發執行流程仍然有明顯區別。在沒有出現 I/O 阻塞時,除非顯式聲明,否則當前正在執行的 eventlet 永遠不會把 cpu 交給其他的 eventlet,而標准線程則是無論是否出現阻塞,總是由所有線程一起爭奪運行資源。所有 eventlet 對 I/O 阻塞無關的大運算量耗時操作基本沒有什么幫助。
二、父greenlet
Let’s see where execution goes when a greenlet dies. Every greenlet has a “parent” greenlet. The parent greenlet is initially the one in which the greenlet was created (this can be changed at any time). The parent is where execution continues when a greenlet dies. This way, greenlets are organized in a tree. Top-level code that doesn’t run in a user-created greenlet runs in the implicit “main” greenlet, which is the root of the tree.
In the above example, both gr1 and gr2 have the main greenlet as a parent. Whenever one of them dies, the execution comes back to “main”.
Uncaught exceptions are propagated into the parent, too. For example, if the above test2() contained a typo, it would generate a NameError that would kill gr2, and the exception would go back directly into “main”. The traceback would show test2, but not test1. Remember, switches are not calls, but transfer of execution between parallel “stack containers”, and the “parent” defines which stack logically comes “below” the current one.
現在看看一個greenlet結束時執行點去哪里。每個greenlet擁有一個父greenlet。每個greenlet最初在其父greenlet中創建(不過可以在任何時候改變)。當子greenlet結束時,執行位置從父greenlet那里繼續。這樣,greenlets之間就被組織成一棵樹,頂級的代碼並不在用戶創建的 greenlet 中運行,而是運行在一個主greenlet中,也就是所有greenlet關系圖的樹根。
在上面的例子中, gr1 和 gr2 都把主greenlet作為父greenlet。任何一個死掉,執行點都會回到主greenlet。
未捕獲的異常會傳遞給父greenlet。如果上面的 test2 包含一個打印錯誤(typo),會生成一個 NameError 而殺死 gr2 ,然后異常被傳遞回主greenlet。traceback會顯示 test2 而不是 test1 。記住,切換不是調用,而是執行點在並行的棧容器間交換,而父greenlet定義了這些棧之間的先后關系。
三、實例化
greenlet.greenlet
is the greenlet type, which supports the following operations:
是一個 greenlet 類型,支持如下操作:
greenlet(run=None, parent=None)
Create a new greenlet object (without running it). run is the callable to invoke, and parent is the parent greenlet, which defaults to the current greenlet.
創建一個greenlet對象,不執行。run是這個greenlet要執行的回調函數,而parent是父greenlet,缺省為當前greenlet。
greenlet.getcurrent()
Returns the current greenlet (i.e. the one which called this function).
返回當前greenlet,也就是誰在調用這個函數。
greenlet.GreenletExit
This special exception does not propagate to the parent greenlet; it can be used to kill a single greenlet.
這個特定的異常不會波及到父greenlet,它用於干掉一個greenlet。
The greenlet type can be subclassed, too. A greenlet runs by calling its run attribute, which is normally set when the greenlet is created; but for subclasses it also makes sense to define a run method instead of giving a run argument to the constructor.
greenlet 類型可以被繼承。一個greenlet通過調用其 run 屬性執行,就是創建時指定的那個。對於子類,可以定義一個 run() 方法,而不必嚴格遵守在構造器中給出 run 參數。
四、在greenlets間切換
Switches between greenlets occur when the method switch() of a greenlet is called, in which case execution jumps to the greenlet whose switch() is called, or when a greenlet dies, in which case execution jumps to the parent greenlet. During a switch, an object or an exception is “sent” to the target greenlet; this can be used as a convenient way to pass information between greenlets. For example:
greenlet之間的切換發生在greenlet的 switch 方法被調用時,這會讓執行點跳轉到greenlet的 switch 被調用處。或者在greenlet死掉時,跳轉到父greenlet那里去。在切換時,一個對象或異常被發送到目標greenlet。這可以作為兩個greenlet之間傳遞信息的方便方式。
例如:
>>> def test1(x, y): ... z = gr2.switch(x+y) ... print z ... >>> def test2(u): ... print u ... gr1.switch(42) ... >>> gr1 = greenlet(test1) >>> gr2 = greenlet(test2) >>> gr1.switch("hello", " world") hello world 42
This prints “hello world” and 42, with the same order of execution as the previous example. Note that the arguments of test1() and test2() are not provided when the greenlet is created, but only the first time someone switches to it.
Here are the precise rules for sending objects around:
這會打印出 “hello world” 和42,跟前面的例子的輸出順序相同。注意 test1() 和 test2() 的參數並不是在 greenlet 創建時指定的,而是在第一次切換到這里時傳遞的。
這里是精確的調用方式:
g.switch(*args, **kwargs)
Switches execution to the greenlet g , sending it the given arguments. As a special case, if g did not start yet, then it will start to run now.
切換到執行點greenlet g,將這里指定的參數發送這個greenlet。在特殊情況下,如果g還沒有啟動,就會讓它啟動;
五、垂死的greenlet
If a greenlet’s run() finishes, its return value is the object sent to its parent. If run() terminates with an exception, the exception is propagated to its parent (unless it is a greenlet.GreenletExit exception, in which case the exception object is caught and returned to the parent).
如果一個greenlet的 run()結束了,他會返回值是返回給父greenlet的對象。如果 run()是異常終止的,異常會傳播到父greenlet(除非是 greenlet.GreenletExit 異常,這種情況下異常會被捕捉並返回到父greenlet)。
Apart from the cases described above, the target greenlet normally receives the object as the return value of the call to switch() in which it was previously suspended. Indeed, although a call to switch() does not return immediately, it will still return at some point in the future, when some other greenlet switches back. When this occurs, then execution resumes just after the switch() where it was suspended, and the switch() itself appears to return the object that was just sent. This means that x = g.switch(y) will send the object y to g, and will later put the (unrelated) object that some (unrelated) greenlet passes back to us into x.
Note that any attempt to switch to a dead greenlet actually goes to the dead greenlet’s parent, or its parent’s parent, and so on. (The final parent is the “main” greenlet, which is never dead.)
除了上面的情況外,目標greenlet會接收到發送來的對象作為 switch() 的返回值。雖然 switch() 並不會立即返回,但是它仍然會在未來某一點上返回,當其他greenlet切換回來時。當這發生時,執行點恢復到 switch() 之后,而 switch() 返回剛才調用者發送來的對象。這意味着 x=g.switch(y) 會發送對象y到g,然后等着一個不知道是誰發來的對象,並在這里返回給x。
注意,任何嘗試切換到死掉的greenlet的行為都會切換到死掉greenlet的父greenlet,或者父greenlet的父greenlet,等等。最終的父greenlet就是main greenlet,main greenlet永遠不會死掉的。
Methods and attributes of greenlets
六、greenlet的方法和屬性
g.switch(*args, **kwargs)
Switches execution to the greenlet g .
切換執行點到greenlet g。
g.run
The callable that g will run when it starts. After g started, this attribute no longer exists.
調用可執行的g,並啟動。在g啟動后,這個屬性就不再存在了。
g.parent
The parent greenlet. This is writeable, but it is not allowed to create cycles of parents.
greenlet的父greenlet。這是可寫的,但是不允許創建循環的父關系。
g.gr_frame
The current top frame, or None.
當前頂級幀,或者None。
g.dead
True if g is dead (i.e. it finished its execution).
判斷greenlet是否已經死掉了。
bool(g)
True if g is active, False if it is dead or not yet started.
如果g是活躍的則返回True,在尚未啟動或者結束后返回False。
g.throw([typ, [val, [tb]]])
Switches execution to the greenlet g, but immediately raises the given exception in g. If no argument is provided, the exception defaults to greenlet.GreenletExit . The normal exception propagation rules apply, as described above. Note that calling this method is almost equivalent to the following:
切換執行點到greenlet g ,但是立即在g中拋出指定的異常。如果沒有提供參數,異常缺省就是 greenlet.GreenletExit 。異常傳播規則如上文描述。注意調用這個方法等同於如下:
def raiser(): raise typ, val, tb g_raiser = greenlet(raiser, parent=g) g_raiser.switch()
except that this trick does not work for the greenlet.GreenletExit exception, which would not propagate from g_raiser to g .
注意這一招對於異常 greenlet.GreenletExit並不適用,因為這個異常不會從 g_raiser 傳播到 g 。
七、greenlets和Python線程
Greenlets can be combined with Python threads; in this case, each thread contains an independent “main” greenlet with a tree of sub-greenlets. It is not possible to mix or switch between greenlets belonging to different threads.
greenlets可以與Python線程一起使用;在這種情況下,每個線程包含一個獨立的 main greenlet,並擁有自己的greenlet樹。不同線程之間不可以互相切換greenlet。
Garbage-collecting live greenlets
八、垃圾收集活躍的greenlets
If all the references to a greenlet object go away (including the references from the parent attribute of other greenlets), then there is no way to ever switch back to this greenlet. In this case, a GreenletExit exception is generated into the greenlet. This is the only case where a greenlet receives the execution asynchronously. This gives try:finally: blocks a chance to clean up resources held by the greenlet. This feature also enables a programming style in which greenlets are infinite loops waiting for data and processing it. Such loops are automatically interrupted when the last reference to the greenlet goes away.
The greenlet is expected to either die or be resurrected by having a new reference to it stored somewhere; just catching and ignoring the GreenletExit is likely to lead to an infinite loop.
Greenlets do not participate in garbage collection; cycles involving data that is present in a greenlet’s frames will not be detected. Storing references to other greenlets cyclically may lead to leaks.
如果不再有對greenlet對象的引用時(包括其他greenlet的parent),還是沒有辦法切換回greenlet。這種情況下會生成一個 GreenletExit 異常到greenlet。這是greenlet收到異步異常的唯一情況。應該給出一個 try .. finally 用於清理greenlet內的資源。這個功能同時允許greenlet中無限循環的編程風格。這樣循環可以在最后一個引用消失時自動中斷。
如果不希望greenlet死掉或者把引用放到別處,只需要捕捉和忽略 GreenletExit 異常即可。
greenlet不參與垃圾收集;greenlet幀的循環引用數據會被檢測到。將引用傳遞到其他的循環greenlet會引起內存泄露。
九、追蹤支持
Standard Python tracing and profiling doesn’t work as expected when used with greenlet since stack and frame switching happens on the same Python thread. It is difficult to detect greenlet switching reliably with conventional methods, so to improve support for debugging, tracing and profiling greenlet based code there are new functions in the greenlet module:
greenlet.gettrace()
Returns a previously set tracing function, or None.
greenlet.settrace(callback)
Sets a new tracing function and returns a previous tracing function, or None. The callback is called on various events and is expected to have the following signature:
def callback(event, args): if event == 'switch': origin, target = args # Handle a switch from origin to target. # Note that callback is running in the context of target # greenlet and any exceptions will be passed as if # target.throw() was used instead of a switch. return if event == 'throw': origin, target = args # Handle a throw from origin to target. # Note that callback is running in the context of target # greenlet and any exceptions will replace the original, as # if target.throw() was used with the replacing exception. return
For compatibility it is very important to unpack args tuple only when event is either 'switch' or 'throw' and not when event is potentially something else. This way API can be extended to new events similar to sys.settrace().