什么是全局解釋器鎖GIL
Python代碼的執行由Python 虛擬機(也叫解釋器主循環,CPython版本)來控制,Python 在設計之初就考慮到要在解釋器的主循環中,同時只有一個線程在執行,即在任意時刻,只有一個線程在解釋器中運行。對Python 虛擬機的訪問由全局解釋器鎖(GIL)來控制,正是這個鎖能保證同一時刻只有一個線程在運行。
在多線程環境中,Python 虛擬機按以下方式執行:
1. 設置GIL
2. 切換到一個線程去運行
3. 運行:
a. 指定數量的字節碼指令,或者
2. 切換到一個線程去運行
3. 運行:
a. 指定數量的字節碼指令,或者
b. 線程主動讓出控制(可以調用time.sleep(0))
4. 把線程設置為睡眠狀態
5. 解鎖GIL
6. 再次重復以上所有步驟
4. 把線程設置為睡眠狀態
5. 解鎖GIL
6. 再次重復以上所有步驟
在調用外部代碼(如C/C++擴展函數)的時候,GIL 將會被鎖定,直到這個函數結束為止(由於在這期間沒有Python 的字節碼被運行,所以不會做線程切換)。
全局解釋器鎖GIL設計理念與限制
GIL的設計簡化了CPython的實現,使得對象模型,包括關鍵的內建類型如字典,都是隱含可以並發訪問的。鎖住全局解釋器使得比較容易的實現對多線程的支持,但也損失了多處理器主機的並行計算能力。
但是,不論標准的,還是第三方的擴展模塊,都被設計成在進行密集計算任務是,釋放GIL。
還有,就是在做I/O操作時,GIL總是會被釋放。對所有面向I/O 的(會調用內建的操作系統C 代碼的)程序來說,GIL 會在這個I/O 調用之前被釋放,以允許其它的線程在這個線程等待I/O 的時候運行。如果是純計算的程序,沒有 I/O 操作,解釋器會每隔 100 次操作就釋放這把鎖,讓別的線程有機會執行(這個次數可以通過 sys.setcheckinterval 來調整)如果某線程並未使用很多I/O 操作,它會在自己的時間片內一直占用處理器(和GIL)。也就是說,I/O 密集型的Python 程序比計算密集型的程序更能充分利用多線程環境的好處。
下面是Python 2.7.9手冊中對GIL的簡單介紹:
The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. Locking the entire interpreter makes it easier for the interpreter to be multi-threaded, at the expense of much of the parallelism afforded by multi-processor machines.
However, some extension modules, either standard or third-party, are designed so as to release the GIL when doing computationally-intensive tasks such as compression or hashing. Also, the GIL is always released when doing I/O.
Past efforts to create a “free-threaded” interpreter (one which locks shared data at a much finer granularity) have not been successful because performance suffered in the common single-processor case. It is believed that overcoming this performance issue would make the implementation much more complicated and therefore costlier to maintain.
從上文中可以看到,針對GIL的問題做的很多改進,如使用更細粒度的鎖機制,在單處理器環境下反而導致了性能的下降。普遍認為,克服這個性能問題會導致CPython實現更加復雜,因此維護成本更加高昂。
However, some extension modules, either standard or third-party, are designed so as to release the GIL when doing computationally-intensive tasks such as compression or hashing. Also, the GIL is always released when doing I/O.
Past efforts to create a “free-threaded” interpreter (one which locks shared data at a much finer granularity) have not been successful because performance suffered in the common single-processor case. It is believed that overcoming this performance issue would make the implementation much more complicated and therefore costlier to maintain.
從上文中可以看到,針對GIL的問題做的很多改進,如使用更細粒度的鎖機制,在單處理器環境下反而導致了性能的下降。普遍認為,克服這個性能問題會導致CPython實現更加復雜,因此維護成本更加高昂。