在python多進程編程過程中出現如下問題:
from multiprocessing import Process,Queue,set_start_method,get_context def download_from_web(q): data = [11,22,33,44] for temp in data: q.put(temp) def analysis_data(q): """處理數據""" watting_analysis_data = list() while True: data = q.get() watting_analysis_data.append(data) if q.empty(): break print(watting_analysis_data) def main(): q = Queue() p1 = Process(target=download_from_web,args=(q,)) p2 = Process(target=analysis_data,args=(q,)) p1.start() p2.start() if __name__ == "__main__": main()
運行會報錯:
Traceback (most recent call last): File "<string>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main self = reduction.pickle.load(from_parent) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__ self._semlock = _multiprocessing.SemLock._rebuild(*state) FileNotFoundError: [Errno 2] No such file or directory Traceback (most recent call last): File "<string>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main self = reduction.pickle.load(from_parent) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__ self._semlock = _multiprocessing.SemLock._rebuild(*state) FileNotFoundError: [Errno 2] No such file or directory
實際上,Python 創建的子進程執行的內容,和啟動該進程的方式有關。而根據不同的平台,啟動進程的方式大致可分為以下 3 種:
- spawn:使用此方式啟動的進程,只會執行和 target 參數或者 run() 方法相關的代碼。Windows 平台只能使用此方法,事實上該平台默認使用的也是該啟動方式。相比其他兩種方式,此方式啟動進程的效率最低。
- fork:使用此方式啟動的進程,基本等同於主進程(即主進程擁有的資源,該子進程全都有)。因此,該子進程會從創建位置起,和主進程一樣執行程序中的代碼。注意,此啟動方式僅適用於 UNIX 平台,os.fork() 創建的進程就是采用此方式啟動的。
- forserver:使用此方式,程序將會啟動一個服務器進程。即當程序每次請求啟動新進程時,父進程都會連接到該服務器進程,請求由服務器進程來創建新進程。通過這種方式啟動的進程不需要從父進程繼承資源。注意,此啟動方式只在 UNIX 平台上有效。
原因是MAC電腦默認啟動進程的方式是fork,而python默認的方式是spawn,所以需要將python啟動進程的方式做修改:
from multiprocessing import Process,Queue,set_start_method,get_context def download_from_web(q): data = [11,22,33,44] for temp in data: q.put(temp) def analysis_data(q): """處理數據""" watting_analysis_data = list() while True: data = q.get() watting_analysis_data.append(data) if q.empty(): break print(watting_analysis_data) def main():
set_start_method('fork') q = Queue() p1 = Process(target=download_from_web,args=(q,)) p2 = Process(target=analysis_data,args=(q,)) p1.start() p2.start() if __name__ == "__main__": main()
也可以使用 get_context() 來獲取上下文對象。上下文對象與多處理模塊具有相同的API,並允許在同一程序中使用多個啟動方法,如下:
from multiprocessing import Process,Queue,set_start_method,get_context def download_from_web(q): data = [11,22,33,44] for temp in data: q.put(temp) def analysis_data(q): """處理數據""" watting_analysis_data = list() while True: data = q.get() watting_analysis_data.append(data) if q.empty(): break print(watting_analysis_data) def main(): q = Queue() ctx = get_context('fork') p1 = ctx.Process(target=download_from_web,args=(q,)) p2 = ctx.Process(target=analysis_data,args=(q,)) p1.start() p2.start() if __name__ == "__main__": main()
這樣就成功獲取到我們的結果了:
[11, 22, 33, 44]
-------------------------------------------------------------------------------------------------------------------------------------------
另外如下代碼也可以正常得到結果:
import multiprocess def download_from_web(q): data = [11,22,33,44] for temp in data: q.put(temp) def analysis_data(q): """處理數據""" watting_analysis_data = list() while True: data = q.get() watting_analysis_data.append(data) if q.empty(): break print(watting_analysis_data) def main(): q = multiprocess.Queue() p1 = multiprocess.Process(target=download_from_web,args=(q,)) p2 = multiprocess.Process(target=analysis_data,args=(q,)) p1.start() p2.start() if __name__ == "__main__": main()