python multiprocessing.pool.apply_async 占用內存多解決方法

本文轉載自查看原文 2020-06-11 18:55 2834

multiprocessing.pool.apply_async 可以執行並行的進程，但是會將所有進程先讀入列表，對於不是很多數量的進程來說沒有問題，但是如果進程數量很多，比如100萬條，1000萬條，而進程不能很快完成，內存就會占用很多，甚至擠爆內存。那么如何限制內存的占有量呢。網上查詢，找到一種解決方法：可以檢測pool._cache的長度，如果超過一定的長度，就讓最后進入pool中的進程等待，等這個進程結束，再讀入一定長度的進程，以達到減少內存占有的目的。

from multiprocessing import Pool
import time

def downloadGif(arg):
    print(arg[0])
    time.sleep(1)

def downloading_over(arg):
    pass

def foo(num):
    for i in range(num,1000001):
        pic_info=[]
        pic_info.append(str(i)+'gif')

        txt_info=[]
        txt_info.append(str(i)+'txt')
        yield pic_info,txt_info

if __name__ == '__main__':
    pool = Pool(processes=5)    # set the processes max number
    count=1
    for download in foo(2):
        pool.apply_async(func=downloadGif, args=(download[0],),callback=downloading_over)
        last=pool.apply_async(func=downloadGif, args=(download[1],),callback=downloading_over)

        count=count+1
        print(count)

        if len(pool._cache) > 1e3:
            print("waiting for cache to clear...")
            last.wait()

#1e3,500條,占有內存10M
#1e4,5000條,占有內存20M
#1e5,50000條,占有內存200M
#1e6,500000條,占有內存2000M

    pool.close()
    pool.join()

核心代碼：

        if len(pool._cache) > 1e3:
            print("waiting for cache to clear...")
            last.wait()

last 是 AsyncResult的實例，是pool的返回值

https://docs.python.org/3/library/multiprocessing.html

class multiprocessing.pool.AsyncResult¶

The class of the result returned by Pool.apply_async() and Pool.map_async().

get ([timeout]): Return the result when it arrives. If timeout is not None and the result does not arrive within timeout seconds then multiprocessing.TimeoutError is raised. If the remote call raised an exception then that exception will be reraised by get().

wait ([timeout]): Wait until the result is available or until timeout seconds pass.

ready (): Return whether the call has completed.

successful (): Return whether the call completed without raising an exception. Will raise ValueError if the result is not ready.

本文參考下面鏈接回答：

https://stackoverflow.com/questions/18414020/memory-usage-keep-growing-with-pythons-multiprocessing-pool

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 python multiprocessing.Pool 中map、map_async、apply、apply_async的區別 SQLSERVER數據庫內存占用高的解決方法 Spring cloud開發內存占用過高解決方法 Unity3D占用內存太大的解決方法 Linux系統內存占用90%以上——解決方法 oracle服務占用內存過高解決方法（轉）Unity3D占用內存太大的解決方法 Java進程占用內存過高，排查解決方法 mongodb 對內存的嚴重占用以及解決方法 Firefox內存占用過高解決方法

python multiprocessing.pool.apply_async 占用內存多 解決方法

免責聲明！

python multiprocessing.pool.apply_async 占用內存多解決方法