當我們處理大規模數據如ImageNet的時候,單進程顯得很吃力耗時,且不能充分利用多核CPU計算機的資源。因此需要使用多進程對數據進行並行處理,然后將結果合並即可。以下給出的是多進程處理的demo代碼,如需要應用到實際應用中,則需要自己實現target_function函數,並且傳args即可。
#coding=utf-8
from multiprocessing import Process
def target_function(index,sublist): print index,sublist if __name__=="__main__": TXT_FILE = "path/to/imagelist.txt" n_processes = 50 #number of processes f = open(TXT_FILE,'r') image_list = f.readlines() f.close() n_total = len(image_list) length = float(n_total) / float(n_processes) indices = [int(round(i* length)) for i in range(n_processes)] sublists = [image_list[indices[i]:indices[i+1]] for i in range(n_processes)] processes = [Process(target=target_function,args=(i,x)) for i,x in enumerate(sublists)] for p in processes: p.start() for p in processes: p.join()