進程池與線程池
在剛開始學多進程或多線程時,我們迫不及待地基於多進程或多線程實現並發的套接字通信,然而這種實現方式的致命缺陷是:服務的開啟的進程數或線程數都會隨着並發的客戶端數目地增多而增多,
這會對服務端主機帶來巨大的壓力,甚至於不堪重負而癱瘓,於是我們必須對服務端開啟的進程數或線程數加以控制,讓機器在一個自己可以承受的范圍內運行,這就是進程池或線程池的用途,
例如進程池,就是用來存放進程的池子,本質還是基於多進程,只不過是對開啟進程的數目加上了限制
Python--concurrent.futures
1.concurent.future模塊是用來創建並行的任務,提供了更高級別的接口,
為了異步執行調用
2.concurent.future這個模塊用起來非常方便,它的接口也封裝的非常簡單
3.concurent.future模塊既可以實現進程池,也可以實現線程池
4.模塊導入進程池和線程池
from concurrent.futures import ProcessPoolExecutor,ThreadPoolExecutor
p = ProcessPoolExecutor(max_works)對於進程池如果不寫max_works:默認的是cpu的數目
p = ThreadPoolExecutor(max_works)對於線程池如果不寫max_works:默認的是cpu的數目*5
基本方法
1、submit(fn, *args, **kwargs) 異步提交任務 2、map(func, *iterables, timeout=None, chunksize=1) 取代for循環submit的操作 3、shutdown(wait=True) 相當於進程池的pool.close()+pool.join()操作 wait=True,等待池內所有任務執行完畢回收完資源后才繼續 wait=False,立即返回,並不會等待池內的任務執行完畢 但不管wait參數為何值,整個程序都會等到所有任務執行完畢 submit和map必須在shutdown之前 4、result(timeout=None) 取得結果 5、add_done_callback(fn) 回調函數
進程池
from concurrent.futures import ProcessPoolExecutor,ThreadPoolExecutor
from threading import currentThread
import os,time,random
def task(n):
print("%s is running " % os.getpid())
time.sleep(random.randint(1,3))
return n*2
if __name__ == '__main__':
start = time.time()
executor = ProcessPoolExecutor(4)
res = []
for i in range(10): # 開啟10個任務
future = executor.submit(task,i) # 異步提交任務
res.append(future)
executor.shutdown() # 等待所有進程執行完畢
print("++++>")
for r in res:
print(r.result()) # 打印結果
end = time.time()
print(end - start)
---------------------輸出
2464 is running
9356 is running
10780 is running
9180 is running
2464 is running
10780 is running
9180 is running
9356 is running
10780 is running
9180 is running
++++>
0
2
4
6
8
10
12
14
16
18
6.643380165100098
線程池
from concurrent.futures import ProcessPoolExecutor,ThreadPoolExecutor
from threading import currentThread
import os,time,random
def task(n):
print("%s is running " % currentThread().getName())
time.sleep(random.randint(1,3))
return n*2
if __name__ == '__main__':
start = time.time()
executor = ThreadPoolExecutor(4) # 線程池
res = []
for i in range(10): # 開啟10個任務
future = executor.submit(task,i) # 異步提交任務
res.append(future)
executor.shutdown() # 等待所有線程執行完畢
print("++++>")
for r in res:
print(r.result()) # 打印結果
end = time.time()
print(end - start)
------------輸出
<concurrent.futures.thread.ThreadPoolExecutor object at 0x00000000025B0DA0>_0 is running
<concurrent.futures.thread.ThreadPoolExecutor object at 0x00000000025B0DA0>_1 is running
<concurrent.futures.thread.ThreadPoolExecutor object at 0x00000000025B0DA0>_2 is running
<concurrent.futures.thread.ThreadPoolExecutor object at 0x00000000025B0DA0>_3 is running
<concurrent.futures.thread.ThreadPoolExecutor object at 0x00000000025B0DA0>_3 is running
<concurrent.futures.thread.ThreadPoolExecutor object at 0x00000000025B0DA0>_1 is running
<concurrent.futures.thread.ThreadPoolExecutor object at 0x00000000025B0DA0>_0 is running
<concurrent.futures.thread.ThreadPoolExecutor object at 0x00000000025B0DA0>_2 is running
<concurrent.futures.thread.ThreadPoolExecutor object at 0x00000000025B0DA0>_3 is running
<concurrent.futures.thread.ThreadPoolExecutor object at 0x00000000025B0DA0>_1 is running
++++>
0
2
4
6
8
10
12
14
16
18
5.002286195755005
回調函數
import requests
import time
from concurrent.futures import ThreadPoolExecutor
def get(url):
print('GET {}'.format(url))
response = requests.get(url)
time.sleep(2)
if response.status_code == 200: # 200代表狀態:下載成功了
return {'url': url, 'content': response.text}
def parse(res):
print('%s parse res is %s' % (res['url'], len(res['content'])))
return '%s parse res is %s' % (res['url'], len(res['content']))
def save(res):
print('save', res)
def task(res):
res = res.result()
par_res = parse(res)
save(par_res)
if __name__ == '__main__':
urls = [
'http://www.cnblogs.com/linhaifeng',
'https://www.python.org',
'https://www.openstack.org',
]
pool = ThreadPoolExecutor(2)
for i in urls:
pool.submit(get, i).add_done_callback(task)#這里的回調函數拿到的是一個對象。得
# 先把返回的res得到一個結果。即在前面加上一個res.result() #誰好了誰去掉回調函數
# 回調函數也是一種編程思想。不僅開線程池用,開線程池也用
pool.shutdown() #相當於進程池里的close和join
-------------輸出
GET http://www.cnblogs.com/linhaifeng
GET https://www.python.org
http://www.cnblogs.com/linhaifeng parse res is 17426
save http://www.cnblogs.com/linhaifeng parse res is 17426
GET https://www.openstack.org
https://www.python.org parse res is 48809
save https://www.python.org parse res is 48809
https://www.openstack.org parse res is 60632
save https://www.openstack.org parse res is 60632
map
import requests
import time
from concurrent.futures import ThreadPoolExecutor
def get(url):
print('GET {}'.format(url))
response = requests.get(url)
time.sleep(2)
if response.status_code == 200: # 200代表狀態:下載成功了
return {'url': url, 'content_len': len(response.text)}
if __name__ == '__main__':
urls = [
'http://www.cnblogs.com/linhaifeng',
'https://www.python.org',
'https://www.openstack.org',
]
pool = ThreadPoolExecutor(2)
res = pool.map(get, urls) #map取代了for+submit
pool.shutdown() # 相當於進程池里的close和join
print('=' * 30)
for r in res: # 返回的是一個迭代器
print(r)
GET http://www.cnblogs.com/linhaifeng
GET https://www.python.org
GET https://www.openstack.org
{'url': 'http://www.cnblogs.com/linhaifeng', 'content_len': 17426}
{'url': 'https://www.python.org', 'content_len': 48809}
{'url': 'https://www.openstack.org', 'content_len': 60632}
自定義線程池
from threading import Thread, currentThread
import time
import queue
class MyThread(Thread):
def __init__(self, queue):
super().__init__()
self.queue = queue
self.daemon = True # 子線程跟着主線程一起退出
self.start()
def run(self):
"""
1、讓他始終去運行,
2、去獲取queue里面的任務,
3、然后給任務分配函數去執行(獲取任務在執行)
:return:
"""
while True:
func, args, kwargs = self.queue.get() # 從隊列中獲取任務
func(*args, **kwargs)
self.queue.task_done() # 計數器 執行完這個任務后 (隊列-1操作)
class MyPool(object):
"""
在任務來到之前,提前創建好線程,等待任務
"""
def __init__(self, num): # 線程數量
self.num = num
self.queue = queue.Queue()
for i in range(self.num):
MyThread(self.queue)
def submit(self, func, args=(), kwargs={}):
self.queue.put((func, args, kwargs))
def join(self):
self.queue.join() # 等待隊列里面的任務處理完畢
def task(i):
print(currentThread().getName(), i)
time.sleep(2)
if __name__ == '__main__':
start = time.time()
pool = MyPool(3) # 實例化一個線程池
for i in range(4):
pool.submit(task, args=(i,))
pool.join()
print('運行的時間{}秒'.format(time.time() - start))
