python aiohttp異步實現HTTP請求


  

在python的HTTP庫中,有requests、aiohttp和httpx。

requests只能發送同步請求,aiohttp只能發送異步請求,httpx既能發送同步請求,也能發送異步請求。

aiohttp在異步請求上效率最快,我們來一起學習下:

介紹

  aiohttp核心是異步並發,基於asyncio/await,可實現單線程並發IO操作。

安裝

   pip install aiohttp

使用

  客戶端使用

import aiohttp,asyncio
async def my_request():
    async with aiohttp.ClientSession() as session:
       # verify_ssl = False # 防止ssl報錯
        async with session.get('http://www.csdn.net/',verify_ssl=False) as response:
            print('status:',response.status)
            print('content-type',response.headers['content-type'])
            html=await response.text()
            print(f'body:{html[:15]}')
# 創建事件循環
loop=asyncio.get_event_loop()
tasks=[my_request(),]
loop.run_until_complete(asyncio.wait(tasks))

運行結果:

 

 *python3.7以上版本運行使用asyncio.run(my_request())

 

  服務端使用

import aiohttp,asyncio
async def hello(request):
    name=request.match_info.get('name','jack')
    text='hello '+name
    return web.Response(text=text)
app=web.Application()
app.add_routes([
    web.get('/',hello),
    web.get('/{name}',hello)
])
web.run_app(app,host='127.0.0.1')

 

  aiohttp客戶端的簡單應用

async def get_html(session,url):
#發送一個get請求信息
async with session.get(url,verify_ssl=False) as response:
print('status:',response.status)
return await response.text()
async def main():
#建立客戶端會話
async with aiohttp.ClientSession() as session:
html1=await get_html(session,'http://www.csdn.net/')
html2=await get_html(session,'http://python.org')
print(html1)
print(html2)

loop= asyncio.get_event_loop()
tasks=[main(),]
loop.run_until_complete(asyncio.wait(tasks))

以上例子也可以發送POST、DELETE、PUT方法,請求參數還有headers,params,data等。

  aio 異步爬蟲

import aiohttp,asyncio
import time
async def get_html(session,url):
    print('發送請求:',url)
    async with session.get(url,verify_ssl=False)as response:
        content=await response.content.read()
        print('得到結果',url,len(content))
        filename=url.rsplit('/')[-1]
        print('正在下載',filename)
        with open(filename,'wb') as file_object:
            file_object.write(content)
            print(filename,'下載成功')
async def main():
    async with aiohttp.ClientSession() as session:
        start_time=time.time()
        url_list=[
            'https://images.cnblogs.com/cnblogs_com/blueberry-mint/1877253/o_201106093544wallpaper1.jpg',
            'https://images.cnblogs.com/cnblogs_com/blueberry-mint/1877253/o_201106093557wallpaper2.jpg',
            'https://images.cnblogs.com/cnblogs_com/blueberry-mint/1877253/o_201106093613wallpaper3.jpg',
        ]

        tasks=[loop.create_task(get_html(session,url))for url in url_list]
        await asyncio.wait(tasks)
        end_time=time.time()
        print('is cost',round(end_time-start_time),'s')

loop=asyncio.get_event_loop()
loop.run_until_complete(main())

 

 

 ClientSession部分重要參數:

  1.TCPConnector 用於常規TCP套接字(同時支持HTTP和HTTPS方案)(絕大部分使用)。

  2.UnixConnector 用於通過UNIX套接字進行連接(主要用於測試)。

所有的連接器都應繼承自BaseConnector。

#創建一個TCPConnector

conn=aiohttp.TCPConnector(verify_ssl=False)

#作為參數傳入ClientSession

async with aiohppt.ClientSession(connector=conn) as session:

 

TCPConnector比較重要的參數有:

  verify_ssl(bool):布爾值,對HTTPS請求執行SSL證書驗證(默認情況下啟動),當要跳過具有無效證書的站點的驗證時可設置為False.

  limit(int):整型,同時連接的總數。如果limit為None,則connector沒有限制。(默認值:100)。

  limit_per_host(int):限制同時連接到同一個端點的總數。如果(host,port,is_ssl)三者相同,則端點相同。如果linit=0,則沒有限制。

限制並發量的另一個做法(使用Semaphore)

使用Semaphore直接限制發送請求。

import backoff as backoff
import requests,time,logging,aiohttp,asyncio
from requests.adapters import HTTPAdapter

# logging.basicConfig(level=logging.DEBUG)

my_logger=logging.getLogger(__name__)
my_handler=logging.FileHandler('log.txt')
my_handler.setLevel(logging.DEBUG)
formatter=logging.Formatter("%(asctime)s %(levelname)s %(pathname)s %(filename)s %(funcName)s %(lineno)s"
                            " -%(message)s","%Y-%m-%d %H:%M:%S")
my_handler.setFormatter(formatter)
my_logger.addHandler(my_handler)
my_logger.setLevel(logging.DEBUG)


now = lambda: time.time()
@backoff.on_exception(backoff.expo,aiohttp.ClientError,max_tries=3,logger=my_logger)
async def get_html(session,i,url):
    start=now()
    async with session.get(url,verify_ssl=False) as response:
        # return await response.text()
        r=await response.read()
        end_time=now()
        cost=end_time-start
        msg='第{}個請求,開始時間:{},花費時間:{},返回信息:{}\n'.format(i,start,cost,r.decode('utf-8'))
        print('running %d'% i,now(),msg)

# 使用semaphore 限制最大並發數
async def bound_register(sem,session,i,url):
    async with sem:
        await get_html(session,i,url)

async def run(num,url):
    tasks=[]
    sem=asyncio.Semaphore(100)
    connector=aiohttp.TCPConnector(limit=0,verify_ssl=False)
    async with aiohttp.ClientSession(connector=connector) as session:
        for i in range(num):
            task=asyncio.ensure_future(
                bound_register(sem=sem,session=session,i=i,url=url)
            )
            tasks.append(task)
        responses=asyncio.gather(*tasks)
        await  responses
start=now()
number=200
# url2='http://www.baidu.com'
url='http://127.0.0.1:8000/rest/href/single_href/?title=6'
loop=asyncio.get_event_loop()
future=asyncio.ensure_future(run(number,url))
loop.run_until_complete(future)
print('總耗時: %0.3f' % (now() - start))

 

 

 

 

 

參考文章:https://www.cnblogs.com/blueberry-mint/p/13937205.html

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM