新一代的網絡請求庫 Httpx

本文轉載自查看原文 2020-09-26 23:38 632

Python3.6 開始使用的下一代網絡請求庫

簡介

HTTPX 是最近 GitHub看的到一個比較火的一個項目，根據官網的描述，總結有如下特點:

和使用 requests 一樣方便,requests 有的它都有
加入 HTTP/1.1 和 HTTP/2 的支持。
能夠直接向 WSGI 應用程序或 ASGI 應用程序發出請求。
到處都有嚴格的超時設置
全類型注釋
100％的測試覆蓋率

比較不錯的一個特點是全類型注解，這讓我想起了一個叫 Starlette 的庫，它也是全類型注解的，類型注解主要方便IDE的智能提示，Java 等靜態類型的語言都有這個功能，Python 是近期新加的。其他的后面再說吧，我們還是看例子吧。

安裝

httpx 的安裝很簡單，像其他的 Python 庫一樣，直接 pip 就完事了

python3 -m pip install httpx

如果需要對 HTTP/2 支持，我們需要額外安裝一個庫

python3 -m pip install httpx[http2]

使用示例

import httpx
r = httpx.get('https://www.example.org/')
r.text
r.content
r.json()
r.status_code

基本的用法直接導包然后 get 就行了。其他的和 requests 的使用類似

r = httpx.put('https://httpbin.org/put', data={'key': 'value'})
r = httpx.delete('https://httpbin.org/delete')
r = httpx.head('https://httpbin.org/get')
r = httpx.options('https://httpbin.org/get')

Ok,這是基本用法。

如果需要做一個爬蟲項目，里面涉及到 Cookie 的傳遞這時候再這樣就不行了，
httpx 有個 requests 的 Session 類型的使用方法.

import httpx
client = httpx.Client() #類似requests.Session()
try:
    do somting
finally:
    client.close() #關閉連接池

更優雅的方法就是使用 with 上下文管理器的形式

with httpx.Client() as client:
    headers = {'X-Custom': 'value'}
    r = client.get('https://example.com', headers=headers)

這里有個地方需要強調下 Client 和 get 里面都可以添加 headers，
最后這兩個地方的 headers 可以合並到請求里,官方的例子

>>> headers = {'X-Auth': 'from-client'}
>>> params = {'client_id': 'client1'}
>>> with httpx.Client(headers=headers, params=params) as client:
...     headers = {'X-Custom': 'from-request'}
...     params = {'request_id': 'request1'}
...     r = client.get('https://example.com', headers=headers, params=params)
...
>>> r.request.url
URL('https://example.com?client_id=client1&request_id=request1')
>>> r.request.headers['X-Auth']
'from-client'
>>> r.request.headers['X-Custom']
'from-request'

接下來說下大家比較關心的一點代理的使用,需要注意的是 httpx 的代理只能在 httpx.Client 創建 Client 的實例的時候使用，client.get 的時候沒這個參數。
有意思的是它這個代理可以指定規則，限制哪些請求使用代理哪些不使用，來個官方的例子

允許所有請求都走代理

proxies = {
    "all://": "http://localhost:8030",
}

如果字典的值為 None 則表示不使用代理。

不同的協議走不用的代理

proxies = {
    "http://": "http://localhost:8030",
    "https://": "http://localhost:8031",
}

http 走 8030 的代理，https 走 8031 的代理，這里要注意和用 requests 使用代理的區別 requests 是下面這樣用的

proxies = {
    "http": "http://localhost:8030",
    "https": "http://localhost:8030",
}

綜合使用

你還可以配置多個規則像下面這

proxies = {
    # Route all traffic through a proxy by default...
    "all://": "http://localhost:8030",
    # But don't use proxies for HTTPS requests to "domain.io"...
    "https://domain.io": None,
    # And use another proxy for requests to "example.com" and its subdomains...
    "all://*example.com": "http://localhost:8031",
    # And yet another proxy if HTTP is used,
    # and the "internal" subdomain on port 5550 is requested...
    "http://internal.example.com:5550": "http://localhost:8032",
}

代理就這些，下面看看它的鏈接池的問題
你可以使用 Client 的關鍵字參數 limits 來控制連接池的大小。它需要以下實例httpx.Limits 來定義：

max_keepalive，允許的保持活動連接數或 None 始終允許。（預設10）
max_connections，允許的最大連接數或 None 無限制。（默認為100）

limits = httpx.Limits(max_keepalive_connections=5, max_connections=10)
client = httpx.Client(limits=limits)

如果默認鏈接數不夠用的就自己重新設置吧。(我感覺是不夠

我這邊只關注了爬蟲可能用到的部分，其他的大家可以看看官網。比如怎么搭配flask使用等。

好了關於httpx的同步請求的內容大概就這些要提的。如果只講到這里，你肯定會說，"就這,用 requests 不香么？",emmm,如果這么想你就錯了，要知道它不僅支持同步還支持異步的(手動滑稽)，使用起來比 aiohttp 簡單多了，這才是我推薦的目的。

httpx 之異步請求

要知道官網可是單獨把它拎出一節講的，可以看出里面應該有點東西。
廢話少說，開整。

我們先看在 aiohttp 中是如何創建並發送請求的

import aiohttp
import asyncio

async def main():
    async with aiohttp.ClientSession() as client:
         async with client.get('http://httpbin.org/get') as resp:
              assert resp.status == 200
              html= await resp.text()
              print(html)

我們需要使用兩個 async with 來完成一個請求,然后我們看看 httpx 怎么實現的呢

async with httpx.AsyncClient() as client:
    resp = await client.get('http://httpbin.org/get')
    assert resp.status_code == 200
    html = resp.text

感覺總體上比較 aiohttp 寫起來舒服多了，少寫很多異步代碼。
之前使用 aiohttp 中的 resp.status 來獲取狀態碼的時候寫了status_code，應該是使用 requests 習慣了吧，這下好了使用 httpx 不用擔心這個寫錯的問題了。

后記

最近,我剛把我之前的那個 discogs_aio_spider 的項目給改了，之前用的aiohttp，我現在改成 httpx，對 httpx 感興趣的朋友，可以到上面研究下我這個項目，有問題歡迎提出。👏
項目名:discogs_aio_spider
項目地址:https://github.com/cxapython/discogs_aio_spider
使用到的模塊：asyncio、httpx、motor、aio-pika、aioredis

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 短文-網絡新一代 avalon新一代UI庫發布 Python網絡請求庫httpx詳解新一代數據庫之Etcd 簡介新一代Ajax API --fetch 新一代銀行架構-筆記基於網絡開放可編程技術構建新一代網絡設備運管平台 Fetch-新一代Ajax API 【SpringCloud】Gateway新一代網關新一代圖像AI ISP技術