Python3之requests模塊

本文轉載自查看原文 2016-06-29 21:55 25784 python3/ python/ requests/ Python模塊

　　Python標准庫中提供了：urllib等模塊以供Http請求，但是，它的 API 太渣了。它是為另一個時代、另一個互聯網所創建的。它需要巨量的工作，甚至包括各種方法覆蓋，來完成最簡單的任務。

　　發送GET請求

import urllib.request

f = urllib.request.urlopen('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508')
result = f.read().decode('utf-8')

　　發送攜帶請求頭的GET請求

import urllib.request

req = urllib.request.Request('http://www.example.com/')
req.add_header('Referer', 'http://www.python.org/')
r = urllib.request.urlopen(req)

result = f.read().decode('utf-8')

　　更多內容點擊查看官方文檔

　　Requests 是使用 Apache2 Licensed 許可證的基於Python開發的HTTP 庫，其在Python內置模塊的基礎上進行了高度的封裝，從而使得Pythoner進行網絡請求時，變得美好了許多，使用Requests可以輕而易舉的完成瀏覽器可有的任何操作。

requests庫特性：

Keep-Alive & 連接池
國際化域名和 URL
帶持久 Cookie 的會話
瀏覽器式的 SSL 認證
自動內容解碼
基本/摘要式的身份認證
優雅的 key/value Cookie
自動解壓
Unicode 響應體
HTTP(S) 代理支持
文件分塊上傳
流下載
連接超時
分塊請求
支持 .netrc

1. 安裝模塊

安裝:
	pip install requests
更新：
	pip install --upgrade requests

2. 使用模塊

　　HTTP的請求類型有POST，GET，PUT，DELETE，HEAD 以及 OPTIONS，其中POST和GET是最常使用的。

　　GET請求

import requests
# 無參數示例
r = requests.get('https://httpbin.org/get')
# 有參數示例
r = requets.get('http://httpbin.org/get', params=d)

傳遞URL參數：
    在URL中常見?符號，http://httpbin.org/get?key=val 這種帶有?傳遞關鍵字參數的方式，requests可以通過params實現。
d = {'k1':'v1', 'k2':'v2', 'k3':None, 'k4':['v4','v5']}  
    # 字典中鍵值為None的鍵不會被添加到URL中
    # 多個鍵值中間用&符號連接
    # 鍵值可是列表 例如'k4'
print(r.url)
執行結果為：http://httpbin.org/get?k1=v1&k2=v2&k4=v4&k4=v5

　　POST請求

# 1、基本POST實例
 
import requests
 
payload = {'key1': 'value1', 'key2': 'value2'}
ret = requests.post("http://httpbin.org/post", data=payload)
print(ret.text)
# 輸出結果
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "key1": "value1", 
    "key2": "value2"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Content-Length": "23", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.18.4"
  }, 
  "json": null, 
  "origin": "不告訴你這里返回的是你的IP地址", 
  "url": "http://httpbin.org/post"
}

 
# 2、發送請求頭和數據實例
 
import requests
import json

url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
headers = {'content-type': 'application/json'}
ret = requests.post(url, data=json.dumps(payload), headers=headers)
print(ret.text)
print(ret.cookies)
# 輸出結果
{"message":"Not Found","documentation_url":"https://developer.github.com/v3"}
<RequestsCookieJar[]>

　　關於響應內容

　requests模塊的返回對象是一個Response對象，可以從這個對象中獲取需要的信息。下面 r 代表Response對象。

r.text 文本響應內容
r.context 二進制響應內容
r.json() JSON響應內容
r.raw 原始相應內容

# 文本響應內容
    Response對象包含很多信息，Requests可以自動對大多數unicode字符集無縫解碼。
    請求發出后，Requests會基於HTTP頭部對響應的編碼做出有根據的推測。
    我們可以通過r.encoding得到編碼，也可以使用r.encoding屬性改變編碼

#二進制響應內容
      對於非文本請求r.content，Requests會自動解碼gzip和deflate傳輸編碼的響應內容。

# JSON相應內容
    需要注意如果JSON解碼失敗，r.json()會拋出異常。然而成功調用r.json()並不意味着響應成功，因為某些服務器失敗
    的相應中也會包含一個JSON對象，這種JSON會被解碼返回。如果要判斷請求是否成功，可以使用r.raise_for_status()
    或者檢查r.status_code是否和預期相同。

# 原始相應內容
    如果需要獲取服務器的原始套接字相應，可以使用r.raw，使用時要確保在初始請求中設置了 stream=True
r = requests.get('https://httpbin.org/get', stream=True)
print(r.raw)
print(r.raw.read(10))
# 結果輸出
<urllib3.response.HTTPResponse object at 0x061665F0>
b'{\n  "args"

相應內容介紹

　　定制請求頭

如果想要添加HTTP頭部，只需要傳遞一個字典給headers參數即可。注意: 所有的 header 值必須是 string、bytestring 或者 unicode。盡管傳遞 unicode header 也是允許的，但不建議這樣做。

注意：定制header的優先級低於某些特定的信息源，例如：

如果在 .netrc 中設置了用戶認證信息，使用 headers= 設置的授權就不會生效。而如果設置了 auth= 參數，``.netrc`` 的設置就無效了。
如果被重定向到別的主機，授權 header 就會被刪除。
代理授權 header 會被 URL 中提供的代理身份覆蓋掉。
在我們能判斷內容長度的情況下，header 的 Content-Length 會被改寫

更進一步講，Requests 不會基於定制 header 的具體情況改變自己的行為。只不過在最后的請求中，所有的 header 信息都會被傳遞進去。

url = 'https://api.github.com/some/endpoint'
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)

　　響應狀態碼

可以通過響應狀態碼得知請求的結果，一般 200表示請求成功，Requests還附帶一個內置的狀態碼查詢對象 request.codes:

>>> r = requests.get('http://httpbin.org/get')
>>> r.status_code
200
>>> r.status_code == requests.codes.ok
True

# 如果發送了一個錯誤請求(一個 4XX 客戶端錯誤，或者 5XX 服務器錯誤響應)，我們可以通過 Response.raise_for_status() 來拋出異常：

>>> bad_r = requests.get('http://httpbin.org/status/404')
>>> bad_r.status_code
404

>>> bad_r.raise_for_status()
Traceback (most recent call last):
  File "requests/models.py", line 832, in raise_for_status
    raise http_error
requests.exceptions.HTTPError: 404 Client Error

# 但是，由於我們的例子中 r 的 status_code 是 200 ，當我們調用 raise_for_status() 時，得到的是：
>>> r.raise_for_status()
None

　　響應頭

>>> r.headers
{
    'content-encoding': 'gzip',
    'transfer-encoding': 'chunked',
    'connection': 'close',
    'server': 'nginx/1.0.4',
    'x-runtime': '148ms',
    'etag': '"e1ca502697e5c9317743dc078f67693f"',
    'content-type': 'application/json'
}

#但是這個字典比較特殊：它是僅為 HTTP 頭部而生的。根據 RFC 2616， HTTP 頭部是大小寫不敏感的。

>>> r.headers['Content-Type']
'application/json'

>>> r.headers.get('content-type')
'application/json'

　　Cookie

>>> url = 'http://example.com/some/cookie/setting/url'
>>> r = requests.get(url)

>>> r.cookies['example_cookie_name']
'example_cookie_value'

# 如果想要發送你的cookies到服務器，可以使用cookies參數
>>> url = 'http://httpbin.org/cookies'
>>> cookies = dict(cookies_are='working')
>>> r = requests.get(url, cookies=cookies)
>>> r.text
'{"cookies": {"cookies_are": "working"}}'

# Cookie 的返回對象為 RequestsCookieJar，它的行為和字典類似，但界面更為完整，適合跨域名跨路徑使用。你還可以把 Cookie Jar 傳到 Requests 中：
>>> jar = requests.cookies.RequestsCookieJar()
>>> jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies')
>>> jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere')
>>> url = 'http://httpbin.org/cookies'
>>> r = requests.get(url, cookies=jar)
>>> r.text
'{"cookies": {"tasty_cookie": "yum"}}'

　　超時

你可以告訴 requests 在經過以 timeout 參數設定的秒數時間之后停止等待響應。基本上所有的生產代碼都應該使用這一參數。如果不使用，你的程序可能會永遠失去響應。

>>> requests.get('http://github.com', timeout=0.001)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request timed out. (timeout=0.001)

# 注意事項
    timeout 僅對連接過程有效，與響應體的下載無關。 timeout 並不是整個下載響應的時間限制，而是如果服務器在 timeout 秒內沒有應答，
    將會引發一個異常（更精確地說，是在 timeout 秒內沒有從基礎套接字上接收到任何字節的數據時）
    If no timeout is specified explicitly, requests do not time out.

　　錯誤與異常

遇到網絡問題（如：DNS 查詢失敗、拒絕連接等）時，Requests 會拋出一個 ConnectionError 異常。
如果 HTTP 請求返回了不成功的狀態碼， Response.raise_for_status() 會拋出一個 HTTPError 異常。
若請求超時，則拋出一個 Timeout 異常。
若請求超過了設定的最大重定向次數，則會拋出一個 TooManyRedirects 異常。
所有Requests顯式拋出的異常都繼承自 requests.exceptions.RequestException 。

　　其他請求

requests.get(url, params=None, **kwargs)
requests.post(url, data=None, json=None, **kwargs)
requests.put(url, data=None, **kwargs)
requests.head(url, **kwargs)
requests.delete(url, **kwargs)
requests.patch(url, data=None, **kwargs)
requests.options(url, **kwargs)
 
# 以上方法均是在此方法的基礎上構建
requests.request(method, url, **kwargs)

3. Http請求和XML實例

實例：檢測QQ賬號是否在線

import urllib
import requests
from xml.etree import ElementTree as ET

# 使用內置模塊urllib發送HTTP請求，或者XML格式內容
"""
f = urllib.request.urlopen('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508')
result = f.read().decode('utf-8')
"""


# 使用第三方模塊requests發送HTTP請求，或者XML格式內容
r = requests.get('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508')
result = r.text

# 解析XML格式內容
node = ET.XML(result)

# 獲取內容
if node.text == "Y":
    print("在線")
else:
    print("離線")

實例：查看火車停靠信息

import urllib
import requests
from xml.etree import ElementTree as ET

# 使用內置模塊urllib發送HTTP請求，或者XML格式內容
"""
f = urllib.request.urlopen('http://www.webxml.com.cn/WebServices/TrainTimeWebService.asmx/getDetailInfoByTrainCode?TrainCode=G666&UserID=')
result = f.read().decode('utf-8')
"""

# 使用第三方模塊requests發送HTTP請求，或者XML格式內容
r = requests.get('http://www.webxml.com.cn/WebServices/TrainTimeWebService.asmx/getDetailInfoByTrainCode?TrainCode=G666&UserID=')
result = r.text

# 解析XML格式內容
root = ET.XML(result)
for node in root.iter('TrainDetailInfo'):
    print(node.find('TrainStation').text,node.find('StartTime').text,node.tag,node.attrib)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Python3 Requests 模塊 python3 urllib和requests模塊 Python——python3的requests模塊的導入 centos6裝python3，並安裝requests, lxml和beautifulsoup模塊 python3之requests Python3安裝Requests Python之requests模塊-cookie Python-requests模塊 Python 爬蟲二 requests模塊 Python—requests模塊詳解