一. 響應的兩種方式

在使用python3的requests模塊時，發現獲取響應有兩種方式

其一，為文本響應內容, r.text
其二，為二進制響應內容，r.content

在《Python學習手冊》中，是這樣解釋的

'''Python 3.X帶有3種字符串對象類型——一種用於文本數據，兩種用於二進制數據：

str表示Unicode文本(8位的和更寬的)
bytes表示二進制數據
bytearray，是一種可變的bytes類型'''

也就是說，r.text實際上就是Unicode的響應內容，而r.content是二進制的響應內容，看看源碼怎么解釋

 @property
    def text(self):
        """Content of the response, in unicode.

        If Response.encoding is None, encoding will be guessed using
        ``chardet``.

        The encoding of the response content is determined based solely on HTTP
        headers, following RFC 2616 to the letter. If you can take advantage of
        non-HTTP knowledge to make a better guess at the encoding, you should
        set ``r.encoding`` appropriately before accessing this property.
        """

大體的意思是說，r.text方法返回的是用unicode編碼的響應內容。響應內容的編碼取決於HTTP消息頭，如果你有HTTP之外的知識來猜測編碼，你應該在訪問這個屬性之前設置合適的r.encoding，也就是說，你可以用r.encoding來改變編碼，這樣當你訪問r.text ，Request 都將會使用r.encoding的新值

>>> r.encoding
'utf-8'
>>> r.encoding = 'ISO-8859-1'

我們看看r.content的源碼

 @property
    def content(self):
        """Content of the response, in bytes."""

r.content返回的是字節形式的響應內容

二. 問題的提出與解決

當用requests發送一個get請求時，得到的結果如下：

import requests

url = "xxx"
params = "xxx"
cookies = {"xxx": "xxx"}

res = requests.request("get", url, params=params, cookies=cookies)
print(res.text)

那么，問題來了，\u表示的那一串unicode編碼，它是什么原因造成的，請參考知乎相關回答，該如何呈現它的廬山真面目？

print(res.text.encode().decode("unicode_escape"))

這個unicode_escape是什么？將unicode的內存編碼值進行存儲，讀取文件時在反向轉換回來。這里就采用了unicode-escape的方式

當我們采用res.content時，也會遇到這個問題：

import requests

url = "xxx"
params = "xxx"
cookies = {"xxx": "xxx"}

res = requests.request("get", url, params=params, cookies=cookies)
print(res.content)

解決的辦法就是

print(res.content.decode("unicode_escape"))

三. 總結

1. str.encode() 把一個字符串轉換為其raw bytes形式

bytes.decode() 把raw bytes轉換為其字符串形式

2. 遇到類似的編碼問題時，先檢查響應內容text是什么類型，如果type(text) is bytes，那么

text.decode('unicode_escape')

如果type(text) is str，那么

text.encode('latin-1').decode('unicode_escape')

參考資料

https://www.zhihu.com/question/26921730

https://blog.csdn.net/bubblelone/article/details/70039419

https://blog.csdn.net/qq_23849183/article/details/51221993

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 關於python2中的unicode和str以及python3中的str和bytes python3中編碼與解碼之Unicode與bytes(轉帖) Python3 處理 Unicode 編碼 python2中的unicode()函數在python3中會報錯： python中unicode和unicodeescape Python報錯：SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape Python打開文件報錯SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape python3將字符串unicode轉換為中文 python3將unicode轉化成中文輸出 python3 將字符串unicode轉換為中文