python中urllib.request和requests的使用和區別

本文轉載自查看原文 2020-10-28 23:09 2151 爬蟲/ python

轉載自https://blog.csdn.net/qq_38783948/article/details/88239109

urllib.request

我們都知道，urlopen()方法能發起最基本對的請求發起，但僅僅這些在我們的實際應用中一般都是不夠的，可能我們需要加入headers之類的參數,那需要用功能更為強大的Request類來構建了

在不需要任何其他參數配置的時候，可直接通過urlopen()方法來發起一個簡單的web請求
發起一個簡單的請求

import urllib.request
url='https://www.douban.com'
webPage=urllib.request.urlopen(url)
print(webPage)
data=webPage.read()
print(data)
print(data.decode('utf-8'))

urlopen()方法返回的是一個http.client.HTTPResponse對象，需要通過read（）方法做進一步的處理。一般使用read（）后，我們需要用decode（）進行解碼，通常為utf-8，經過這些步驟后，最終才獲取到我們想要的網頁。
添加Headers信息

import urllib.request
url='https://www.douban.com'
headers = {
     'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36',
 }
response=urllib.request.Request(url=url,headers=headers)
webPage=urllib.request.urlopen(response)
print(webPage.read().decode('utf-8'))

使用Request類返回的又是一個urllib.request.Request對象了。
通常我們爬取網頁，在構造http請求的時候，都需要加上一些額外信息，什么Useragent，cookie等之類的信息，或者添加代理服務器。往往這些都是一些必要的反爬機制

requests

通常而言，在我們使用python爬蟲時，更建議用requests庫，因為requests比urllib更為便捷，requests可以直接構造get,post請求並發起，而urllib.request只能先構造get，post請求，再發起。

import requests
url='https://www.douban.com'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36',
}
get_response = requests.get(url,headers=headers,params=None)
post_response=requests.post(url,headers=headers,data=None,json=None)
print(post_response)
print(get_response.text)
print(get_response.content)
print(get_response.json)

get_response.text得到的是str數據類型。
get_response.content得到的是Bytes類型,需要進行解碼。作用和get_response.text類似。
get_response.json得到的是json數據。

總而言之，requests是對urllib的進一步封裝，因此在使用上顯得更加的便捷，建議小伙伴們在實際應用當中盡量使用requests。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 在python3中使用urllib.request編寫簡單的網絡爬蟲 Python-爬蟲03：urllib.request模塊的使用 Python3 內置http.client,urllib.request及三方庫requests發送請求對比 python3爬蟲初探（一）之urllib.request Python 3.X 要使用urllib.request 來抓取網絡資源。轉 Python3使用urllib.request模塊https請求時的ssl證書錯誤(mac系統運行) python3 urllib.request 網絡請求操作 python3 urllib及requests基本使用爬蟲小探-Python3 urllib.request獲取頁面數據 Python3——根據m3u8下載視頻（上）之urllib.request