爬蟲學習（三）——get請求參數解析

本文轉載自查看原文 2019-02-12 11:55 861 用python進行爬蟲

get請求：

用戶輸入搜索的內容，發送請求，將請求的內容保存起來。

get請求的本質是在地址欄中輸入參數進行的一種請求方式。

解析參數使用urllib.parse

import urllib.parse

# 在百度搜索“中國”關鍵字，得到的得到的url如下：
string= "https://www.baidu.com/s?ie=utf-8&word=%E4%B8%AD%E5%9B%BD&tn=98537121_hao_pg"

#unquote()反應用，解析參數，將二進制轉成我們能看懂的中文形式

string = urllib.parse.unquote(string)

print(string)

# 顯示結果：

https://www.baidu.com/s?ie=utf-8&word=中國&tn=98537121_hao_pg

第一種方式

country =input("請輸入要檢索的信息")
# 輸入內容：王家興

# 引用，即解析參數，將中文轉換成為二進制形式
# 對指定的參數徑解析
string = urllib.parse.quote(country)

# 將參數寫成字典的格式
data = {
"ie":"utf-8",
"word":"%s"%string
}

# 拼接路由

lt = []

for ie,word in data.items():

lt.append(ie+"="+word)

# join()函數的作用是將將“&”作為分隔符對列表lt中的字符串進行拼接
var ="&".join(lt)

string= "https://www.baidu.com/s?%s"%var
print(string)

# 顯示結果：
# https://www.baidu.com/s?ie=utf-8&word=%E7%8E%8B%E5%AE%B6%E5%85%B4

第二種方式

# 將參數寫成字典的格式
data = {
"ie":"utf-8",
"word":"澳大利亞"
}
# urlencode()函數是專門對url進行操作的函數，並且把參數編碼成為url類型的數據
urldata = urllib.parse.urlencode(data)

print(urldata)
# 顯示結果:
# ie=utf-8&word=%E6%BE%B3%E5%A4%A7%E5%88%A9%E4%BA%9A

# 對url進行拼接
urlall= "https://www.baidu.com/s?%s"%urldata

# 顯示結果：
print(urlall)
# https://www.baidu.com/s?ie=utf-8&word=%E6%BE%B3%E5%A4%A7%E5%88%A9%E4%BA%9A

ajax的GET請求

請求頭信息（瀏覽器請求頭信息）

Request URL:

https://movie.douban.com/top250?start=25&filter=
Request Method:

GET
Status Code:

200 OK
Remote Address:

154.8.131.165:443
Referrer Policy:

unsafe-url

爬蟲代碼如下：
import urllib.request
import urllib.parse
url = "https://movie.douban.com/top250?"
start = int(input("輸入要查看到電影的頁碼："))
data = {
    "start": (start - 1) * 25,
    "filter": "",
}
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"}
data = urllib.parse.urlencode(data)
url+=data
print(url)
request =urllib.request.Request(url,headers = headers)
response = urllib.request.urlopen(request)
print(response.read().decode("utf8"))

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 爬蟲--requests基本請求：get和post帶參數學習@RequestBody注解解析請求參數流程 get請求參數為數組 Get請求之接受參數【FastAPI 學習七】GET和POST請求參數接收以及驗證 Spider爬蟲-get、post請求爬蟲——requests.get爬蟲模塊參數 requests不帶參數的get請求和帶get參數請求 http get請求參數拼接 html 接收GET請求參數