python爬蟲（九） requests庫之post請求

本文轉載自查看原文 2020-02-27 23:17 2238

1、方法：

response=requests.post("https://www.baidu.com/s",data=data)

2、拉勾網職位信息獲取

因為拉勾網設置了反爬蟲機制，在拉勾網中，一些頁面的信息獲取方法是post,所以就用到了post方法

在拉勾網中，我們搜索與python相關的職業，如果我們爬取這一頁的信息，是沒有職業的信息的，因為職業的信息在另外的jsp頁面上，所以我們需要在這個界面上爬取到職業的信息，選擇一個城市+學生身份

同樣，在頁面右擊，選擇查看元素，找到網絡，刷新，選擇跟職位相關的

然后右側的網址為url:

這個頁面上面的網址為urls:

可以看到他的獲取方法是post,所以我們要獲取職位的信息，需要post函數

這時我們需要用到data參數

以及請求頭：

代碼如下：

import requests

url='https://www.lagou.com/jobs/positionAjax.json?xl=%E6%9C%AC%E7%A7%91&px=default&gx=%E5%85%A8%E8%81%8C&city=%E6%88%90%E9%83%BD&needAddtionalResult=false&isSchoolJob=1'
data ={
 'first':"true",
 'pn':1,
 'kd':"python"
}
headers={
 'User-Agent':"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36",
 'Referer':"https://www.lagou.com/jobs/list_python/p-city_252?px=default&gx=%E5%85%A8%E8%81%8C&gj=&xl=%E6%9C%AC%E7%A7%91&isSchoolJob=1",
 'Accept':'application/json, text/javascript, */*; q=0.01'
}
urls='https://www.lagou.com/jobs/list_python/p-city_252?px=default&gx=%E5%85%A8%E8%81%8C&gj=&xl=%E6%9C%AC%E7%A7%91&isSchoolJob=1#filterBox'
s = requests.Session()
s.get(urls, headers=headers, timeout=3)
cookie = s.cookies
response = s.post('https://www.lagou.com/jobs/positionAjax.json?xl=%E6%9C%AC%E7%A7%91&px=default&gx=%E5%85%A8%E8%81%8C&city=%E6%88%90%E9%83%BD&needAddtionalResult=false&isSchoolJob=1',data=data,headers=headers, cookies=cookie,timeout=5)

print(response.text)
with open('py.html', 'w') as file:
 file.write(response.text)

中間出現錯誤：您操作太頻繁，請稍后再訪問，解決方法參考網址：http://www.freesion.com/article/140098505/

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 4.爬蟲 requests庫講解 GET請求 POST請求響應 python中requests庫的post請求 python爬蟲（八） requests庫之 get請求爬蟲請求庫——requests python 爬蟲基於requests模塊發起ajax的post請求 Python 爬蟲（二）：Requests 庫 python爬蟲之一：requests庫 python requests post請求帶header python網絡爬蟲之requests庫 Python爬蟲之requests庫介紹(一)