Python+requests重定向和追蹤
一、什么是重定向
重定向就是網絡請求被重新定個方向轉到了其它位置
二、為什么要做重定向
網頁重定向的情況一般有:網站調整(如網頁目錄結構變化)、網頁地址改變、網頁擴展名(.php、.html、.asp)的改變、當一個網站注冊了多個域名的時候。這些情況下都需要進行網頁的重定向。不做重定向的話就容易出現404錯誤(如訪問網上提供的網頁url經常報404錯誤,就是有可能url地址改變了但沒有做重定向導致的。)
三、Python+requests重定向操作
1、重定向分:301 redirect---》永久性重定向、302 redirect---》暫時性重定向,比如下圖的302永久性重定向
2、追蹤重定向
import requests url = 'http://home.cnblogs.com/u/xswt/' r = requests.get(url,params=None,headers={'Content-Type':'application/json'}) print(r.history)#history追蹤頁面重定向歷史
運行結果:
[<Response [301]>, <Response [302]>, <Response [302]>, <Response [302]>] #可以看到該請求做了多次重定向
3、Python+requests獲取重定向的url地址:
import requests url = 'http://home.cnblogs.com/u/xswt/' r = requests.get(url,headers={"Content-Type":"application/json"}) reditList = r.history#可以看出獲取的是一個地址序列
print(f'獲取重定向的歷史記錄:{reditList}') print(f'獲取第一次重定向的headers頭部信息:{reditList[0].headers}') print(f'獲取重定向最終的url:{reditList[len(reditList)-1].headers["location"]}')
運行結果:
獲取重定向的歷史記錄:[<Response [301]>, <Response [302]>, <Response [302]>, <Response [302]>] 獲取第一次重定向的headers頭部信息:{'Date': 'Fri, 06 Sep 2019 06:53:05 GMT', 'Content-Length': '0', 'Connection': 'keep-alive', 'Location': 'https://home.cnblogs.com/u/xswt/'} 獲取重定向最終的url:https://account.cnblogs.com/signin?returnUrl=http%3a%2f%2fhome.cnblogs.com%2fu%2fxswt%2f
4、Python+requests重啟和禁止重定向
''' 禁止重定向(all_redirects=False) '''
import requests url = 'http://home.cnblogs.com/u/xswt/' r = requests.get(url,headers={"Content-Type":"application/json"},allow_redirects=False) print(r.status_code) print(r.text)
運行結果:
301
''' 重啟重定向 '''
import requests url = 'http://home.cnblogs.com/u/xswt/' r = requests.get(url,headers={"Content-Type":"application/json"},allow_redirects=True) print(r.status_code) print(r.text)
運行結果
200
<!DOCTYPE html><html lang=zh><head><meta charset=utf-8><meta http-equiv=X-UA-Compatible content="IE=EDGE"><meta name=viewport content="width=device-width, initial-scale=1, shrink-to-fit=no"><title>用戶登錄 - 博客園</title><link rel="shortcut icon" href=//common.cnblogs.com/favicon.ico type=image/x-icon><script src="/assets/account/signin-iconfont.js?v=01OkrFmCBcVIQNTQ6W3Q8sMKdVgWbmPjCL6jUR8-WG0"></script><link rel=stylesheet href="/assets/commons.bundle.css?v=Oz63dDHd7T_Cfz5h2Sq0d3vui_UXH--HRn9V4awJQzk"><link rel=stylesheet href="/assets/shared/_card.css?v=IL3_1zWqtnCRPXGhVd5DWxlqIbzUxrVAMDMRBgNJqr0"><link rel=stylesheet href="/assets/account/signin.css?v=OQC4pMzU7K-SBw0eOIhORW9tPgMtc8t_KMFfauwhOe4"><script>window.captcha={captchaType:'Geetest'};</script><body><!--[if IE]><div class=unsupported-browser>該頁面不支持 Internet Explorer 瀏覽器,建議使用 <a href="https://www.google.cn/intl/zh-CN/chrome/">Google Chrome</a>, <a href="https://www.mozilla.org/zh-CN/firefox/">Firefox</a> 或 <a href="https://www.microsoftedgeinsider.com/zh-CN/">Microsoft Edge</a></div><![endif]--><div class=center-container><div class="center-body card h-sm-100"><div class=card-body><div class="login-top text-center"><span class=login-title>博客園用戶登錄</span> <a href="https://www.cnblogs.com/"> <svg class=login-sign><use xlink:href=#icon-login-sign></use></svg> </a><div class=login-info>代碼改變世界</div></div><form id=loginForm method=post onsubmit="return false" action="/signin?returnurl=http%3A%2F%2Fhome.cnblogs.com%2Fu%2Fxswt%2F"><div class=form-group><input tabindex=1 class=form-control placeholder=登錄用戶名 autofocus type=text data-val=true data-val-required=請輸入登錄用戶名 id=LoginName name=LoginName> <span class="invalid-feedback field-validation-valid" data-valmsg-for=LoginName data-valmsg-replace=true></span> <a href=//passport.cnblogs.com/GetUsername.aspx class=txt-forget-sign>忘記登錄用戶名</a></div><div class=form-group><input tabindex=2 class=form-control placeholder=密碼 type=password data-val=true data-val-required=請輸入密碼
id=Password name=Password> <span class="invalid-feedback field-validation-valid" data-valmsg-for=Password data-valmsg-replace=true></span> <a class=txt-forget-sign href=/resetpassword>忘記密碼</a></div><div class="form-check form-remember"><input tabindex=3 type=checkbox id=IsRemember name=IsRemember value=true> <label class=label-remember for=IsRemember>記住我</label></div><button tabindex=4 id=submitBtn type=submit class="btn-login btn btn-primary btn-sm ladda-button px-4" data-style=slide-down> <span class=ladda-label>登錄</span> </button><div class=login-footer><div class=ajax-error-box><div class="ajax-error mb-2"></div></div><span>沒有帳戶,<a href=/signup>立即注冊</a></span></div><input name=__RequestVerificationToken type=hidden value=CfDJ8BQYbW6Qx5RFuF4UTI7QvU0JhTrWuqHSETm-ZBHqozMUxn_xVSGIuIjhJup5YFxpPklNDOD4T8n4eWmtuKVsaDDIYZfq53CJV9nH8hmpuWAnu9T-D8XnbDP7ouAqv6uHIjB_jLDh33Ncimy9Z6h8yec></form><input type=hidden id=PublicKey name=PublicKey value=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCp0wHYbg/NOPO3nzMD3dndwS0MccuMeXCHgVlGOoYyFwLdS24Im2e7YyhB0wrUsyYf0/nhzCzBK8ZC9eCWqd0aHbdgOQT6CuFQBMjbyGYvlVYU2ZP7kG9Ft6YV6oc9ambuO7nPZh+bvXH0zDKfi02prknrScAKC0XhadTHT3Al0QIDAQAB></div></div></div><script src="/assets/commons.bundle.js?v=hoU0LpMUGe-JXAnP-fFZtpXo0z2NRIKd7lcM9-aTiyw"></script><script src="/assets/shared/_withoutnav.js?v=y4G8garzujN3d6jIVIcqucumyuGzj_F89wPux5sCv80"></script><script src="/assets/account/signin.js?v=ZN5IPajeQxzfOVgdZ7bt4ZCCvcPFYWL-4fLGYVaP1Jk"></script>