原文地址https://www.cnblogs.com/yoyoketang/p/9098096.html
原文地址https://www.cnblogs.com/yoyoketang/p/6886610.html
原文地址https://www.cnblogs.com/yoyoketang/
原文地址https://www.cnblogs.com/yoyoketang/p/7259993.html
前言
登錄網站的時候,經常會遇到傳token參數,token關聯並不難,難的是找出服務器第一次返回token的值所在的位置,取出來后就可以動態關聯了
登錄拉勾網
1.先找到登錄首頁https://passport.lagou.com/login/login.html,輸入賬號和密碼登錄,抓包看詳情
2.再重新登錄一次抓包看的時候,頭部有兩個參數是動態的,token和code值每次都會不一樣,只能用一次
X-Anit-Forge-Token: 45aa69d8-4afa-4235-8957-9dde7af1903e X-Anit-Forge-Code: 20765316
找到token生成的位置
1.打開登錄首頁https://passport.lagou.com/login/login.html,直接按F5刷新(只做刷新動作,不輸入賬號和密碼),然后從返回的頁面找到token生成的位置
</script> <!-- 頁面樣式 --> <!-- 動態token,防御偽造請求,重復提交 --> <script> window.X_Anti_Forge_Token = '286fd3ae-ef82-4019-89c4-9408947a0e26'; window.X_Anti_Forge_Code = '74603111'; </script>
前端的代碼,注釋內容暴露了token位置,嘿嘿!
2.接下來從返回的html里面解析出token和code兩個參數的值
# coding:utf-8 import requests import re from bs4 import BeautifulSoup # 作者:上海-悠悠 QQ交流群:512200893 def getTokenCode(s): ''' 要從登錄頁面提取token,code, 然后在頭信息里面添加 <!-- 頁面樣式 --><!-- 動態token,防御偽造請求,重復提交 --> <script type="text/javascript"> window.X_Anti_Forge_Token = 'dde4db4a-888e-47ca-8277-0c6da6a8fc19'; window.X_Anti_Forge_Code = '61142241'; </script> ''' url = 'https://passport.lagou.com/login/login.html' h = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0", } # 更新session的headers s.headers.update(h) data = s.get(url, verify=False) soup = BeautifulSoup(data.content, "html.parser", from_encoding='utf-8') tokenCode = {} try: t = soup.find_all('script')[1].get_text() print(t) tokenCode['X_Anti_Forge_Token'] = re.findall(r"Token = '(.+?)'", t)[0] tokenCode['X_Anti_Forge_Code'] = re.findall(r"Code = '(.+?)'", t)[0] except: print("獲取token和code失敗") tokenCode['X_Anti_Forge_Token'] = "" tokenCode['X_Anti_Forge_Code'] = "" return tokenCode
模擬登陸
1.登陸的時候這里密碼參數雖然加密了,但是是固定的加密方式,所以直接復制抓包的加密后字符串就行了
# coding:utf-8 import requests import re from bs4 import BeautifulSoup # 作者:上海-悠悠 QQ交流群:512200893 def login(s, gtoken, user, psw): ''' function:登錄拉勾網網站 :param s: 傳s = requests.session() :param gtoken: 上一函數getTokenCode返回的tokenCode :param user: 賬號 :param psw: 密碼 :return: 返回json ''' url2 = 'https://passport.lagou.com/login/login.json' h2 = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0", "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8", "X-Requested-With": "XMLHttpRequest", "X-Anit-Forge-Token": gtoken['X_Anti_Forge_Token'], "X-Anit-Forge-Code": gtoken['X_Anti_Forge_Code'], "Referer": "https://passport.lagou.com/login/login.html", } # 更新s的頭部 s.headers.update(h2) body = { "isValidate":'true', "username": user, "password": psw, "request_form_verifyCode": "", "submit": "" } r2 = s.post(url2 , data=body, verify=False) print(r2.text) return r2.json()
密碼加密
1.這里密碼是md5加密的(百度看了其它大神的博客,才知道的)
# coding:utf-8 import requests import re from bs4 import BeautifulSoup import hashlib def encryptPwd(passwd): # 作者:上海-悠悠 QQ交流群:512200893 # 對密碼進行了md5雙重加密 passwd = hashlib.md5(passwd.encode('utf-8')).hexdigest() # veennike 這個值是在js文件找到的一個寫死的值 passwd = 'veenike'+passwd+'veenike' passwd = hashlib.md5(passwd.encode('utf-8')).hexdigest() return passwd if __name__ == "__main__": # 測試密碼123456 print(encryptPwd("123456"))
輸出結果:
2.跟抓包的數據對比,發現是一樣的,說明加密成功
參考代碼:
# coding:utf-8 import requests import re from bs4 import BeautifulSoup import urllib3 import hashlib urllib3.disable_warnings() # 作者:上海-悠悠 QQ交流群:512200893 class LoginLgw(): def __init__(self, s): self.s = s def getTokenCode(self): ''' 要從登錄頁面提取token,code, 然后在頭信息里面添加 <!-- 頁面樣式 --><!-- 動態token,防御偽造請求,重復提交 --> <script type="text/javascript"> window.X_Anti_Forge_Token = 'dde4db4a-888e-47ca-8277-0c6da6a8fc19'; window.X_Anti_Forge_Code = '61142241'; </script> ''' url = 'https://passport.lagou.com/login/login.html' h = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0", } # 更新session的headers self.s.headers.update(h) data = self.s.get(url, verify=False) soup = BeautifulSoup(data.content, "html.parser", from_encoding='utf-8') tokenCode = {} try: t = soup.find_all('script')[1].get_text() print(t) tokenCode['X_Anti_Forge_Token'] = re.findall(r"Token = '(.+?)'", t)[0] tokenCode['X_Anti_Forge_Code'] = re.findall(r"Code = '(.+?)'", t)[0] return tokenCode except: print("獲取token和code失敗") tokenCode['X_Anti_Forge_Token'] = "" tokenCode['X_Anti_Forge_Code'] = "" return tokenCode def encryptPwd(self,passwd): # 對密碼進行了md5雙重加密 passwd = hashlib.md5(passwd.encode('utf-8')).hexdigest() # veennike 這個值是在js文件找到的一個寫死的值 passwd = 'veenike'+passwd+'veenike' passwd = hashlib.md5(passwd.encode('utf-8')).hexdigest() return passwd def login(self, user, psw): ''' function:登錄拉勾網網站 :param user: 賬號 :param psw: 密碼 :return: 返回json ''' gtoken = self.getTokenCode() print(gtoken) print(gtoken['X_Anti_Forge_Token']) print(gtoken['X_Anti_Forge_Code']) url2 = 'https://passport.lagou.com/login/login.json' h2 = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0", "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8", "X-Requested-With": "XMLHttpRequest", "X-Anit-Forge-Token": gtoken['X_Anti_Forge_Token'], "X-Anit-Forge-Code": gtoken['X_Anti_Forge_Code'], "Referer": "https://passport.lagou.com/login/login.html", } # 更新s的頭部 self.s.headers.update(h2) passwd = self.encryptPwd(psw) body = { "isValidate":'true', "username": user, "password": passwd, "request_form_verifyCode": "", "submit": "" } r2 = self.s.post(url2 , data=body, verify=False) try: print(r2.text) return r2.json() except: print("登錄異常信息:%s" % r2.text) return None if __name__ == "__main__": s = requests.session() lgw = LoginLgw(s) lgw.login("15221000000", "123456")