python 破解58字體反爬

本文轉載自查看原文 2019-05-01 21:29 537 爬蟲/ python

1、選擇網址58同城

2、按F12查看元素

3、將鼠標指到數字上發現如下所示

數字顯示亂碼

4、發現亂碼前的class標簽和旁邊style的標簽一樣

我搜索一下fangchan-secret

發現有很長的字符串前面有base64，斷定這是base64加密，然后解密這段字符串就能實現反爬

代碼如下：

1、獲取整個頁面

    def get_html(self,url):
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
        }
        html = requests.get(url, headers=headers).text
        return html

2、解析頁面獲取base64加密的字符串從進.xml文件中

    def get_xml(self,html):
        base64_str = re.search(r'base64,(.*?)\)', html, re.S).group(1)
        base64_str_decode = base64.b64decode(base64_str)
        filr_name = "58.ttf"
        with open(filr_name, 'wb') as f:
            f.write(base64_str_decode)
        font = TTFont('58.ttf') # 打開本地的ttf文件
        font.saveXML('58.xml')  # 轉換成xml

3、打開xml文件

4、解析cmap中的內容得到字典

    def get_dict(self,html):
        base64_str = re.search(r'base64,(.*?)\)', html, re.S).group(1)
        base64_str_decode = base64.b64decode(base64_str)
        filr_name = "58.ttf"
        with open(filr_name, 'wb') as f:
            f.write(base64_str_decode)
        font = TTFont('58.ttf') # 打開本地的ttf文件
        font.saveXML('58.xml')  # 轉換成xml
        cmap = font['cmap'].getBestCmap()
        newdict = {}
        for i in cmap:
            pat = re.compile(r'(\d+)')
            values = int (re.search(pat,cmap[i])[1]) - 1
            keys = hex(i)
            newdict[keys] = values
        return newdict

5、字典內容

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 58 字體反爬攻略 python3 實戰-快手H5字體反爬 python爬蟲 - js逆向之svg字體反爬破解 python爬蟲 - js逆向之woff字體反爬破解 python解析字體反爬 Python爬蟲實例：爬取貓眼電影——破解字體反爬字體反爬破解學習--爬取實習僧 12、Python 高級反爬機制-破解js加密爬蟲中關於字體反爬【爬蟲】58同城字體加密&破解方法