python爬取夢幻西游召喚獸資質信息（不包含變異）

本文轉載自查看原文 2021-12-24 15:57 819 爬蟲/ python

一.分析

1.爬取網站：https://xyq.163.com/chongwu/

2.獲取網頁源碼：

request.get("https://xyq.163.com/chongwu/").text

這里就有問題了

這是查看網頁源代碼看到的源碼，也是通過requests獲取的源碼，發現是空的

這是在檢查處拿到的源碼，發現有數據了

發現代碼中沒有使用json，js中也無相應json數據（2022.3.18更新：其實是有json文件的，當時沒找到😂），因此數據是通過js動態加載出來的，因此選擇selenium+火狐瀏覽器組合爬取頁面（selenium是一個測試工具,selenium通過打開火狐驅動器加載火狐瀏覽器），可以獲取加載js后的HTML頁面

3.解析html，獲取每個召喚獸data-id和名稱

循環請求新的網址：https://xyq.163.com/chongwu/zhsxq.html?id=爬取的id&type=1

同樣，新的網址依舊不能通過requests獲取源代碼

解析並抓取p標簽的數據

二.代碼實現：

from selenium import webdriver
from pyquery import PyQuery as pq
import pandas as pd

firefox_options = webdriver.FirefoxOptions()
firefox_options.add_argument('--headless')
browser = webdriver.Firefox(options=firefox_options)
browser.get("https://xyq.163.com/chongwu/")
resp = browser.page_source
browser.quit()
data = pq(resp)
ret = data(".xxd li").items()
zhs_list = []
title = ["召喚獸名稱"]
count = 0
for i in ret:
    id = i.attr("data-id")
    name = i.text()
    browser2 = webdriver.Firefox(options=firefox_options)
    browser2.get("https://xyq.163.com/chongwu/zhsxq.html?id=%d&type=1" % int(id))
    resp2 = browser2.page_source
    browser2.quit()
    data2 = pq(resp2)
    ret2 = data2(".zhszz p").items()
    # 資質信息
    zizhi_list = [name]
    for j in ret2:
        if count == 0:
            top = j.text().split(" ")[0]
            title.append(top)
        zizhi = j('span').text()
        zizhi_list.append(zizhi)
    count += 1
    zhs_list.append(zizhi_list)

table = pd.DataFrame(zhs_list, columns=title)
print(table)
table.to_csv("夢幻西游召喚獸信息.csv",index=False,encoding="utf-8")

三.可能遇到的問題

1.如果沒有下載selenium ，先下載

pip3 install selenium

2. 下載火狐瀏覽器驅動包firefoxdriver，https://github.com/mozilla/geckodriver/releases，下載完以后，將exe文件放入python.exe目錄下，直接使用webdriver.Firefox()即可，否則就需要加上如下代碼：

from selenium.webdriver.chrome.service import Service

s=Service(r"你的exe文件路徑")
browser = webdriver.Firefox(service=s)

四.運行結果：

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 python爬取股票信息 Python爬取網頁信息 python爬取網站信息夢幻西游搖錢樹苗怎么種 python 爬取豆瓣書籍信息 python爬取鏈家租房信息用Python編寫爬取股票信息的代碼 python爬取酒店信息練習 python爬蟲的圖片信息爬取 python爬取考研專業信息