以下是小米應用商店熱門APP的爬蟲代碼:
只爬取前十頁:
# coding=utf-8 import requests import re from bs4 import BeautifulSoup count=1 #爬取小米應用市場前十頁 while count<11: # 獲取排行榜頁面的網頁內容 wbdata = requests.get("http://app.mi.com/topList?page=" + str(count)).text print("開始爬取第" + str(count) + "頁") soup = BeautifulSoup(wbdata,'lxml') applist = soup.find(class_='applist') for li in applist.find_all(name='li'): #print('輸出每個li:', li) pkg_name = li.a['href'] appname = li.h5.string categroy = li.p.string print(appname+'|'+pkg_name+'|'+categroy) count += 1
結果:
開始爬取第1頁 王者榮耀|/details?id=com.tencent.tmgp.sgame|網游RPG QQ|/details?id=com.tencent.mobileqq|聊天社交 抖音短視頻|/details?id=com.ss.android.ugc.aweme|影音視聽 微信|/details?id=com.tencent.mm|聊天社交 快手|/details?id=com.smile.gifmaker|攝影攝像 …………(以后省略一萬字)