前言
今天給大家介紹下載B站top100小視頻,讓我們愉快地開始吧~
開發工具
Python版本:3.6.4
相關模塊:
requests模塊;
click模塊;
以及一些python自帶的模塊。
環境搭建
安裝Python並添加到環境變量,pip安裝需要的相關模塊即可。
原理簡介
首先,當然是打開B站小視頻所在的網址:
http://vc.bilibili.com/p/eden/rank#/?tab=%E5%85%A8%E9%83%A8
然后打開開發者模式,簡單抓包可以發現請求以下這個鏈接就可以返回視頻的真實地址:
請求該鏈接需要攜帶的參數包括:
page_size: 10 # 顯然,參數含義是每頁返回幾個視頻唄
next_offset: # 往下翻頁可以發現第二頁的值為11, 第三頁為21,所以應該是當前偏移量
tag: 今日熱門 # 標簽,值固定
platform: pc # 聲明平台,值固定
根據上面的分析結果,定義個函數來自動獲取B站前100個小視頻的鏈接吧:
'''獲取B站前top_n個小視頻的鏈接'''
def getVideoTopNLinks(top_n):
assert top_n > 0, '<top_n> in function getVideoTopNLinks must be larger than zero.'
print('[INFO]: Start to get video topn links...')
info_url = 'http://api.vc.bilibili.com/board/v1/ranking/top?'
headers = {
'Referer': 'http://vc.bilibili.com/p/eden/rank',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36'
}
params_base = {
'page_size': 10,
'next_offset': -10,
'tag': '今日熱門',
'platform': 'pc'
}
video_infos = []
while True:
params_base['next_offset'] += params_base['page_size']
if top_n <= 10:
params_base['page_size'] = top_n
top_n = 0
else:
top_n = top_n - 10
try:
res = requests.get(info_url, params=params_base, headers=headers)
items = res.json()['data']['items']
for item in items:
title = item['item']['description']
for char in '/::*??"<>|':
title = title.replace(char, '')
link = item['item']['video_playurl']
video_infos.append([title, link])
print('[INFO]: Got %s...' % title)
except:
print('[Warnning]: Something error when getting video links...')
if top_n <= 0:
break
time.sleep(random.random() * 2)
print('[INFO]: Finish, get %d links in total...' % (len(video_infos)))
return video_infos
然后寫個下載視頻的函數:
'''下載單個視頻'''
def downloadVideo(video_info, savepath):
checkDir(savepath)
savename, video_link = '%s.mp4' % video_info[0], video_info[1]
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36'
}
with closing(requests.get(video_link, headers=headers, stream=True, verify=False)) as res:
total_size = int(res.headers['content-length'])
if res.status_code == 200:
label = '[%s, FileSize]:%0.2f MB' % (savename, total_size/(1024*1024))
with click.progressbar(length=total_size, label=label) as progressbar:
with open(os.path.join(savepath, savename), "wb") as f:
for chunk in res.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
progressbar.update(1024)
最后遍歷得到的視頻鏈接列表下載這些視頻就大功告成啦:
for video_info in video_infos:
try:
downloadVideo(video_info, savepath)
except:
print('[Warnning]: Fail to download %s...' % video_info[1])
看完篇文章喜歡的朋友點個贊支持一下,關注我每天分享Python數據爬蟲案例,下篇文章分享是Python爬蟲分析魚C論壇熱帖