python requests庫 爬取視頻


python requests庫 爬取視頻

一、總結

一句話總結:

爬取視頻操作和爬取圖片操作比較類似,我們可以設置請求中的stream參數來選擇以一整個塊的方式來爬取視頻或者以流的方式爬取
# 顯示下載視頻的進度
import requests
headers = {
    "user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36",
}
url ="https://video.pearvideo.com/mp4/adshort/20200709/cont-1684816-15252785_adpkg-ad_hd.mp4"
response = requests.get(url,headers=headers,stream=True)
print(response.status_code)
print(response.headers['content-length'])
content_size = int(response.headers['content-length'])
# print(response.text)
# print(response.content)
n = 1
with open("v.mp4","wb") as f:
    for i in response.iter_content(chunk_size=1024):
        rate=n*1024/content_size
        print("下載進度:{0:%}".format(rate))
        f.write(i)
        n+=1
    print("下載完成")

 

 

1、爬蟲如何獲取視頻的大小?

用響應頭里面的content-length屬性即可,即response.headers['content-length']

 

 

 

二、python requests庫 爬取視頻

轉自或參考:

 

import requests


# 下載視頻
def download(url):
    with requests.get(url, stream=True) as r:
        print('開始下載。。。')
        with open('v.mp4', 'wb')as f:
            for i in r.iter_content(chunk_size=1024):
                f.write(i)
    print('下載結束')


# 帶下載進度下載視頻
def download_level2(url):
    with requests.get(url, stream=True) as r:
        print('開始下載。。。')
        content_size = int(r.headers['content-length'])
        with open('v.mp4', 'wb')as f:
            n = 1
            for i in r.iter_content(chunk_size=1024):
                loaded = n * 1024.0 / content_size
                print(loaded)
                f.write(i)
                print('已下載{0:%}'.format(loaded))
                n += 1
    print('下載結束')


if __name__ == '__main__':
    URL = 'http://tb-video.bdstatic.com/tieba-smallvideo-transcode/3853363_adac7ec8907890797b3970e570aba43a_140b8b74a014_3.mp4'
    # 下載視頻
    # download(URL)
    # 帶下載進度
    download_level2(URL)

 

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM