60行代碼爬取抖音個人主頁視頻


60行代碼批量爬取抖音視頻

​ 爬蟲原理這里就不詳細寫了,直接貼代碼,主要也是為了方便我本人拿取,需要的朋友自取順便點個贊哦。

​ 操作方法:打開抖音,切換到某一個用戶頁面下,點擊右上角的三個點,點擊分享再點擊復制鏈接,運行程序,把鏈接輸入等待程序運行即可(“抖音,記錄美好生活”這幾個字記得去掉),然后就會把該用戶下所有上傳的視頻全部爬取下來。

# !/usr/bin/env python3
# -*- coding:utf-8 -*-
# @Time : 2021-03-15
# @Author : wind_leaf
import requests
import json
import re
import sys

headers = {
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'accept-language': 'zh-CN,zh;q=0.9,en;q=0.8',
    'pragma': 'no-cache',
    'cache-control': 'no-cache',
    'upgrade-insecure-requests': '1',
    'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1',
}
'''
https://www.iesdouyin.com/web/api/v2/aweme/post/?
sec_uid=MS4wLjABAAAAeAIH1d_98INk5rNXF9Q4zrbGK9d1Eumyydy7qKL1WPk&
count=21&
max_cursor=0&
aid=1128&
_signature=j0NkqgAA7x12uIyl2MgN6I9DZL&dytk=
'''
'''
<a href="https://www.iesdouyin.com/share/user/72673737181?u_code=17fc9cg0a&amp;did=69773896663&amp;iid=1302254358919767&amp;sec_uid=MS4wLjABAAAAeAIH1d_98INk5rNXF9Q4zrbGK9d1Eumyydy7qKL1WPk&amp;timestamp=1615603669&amp;utm_source=copy&amp;utm_campaign=client_share&amp;utm_medium=android&amp;share_app_name=douyin">Found</a>.

'''
'''eg: https://v.douyin.com/eRENmGV/    # 一條小團團OvO'''

root_url = input('輸入你要下載的用戶的分享鏈接:').strip()
max_cursor = 0      # 頁碼
has_more = True     # 是否有下一頁
page = 1        # 1頁20個視頻
response = requests.get(url=root_url, headers=headers, allow_redirects=False)
sec_uid = re.findall(r'sec_uid.*?&', response.headers['location'])[0][8:-1]     # 用戶唯一id

while has_more:
    video_lis = []
    print(f'獲取第{page}頁視頻地址---')
    response = requests.get(url=f'https://www.iesdouyin.com/web/api/v2/aweme/post/?sec_uid={sec_uid}&count=21&max_cursor={max_cursor}&aid=1128&_signature=dpcuDQAAFtyPbMYCi7BbQ3aXLh&dytk=', headers=headers)
    print(response.text)
    result = json.loads(response.text)
    if result['aweme_list']:
        max_cursor = result['max_cursor']
        has_more = result['has_more']
        for video_data in result['aweme_list']:
            dic = {'desc': video_data['desc']}
            dic['url'] = video_data['video']['play_addr']['url_list'][2]
            video_lis.append(dic)
    print('開始下載---')
    for i, video in enumerate(video_lis):
        print(f"第{page}頁{i+1}個視頻:{video['desc']}")
        size = 0
        response = requests.get(url=video['url'], headers=headers)
        content_size = int(response.headers['content-length'])
        sys.stdout.write('----[文件大小]:%0.2f MB\n' % (content_size / 1024 / 1024))

        with open(video['desc']+'.mp4', 'wb')as f:
            for data in response.iter_content(chunk_size=1024):
                f.write(data)
                size += len(data)
            f.flush()
    page += 1



免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM