java_爬蟲_從騰訊視頻播放界面爬取視頻真實地址


由於想在微信公眾號里爬一點兒考研的視頻

花了差不多一天的時間把這個爬蟲做好(其實也不算爬蟲吧,就算個能批量處理的地址解析器,半個爬蟲)

 

不多說,進正題

(本文適合有java基礎的同學,沒基礎的用客戶端緩存然后格式轉換吧)

 

所需條件:

1.一台聯網的有java環境的電腦

2.耐心

 

訪問后台接口網址:

http://vv.video.qq.com/getinfo(低清的只要這一個就好了)

http://vv.video.qq.com/getkey(高清的需要訪問這個)

原理(獲取低清視頻,先把原理打通,高清后期有時間會更):

步驟一:

獲取你想要下載的視頻的騰訊視頻頁面地址(這個很容易啦,就不贅述)

此處以:https://v.qq.com/x/page/f08302y6rof.html為例

 

步驟二:

獲取視頻vid

此處的vid為f08302y6rof,就是上邊兒網址上那一串長長的東西

 

步驟三:

用獲取到的視頻的vid替換下面紅色標明的vid(這一步是訪問后台接口得到json報文)

http://vv.video.qq.com/getinfo?vids=f08302y6rof&platform=101001&charge=0&otype=json&defn=shd

然后訪問

 

步驟四:

在頁面返回瀏覽器的json報文中,找到fn 和 fvkey

我這邊兒傳回的報文如下

QZOutputJson={"dltype":1,"exem":0,"fl":{"cnt":2,"fi":[{"id":100701,"name":"msd","lmt":0,"sb":1,"cname":"標清;(270P)","br":29,"profile":2,"drm":0,"video":1,"audio":1,"fs":101567331,"super":0,"hdr10enh":0,"sl":1},{"id":2,"name":"mp4","lmt":0,"sb":1,"cname":"高清;(480P)","br":34,"profile":1,"drm":0,"video":1,"audio":1,"fs":130427092,"super":0,"hdr10enh":0,"sl":0}]},"hs":0,"ip":"111.79.225.65","ls":0,"preview":3383,"s":"o","sfl":{"cnt":0},"tm":1556431150,"vl":{"cnt":1,"vi":[{"br":29,"ch":0,"cl":{"fc":0,"keyid":"f08302y6rof.100701"},"ct":21600,"drm":0,"dsb":0,"fmd5":"74e3040ce70af50716abead16c9fba50","fn":"f08302y6rof.m701.mp4","fs":101567331,"fst":5,"fvkey":"D351DB69FA6EC791CB6DE47266F80B21BFFFAA3616A7B42975903ED5EA68589C0E2454137002A84799CF43B4FD972B415259C1F23C21CD34F2C34BC64F6D7D16F21BF3BF94F22B09491FC9D8C96CFFA3B3177345807F34EFDDAF94449E72FC3B8C55751EE9EADC5F","head":0,"hevc":0,"iflag":0,"level":0,"lnk":"f08302y6rof","logo":1,"mst":8,"pl":null,"share":1,"sp":0,"st":2,"tail":0,"td":"3383.47","ti":"2020考研數學寒假計划(第一次課)","tie":0,"type":3,"ul":{"ui":[{"url":"http://ugcws.video.gtimg.com/uwMROfz0r5zAoaQXGdGnC2dfhzlOR5XW60pRw41PvMP8tDlH/","vt":106,"dtc":0,"dt":2},{"url":"http://ugcydzd.qq.com/uwMROfz0r5zAoaQXGdGlC2dfhznfaJdqBNmJ_NLSRfZb0kLT/","vt":146,"dtc":0,"dt":2},{"url":"http://ugcsjy.qq.com/uwMROfz0r5zAoaQXGdGlK2dfhzmm-mdByiC0fycrmmUBpCVq/","vt":176,"dtc":0,"dt":2},{"url":"http://video.dispatch.tc.qq.com/uwMROfz0r5zAoaQXGdGlLGdfhzn3bYHHUWfJ-3lk8pLFnjzb/","vt":0,"dtc":0,"dt":2}]},"vh":480,"vid":"f08302y6rof","videotype":0,"vr":0,"vst":2,"vw":270,"wh":0.5625,"wl":{"wi":[{"id":19,"x":14,"y":14,"w":84,"h":27,"a":100,"md5":"dcc9dc5c478c4100ea2817c5e6020f26","url":"http://puui.qpic.cn/vcolumn_pic/0/logo_qing_xi_color_336_108.png/0","surl":"https://puui.qpic.cn/vcolumn_pic/0/logo_qing_xi_color_336_108.png/0"}]},"uptime":1548118095,"fvideo":0,"fvpint":0,"swhdcp":0}]}};

(傳回來的就是一行,我直接用java代碼解析了,手動找費眼睛)

 

步驟五:

利用獲取到的fn和fvkey構建視頻下載地址

此處構建的是:

http://ugcws.video.gtimg.com/f08302y6rof.m701.mp4?vkey=2E657DF01414A1F95E0B3CF7F187CEB84B3E439F5D0BA2D7F052967654DEFDE53292F0BE8BCD373FA0F269BA6BE5CC1AD5CC4AEE269AB0B1C72261815608260190B1D14D9B1820B0394DAB0C8DA1D8561F3B3455FBE5BA27D618C81A0A233256DDDAB6429E3A05FF

把獲取到的fn替換前邊兒一個短的標紅內容

fvkey替換后邊兒長的標紅內容

 

這就是完整的視頻下載地址了,可以用迅雷下載

完成

 

源碼如下(有錯誤或者不規范請大佬指出,個人機器上可以運行):

package catchVedio;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;

/**
 * 獲取視頻接口的json
 * @author Administrator
 *
 */
public class CatchVedio {
//    Socket client = new Scoket();
    private URL url;
    private HttpURLConnection urlConnection;
    private int responseCode;
    private BufferedReader reader;
    private BufferedWriter writer;
    
    
    public static void main(String[] args) {
        CatchVedio cv = new CatchVedio();
        try {
            
            String[] VedioURL = cv.get_VedioURL();//接收
            for(String temp:VedioURL) {//temp是每一個視頻的播放地址
                cv.toDownloadURL(cv.analyse(cv.get_Json(temp)));//寫出到文件
            }    
        } catch (IOException e) {
            // TODO 自動生成的 catch 塊
            e.printStackTrace();
        }finally {
            try {
                cv.reader.close();
                cv.writer.close();
            } catch (IOException e) {
                // TODO 自動生成的 catch 塊
                e.printStackTrace();
            }
        }
    
    }
    
    void toDownloadURL(String real_url) throws IOException {//將對應下載地址輸出到文件
        this.writer = new BufferedWriter(new FileWriter("D:/worm/downloadURL.txt",true));//定義追加方式寫入的流
//        this.writer.append(real_url);
        this.writer.write(real_url+"\r\n");
        this.writer.flush();
    }
    
    String analyse(String json) {//分析json,傳回完整下載地址
        int fvkey_index = json.indexOf("\"fvkey\":\"")+9;
        int endIndex = json.indexOf("\"",fvkey_index);
        String fvkey = json.substring(fvkey_index,endIndex);//獲取到fvkey
//        System.out.println(fvkey);
        
        int fn_index = json.indexOf("\"fn\":\"")+6;
        int fn_end = json.indexOf("\"",fn_index);
        String fn = json.substring(fn_index,fn_end);//獲取到視頻文件名 
//        System.out.println(fn);
        
        String head = "http://ugcws.video.gtimg.com/";
        
        StringBuffer real_url = new StringBuffer();
        real_url.append(head);//加入頭部
        real_url.append(fn+"?");//加入文件名
        real_url.append("vkey="+fvkey);//加入解鎖碼
        /*構造成功*/
//        System.out.println(real_url.toString());
        return real_url.toString();
        
    }
    
    String get_Json(String url) throws UnsupportedEncodingException, IOException {
        String line = "";
        StringBuffer sb = new StringBuffer();
        this.url = new URL(url);
        this.urlConnection = (HttpURLConnection)this.url.openConnection();
        this.responseCode = this.urlConnection.getResponseCode();
        if (this.responseCode == 200) {
            this.reader = new BufferedReader(new InputStreamReader(this.urlConnection.getInputStream(), "UTF-8"));
            while ((line = this.reader.readLine()) != null) {
                sb.append(line);// 網頁傳回的只有一行
            }
            return sb.toString();
        }
        return "";
    }
    
    String[] get_VedioURL() throws IOException {
//    void get_VedioURL() throws IOException {
        File file = new File("D:/worm/vedioURL.txt");
        String line = "";
        this.reader = new BufferedReader(new FileReader(file));
        String[] t = new String[0];
        List<String> container = new ArrayList<String>();
        while(null!=(line = this.reader.readLine())) {
            if(line.equals("")) {
                continue;
            }
            line = this.change(line);//轉換一下
            container.add(line);//裝入容器
        }
        return container.toArray(t);
    }
    /**
     * http://vv.video.qq.com/getinfo?vids=x0164ytbgov&platform=101001&charge=0&otype=json&defn=shd //格式
     * @param str
     * @return
     * https://v.qq.com/x/page/f08302y6rof.html//頁面地址示例
     * https://v.qq.com/x/page/y083158hphd.html
     * https://v.qq.com/x/page/c08503oe58c.html
     */
    String change(String str) {//定義從頁面播放地址獲取vid轉換到后台接口地址的方法
        String head = "http://vv.video.qq.com/getinfo?vids=";
        String tail = "&platform=101001&charge=0&otype=json&defn=shd";
        String vid = str.substring(str.indexOf("page/")+5,str.indexOf(".html"));
        return head+vid+tail;
    }
}

我是輸入輸出都是文件操作

 

希望對大家有所幫助

以上

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM