如何用java實現一個p2p種子搜索(4)-種子獲取

本文轉載自查看原文 2019-04-23 14:32 943

種子獲取

在上一篇中我們已經可以獲取到dht網絡中的infohash了，所以我們只需要通過infohash來獲取到種子，最后獲取種子里面的文件名，然后和獲取到的infohash建立對應關系，那么我們的搜索的數據就算落地了，有了數據再把數據導到es，搜索就算完成了。
獲取種子我們需要和其他的peer交互，所以需要使用peer wire protocal發送握手數據包，握手數據包是68字節，第一個字節必須是19代表長度，后面是協議固定為BitTorrent protocol剛好19個字節，然后再跟着8個保留字節。現在一共是28字節，最后40字節分別是infohash和nodeid這樣合起來剛好是68字節

@Override
public void channelActive(ChannelHandlerContext ctx) throws Exception {
    byte[] infoHash = DHTUtil.hexStr2Bytes(this.infoHash);
    byte[] sendBytes = new byte[68];
    System.arraycopy(HANDSHAKE_BYTES, 0, sendBytes, 0, 28);
    System.arraycopy(infoHash, 0, sendBytes, 28, 20);
    System.arraycopy(routingTable.getNodeId(), 0, sendBytes, 48, 20);
    ctx.channel().writeAndFlush(Unpooled.copiedBuffer(sendBytes));
}

在握手協議后呢還需要在發送一個握手協議，這是因為不是所有的peer都支持種子的下載，種子的下載使用的是擴展bep_0009協議。
這個握手協議發送一個參數為m的字典，格式如下：前面4字節是長度字段，后面1字節是message id用來確認消息，緊接着一個字節0代表握手,在后面就是m參數的那個字典實際的數據了,官方介紹是這樣的

This message is sent as any other bittorrent message, with a 4 byte length prefix and a single byte identifying the message (the single byte being 20 in this case). At the start of the payload of the message, is a single byte message identifier. This identifier can refer to different extension messages and only one ID is specified, 0. If the ID is 0, the message is a handshake message which is described below. The layout of a general extended message follows (including the message headers used by the bittorrent protocol):
uint32_t    length prefix. Specifies the number of bytes for the entire message. (Big endian)
uint8_t     bittorrent message ID, = 20
uint8_t     extended message ID. 0 = handshake, >0 = extended message as specified by the handshake.

具體發送代碼：

public void sendHandshakeMsg(ChannelHandlerContext ctx) throws Exception{
    Map<String, Object> extendMessageMap = new LinkedHashMap<>();
    Map<String, Object> extendMessageMMap = new LinkedHashMap<>();
    extendMessageMMap.put("ut_metadata", 1);
    extendMessageMap.put("m", extendMessageMMap);
    byte[] tempExtendBytes = bencode.encode(extendMessageMap);
    byte[] extendMessageBytes = new byte[tempExtendBytes.length + 6];
    extendMessageBytes[4] = 20;
    extendMessageBytes[5] = 0;
    byte[] lenBytes = DHTUtil.int2Bytes(tempExtendBytes.length + 2);
    System.arraycopy(lenBytes, 0, extendMessageBytes, 0, 4);
    System.arraycopy(tempExtendBytes, 0, extendMessageBytes, 6, tempExtendBytes.length);
    ctx.channel().writeAndFlush(Unpooled.copiedBuffer(extendMessageBytes));
}

如果返回的消息里面包含ut_metadata和metadata_size，那么說明就支持種子下載協議，metadata_size代表種子的大小，因為每次下載最多是16Kb，所以我們需要根據返回的metadata_size進行分塊下載。其中有兩個參數一個是msg_type 具體的值有0 1 2，0 代表request也就是發起請求，1 代表data也就是數據，2 reject代表拒絕，還有一個參數是piece代表需要下載第幾塊數據。看起來還是挺簡單的

@SneakyThrows
private void sendMetadataRequest(ChannelHandlerContext ctx, String s){
    int ut_metadata= Integer.parseInt(s.substring(s.indexOf("ut_metadatai") + 12, s.indexOf("ut_metadatai") + 13));
    String str=s.substring(s.indexOf("metadata_sizei") + 14, s.length());
    int metadata_size=Integer.parseInt(str.substring(0, str.indexOf("e")));
    //分塊數
    int blockSize = (int) Math.ceil((double) metadata_size / (16 << 10));
    bs=blockSize;
    log.info("blocksize="+blockSize);
    //發送metadata請求
    for (int i = 0; i < blockSize; i++) {
        Map<String, Object> metadataRequestMap = new LinkedHashMap<>();
        metadataRequestMap.put("msg_type", 0);
        metadataRequestMap.put("piece", i);
        byte[] metadataRequestMapBytes = bencode.encode(metadataRequestMap);
        byte[] metadataRequestBytes = new byte[metadataRequestMapBytes.length + 6];
        metadataRequestBytes[4] = 20;
        metadataRequestBytes[5] = (byte) ut_metadata;
        byte[] lenBytes = DHTUtil.int2Bytes(metadataRequestMapBytes.length + 2);
        System.arraycopy(lenBytes, 0, metadataRequestBytes, 0, 4);
        System.arraycopy(metadataRequestMapBytes, 0, metadataRequestBytes, 6, metadataRequestMapBytes.length);
        ctx.channel().writeAndFlush(Unpooled.copiedBuffer(metadataRequestBytes));
    }
}

發送完后，對返回結果進行解碼，可以看到里面包含了種子的文件名，種子的長度等等。最后對解析到的文件名和infohash保存的到數據庫和es
好了，到此我們介紹完了種子搜索的整個思路和實現，那其實在dht網絡中獲取到infohash，然后再下載種子，最后能成功的概率沒有很高，我自己運行了好幾天，數據量不太，infohash到是還算多，但是很多都不支持metadata下載，這個是最騷的。不過還可以根據一些現有的磁力種子網站根據http協議去解析，這樣通過多種途徑收集數據才算多。
在實現的過程中，遇到了很多問題，看了很多文檔和資料，最終能實現感覺還是有點東西的，當然也參考了github上面種子搜索java實現，很多代碼都是copy的，哈哈哈哈。
最后再貼一下源碼地址吧https://github.com/mistletoe9527/dht-spider

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 如何用java實現一個p2p種子搜索(2)-路由表實現如何用java實現一個p2p種子搜索(3)-dht協議實現只用120行Java代碼寫一個自己的區塊鏈-4實現真正的p2p網絡 java項目----p2p項目網絡協議 15 - P2P 協議：小種子大學問 t-io Java構建p2p網絡 Stun方式的P2P實現原理(轉) P2P之UDP穿透NAT的原理與實現 P2P網絡 JAVA實現種子填充算法