[LeetCode] Encode and Decode Tiny URL | 短網址算法

本文轉載自查看原文 2017-06-14 16:03 1882 Design/ hash table

https://leetcode.com/problems/encode-and-decode-tinyurl

一種做法是對於每一個請求的longURL，從0開始按遞增的順序用一個整數與之對應，這個整數就是對longURL的編碼，同時做為索引；對短網址解碼時，解析出短網址中的整數信息，查找原來的長網址即可。

class Solution {
public:
    // Encodes a URL to a shortened URL.
    string encode(string longUrl) {
        long_urls.push_back(longUrl);
        return "http://t.com/" + std::to_string(long_urls.size()-1);
    }

    // Decodes a shortened URL to its original URL.
    string decode(string shortUrl) {
        auto pos = shortUrl.find_last_of('/');
        auto id = std::stoi(shortUrl.substr(pos+1));
        return long_urls[id];
    }
    
private:
    vector<string> long_urls;
};

遞增方法的好處是編碼的結果都是唯一的，但是缺點也是明顯的：對相同的longURL，每次編碼的結果都不同，存在id和存儲資源的浪費。改用哈希表可以解決空間浪費的問題，但是遞增方法會把短網址的計數器暴露給用戶，也許存在安全隱患。

改進的方法是用字符串去設計短網址，僅僅考慮數字和字母的話，就有10+2*26=62種，變長編碼自然是可行的，但是編碼規則可能比較復雜，定長編碼足夠了。至於多長，據說新浪微博是用7個字符的， $62^7 \approx 3.5 \times 10^{12}$ ，這已經遠遠超過當今互聯網的URL總數了。於是，一個可行的做法是：對每個新到來的長URL，隨機從62個字符中選出7個構造它的key，並存入哈希表中（如果key已經用過，就繼續生成新的，直到不重復為止，不過重復的概率是很低的）；解碼短網址時，在哈希表中查找對應的key即可。

另外，為了不浪費key，可以再開一個哈希表，記錄每個長網址對應的短網址。

class Solution {
public:
    Solution() {
        short2long.clear();
        long2short.clear();
        dict = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
        len_tiny = 7;
        srand(time(NULL));
    }
    
    // Encodes a URL to a shortened URL.
    string encode(string longUrl) {
        if (long2short.count(longUrl)) {
            return "http://t.com/" + long2short[longUrl];
        }
        string tiny = dict.substr(0, len_tiny);
        while (short2long.count(tiny)) {
            std::random_shuffle(dict.begin(), dict.end());
            tiny = dict.substr(0, len_tiny);
        }
        long2short[longUrl] = tiny;
        short2long[tiny] = longUrl;
        return "http://t.com/" + tiny;
    }

    // Decodes a shortened URL to its original URL.
    string decode(string shortUrl) {
        auto pos = shortUrl.find_last_of('/');
        auto tiny = shortUrl.substr(pos+1);
        return short2long.count(tiny)? short2long[tiny] : shortUrl;
    }
    
private:
    unordered_map<string, string> short2long, long2short;
    string dict;
    int len_tiny;
};

參考：

http://www.mamicode.com/info-detail-1724865.html
如何設計一個短網址服務(TinyURL), https://soulmachine.gitbooks.io/system-design/content/cn/distributed-id-generator.html
如何設計短網址系統(TinyURL), http://cn.soulmachine.me/2017-04-10-how-to-design-tinyurl/

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 短網址ShortUrl的算法 URL短網址生成算法原理和php實現案例用PHP實現URL轉換短網址的算法示例 C#如何實現url短地址？C#短網址壓縮算法與短網址原理入門 [LeetCode] Encode and Decode TinyURL 編碼和解碼精簡URL地址 [LeetCode] Encode and Decode Strings Leetcode: Encode and Decode TinyURL 短網址生成思路算法短網址(short URL)系統的原理及其實現短網址服務(TinyURL)生成算法