[LeetCode] 642. Design Search Autocomplete System 設計搜索自動補全系統


 

Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character '#'). For each character they type except '#', you need to return the top 3historical hot sentences that have prefix the same as the part of sentence already typed. Here are the specific rules:

  1. The hot degree for a sentence is defined as the number of times a user typed the exactly same sentence before.
  2. The returned top 3 hot sentences should be sorted by hot degree (The first is the hottest one). If several sentences have the same degree of hot, you need to use ASCII-code order (smaller one appears first).
  3. If less than 3 hot sentences exist, then just return as many as you can.
  4. When the input is a special character, it means the sentence ends, and in this case, you need to return an empty list.

Your job is to implement the following functions:

The constructor function:

AutocompleteSystem(String[] sentences, int[] times): This is the constructor. The input is historical data. Sentences is a string array consists of previously typed sentences. Times is the corresponding times a sentence has been typed. Your system should record these historical data.

Now, the user wants to input a new sentence. The following function will provide the next character the user types:

List<String> input(char c): The input c is the next character typed by the user. The character will only be lower-case letters ('a' to 'z'), blank space (' ') or a special character ('#'). Also, the previously typed sentence should be recorded in your system. The output will be the top 3 historical hot sentences that have prefix the same as the part of sentence already typed.

 

Example:
Operation: AutocompleteSystem(["i love you", "island","ironman", "i love leetcode"], [5,3,2,2]) 
The system have already tracked down the following sentences and their corresponding times: 
"i love you" : 5 times 
"island" : 3 times 
"ironman" : 2 times 
"i love leetcode" : 2 times 
Now, the user begins another search: 

Operation: input('i') 
Output: ["i love you", "island","i love leetcode"] 
Explanation: 
There are four sentences that have prefix "i". Among them, "ironman" and "i love leetcode" have same hot degree. Since ' ' has ASCII code 32 and 'r' has ASCII code 114, "i love leetcode" should be in front of "ironman". Also we only need to output top 3 hot sentences, so "ironman" will be ignored. 

Operation: input(' ') 
Output: ["i love you","i love leetcode"] 
Explanation: 
There are only two sentences that have prefix "i "

Operation: input('a') 
Output: [] 
Explanation: 
There are no sentences that have prefix "i a"

Operation: input('#') 
Output: [] 
Explanation: 
The user finished the input, the sentence "i a" should be saved as a historical sentence in system. And the following input will be counted as a new search. 

 

Note:

    1. The input sentence will always start with a letter and end with '#', and only one blank space will exist between two words.
    2. The number of complete sentences that to be searched won't exceed 100. The length of each sentence including those in the historical data won't exceed 100.
    3. Please use double-quote instead of single-quote when you write test cases even for a character input.
    4. Please remember to RESET your class variables declared in class AutocompleteSystem, as static/class variables are persisted across multiple test cases. Please see here for more details.

 

這道題讓實現一個簡單的搜索自動補全系統,當我們用谷歌或者百度進行搜索時,會有這樣的體驗,輸入些單詞,搜索框會彈出一些以你輸入為開頭的一些完整的句子供你選擇,這就是一種搜索自動補全系統。根據題目的要求,補全的句子是按之前出現的頻率排列的,高頻率的出現在最上面,如果頻率相同,就按字母順序來顯示。輸入規則是每次輸入一個字符,然后返回自動補全的句子,如果遇到井字符,表示完整句子結束。那么肯定需要一個 HashMap,建立句子和其出現頻率的映射,還需要一個字符串 data,用來保存之前輸入過的字符。在構造函數中,給了一些句子,和其出現的次數,直接將其加入 HashMap,然后 data 初始化為空字符串。在 input 函數中,首先判讀輸入字符是否為井字符,如果是的話,那么表明當前的 data 字符串已經是一個完整的句子,在 HashMap 中次數加1,並且 data 清空,返回空集。否則的話將當前字符加入 data 字符串中,現在就要找出包含 data 前綴的前三高頻句子了,使用優先隊列來做,設計的思路是,始終用優先隊列保存頻率最高的三個句子,應該把頻率低的或者是字母順序大的放在隊首,以便隨時可以移出隊列,所以應該是個最小堆,隊列里放句子和其出現頻率的 pair 對兒,並且根據其頻率大小進行排序,要重寫優先隊列的 comparator。然后遍歷 HashMap 中的所有句子,首先要驗證當前 data 字符串是否是其前綴,沒啥好的方法,就逐個字符比較,用標識符 matched,初始化為 true,如果發現不匹配,則 matched 標記為 false,並 break 掉。然后判斷如果 matched 為 true 的話,說明 data 字符串是前綴,那么就把這個 pair 加入優先隊列中,如果此時隊列中的元素大於三個,那把隊首元素移除,因為是最小堆,所以頻率小的句子會被先移除。然后就是將優先隊列的元素加到結果 res 中,由於先出隊列的是頻率小的句子,所以要加到結果 res 的末尾,參見代碼如下:

 

class AutocompleteSystem {
public:
    AutocompleteSystem(vector<string> sentences, vector<int> times) {
        for (int i = 0; i < sentences.size(); ++i) {
            freq[sentences[i]] += times[i]; 
        }
        data = "";
    }    
    vector<string> input(char c) {
        if (c == '#') {
            ++freq[data];
            data = "";
            return {};
        }
        data.push_back(c);
        auto cmp = [](pair<string, int>& a, pair<string, int>& b) {
            return a.second > b.second || (a.second == b.second && a.first < b.first);
        };
        priority_queue<pair<string, int>, vector<pair<string, int>>, decltype(cmp) > q(cmp);
        for (auto f : freq) {
            bool matched = true;
            for (int i = 0; i < data.size(); ++i) {
                if (data[i] != f.first[i]) {
                    matched = false;
                    break;
                }
            }
            if (matched) {
                q.push(f);
                if (q.size() > 3) q.pop();
            }
        }
        vector<string> res(q.size());
        for (int i = q.size() - 1; i >= 0; --i) {
            res[i] = q.top().first; q.pop();
        }
        return res;
    }
    
private:
    unordered_map<string, int> freq;
    string data;
};

 

Github 同步地址:

https://github.com/grandyang/leetcode/issues/642

 

類似題目:

Implement Trie (Prefix Tree)

Top K Frequent Words

 

參考資料:

https://leetcode.com/problems/design-search-autocomplete-system/

https://leetcode.com/problems/design-search-autocomplete-system/discuss/176550/Java-simple-solution-without-using-Trie-(only-use-HashMap-and-PriorityQueue)

https://leetcode.com/problems/design-search-autocomplete-system/discuss/105379/Straight-forward-hash-table-%2B-priority-queue-solution-in-c%2B%2B-no-trie

 

LeetCode All in One 題目講解匯總(持續更新中...)


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM