Given a paragraph and a list of banned words, return the most frequent word that is not in the list of banned words. It is guaranteed there is at least one word that isn't banned, and that the answer is unique.
Words in the list of banned words are given in lowercase, and free of punctuation. Words in the paragraph are not case sensitive. The answer is in lowercase.
Example:
Input: paragraph = "Bob hit a ball, the hit BALL flew far after it was hit." banned = ["hit"] Output: "ball" Explanation: "hit" occurs 3 times, but it is a banned word. "ball" occurs twice (and no other word does), so it is the most frequent non-banned word in the paragraph. Note that words in the paragraph are not case sensitive, that punctuation is ignored (even if adjacent to words, such as "ball,"), and that "hit" isn't the answer even though it occurs more because it is banned.
Note:
1 <= paragraph.length <= 1000.1 <= banned.length <= 100.1 <= banned[i].length <= 10.- The answer is unique, and written in lowercase (even if its occurrences in
paragraphmay have uppercase symbols, and even if it is a proper noun.) paragraphonly consists of letters, spaces, or the punctuation symbols!?',;.- There are no hyphens or hyphenated words.
- Words only consist of letters, never apostrophes or other punctuation symbols.
這道題給了我們一個字符串,是一個句子,里面有很多單詞,並且還有標點符號,然后又給了我們一個類似黑名單功能的一個字符串數組,讓我們在返回句子中出現的頻率最高的一個單詞。要求非常簡單明了,那么思路也就簡單粗暴一些吧。因為我們返回的單詞不能是黑名單中的,所以我們對於每一個統計的單詞肯定都需要去黑名單中檢查,為了提高效率,我們可以把黑名單中所有的單詞放到一個HashSet中,這樣就可以常數級時間內查詢了。然后要做的就是處理一下字符串數組,因為字符串中可能有標點符號,所以我們先過濾一遍字符串,這里我們使用了系統自帶的兩個函數isalpha()和tolower()函數,其實自己寫也可以,就放到一個子函數處理一下也不難,這里就偷懶了,遍歷每個字符,如果不是字母,就換成空格符號,如果是大寫字母,就換成小寫的。然后我們又使用一個C++中的讀取字符串流的類,Java中可以直接調用split函數,叼的飛起。但誰讓博主固執的寫C++呢,也無所謂啦,習慣就好,這里我們也是按照空格拆分,將每個單詞讀出來,這里要使用一個mx變量,統計當前最大的頻率,還需要一個HashMap來建立單詞和其出現頻率之間的映射。然后我們看讀取出的單詞,如果不在黑名單中內,並且映射值加1后大於mx的話,我們更新mx,並且更新結果res即可,參見代碼如下:
class Solution { public: string mostCommonWord(string paragraph, vector<string>& banned) { unordered_set<string> ban(banned.begin(), banned.end()); unordered_map<string, int> strCnt; int mx = 0; for (auto &c : paragraph) c = isalpha(c) ? tolower(c) : ' '; istringstream iss(paragraph); string t = "", res = ""; while (iss >> t) { if (!ban.count(t) && ++strCnt[t] > mx) { mx = strCnt[t]; res = t; } } return res; } };
參考資料:
https://leetcode.com/problems/most-common-word/
https://leetcode.com/problems/most-common-word/discuss/124286/Clean-6ms-C%2B%2B-solution
