[LeetCode] 3. Longest Substring Without Repeating Characters 最長無重復字符的子串

本文轉載自查看原文 2015-05-06 01:05 52476 LeetCode

Given a string, find the length of the longest substring without repeating characters.

Example 1:

Input: "abcabcbb"
Output: 3 Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: "bbbbb"
Output: 1 Explanation: The answer is "b", with the length of 1.

Example 3:

Input: "pwwkew"
Output: 3 Explanation: The answer is "wke", with the length of 3. Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

這道求最長無重復子串的題和之前那道 Isomorphic Strings 很類似，屬於 LeetCode 早期經典題目，博主認為是可以跟 Two Sum 媲美的一道題。給了我們一個字符串，讓求最長的無重復字符的子串，注意這里是子串，不是子序列，所以必須是連續的。先不考慮代碼怎么實現，如果給一個例子中的例子 "abcabcbb"，讓你手動找無重復字符的子串，該怎么找。博主會一個字符一個字符的遍歷，比如 a，b，c，然后又出現了一個a，那么此時就應該去掉第一次出現的a，然后繼續往后，又出現了一個b，則應該去掉一次出現的b，以此類推，最終發現最長的長度為3。所以說，需要記錄之前出現過的字符，記錄的方式有很多，最常見的是統計字符出現的個數，但是這道題字符出現的位置很重要，所以可以使用 HashMap 來建立字符和其出現位置之間的映射。進一步考慮，由於字符會重復出現，到底是保存所有出現的位置呢，還是只記錄一個位置？我們之前手動推導的方法實際上是維護了一個滑動窗口，窗口內的都是沒有重復的字符，需要盡可能的擴大窗口的大小。由於窗口在不停向右滑動，所以只關心每個字符最后出現的位置，並建立映射。窗口的右邊界就是當前遍歷到的字符的位置，為了求出窗口的大小，需要一個變量 left 來指向滑動窗口的左邊界，這樣，如果當前遍歷到的字符從未出現過，那么直接擴大右邊界，如果之前出現過，那么就分兩種情況，在或不在滑動窗口內，如果不在滑動窗口內，那么就沒事，當前字符可以加進來，如果在的話，就需要先在滑動窗口內去掉這個已經出現過的字符了，去掉的方法並不需要將左邊界 left 一位一位向右遍歷查找，由於 HashMap 已經保存了該重復字符最后出現的位置，所以直接移動 left 指針就可以了。維護一個結果 res，每次用出現過的窗口大小來更新結果 res，就可以得到最終結果啦。

這里可以建立一個 HashMap，建立每個字符和其最后出現位置之間的映射，然后需要定義兩個變量 res 和 left，其中 res 用來記錄最長無重復子串的長度，left 指向該無重復子串左邊的起始位置的前一個，由於是前一個，所以初始化就是 -1，然后遍歷整個字符串，對於每一個遍歷到的字符，如果該字符已經在 HashMap 中存在了，並且如果其映射值大於 left 的話，那么更新 left 為當前映射值。然后映射值更新為當前坐標i，這樣保證了 left 始終為當前邊界的前一個位置，然后計算窗口長度的時候，直接用 i-left 即可，用來更新結果 res。

這里解釋下程序中那個 if 條件語句中的兩個條件 m.count(s[i]) && m[s[i]] > left，因為一旦當前字符 s[i] 在 HashMap 已經存在映射，說明當前的字符已經出現過了，而若 m[s[i]] > left 成立，說明之前出現過的字符在窗口內，那么如果要加上當前這個重復的字符，就要移除之前的那個，所以讓 left 賦值為 m[s[i]]，由於 left 是窗口左邊界的前一個位置（這也是 left 初始化為 -1 的原因，因為窗口左邊界是從0開始遍歷的），所以相當於已經移除出滑動窗口了。舉一個最簡單的例子 "aa"，當 i=0 時，建立了 a->0 的映射，並且此時結果 res 更新為1，那么當 i=1 的時候，發現a在 HashMap 中，並且映射值0大於 left 的 -1，所以此時 left 更新為0，映射對更新為 a->1，那么此時 i-left 還為1，不用更新結果 res，那么最終結果 res 還為1，正確，代碼如下：

C++ 解法一：

class Solution {
public:
    int lengthOfLongestSubstring(string s) {
        int res = 0, left = -1, n = s.size();
        unordered_map<int, int> m;
        for (int i = 0; i < n; ++i) {
            if (m.count(s[i]) && m[s[i]] > left) {
                left = m[s[i]];  
            }
            m[s[i]] = i;
            res = max(res, i - left);            
        }
        return res;
    }
};

下面這種寫法是上面解法的精簡模式，這里我們可以建立一個 256 位大小的整型數組來代替 HashMap，這樣做的原因是 ASCII 表共能表示 256 個字符，但是由於鍵盤只能表示 128 個字符，所以用 128 也行，然后全部初始化為 -1，這樣的好處是不用像之前的 HashMap 一樣要查找當前字符是否存在映射對了，對於每一個遍歷到的字符，直接用其在數組中的值來更新 left，因為默認是 -1，而 left 初始化也是 -1，所以並不會產生錯誤，這樣就省了 if 判斷的步驟，其余思路都一樣：

C++ 解法二：

class Solution {
public:
    int lengthOfLongestSubstring(string s) {
        vector<int> m(128, -1);
        int res = 0, left = -1;
        for (int i = 0; i < s.size(); ++i) {
            left = max(left, m[s[i]]);
            m[s[i]] = i;
            res = max(res, i - left);
        }
        return res;
    }
};

Java 解法二：

public class Solution {
    public int lengthOfLongestSubstring(String s) {
        int[] m = new int[256];
        Arrays.fill(m, -1);
        int res = 0, left = -1;
        for (int i = 0; i < s.length(); ++i) {
            left = Math.max(left, m[s.charAt(i)]);
            m[s.charAt(i)] = i;
            res = Math.max(res, i - left);
        }
        return res;
    }
}

下面這種解法使用了 HashSet，核心算法和上面的很類似，把出現過的字符都放入 HashSet 中，遇到 HashSet 中沒有的字符就加入 HashSet 中並更新結果 res，如果遇到重復的，則從左邊開始刪字符，直到刪到重復的字符停止：

C++ 解法三：

class Solution {
public:
    int lengthOfLongestSubstring(string s) {
        int res = 0, left = 0, i = 0, n = s.size();
        unordered_set<char> t;
        while (i < n) {
            if (!t.count(s[i])) {
                t.insert(s[i++]);
                res = max(res, (int)t.size());
            }  else {
                t.erase(s[left++]);
            }
        }
        return res;
    }
};

Java 解法三：

public class Solution {
    public int lengthOfLongestSubstring(String s) {
        int res = 0, left = 0, right = 0;
        HashSet<Character> t = new HashSet<Character>();
        while (right < s.length()) {
            if (!t.contains(s.charAt(right))) {
                t.add(s.charAt(right++));
                res = Math.max(res, t.size());
            } else {
                t.remove(s.charAt(left++));
            }
        }
        return res;
    }
}

Github 同步地址：

https://github.com/grandyang/leetcode/issues/3

類似題目：

Longest Substring with At Most Two Distinct Characters

Longest Substring with At Most K Distinct Characters

Subarrays with K Different Integers

參考資料：

https://leetcode.com/problems/longest-substring-without-repeating-characters/

https://leetcode.com/problems/longest-substring-without-repeating-characters/discuss/1737/C++-code-in-9-lines.

https://leetcode.com/problems/longest-substring-without-repeating-characters/discuss/1812/Share-my-Java-solution-using-HashSet

https://leetcode.com/problems/longest-substring-without-repeating-characters/discuss/1729/11-line-simple-Java-solution-O(n)-with-explanation

LeetCode All in One 題目講解匯總(持續更新中...)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。