[LeetCode] 471. Encode String with Shortest Length 最短長度編碼字符串

本文轉載自查看原文 2016-12-18 14:12 10199 LeetCode

Given a non-empty string, encode the string such that its encoded length is the shortest.

The encoding rule is: k[encoded_string], where the encoded_string inside the square brackets is being repeated exactly k times.

Note:

k will be a positive integer and encoded string will not be empty or have extra space.
You may assume that the input string contains only lowercase English letters. The string's length is at most 160.
If an encoding process does not make the string shorter, then do not encode it. If there are several solutions, return any of them is fine.

Example 1:

Input: "aaa"
Output: "aaa"
Explanation: There is no way to encode it such that it is shorter than the input string, so we do not encode it.

Example 2:

Input: "aaaaa"
Output: "5[a]"
Explanation: "5[a]" is shorter than "aaaaa" by 1 character.

Example 3:

Input: "aaaaaaaaaa"
Output: "10[a]"
Explanation: "a9[a]" or "9[a]a" are also valid solutions, both of them have the same length = 5, which is the same as "10[a]".

Example 4:

Input: "aabcaabcd"
Output: "2[aabc]d"
Explanation: "aabc" occurs twice, so one answer can be "2[aabc]d".

Example 5:

Input: "abbbabbbcabbbabbbc"
Output: "2[2[abbb]c]"
Explanation: "abbbabbbc" occurs twice, but "abbbabbbc" can also be encoded to "2[abbb]c", so one answer can be "2[2[abbb]c]".

這道題讓我們壓縮字符串，把相同的字符串用中括號括起來，然后在前面加上出現的次數，感覺還是一道相當有難度的題呢。參考了網上大神的帖子才弄懂該怎么做，這道題還是應該用DP來做。我們建立一個二維的DP數組，其中dp[i][j]表示s在[i, j]范圍內的字符串的縮寫形式(如果縮寫形式長度大於子字符串，那么還是保留子字符串)，那么如果s字符串的長度是n，最終我們需要的結果就保存在dp[0][n-1]中，然后我們需要遍歷s的所有子字符串，對於任意一段子字符串[i, j]，我們\\我們以中間任意位置k來拆分成兩段，比較dp[i][k]加上dp[k+1][j]的總長度和dp[i][j]的長度，將長度較小的字符串賦給dp[i][j]，然后我們要做的就是在s中取出[i, j]范圍內的子字符串t進行合並。合並的方法是我們在取出的字符串t后面再加上一個t，然后在這里面尋找子字符串t的第二個起始位置，如果第二個起始位置小於t的長度的話，說明t包含重復字符串，舉個例子吧，比如 t = "abab", 那么t+t = "abababab"，我們在里面找第二個t出現的位置為2，小於t的長度4，說明t中有重復出現，重復的個數為t.size()/pos = 2個，那么我們就要把重復的地方放入中括號中，注意中括號里不能直接放這個子字符串，而是應該從dp中取出對應位置的字符串，因為重復的部分有可能已經寫成縮寫形式了，比如題目中的例子5。再看一個例子，如果t = "abc"，那么t+t = "abcabc"，我們在里面找第二個t出現的位置為3，等於t的長度3，說明t中沒有重復出現，那么replace就還是t。然后我們比較我們得到的replace和dp[i][j]中的字符串長度，把長度較小的賦給dp[i][j]即可，時間復雜度為O(n ³)，空間復雜度為O(n ²)，參見代碼如下：

解法一：

class Solution {
public:
    string encode(string s) {
        int n = s.size();
        vector<vector<string>> dp(n, vector<string>(n, ""));
        for (int step = 1; step <= n; ++step) {
            for (int i = 0; i + step - 1 < n; ++i) {
                int j = i + step - 1;
                dp[i][j] = s.substr(i, step);
                for (int k = i; k < j; ++k) {
                    string left = dp[i][k], right = dp[k + 1][j];
                    if (left.size() + right.size() < dp[i][j].size()) {
                        dp[i][j] = left + right;
                    }
                }
                string t = s.substr(i, j - i + 1), replace = "";
                auto pos = (t + t).find(t, 1);
                if (pos >= t.size()) replace = t;
                else replace = to_string(t.size() / pos) + '[' + dp[i][i + pos - 1] + ']';
                if (replace.size() < dp[i][j].size()) dp[i][j] = replace;
            }
        }
        return dp[0][n - 1];
    }
};

根據熱心網友iffalse的留言，我們可以優化上面的方法。如果t是重復的，是不是就不需要再看left.size() + right.size() < dp[i][j].size()了。例如t是abcabcabcabcabc, 最終肯定是5[abc]，不需要再看3[abc]+abcabc或者abcabc+3[abc]。對於一個本身就重復的字符串，最小的長度肯定是n[REPEATED]，不會是某個left+right。所以應該把k的那個循環放在t和replace那部分代碼的后面。這樣的確提高了一些運算效率的，參見代碼如下：

解法二：

class Solution {
public:
    string encode(string s) {
        int n = s.size();
        vector<vector<string>> dp(n, vector<string>(n, ""));
        for (int step = 1; step <= n; ++step) {
            for (int i = 0; i + step - 1 < n; ++i) {
                int j = i + step - 1;
                dp[i][j] = s.substr(i, step);
                string t = s.substr(i, j - i + 1), replace = "";
                auto pos = (t + t).find(t, 1);
                if (pos < t.size()) {
                    replace = to_string(t.size() / pos) + "[" + dp[i][i + pos - 1] + "]";
                    if (replace.size() < dp[i][j].size()) dp[i][j] = replace;
                    continue;
                }
                for (int k = i; k < j; ++k) {
                    string left = dp[i][k], right = dp[k + 1][j];
                    if (left.size() + right.size() < dp[i][j].size()) {
                        dp[i][j] = left + right;
                    }
                }
            }
        }
        return dp[0][n - 1];
    }
};

類似題目：

Decode String

Number of Atoms

參考資料：

https://leetcode.com/problems/encode-string-with-shortest-length/

https://leetcode.com/problems/encode-string-with-shortest-length/discuss/95599/Accepted-Solution-in-Java

https://leetcode.com/problems/encode-string-with-shortest-length/discuss/95605/Easy-to-understand-C%2B%2B-O(n3)-solution

https://leetcode.com/problems/encode-string-with-shortest-length/discuss/95619/C%2B%2B-O(N3)-time-O(N2)-space-solution-using-memorized-dynamic-programming-with-detail-explanations

LeetCode All in One 題目講解匯總(持續更新中...)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 [LeetCode 471] Encode String with Shortest Length Leetcode: Encode String with Shortest Length && G面經 [LeetCode] 1055. Shortest Way to Form String 形成字符串的最短方法普通字符串與Hex編碼字符串之間轉換日文SJIS編碼字符串字符數獲取方法 PHP處理base64編碼字符串 C# 實現十六進制Unicode編碼字符串轉換為漢字 Python將形如”\xe4..."的十六進制編碼字符串恢復為中文 Base64編碼字符串時數據量明顯變大 Python實現unescape解碼JS(escape,encodeURI等方法)url編碼字符串