leetcode筆記 動態規划在字符串匹配中的應用


leetcode筆記 動態規划在字符串匹配中的應用

0 參考文獻

序號 標題
1 一招解決4道leetcode hard題,動態規划在字符串匹配問題中的應用
2 10.Regular Expression Matching

1. [10. Regular Expression Matching]

1.1 題目

Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Note:

  • s could be empty and contains only lowercase letters a-z.
  • p could be empty and contains only lowercase letters a-z, and characters like . or *.

Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

Example 4:

Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".

Example 5:

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

1.2 思路 && 解題方法

dp數組

首先建立一個二維數組,數P組的列是代表了字符串S,數組的行代表了字符串P。dp[i] [j] 表示P[0:i] 匹配S[0:j]。因此如果最后P能夠匹配S,則dp [len(P)] [len(S)] == True 。注意dp[0] [0] 分別是S是空字符和P是空字符的時候。 這個時候是必定匹配的,因此dp [0] [0] = True。

之后需要做的事情就是依次填滿這個矩陣。為此需要初始化dp [ 0 ] [ j ] 和dp [ i ] [ 0 ]既第0行和第0列。

  1. 對於第0行因為P為空,則除了S是空以外其他的都不匹配。因此 dp [ 0 ] [ j ] = False

  2. 對於第0列,則需要判斷下P是否能匹配S是空串的情況。在S是空串的情況下,之后P是空串或者P是帶有 " * "的情況下才能匹配,因此只需要處理這兩種情況。

    1. P是空串的情況下,可以匹配S。因此dp [ 0 ] [ 0 ] = True
    2. P是" * "的情況下,例如"abc",因為可以是匹配0個或者多個字符。因此當在這種情況下,*號其實可以消掉前面的字符變成""。因此dp [ i ] [ 0 ] = dp [ i-2 ] [ 0 ] and P [ i -1 ] == " * "。這里為什么不是判斷dp [ i -1 ] [ 0 ] 是否為True 而是判斷dp [ i-2 ] [ 0 ]呢?是因為dp [ i-1 ] [ 0 ]是表示 P [ 0 : i - 2 ] 能夠匹配S [ 0 ],如果P [ 0 : i -2 ]能夠匹配S [ 0 ],那么當前字符" * " 消掉前一個字符便無法匹配S [ 0 ] (既dp [ i ] [ 0 ] == False)。如下圖的示例,當P="a * b * " 是可以匹配""空字符的。那么當i = 2 ,j = 0 的時候,必須有dp [ 0 ] [ 0 ] == True 才能得到 dp [ 2 ] [ 0 ] == True。

1557325258465

到此,dp矩陣的初始化已經好了。這個時候,矩陣中的值如圖所示。綠色部分是已經初始化的值。空白的部分是待填充的。

1557325258465

接下來就是填充dp矩陣的剩余部分。對於dp [ i ] [ j ] ( i>1, j>1)會有以下的幾種情況:

  1. P [ i - 1 ] == " * " :

    對於這種情況,還可以區別2中情況:

    1. " * " 抵消前面的字符,既 " * "匹配空字符串:

      對於這種情況則和前文所述的方法一樣,dp [ i ] [ j ] == dp [ i-2 ] [ j ]

    2. " * "匹配前面的字符N次 :

      對於這種情況,則需要在 ( ( P[ i - 1 ] == " . " ) or ( S[ j -1 ] == P [ i -2 ] ) )的情況下 ,dp [ i -1 ] [ j ] == True。這是為什么呢?原因在於如果要匹配0-N次,則代表了P[ 0 - i -2 ] (既dp [ i -1 ] [ XXXX ] ) 能完全匹配S[ 0 : j - 1 ]。

      如例子中的 "a." 能匹配 "abb"。

  2. P [ i - 1 ] == " . " or P [ i - 1 ] == " 一個正常的字符 " :

    如果是這種情況見簡單的多,既( S [ j - 1 ] == P [ j -1 ] or P [ j -1 ] == " . " ) and dp [ i - 1 ] [ j -1 ] == True 。

1557325258465

1.3 實現

class Solution(object):
    def isMatch(self, s, p):
        """
        :type s: str
        :type p: str
        :rtype: bool
        """
        # dp[i][j] 代表了p字符串從0-i是否匹配s字符的0-j
        row = len(p) + 1
        col = len(s) + 1
        dp = [ [False for i in range( col ) ] for j in range( row ) ]
        dp[0][0] = True # dp[0][0] 代表了p是空串 s是空串
        # 當s時空串的情況下,p的不同,匹配的不同情況。為接下去匹配len(s) = 1 ,2 ,3 .... n 做准備
        # 當s為空串的時候,只有a*b*這種能匹配。
        # 因此dp[0][0] 為空串,所以i-1實際真正指向p的一個字符串的位置
        for i in range( 1, row):
            dp[i][0] = ( i > 1 ) and p[ i - 1 ] == "*" and dp[ i - 2 ][0]

        for i in range( 1, row ) :
            for j in range( 1, col ):
                
                if p[ i - 1 ] =="*":
                    dp[i][j] = dp[ i - 2 ][j] or ( p[ i - 2  ] == s[ j - 1 ] or p[ i - 2 ] == ".") and  dp[i][j-1]

                else:
                    dp[i][j] = ( p[ i - 1 ] == "."  or p[ i - 1 ]  == s[ j - 1 ]) and dp[i-1][j-1]

        return dp[row-1][col-1]
        

2. [44. Wildcard Matching]

2.1 題目

Given an input string (s) and a pattern (p), implement wildcard pattern matching with support for '?' and '*'.

'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

Note:

  • s could be empty and contains only lowercase letters a-z.
  • p could be empty and contains only lowercase letters a-z, and characters like ? or *.

Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input:
s = "aa"
p = "*"
Output: true
Explanation: '*' matches any sequence.

Example 3:

Input:
s = "cb"
p = "?a"
Output: false
Explanation: '?' matches 'c', but the second letter is 'a', which does not match 'b'.

Example 4:

Input:
s = "adceb"
p = "*a*b"
Output: true
Explanation: The first '*' matches the empty sequence, while the second '*' matches the substring "dce".

Example 5:

Input:
s = "acdcb"
p = "a*c?b"
Output: false

2.2 思路 && 解題方法

這道題和前面的那道題思路是一樣的,也是維護一個二維數組dp 來解題。只不過這里匹配任意字符的符號換成了"?" ,而" * "現在是可以匹配任意序列包括空字符串。 同樣的假設當前S= "abc" P="a?b*",則有如下的dp數組:

1557325258465

同樣首先初始化第0行和第0列。

  1. 對於第0行很好處理,除了0,0位置,其他的地方全部都是False
  2. 對於第0列,因為S=""因此只有當遇到了" * "的時候,才能匹配。則匹配的條件是dp [ i -1 ] [ 0 ] == True 。

開始填充dp的時候,也是有如下的2種情況:

  1. 當P [ i-1 ] == " * " :

    這種情況下也分2種情況:

    1. " * "當做空字符串使用:則和前述一樣 dp [ i ] [ j ] =( dp [ i -1 ] [ j ] == True )
    2. " * "當做任意字符串使用 : 則 dp [ i ] [ j ] = ( dp [ i ] [ j - 1 ] == True)。這里解釋下我的理解。對於" * "當做任意字符使用的情況下,dp [ i ] [ j - 1 ] == True 表示的是:P [ 0 : i-2 ] 匹配了 S [ 0 : j -2 ],同時P[ i - 1 ] (當前是 * )當做空子串使用。
  2. 當P [ i-1 ] == " ? " 或 " 一個正常的字符 ":

    則( S [ j - 1 ] == P [ j -1 ] or P [ j -1 ] == " ?" ) and dp [ i - 1 ] [ j -1 ] == True

1557325258465

2.3 實現

#!/bin/python

class Solution(object):
    def isMatch(self, s, p):
        """
        :type s: str
        :type p: str
        :rtype: bool
        """
        row = len(p) + 1
        col = len(s) + 1
        dp = [ [False for i in range( 0, col )] for j in range( 0, row ) ]

        dp[0][0] = True
        for j in range(1, col ):
            dp[0][j] = False
        for i in range( 1, row):
            if p[i-1] == "*":
                dp[i][0] = dp[i-1][0]

        for i in range( 1, row ):
            for j in range( 1, col ):
                if p[i-1] == "*":
                    dp[i][j] = dp[i-1][j] or dp[i][j-1]
                else:
                    dp[i][j] = (s[j-1] == p[i-1] or p[i-1] == "?") and dp[i-1][j-1]
        return dp[row-1][col-1]

if __name__ == "__main__":
    m = Solution()
    print "s:[aa],p[a] ret:"+str(m.isMatch("aa","a"))
    print "s:[aa],p[*] ret:"+str(m.isMatch("aa","*"))
    print "s:[cb],p[?a] ret:"+str(m.isMatch("cb","?a"))
    print "s:[],p[] ret:"+str(m.isMatch("",""))
    print "s:[acdcb],p[a*c?b] ret:"+str(m.isMatch("acdcb","a*c?b"))
    print "s:[adceb],p[*a*b] ret:"+str(m.isMatch("adceb","*a*b"))

3. [97. Interleaving String]

3.1 題目

Given s1, s2, s3, find whether s3 is formed by the interleaving of s1 and s2.

Example 1:

Input: s1 = "aabcc", s2 = "dbbca", s3 = "aadbbcbcac"
Output: true

Example 2:

Input: s1 = "aabcc", s2 = "dbbca", s3 = "aadbbbaccc"
Output: false

3.2 思路 && 解題方法

本題的思路還是一樣的,維護一個動態數組dp 。dp [ i ] [ j ] 表示 S1 [ i ] 和S2 [ j ] 能匹配 S3 [ i + j - 1] 。假設S1="aa" S2="ab" S3="aaba" , 則從dp [ 0 ] [ 0 ]開始,往右一步代表使用S2 [ j ] 表示S3 [ 0 + j ] ( 0 是 i 因為此處是第一行,所以i是0)。同樣往下一步 代表使用S1[ i ] 表示 S3 [ i + 0 ] 。因此如果可以到達dp [ i ] [ j ]則 需要 dp [ i -1 ] [ j ]或者 dp [ i ] [ j -1 ]是1 。

1557325258465

3.3 實現

#!/bin/bash

class Solution(object):
    def isInterleave(self, s1, s2, s3):
        """
        :type s1: str
        :type s2: str
        :type s3: str
        :rtype: bool
        """
        row = len(s1) + 1
        col = len(s2) + 1
        t = len(s3)
        if row + col -2 !=t :
            return False
        dp = [ [False for j in range(col)] for j in range(row) ]
        dp[0][0] = True
        for j in range(1,col):
            dp[0][j] = dp[0][j-1] and s2[j-1] == s3[j-1]
        for i in range(1,row):
            dp[i][0] = dp[i-1][0] and s1[i-1] == s3[i-1]

        for i in range(1,row):
            for j in range(1,col):
       
                dp[i][j] = ( dp[i-1][j] and s1[i-1] == s3[i+j-1]) or (dp[i][j-1] and s2[j-1] == s3[i+j-1])
        return dp[row-1][col-1]

if __name__=='__main__':
    m = Solution()
    print m.isInterleave("a","b","a")


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM