leetcode筆記 動態規划在字符串匹配中的應用
0 參考文獻
序號 | 標題 |
---|---|
1 | 一招解決4道leetcode hard題,動態規划在字符串匹配問題中的應用 |
2 | 10.Regular Expression Matching |
1. [10. Regular Expression Matching]
1.1 題目
Given an input string (s
) and a pattern (p
), implement regular expression matching with support for '.'
and '*'
.
'.' Matches any single character.
'*' Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Note:
s
could be empty and contains only lowercase lettersa-z
.p
could be empty and contains only lowercase lettersa-z
, and characters like.
or*
.
Example 1:
Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".
Example 2:
Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".
Example 3:
Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".
Example 4:
Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".
Example 5:
Input:
s = "mississippi"
p = "mis*is*p*."
Output: false
1.2 思路 && 解題方法
首先建立一個二維數組,數P組的列是代表了字符串S,數組的行代表了字符串P。dp[i] [j] 表示P[0:i] 匹配S[0:j]。因此如果最后P能夠匹配S,則dp [len(P)] [len(S)] == True 。注意dp[0] [0] 分別是S是空字符和P是空字符的時候。 這個時候是必定匹配的,因此dp [0] [0] = True。
之后需要做的事情就是依次填滿這個矩陣。為此需要初始化dp [ 0 ] [ j ] 和dp [ i ] [ 0 ]既第0行和第0列。
-
對於第0行因為P為空,則除了S是空以外其他的都不匹配。因此 dp [ 0 ] [ j ] = False
-
對於第0列,則需要判斷下P是否能匹配S是空串的情況。在S是空串的情況下,之后P是空串或者P是帶有 " * "的情況下才能匹配,因此只需要處理這兩種情況。
- P是空串的情況下,可以匹配S。因此dp [ 0 ] [ 0 ] = True
- P是" * "的情況下,例如"abc",因為可以是匹配0個或者多個字符。因此當在這種情況下,*號其實可以消掉前面的字符變成""。因此dp [ i ] [ 0 ] = dp [ i-2 ] [ 0 ] and P [ i -1 ] == " * "。這里為什么不是判斷dp [ i -1 ] [ 0 ] 是否為True 而是判斷dp [ i-2 ] [ 0 ]呢?是因為dp [ i-1 ] [ 0 ]是表示 P [ 0 : i - 2 ] 能夠匹配S [ 0 ],如果P [ 0 : i -2 ]能夠匹配S [ 0 ],那么當前字符" * " 消掉前一個字符便無法匹配S [ 0 ] (既dp [ i ] [ 0 ] == False)。如下圖的示例,當P="a * b * " 是可以匹配""空字符的。那么當i = 2 ,j = 0 的時候,必須有dp [ 0 ] [ 0 ] == True 才能得到 dp [ 2 ] [ 0 ] == True。
到此,dp矩陣的初始化已經好了。這個時候,矩陣中的值如圖所示。綠色部分是已經初始化的值。空白的部分是待填充的。
接下來就是填充dp矩陣的剩余部分。對於dp [ i ] [ j ] ( i>1, j>1)會有以下的幾種情況:
-
P [ i - 1 ] == " * " :
對於這種情況,還可以區別2中情況:
-
" * " 抵消前面的字符,既 " * "匹配空字符串:
對於這種情況則和前文所述的方法一樣,dp [ i ] [ j ] == dp [ i-2 ] [ j ]
-
" * "匹配前面的字符N次 :
對於這種情況,則需要在 ( ( P[ i - 1 ] == " . " ) or ( S[ j -1 ] == P [ i -2 ] ) )的情況下 ,dp [ i -1 ] [ j ] == True。這是為什么呢?原因在於如果要匹配0-N次,則代表了P[ 0 - i -2 ] (既dp [ i -1 ] [ XXXX ] ) 能完全匹配S[ 0 : j - 1 ]。
如例子中的 "a." 能匹配 "abb"。
-
-
P [ i - 1 ] == " . " or P [ i - 1 ] == " 一個正常的字符 " :
如果是這種情況見簡單的多,既( S [ j - 1 ] == P [ j -1 ] or P [ j -1 ] == " . " ) and dp [ i - 1 ] [ j -1 ] == True 。
1.3 實現
class Solution(object):
def isMatch(self, s, p):
"""
:type s: str
:type p: str
:rtype: bool
"""
# dp[i][j] 代表了p字符串從0-i是否匹配s字符的0-j
row = len(p) + 1
col = len(s) + 1
dp = [ [False for i in range( col ) ] for j in range( row ) ]
dp[0][0] = True # dp[0][0] 代表了p是空串 s是空串
# 當s時空串的情況下,p的不同,匹配的不同情況。為接下去匹配len(s) = 1 ,2 ,3 .... n 做准備
# 當s為空串的時候,只有a*b*這種能匹配。
# 因此dp[0][0] 為空串,所以i-1實際真正指向p的一個字符串的位置
for i in range( 1, row):
dp[i][0] = ( i > 1 ) and p[ i - 1 ] == "*" and dp[ i - 2 ][0]
for i in range( 1, row ) :
for j in range( 1, col ):
if p[ i - 1 ] =="*":
dp[i][j] = dp[ i - 2 ][j] or ( p[ i - 2 ] == s[ j - 1 ] or p[ i - 2 ] == ".") and dp[i][j-1]
else:
dp[i][j] = ( p[ i - 1 ] == "." or p[ i - 1 ] == s[ j - 1 ]) and dp[i-1][j-1]
return dp[row-1][col-1]
2. [44. Wildcard Matching]
2.1 題目
Given an input string (s
) and a pattern (p
), implement wildcard pattern matching with support for '?'
and '*'
.
'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).
The matching should cover the entire input string (not partial).
Note:
s
could be empty and contains only lowercase lettersa-z
.p
could be empty and contains only lowercase lettersa-z
, and characters like?
or*
.
Example 1:
Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".
Example 2:
Input:
s = "aa"
p = "*"
Output: true
Explanation: '*' matches any sequence.
Example 3:
Input:
s = "cb"
p = "?a"
Output: false
Explanation: '?' matches 'c', but the second letter is 'a', which does not match 'b'.
Example 4:
Input:
s = "adceb"
p = "*a*b"
Output: true
Explanation: The first '*' matches the empty sequence, while the second '*' matches the substring "dce".
Example 5:
Input:
s = "acdcb"
p = "a*c?b"
Output: false
2.2 思路 && 解題方法
這道題和前面的那道題思路是一樣的,也是維護一個二維數組dp 來解題。只不過這里匹配任意字符的符號換成了"?" ,而" * "現在是可以匹配任意序列包括空字符串。 同樣的假設當前S= "abc" P="a?b*",則有如下的dp數組:
同樣首先初始化第0行和第0列。
- 對於第0行很好處理,除了0,0位置,其他的地方全部都是False
- 對於第0列,因為S=""因此只有當遇到了" * "的時候,才能匹配。則匹配的條件是dp [ i -1 ] [ 0 ] == True 。
開始填充dp的時候,也是有如下的2種情況:
-
當P [ i-1 ] == " * " :
這種情況下也分2種情況:
- " * "當做空字符串使用:則和前述一樣 dp [ i ] [ j ] =( dp [ i -1 ] [ j ] == True )
- " * "當做任意字符串使用 : 則 dp [ i ] [ j ] = ( dp [ i ] [ j - 1 ] == True)。這里解釋下我的理解。對於" * "當做任意字符使用的情況下,dp [ i ] [ j - 1 ] == True 表示的是:P [ 0 : i-2 ] 匹配了 S [ 0 : j -2 ],同時P[ i - 1 ] (當前是 * )當做空子串使用。
-
當P [ i-1 ] == " ? " 或 " 一個正常的字符 ":
則( S [ j - 1 ] == P [ j -1 ] or P [ j -1 ] == " ?" ) and dp [ i - 1 ] [ j -1 ] == True
2.3 實現
#!/bin/python
class Solution(object):
def isMatch(self, s, p):
"""
:type s: str
:type p: str
:rtype: bool
"""
row = len(p) + 1
col = len(s) + 1
dp = [ [False for i in range( 0, col )] for j in range( 0, row ) ]
dp[0][0] = True
for j in range(1, col ):
dp[0][j] = False
for i in range( 1, row):
if p[i-1] == "*":
dp[i][0] = dp[i-1][0]
for i in range( 1, row ):
for j in range( 1, col ):
if p[i-1] == "*":
dp[i][j] = dp[i-1][j] or dp[i][j-1]
else:
dp[i][j] = (s[j-1] == p[i-1] or p[i-1] == "?") and dp[i-1][j-1]
return dp[row-1][col-1]
if __name__ == "__main__":
m = Solution()
print "s:[aa],p[a] ret:"+str(m.isMatch("aa","a"))
print "s:[aa],p[*] ret:"+str(m.isMatch("aa","*"))
print "s:[cb],p[?a] ret:"+str(m.isMatch("cb","?a"))
print "s:[],p[] ret:"+str(m.isMatch("",""))
print "s:[acdcb],p[a*c?b] ret:"+str(m.isMatch("acdcb","a*c?b"))
print "s:[adceb],p[*a*b] ret:"+str(m.isMatch("adceb","*a*b"))
3. [97. Interleaving String]
3.1 題目
Given s1, s2, s3, find whether s3 is formed by the interleaving of s1 and s2.
Example 1:
Input: s1 = "aabcc", s2 = "dbbca", s3 = "aadbbcbcac"
Output: true
Example 2:
Input: s1 = "aabcc", s2 = "dbbca", s3 = "aadbbbaccc"
Output: false
3.2 思路 && 解題方法
本題的思路還是一樣的,維護一個動態數組dp 。dp [ i ] [ j ] 表示 S1 [ i ] 和S2 [ j ] 能匹配 S3 [ i + j - 1] 。假設S1="aa" S2="ab" S3="aaba" , 則從dp [ 0 ] [ 0 ]開始,往右一步代表使用S2 [ j ] 表示S3 [ 0 + j ] ( 0 是 i 因為此處是第一行,所以i是0)。同樣往下一步 代表使用S1[ i ] 表示 S3 [ i + 0 ] 。因此如果可以到達dp [ i ] [ j ]則 需要 dp [ i -1 ] [ j ]或者 dp [ i ] [ j -1 ]是1 。
3.3 實現
#!/bin/bash
class Solution(object):
def isInterleave(self, s1, s2, s3):
"""
:type s1: str
:type s2: str
:type s3: str
:rtype: bool
"""
row = len(s1) + 1
col = len(s2) + 1
t = len(s3)
if row + col -2 !=t :
return False
dp = [ [False for j in range(col)] for j in range(row) ]
dp[0][0] = True
for j in range(1,col):
dp[0][j] = dp[0][j-1] and s2[j-1] == s3[j-1]
for i in range(1,row):
dp[i][0] = dp[i-1][0] and s1[i-1] == s3[i-1]
for i in range(1,row):
for j in range(1,col):
dp[i][j] = ( dp[i-1][j] and s1[i-1] == s3[i+j-1]) or (dp[i][j-1] and s2[j-1] == s3[i+j-1])
return dp[row-1][col-1]
if __name__=='__main__':
m = Solution()
print m.isInterleave("a","b","a")