Python正則表達式里的單行re.S和多行re.M模式

本文轉載自查看原文 2018-12-15 14:16 8022

Python正則表達式里的單行re.S和多行re.M模式

Python 的re模塊內置函數幾乎都有一個flags參數，以位運算的方式將多個標志位相加。其中有兩個模式：單行（re.DOTALL, 或者re.S）和多行（re.MULTILINE, 或者re.M）模式。它們初看上去不好理解，但是有時又會非常有用。這兩個模式在PHP和JavaScripts里都有。

單行模式 re.DOTALL

在單行模式里，文本被強制當作單行來匹配，什么樣的文本不會被當作單行？就是里面包含有換行符的文本，比如：

This is the first line.\nThis is the second line.\nThis is the third line.

點號（.）能匹配所有字符，換行符例外。現在我們希望能匹配出整個字符串，當用點號（.）匹配上面這個字符串時，在換行符的地方，匹配停止。例如：

>>> a = 'This is the first line.\nThis is the second line.\nThis is the third line.'

>>> print a

This is the first line.

This is the second line.

This is the third line.

>>> import re

>>> p = re.match(r'This.*line.' ,a)

>>> p.group(0)

'This is the first line.'

>>>

在上面的例子里，即使是默認貪婪（greedy）的匹配，仍然在第一行的結尾初停止了匹配，而在單行模式下，換行符被當作普通字符，被點號（.）匹配：

>>> q = re.match(r'This.*line.', a, flags=re.DOTALL)

>>> q.group(0)

'This is the first line.\nThis is the second line.\nThis is the third line.'

點號（.）匹配了包括換行符在內的所有字符。所以，更本質的說法是

單行模式改變了點號（.）的匹配行為

多行模式 re.MULTILINE

在多行模式里，文本被強制當作多行來匹配。正如上面單行模式里說的，默認情況下，一個包含換行符的字符串總是被當作多行處理。但是行首符^和行尾符$僅僅匹配整個字符串的起始和結尾。這個時候，包含換行符的字符串又好像被當作一個單行處理。

在下面的例子里，我們希望能將三句話分別匹配出來。用re.findall( )顯示所有的匹配項

>>> a = 'This is the first line.\nThis is the second line.\nThis is the third line.'

>>> print a

This is the first line.

This is the second line.

This is the third line.

>>> import re

>>> re.findall(r'^This.*line.$', a)

[]

>>>

默認點號不匹配換行符，我們需要設置re.DOTALL。

>>> re.findall(r'^This.*line.$', a, flags=re.DOTALL)

['This is the first line.\nThis is the second line.\nThis is the third line.']

>>>

匹配出了整句話，因為默認是貪婪模式，用問號切換成非貪婪模式：

>>> re.findall(r'^This.*?line.$', a, flags=re.DOTALL)

['This is the first line.\nThis is the second line.\nThis is the third line.']

>>>

仍然是整句話，這是因為^和$只匹配整個字符串的起始和結束。在多行模式下，^除了匹配整個字符串的起始位置，還匹配換行符后面的位置；$除了匹配整個字符串的結束位置，還匹配換行符前面的位置.

>>> re.findall(r'^This.*?line.$', a, flags=re.DOTALL+re.MULTILINE)

['This is the first line.', 'This is the second line.', 'This is the third line.']

>>>

更本質的說法是

多行模式改變了^和$的匹配行為

本文轉自：

https://www.lfhacks.com/tech/python-re-single-multiline

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Python正則表達式中的re.S，re.M，re.I的作用 Python正則表達式中的re.S，re.M，re.I的作用 Python正則表達式中的re.S Python正則表達式中的re.S 正則表達式re.S的用法正則表達式中的.*?和python中re.S參數的詳解 Python正則表達式中的re.S的作用 python--正則表達式中(.)(*)(.*?)以及re.S的認識正則表達式之模糊匹配\d、\D、\s、\S、[A-Z]、[a-z]、[a-zA-Z]、+、*、{n,m}、?、.、re.M、^和$等 python正則之模式re.I re.M