xpath中使用正則表達式

其實我自己也從來沒用到過，在此記錄一下，萬一以后會用到呢。
比如有個網站正文部分是： //*[@id='postmessage_32199']
另一個同級別頁面的正文是： //*[@id='postmessage_32153']
要抓取這種正文其實可以用xpath： //*[starts-with(@id, 'postmessage_')]
或者 //*[contains(@id, 'postmessage_')]
也可以選擇在xpath中使用正則表達式：doc.xpath(r'//*[re:match(@id, "postmessage_\d+")]', namespace={"re": "http://exslt.org/regular-expressions"})

xpath中如何看選住原文的內容

選取頁面元素el,通過to_string 方法可以拿到頁面標簽的原文不過是bytes類型，可以用bytes.decoding
result = etree.tostring(el)
print(result.decode('utf-8'))

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 xpath里邊怎么用正則表達式在python中使用正則表達式(二) Mysql中使用正則表達式 mysql中使用正則表達式查詢如何在VB中使用正則表達式 UltraEdit中使用正則表達式替換 oracle plsql中使用正則表達式|轉| SQLserver中使用正則表達式正則表達式字符&使用正則表達式基本使用(轉)