Python escape unescape html

本文轉載自查看原文 2017-12-18 20:28 1895

在做網絡爬蟲的時候經常需要unescape得到的html，

因為得到的html經常如下：

html = '&lt;abc&gt;'

在python 3 中如下：

from html.parser import HTMLParser

html = '<abc>'
html_parser = HTMLParser()
txt = html_parser.unescape(html)

得到的結果如下：

txt = '<abc>'

如果要轉回去，可以用cgi模塊

import cgi

html = cgi.escape(txt) # 這樣又回到了 html = '<abc&gt'

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 C#版的 Escape() 和 Unescape() Python實現unescape解碼JS(escape,encodeURI等方法)url編碼字符串 js幾種escape()解碼與unescape()編碼 js幾種escape()解碼與unescape()編碼 java實現js端的escape和unescape html.unescape(s) python escape sequences Python 爬蟲解決escape問題 python pymysql轉義方法escape_string python3中的unicode_escape