ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes

本文轉載自查看原文 2021-07-15 14:52 242

出現這個錯，是因為編碼的問題。

Traceback (most recent call last):
  File "/tmp/a.py", line 4, in <module>
    html5lib.parse('<p>&#1;', treebuilder='lxml')
  File "/home/simon/.virtualenvs/weasyprint/lib/python3.3/site-packages/html5lib/html5parser.py", line 28, in parse
    return p.parse(doc, encoding=encoding)
  File "/home/simon/.virtualenvs/weasyprint/lib/python3.3/site-packages/html5lib/html5parser.py", line 224, in parse
    parseMeta=parseMeta, useChardet=useChardet)
  File "/home/simon/.virtualenvs/weasyprint/lib/python3.3/site-packages/html5lib/html5parser.py", line 93, in _parse
    self.mainLoop()
  File "/home/simon/.virtualenvs/weasyprint/lib/python3.3/site-packages/html5lib/html5parser.py", line 183, in mainLoop
    new_token = phase.processCharacters(new_token)
  File "/home/simon/.virtualenvs/weasyprint/lib/python3.3/site-packages/html5lib/html5parser.py", line 991, in processCharacters
    self.tree.insertText(token["data"])
  File "/home/simon/.virtualenvs/weasyprint/lib/python3.3/site-packages/html5lib/treebuilders/_base.py", line 320, in insertText
    parent.insertText(data)
  File "/home/simon/.virtualenvs/weasyprint/lib/python3.3/site-packages/html5lib/treebuilders/etree_lxml.py", line 240, in insertText
    builder.Element.insertText(self, data, insertBefore)
  File "/home/simon/.virtualenvs/weasyprint/lib/python3.3/site-packages/html5lib/treebuilders/etree.py", line 108, in insertText
    self._element.text += data
  File "lxml.etree.pyx", line 921, in lxml.etree._Element.text.__set__ (src/lxml/lxml.etree.c:41467)
  File "apihelpers.pxi", line 652, in lxml.etree._setNodeText (src/lxml/lxml.etree.c:18888)
  File "apihelpers.pxi", line 1335, in lxml.etree._utf8 (src/lxml/lxml.etree.c:24701)
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters

再生成文檔過程中，突然間發現出現此錯誤。本來想着是通過改變編碼的方式，來解決這類問題，如下所示：

p = document.add_paragraph(u"哈哈 ")
或者是：
p = document.add_paragraph(p.encode('utf-8').decode("utf-8"))

但是我使用了上述的兩種方法，錯誤仍然存在，后面就用了替換的方法，解決了眼前的錯誤（雖然目前妥協了，但是后面如果發現又更好的解決方式，會再來更新的）：

s = re.sub(u"[\\x00-\\x08\\x0b\\x0e-\\x1f\\x7f]", "", s)
p = self.doc.add_paragraph(s)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters “matplotlib display text must have all code points < 128 or use Unicode strings”解決方法 Python報錯ValueError: arrays must all be same length Python報錯 ValueError: arrays must all be same length ValueError: source code string cannot contain null bytes python開發遇到的坑(1)xpath解析ValueError: Unicode strings with encoding declaration are not supported 【scrapy】FormRequest python2：TypeError: must be string without null bytes, not str Python錯誤 ValueError: If using all scalar values, you must pass an index. Python常見錯誤：ValueError: If using all scalar values, you must pass an index（四種解決方案）