python开发遇到的坑(1)xpath解析ValueError: Unicode strings with encoding declaration are not supported


Traceback (most recent call last):
  File "/Users/*******.py", line 37, in <module>
    BtcSpider().run()
  File "/Users/******.py", line 34, in run
    self.parse_data(data)
  File "/Users/******.py", line 21, in parse_data
    xpath_data = etree.HTML(data)
  File "src/lxml/etree.pyx", line 3161, in lxml.etree.HTML
  File "src/lxml/parser.pxi", line 1872, in lxml.etree._parseMemoryDocument
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

  爬了一个论坛,网页是<meta http-equiv="Content-Type" content="text/html; charset=gb2312"> 但是Mac爬取的网页utf-8解码才正确,但是在 xpath 解析的时候出现上面问题,

xpath 解析的时候 encode 一下就可以了,看代码:

xpath_data = etree.HTML(data.encode('utf-8'))

  问题解决啦


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM