bs4 python解析html

本文轉載自查看原文 2016-04-24 11:54 5480 python

使用文檔：https://www.crummy.com/software/BeautifulSoup/bs4/doc.zh/

python的編碼問題比較惡心。

decode解碼
encode編碼


在文件頭設置

# -*- coding: utf-8 -*-
讓python使用utf8.

# -*- coding: utf-8 -*-
__author__ = 'Administrator'

from bs4 import BeautifulSoup
import requests
import os
import sys
import io

def getHtml(url):
    r = requests.get(url)
    content = r.content.decode('utf8')
    #print(content)
    soup = BeautifulSoup(content)
    print(soup.find_all('h2'))
    print(soup.find_all('p'))

if __name__=="__main__":

    print(sys.getdefaultencoding())
    print("start.......")
    url = "http://www.jiakaobaodian.com/mnks/exercise/0-c1-kemu1-chengdu.html?id=800000"
    getHtml(url)
    print("end.......")

　　Demo

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Python（00）：BeautifulSoup(BS4)解析HTML和XML Python3.x：bs4解析html基礎用法 Python爬蟲bs4解析實戰 Python：數據解析（bs4 / xpath） bs4解析庫 bs4 解析以及用法數據解析之bs4 爬蟲-使用BeautifulSoup4（bs4）解析html數據 html 網頁源碼解析：bs4中BeautifulSoup python bs4 BeautifulSoup