一.數據爬取和數據入庫

在使用jsoup爬取數據出現一定問題之后，我改變了方法采用Python來快速爬取疫情數據。

經過一定時間學習Python相關知識后采用了requests 里的一些方法和 json 格式的轉換以及就是數據庫的添加操作。

爬取代碼如下

import requests
import json
from pymysql import *
import requests
from retrying import retry

headers = {
    "User-Agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Mobile Safari/537.36"
    , "Referer": "https://wp.m.163.com/163/page/news/virus_report/index.html?_nw_=1&_anw_=1"}


def _parse_url(url):
    response = requests.get(url, headers=headers, timeout=3)  # 3秒之后返回
    return response.content.decode()


def parse_url(url):
    try:
        html_str = _parse_url(url)
    except:
        html_str = None
    return html_str


class yiqing:
    url = "https://c.m.163.com/ug/api/wuhan/app/data/list-total?t=316765429316"

    def getContent_list(self, html_str):
        dict_data = json.loads(html_str)
        # 各省的數據
        content_list = dict_data["data"]

        return content_list

    def saveContent_list(self, i):
        # 打開數據庫連接（ip/數據庫用戶名/登錄密碼/數據庫名）
        con = connect("localhost", "root", "123456", "web01")
        # 使用 cursor() 方法創建一個游標對象 cursor
        cursors = con.cursor()
        # 使用 execute()  方法執行 SQL 查詢 返回的是你影響的行數

        row = cursors.execute("insert into provinces values(%s,%s,%s,%s,%s,%s,%s,%s)",
                              (i.get('id'), i.get('name'), i.get('total').get('confirm'),
                               i.get('total').get('suspect'), i.get('total').get('heal'),
                               i.get('total').get('dead'), i.get('total').get('severe'),
                               i.get('lastUpdateTime')))
        for j in i.get('children'):
            row = cursors.execute("insert into citys values(%s,%s,%s,%s,%s,%s,%s,%s)",
                                  (j.get('id'), j.get('name'), j.get('total').get('confirm'),
                                   j.get('total').get('suspect'), j.get('total').get('heal'),
                                   j.get('total').get('dead'), j.get('total').get('severe'),
                                   j.get('lastUpdateTime')))
        con.commit()  # 提交事務
        con.close()  # 關閉數據庫連接

    def run(self):  # 實現主要邏輯
        # 請求數據
        html_str = parse_url(self.url)
        # 獲取數據
        content_list = self.getContent_list(html_str)
        values = content_list["areaTree"][0]["children"]
        for i in values:
            self.saveContent_list(i)


if __name__ == '__main__':
    yq = yiqing()
    yq.run()
    print('爬取，存儲成功！！')

二.可視化展示

效果如下圖：

爬取數據后只需將上次的數據查詢sql 語句更改一些，並對 Echart 格式進行些許修改即可。

三.學習及實現過程的psp表

日期	開始時間	結束時間	中斷時間	凈時間	活動	備注
3.10	15:35	17:35	10min	1h50min	學習jsoup的使用	觀看視頻進行學習並對jsoup有了大致了解
3.11	9:50	10:50	5min	55min	親自實踐使用jsoup	通過視頻案例成功爬取了網頁圖片
3.11	13:30	15:30	0	2h	用jsoup進行數據爬取	網頁當中js動態生成的網頁無法抓取找到使用phantomjs 插件的解決方案對其了解並嘗試使用
3.11	16:00	17:00	0	1h	使用phantomjs插件	並未成功爬取到數據轉換思路使用python進行數據爬取
3.11	19:00	22:00	30min	2h30min	學習python基本語法以及爬取的相關知識	使用python抓取數據，並將給出的示例進行改編成功實現數據存入數據庫，並用Echarts可視化展示

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 全國疫情數據爬取+可視化展示 Python 爬取每日全國疫情+數據入庫+可視化顯示爬取世界疫情數據繪制疫情可視化地圖【爬蟲+可視化】Python爬取疫情數據，並做可視化展示全國疫情統計可視化地圖全國疫情統計可視化地圖（1）全國疫情統計可視化地圖全國疫情統計可視化地圖（2）全國疫情可視化地圖（一）全國疫情統計可視化地圖（3）