天天基金爬蟲——天天基金爬取基金信息統計持倉凈值

本文轉載自查看原文 2021-05-28 22:36 1049 爬蟲/ python

天天基金爬蟲——天天基金爬取基金信息統計持倉凈值

一、獲取網頁
二、解析數據
- 1、查看需要的基本信息
三、統計求和
- 1、步驟
- 2、完整代碼
四、喜歡的話請點個關注吧！不要忘了長按點贊一鍵三連哦！

一、獲取網頁

1、打開一個基金網頁

我們先打開天天基金網，然后隨便打開一個基金的頁面。例如161725招商中證白酒指數。

鏈接:
http://fund.eastmoney.com/161725.html.

我們發現天天基金的網址構成是http://fund.eastmoney.com/+基金編碼+.html
在這里插入圖片描述
~~椰樹牌椰汁的即視感d(ŐдŐ๑)。~~

2、分析頁面

我們右鍵點開檢查，打開network，觀察返回具體基金信息的文檔是哪個。
在這里插入圖片描述
好像就是這個以基金編碼為名字的文件呀，那我們把它獲取下來吧。

3、獲取頁面

要用到的庫，request獲取頁面。

import requests

來一個經典request headers

header={
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36'
}

寫一個函數獲取頁面

def getfund(code:str):
    url='http://fund.eastmoney.com/'+code+'.html'
    page=requests.get(url)
    
    print(page.text)

測試一下看看頁面內容

if __name__=='__main__':
    getfund('161725')

在這里插入圖片描述
這里我們發現返回值亂碼了，所以我們應該重新把頁面加載成utf-8的編碼。

加載成utf-8

def getfund(code:str):
    url='http://fund.eastmoney.com/'+code+'.html'
    page=requests.get(url)
    
    html=str(page.content,'utf-8')
    #把content中的內容重新編碼成utf-8
    
    print(html)

測試一下，發現這回中文正常不亂碼了。
在這里插入圖片描述
~~這個title很靈性呀๑乛v乛๑。~~

二、解析數據

1、查看需要的基本信息

先明確一下我們想要的信息有哪些，基本的話就凈值和基金名字吧。

在網頁檢查中選中凈值

我們觀察到凈值在dataNums下，邊上還有變動幅度。
讀出凈值
我們發現凈值在這邊。

<span class="ui-font-large ui-color-red ui-num">1.5613</span>

現在我們要先把之前獲取的頁面變成方便查找的形式，這里用到BeautifulSoup。
然后我們用find()按相應信息找出凈值

import requests
from bs4 import BeautifulSoup

header={
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36'
}

def getfund(code:str):
    url='http://fund.eastmoney.com/'+code+'.html'
    page=requests.get(url)
    
    html=str(page.content,'utf-8')
    #把content中的內容重新編碼成utf-8
    
    soup=BeautifulSoup(html,'lxml')
   	value=soup.find_all('dd',{'class':'dataNums'})[1].find('span').getText()
    print(value)

測試一下
在這里插入圖片描述

沒錯是我們想要的。

同理獲取基金名字和日期。

def getfund(code:str):
    url='http://fund.eastmoney.com/'+code+'.html'
    page=requests.get(url)
    
    html=str(page.content,'utf-8')
    #把content中的內容重新編碼成utf-8
    
    soup=BeautifulSoup(html,'lxml')
    value=soup.find_all('dd',{'class':'dataNums'})[1].find('span').getText()
    print(value)
    
    name=soup.find('a',{'href':url,'target':"_self"}).getText()
    print(name)
    
    date=soup.find('dl',{'class':"dataItem02"}).find('p').getText()
    print(date[6:-1])

測試一下
在這里插入圖片描述

好，沒問題。

三、統計求和

1、步驟

先寫一個字典存下持倉的基金編碼和份額。
tips：這里只是舉個例子，不代表任何投資建議，如有雷同純屬巧合

funds={
    '004432':2673.06,
    '001156':739.65,
    '009265':893.87,
    '160222':2888.71,
    '009821':1000.00,
    '008903':2215.10,
    '161725':2513.26,
    '001475':1781.60,
    '161028':2571.06,
    '270002':2772.19,
    '008168':9905.49}

然后我們把getfund()函數返回值設置成凈值吧。

def getfund(code:str):
    url='http://fund.eastmoney.com/'+code+'.html'
    page=requests.get(url)
    
    html=str(page.content,'utf-8')
    #把content中的內容重新編碼成utf-8
    
    soup=BeautifulSoup(html,'lxml')
    
    value=soup.find_all('dd',{'class':'dataNums'})[1].find('span').getText()
    name=soup.find('a',{'href':url,'target':"_self"}).getText()
    date=soup.find('dl',{'class':"dataItem02"}).find('p').getText()[6:-1]
    
    print("基金編號:",code,'\n基金名:',name,"\n日期:",date,"凈值:",value)
    return float(value)

我們以此從字典中取出代碼，獲得凈值后計算總市值，記得保留兩位小數。

if __name__=='__main__':
    total=0
    for code in funds:
        share=funds[code]
        price=share*getfund(code)
        total+=price
        
        print('份額:',share,'市值:','%.2f'%price)
        
    print('總計:','%.2f'%total)

運行一下看看
在這里插入圖片描述
基本實現了預期。

2、完整代碼

#by concyclics
import requests
from bs4 import BeautifulSoup

header={
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36'
}

funds={
    '004432':2673.06,
    '001156':739.65,
    '009265':893.87,
    '160222':2888.71,
    '009821':1000.00,
    '008903':2215.10,
    '161725':2513.26,
    '001475':1781.60,
    '161028':2571.06,
    '270002':2772.19,
    '008168':9905.49}

def getfund(code:str):
    url='http://fund.eastmoney.com/'+code+'.html'
    page=requests.get(url)
    
    html=str(page.content,'utf-8')
    #把content中的內容重新編碼成utf-8
    
    soup=BeautifulSoup(html,'lxml')
    
    value=soup.find_all('dd',{'class':'dataNums'})[1].find('span').getText()
    name=soup.find('a',{'href':url,'target':"_self"}).getText()
    date=soup.find('dl',{'class':"dataItem02"}).find('p').getText()[6:-1]
    
    print("基金編號:",code,'\n基金名:',name,"\n日期:",date,"凈值:",value)
    return float(value)


if __name__=='__main__':
    total=0
    for code in funds:
        share=funds[code]
        price=share*getfund(code)
        total+=price
        
        print('份額:',share,'市值:','%.2f'%price)
        
    print('總計:','%.2f'%total)

四、喜歡的話請點個關注吧！不要忘了長按點贊一鍵三連哦！

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 爬蟲 -- 天天基金網數據簡單爬取 scrapy學習-爬取天天基金網基金列表 python 爬取網頁天天基金爬取天天基金網、股票數據 Python爬蟲之天天基金歷史數據 Python-天天基金網爬蟲分析多線程+代理池爬取天天基金網、股票數據(無需使用爬蟲框架) 天天基金網數據接口天天基金網數據接口 Python爬蟲周記之案例篇——基金凈值爬取（上）