python爬蟲(正則取數據)讀取表格內的基金代碼后爬取基金最新凈值,同時寫到對應的表格中,基於最近一次購買凈值計算出漲跌幅(名字有點長)


最近基金跌的真夠猛,雖說是定投,但大幅度下跌,有時候適當的增加定投數也是降低平均成本的一種方式

每天去看去算太費時間,寫了個爬蟲,讓他自動抓數據后自動計算出來吧

 

實現邏輯:

1、創建了一個excel表格,把當前定投的基金都備注到里面、

2、腳本依次讀取表格中的基金代碼

3、拿到基金代碼,到“天天基金網”獲取基金最新凈值

4、把獲取到凈值和更新時間寫到excel中

5、excel公示計算出基於最近一次購買值的漲跌幅

 

后續還可以把腳本搬到雲上,每天自動運行,達到設置的跌幅發郵件提醒或者短信提醒,這都是后話

這里只寫代碼部分

 

第一步,讀取表格中的基金代碼,腳本如下

#讀取表格內的基金代碼
def code():
    wb = xlrd.open_workbook(path+'\\更新數據.xlsx')# 打開Excel文件
    data = wb.sheet_by_name('Sheet1')#通過excel表格名稱(rank)獲取工作表
    b=data.col_values(1)#獲取第一列數據(數組)
    list=[]
    for c in b[1:]:#for循環,排除第一行數據
        d=int(c)
        s="%06d" % d#股票代碼一共有6位,常規打印無法打印出首位帶0的代碼的0部分,補齊缺失的0
        #print(s)
        list.append(s)
    return(list)
code=code()

返回數據如下格式

“['161005', '161903', '110003', '160222', '000248', '163406']”

這一步涉及到對excel操作(讀取)處理,請參考:https://www.cnblogs.com/becks/p/11397995.html

 

第二步,從上一步返回的基金代碼list中取值,拼接成有效鏈接爬取基金信息,腳本如下

def data(cookies,headers,params,code):
    valuelist = []
   timelist = []
for num in code: response = requests.get('http://fund.eastmoney.com/'+num+'.html', headers=headers, params=params, cookies=cookies, verify=False) response.encoding = 'utf-8' #處理編碼得步驟 response = response.text #獲取當前基金凈值 valueD = re.findall('<span class="fix_dwjz bold ui-color-green">(.+?)</span>', response, re.S) if len(valueD)>0: #如果返回值長度大於0,即可理解為當天是跌,value從valueD取值 value = valueD else: #否則,即可理解為當天是漲,value從valueZ取值 valueZ = re.findall('<span class="fix_dwjz bold ui-color-red">(.+?)</span>', response, re.S)#當天凈值為漲 value = valueZ #print(value) #獲取當前最新凈值更新時間 valuetime = re.findall('<div class="titleItems tabBtn titleItemActive" data-date="(.+?)" data-href-more', response, re.S) valuelist.append(value[0]) timelist.append(valuetime) return(valuelist,timelist) test=data(cookies,headers,params,code)

返回數據如下:這一步沒有對數據進行處理,里面包含了所有基金凈值和更新時間

“(['3.7090', '2.1096', '2.5932', '1.2488', '3.2830', '2.1112'], ['2021-03-05'])”

這一步涉及對數據的正則處理,可參考:https://www.cnblogs.com/becks/p/14494929.html

 

第三步,處理數據,並保存到excel中,腳本如下

def save(test):
        xfile = openpyxl.load_workbook(path+'\\更新數據.xlsx')   
        sheet1 = xfile.worksheets[0]for i in range(len(test[0])):
            sheet1.cell(i+2, 5).value=test[0][i]
        for i in range(len(test[0])):
            sheet1.cell(i+2, 6).value=test[1][i][0]
        xfile.save(path+'\\更新數據.xlsx')
save(test)    

這一步涉及到對excel進行操作(追加寫入),可參考:https://www.cnblogs.com/becks/p/12250052.html

 

整個腳本執行后,將自動更新下圖表格中的紅框數據,且漲跌幅一欄也會基於購買時的凈值,計算得出結果

 

 

需要注意的是,excel文件的命名還有存放位置需要跟腳本在一個目錄下

 

 附完整代碼

# -*-coding:utf8-*-
# encoding:utf-8
import requests
import os
import sys
import re
import time
import random
import openpyxl
import xlrd

path = os.path.abspath(os.path.dirname(sys.argv[0]))

cookies = {
    'intellpositionL': '1010.67px',
    'em_hq_fls': 'js',
    'em-quote-version': 'topspeed',
    'qgqp_b_id': '134fc3fcff10a2a2c4035eebfe40f119',
    'intellpositionT': '455px',
    'HAList': 'a-sh-600909-%u534E%u5B89%u8BC1%u5238%2Ca-sh-605003-%u4F17%u671B%u5E03%u827A%2Ca-sz-300855-%u56FE%u5357%u80A1%u4EFD%2Ca-sz-300015-%u7231%u5C14%u773C%u79D1%2Ca-sh-603939-%u76CA%u4E30%u836F%u623F%2Ca-sz-300677-%u82F1%u79D1%u533B%u7597%2Ca-sh-600036-%u62DB%u5546%u94F6%u884C%2Ca-sz-000860-%u987A%u946B%u519C%u4E1A%2Ca-sz-002352-%u987A%u4E30%u63A7%u80A1%2Ca-sz-002034-%u65FA%u80FD%u73AF%u5883',
    'kforders': '0%3B-1%3B%3B%3B0%2C2%2C24%2C25%2C18%2C19%2C22%2C23%2C21%2C3',
    'Eastmoney_Fund_Transform': 'true',
    'Eastmoney_Fund': '000297_000001_000011',
    'st_si': '89636864382855',
    'st_asi': 'delete',
    'ASP.NET_SessionId': 'kbxmonijodtnjbezidg0jzmq',
    'searchbar_code': '519736_000032_000205_040040_206018_000248_110003_163406_161903_161005',
    'EMFUND0': '02-08%2015%3A15%3A53@%23%24%u6613%u65B9%u8FBE%u4FE1%u7528%u503A%u503A%u5238A@%23%24000032',
    'EMFUND2': '02-08%2015%3A16%3A12@%23%24%u534E%u5B89%u7EAF%u503A%u503A%u5238A@%23%24040040',
    'EMFUND1': '02-08%2015%3A16%3A07@%23%24%u6613%u65B9%u8FBE%u6295%u8D44%u7EA7%u4FE1%u7528%u503A%u503A%u5238A@%23%24000205',
    'EMFUND3': '02-08%2015%3A26%3A18@%23%24%u9E4F%u534E%u4EA7%u4E1A%u503A%u503A%u5238@%23%24206018',
    'EMFUND4': '02-08%2015%3A27%3A25@%23%24%u6C47%u6DFB%u5BCC%u4E2D%u8BC1%u4E3B%u8981%u6D88%u8D39ETF%u8054%u63A5@%23%24000248',
    'EMFUND5': '02-08%2015%3A28%3A40@%23%24%u6613%u65B9%u8FBE%u4E0A%u8BC150%u589E%u5F3AA@%23%24110003',
    'EMFUND6': '02-08%2015%3A29%3A32@%23%24%u5174%u5168%u5408%u6DA6%u6DF7%u5408%28LOF%29@%23%24163406',
    'EMFUND7': '02-08%2015%3A30%3A22@%23%24%u4E07%u5BB6%u884C%u4E1A%u4F18%u9009%u6DF7%u5408%28LOF%29@%23%24161903',
    'EMFUND8': '03-07%2020%3A24%3A02@%23%24%u5357%u65B9%u660C%u5143%u8F6C%u503AA@%23%24006030',
    'st_pvi': '18238554782275',
    'st_sp': '2020-01-23%2014%3A56%3A33',
    'st_inirUrl': 'https%3A%2F%2Fwww.baidu.com%2Flink',
    'st_sn': '5',
    'st_psi': '20210307202806812-112200305282-6714006413',
}

headers = {
    'Connection': 'keep-alive',
    'Cache-Control': 'max-age=0',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
    'Accept-Encoding': 'gzip, deflate',
    'Accept-Language': 'zh-CN,zh;q=0.9',
    'If-None-Match': 'W/"6044c497-25fbe"',
    'If-Modified-Since': 'Sun, 07 Mar 2021 12:18:31 GMT',
}

params = (
    ('spm', 'search'),
)

#讀取表格內的基金代碼
def code():
    wb = xlrd.open_workbook(path+'\\更新數據.xlsx')# 打開Excel文件
    data = wb.sheet_by_name('Sheet1')#通過excel表格名稱(rank)獲取工作表
    b=data.col_values(1)#獲取第一列數據(數組)
    list=[]
    for c in b[1:]:#for循環,排除第一行數據
        d=int(c)
        s="%06d" % d#股票代碼一共有6位,常規打印無法打印出首位帶0的代碼的0部分,補齊缺失的0
        #print(s)
        list.append(s)
    return(list)
code=code()

def data(cookies,headers,params,code):
    valuelist = []
    timelist = []
    for num in code:
        response = requests.get('http://fund.eastmoney.com/'+num+'.html', headers=headers, params=params, cookies=cookies, verify=False)
        response.encoding = 'utf-8' #處理編碼得步驟
        response = response.text
        #獲取當前基金凈值
        valueD = re.findall('<span class="fix_dwjz  bold ui-color-green">(.+?)</span>', response, re.S)
        if len(valueD)>0: #如果返回值長度大於0,即可理解為當天是跌,value從valueD取值
            value = valueD
        else:            #否則,即可理解為當天是漲,value從valueZ取值
            valueZ = re.findall('<span class="fix_dwjz  bold ui-color-red">(.+?)</span>', response, re.S)#當天凈值為漲
            value = valueZ
        #print(value)
        #獲取當前最新凈值更新時間
        valuetime = re.findall('<div class="titleItems tabBtn titleItemActive" data-date="(.+?)" data-href-more', response, re.S)
        valuelist.append(value[0])
        timelist.append(valuetime)
    return(valuelist,timelist)
test=data(cookies,headers,params,code)

def save(test):
        xfile = openpyxl.load_workbook(path+'\\更新數據.xlsx')   
        sheet1 = xfile.worksheets[0]
        for i in range(len(test[0])):
            sheet1.cell(i+2, 5).value=test[0][i]
        for i in range(len(test[0])):
            sheet1.cell(i+2, 6).value=test[1][i][0]
        xfile.save(path+'\\更新數據.xlsx')
save(test)      

def over(test):
    print("數據已更新",test[1][0])
over(test)

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM