程序簡介

利用乘法型的時間序列分解算法預測北京氣溫變化
程序輸入：觀測數據，周期長度，需要往后預測的個數
程序輸出：預測值，模型結構

時間序列分解使用加法模型或乘法模型講原始系列拆分為四部分：長期趨勢變動T、季節變動S(顯式周期，固定幅度、長度的周期波動)、循環變動C(隱式周期，周期長不具嚴格規則的波動)和不規則變動L。本例使用的是乘法模型。

程序/數據集下載

點擊進入下載地址

代碼分析

導入模塊、路徑

# -*- coding: utf-8 -*-
from Module.BuildModel import TimeSeriesSplit
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os

#路徑目錄
baseDir = ''#當前目錄
staticDir = os.path.join(baseDir,'Static')#靜態文件目錄
resultDir = os.path.join(baseDir,'Result')#結果文件目錄

讀取北京氣象數據，分為測試集和訓練集，查看內容，程序主要使用平均溫度數據

#讀取數據
data = pd.read_excel(staticDir+'/北京氣象數據.xlsx')
#前24個做訓練數據，后面的做測試數據
train = data[0:24]['平均溫度(℃)'].values
test = data[24:]['平均溫度(℃)'].values
data.head()

	城市	日期	最低溫度(℃)	最高溫度(℃)	平均溫度(℃)	濕度(%)	風速(m/s)	風級(級)	氣壓(hpa)	能見度(km)	總降水量(mm)	平均總雲量(%)
0	北京市	2017年03月	-1.2	19.1	9.0	38	2.5	2	1018	14.2	5.7	64
1	北京市	2017年04月	5.4	33.1	17.4	35	2.5	2	1008	14.5	0.0	66
2	北京市	2017年05月	12.5	35.4	23.2	41	2.6	2	1006	14.6	15.5	65
3	北京市	2017年06月	12.7	38.0	25.5	50	2.3	2	1003	14.3	60.2	73
4	北京市	2017年07月	19.8	36.3	27.9	70	1.8	2	1001	9.2	48.7	77

進行時間分解建模，使用的TimeSeriesSplit類位於項目文件的Modlue/BuildModel.py中，下文將給出代碼

本文使用的時間分解乘法模型，其公式為D=T×S×C×I，D為預測值，T為長期趨勢，S為季節性因子，C為周期波動(不容易得出，往往由研究者自定或忽略)，I為隨機波動(無法計算，忽略)

而T長期趨勢是由單獨的一個模型求出，本程序使用的是線性模型

#建模
EMA = 12#周期長度，即12個月
model = TimeSeriesSplit(train,EMA)
#預測
result = model.predict(test.shape[0])
print('季節性因子',np.round(result['seasonFactor']['value'],2))
print('長期趨勢系數和截距',np.round(result['Ta']['value'],2),np.round(result['Tb']['value'],2))
print('預測值',np.round(result['predict']['value'],2))

季節性因子 [ 0.61  1.14  1.6   1.96  2.03  2.01  1.66  0.94  0.34 -0.01 -0.23 -0.05]
長期趨勢系數和截距 -0.52 19.75
預測值 [ 4.45  7.76 10.09 11.34 10.69  9.55  7.01  3.48  1.08 -0.02 -0.49]

這部分是上文使用的TimeSeriesSplit類對序列進行時間分解乘法建模，輸入為原序列和周期長度(這里為12，因為1年12個月)。該代碼塊可以直接運行用來進行小實驗測試。

# -*- coding: utf-8 -*-
from sklearn.linear_model import LinearRegression
import pandas as pd
import numpy as np

class TimeSeriesSplit():
    def __init__(self,series,EMA):
        '''
        時間序列分解算法，乘法模型，由於循環波動難以確認，受隨機因素影響大，不予考慮
        series：時間序列
        EMA：移動平均項數，也是周期的時長        
        '''
        self.buildModel(series,EMA)
    
    def predict(self,num):
        '''
        往后預測num個數，返回的是整個模型的信息
        num：預測個數
        '''
        result = []
        for i in range(num):
            #季節因子
            S = self.seasFactors[(i+len(self.series))%len(self.seasFactors)]
            #長期趨勢
            T = self.regression.predict(i+len(self.series))[0][0]
            result.append(T*S)
        info = {
                'predict':{'value':result,'desc':'往后預測的%s個數'%num},
                'Ta':{'value':self.regression.coef_[0][0],'desc':'長期趨勢線性模型的系數'},
                'Tb':{'value':self.regression.intercept_[0],'desc':'長期趨勢線性模型的截距'},
                'seasonFactor':{'value':self.seasFactors,'desc':'季節因子'},
                }
        return info
        
    
    def buildModel(self,series,EMA):
        '''
        建模，預測
        series：時間序列
        EMA：移動平均項數，也是周期的時長
        '''
        series = np.array(series).reshape(-1)
        #移動平均數
        moveSeies = self.calMoveSeries(series,EMA)
        #季節因子
        seasonFactors = self.calSeasonFactors(series,moveSeies,EMA)
        #長期趨勢建模
        regression = self.buildLongTrend(series)
        #收尾，設置對象屬性
        self.series = series
        self.seasFactors = seasonFactors
        self.regression = regression

    def calMoveSeries(self,series,EMA):
        '''
        計算移動平均數
        series：時間序列
        EMA：移動平均項數，也是周期的時長
        '''
        #計算移動平均
        moveSeries = []
        for i in range(0,series.shape[0]-EMA+1):
            moveSeries.append(series[i:i+EMA].mean())
        moveSeries = np.array(moveSeries).reshape(-1)
        #如果項數為復數，則移動平均后數據索引無法對應原數據，要進行第2次項數為2的移動平均
        if EMA % 2 == 0:
            moveSeries2 = []
            for i in range(0,moveSeries.shape[0]-2+1):
                moveSeries2.append(moveSeries[i:i+2].mean())
            moveSeries = np.array(moveSeries2).reshape(-1)
        return moveSeries

    def calSeasonFactors(self,series,moveSeries,EMA):
        '''
        計算季節性因子
        series：時間序列
        moveSeries：移動平均數
        EMA：移動平均項數，也是周期的時長
        '''
        #移動平均后的第一項索引對應原數據的索引
        startIndex = int((series.shape[0] - moveSeries.shape[0])/2)
        #觀測值除以移動平均值
        factors = []
        for i in range(len(moveSeries)):
            factors.append(series[startIndex+i]/moveSeries[i])
        #去掉尾部多余部分
        rest = len(factors)%EMA
        factors = factors[:len(factors)-rest]
        factors = np.array(factors).reshape(-1,EMA)

        #平均值可能不是1,調整
        seasonFactors = factors.mean(axis=0)/factors.mean()
        #按季順序進行索引調整
        seasonFactors = seasonFactors[startIndex:].reshape(-1).tolist() + seasonFactors[:startIndex].reshape(-1).tolist()
        seasonFactors = np.array(seasonFactors).reshape(-1)
        return seasonFactors
    
    def buildLongTrend(self,series):
        '''
        計算長期趨勢
        series：時間序列
        '''
        #建立線性模型
        reg = LinearRegression()
        #季節索引從0開始
        index = np.array(range(series.shape[0])).reshape(-1,1)
        reg.fit(index,series.reshape(-1,1))
        return reg
    
if __name__ == "__main__":
    import matplotlib.pyplot as plt
    #https://wenku.baidu.com/view/89933c24a417866fb94a8e0a.html?from=search
    #https://blog.csdn.net/weixin_40159138/article/details/90603344
    #銷售數據
    data = [
            3017.60,3043.54,2094.35,2809.84,
            3274.80,3163.28,2114.31,3024.57,
            3327.48,3439.48,3493.93,3490.79,
            3685.08,3661.23,2378.43,3459.55,
            3849.63,3701.18,2642.38,3585.52,
            4078.66,3907.06,2818.46,4089.50,
            4339.61,4148.60,2976.45,4084.64,
            4242.42,3997.58,2881.01,4036.23,
            4360.33,4360.53,3172.18,4223.76,
            4690.48,4694.48,3342.35,4577.63,
            4965.46,5026.05,3470.14,4525.94,
            5258.71,5489.58,3596.76,3881.60            
    ]
    #plt.plot(range(len(data)),data)
    model = TimeSeriesSplit(data,4)
    #往后預測4個數，也就是1年4個季度的數
    print(model.predict(4))

將預測值和觀測值可視化，因為時間序列分解法適合長期預測，所以只需要一次建模，從圖中可看出序列的趨勢得到很好的擬合，這一點其他時間序列預測算法不易得到，因為大多數算法只適合短期預測

#用來正常顯示中文標簽
plt.rcParams['font.sans-serif']=['SimHei'] 
#用來正常顯示負號
plt.rcParams['axes.unicode_minus']=False
plt.plot(range(len(result['predict']['value'])),result['predict']['value'],label="預測值")
plt.plot(range(len(result['predict']['value'])),test,label="觀測值")
plt.legend()
plt.title('時間序列分解法預測效果')
plt.savefig(resultDir+'/時間序列分解法預測效果.png',dpi=100,bbox_inches='tight')

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 TensorFlow實現時間序列預測 Time-Series Analysis MATLAB時間序列預測Prediction of time series with NAR neural network 通過 Python 代碼實現時間序列數據的統計學預測模型用python做時間序列預測三：時間序列分解時間序列分解-STL分解法用R分析時間序列(time series)數據用R分析時間序列(time series)數據如何使用Cassandra來存儲time-series類型的數據（二）Superset 1.3圖表篇——Time-series Table