建立ARMAX模型需要運用R的dse包,在R的dse包中The ARMA model representation is general, so that VAR, VARX,ARIMA, ARMAX, ARIMAX can all be considered to be special cases.
數據集為天然氣爐中的天然氣(input)與產生的CO2(output),數據來源為王燕應用時間序列分析第三版附錄表A1-24,首先隊數據做簡要的分析,做出時序圖以及協方差圖
input<-ts(input,frequency = 1,start = c(1,1)) output<-ts(output,frequency = 1,start = c(1,1)) par(mfrow=c(1,2)) plot.ts(input) plot.ts(output)
做兩個變量間的協方差函數需要使用ccf函數
ccf(output,input,lag.max = NULL,type = c("correlation"),plot = TRUE)
可以看到兩變量存在邏輯上的因果關系,變量延遲的階數在3~5左右,因此可以通過dse包進行建模,需要使用的函數有ARMA,TSdata,estMaxLik,forecast
首先使用ARMA函數定義一個模型表達式,注意ARMA只能定義一個模型結構,而不能估計其參數,其定義的結構為
A(L)y(t) = B(L)w(t) + C(L)u(t) + TREND(t)
where
A (axpxp) is the auto-regressive polynomial array.
B (bxpxp) is the moving-average polynomial array.
C (cxpxm) is the input polynomial array. C should be NULL if there is no input
y is the p dimensional output data.
u is the m dimensional control (input) data.
TREND is a matrix the same dimension as y, a p-vector (which gets replicated for each time
period), or NULL.
需要注意定義模型中參數與變量的區別,以及模型設置的初始值設置
By default, elements in parameter arrays are treated as constants if they are exactly 1.0 or 0.0, and
as parameters otherwise. A value of 1.001 would be treated as a parameter, and this is the easiest
way to initialize an element which is not to be treated as a constant of value 1.0. Any array elements
can be fixed to constants by specifying the list constants. Arrays which are not specified in the list
will be treated in the default way. An alternative for fixing constants is the function fixConstants.
因此我們可以如下定義模型
MA <- array(c(1,-1.5,0.5),c(3,1,1)) C <- array(c(0,0,0,-0.5,-0.5,-0.5),c(5,1,1)) AR <- array(c(1,-0.5),c(2,1,1)) TR <-array(c(50),c(1,1,1)) ARMAX <- ARMA(A=AR, B=MA, C=C ,TREND<-TR)
定義好模型后,還需要將數據轉化為需要的格式,這里要使用TSdata函數,把input與output數據存儲在一起
隨后就可以使用estMaxLik函數對模型進行求解了,先說另一個求解函數estVARXls,顧名思義,采用的求解方法為ls,該函數的好處是可以自動定階,如:
armaxdata<-TSdata(input=input,output=output) model <- estVARXls(armaxdata)
> model
neg. log likelihood= 821.601
A(L) =
1-1.705922L1+0.6411281L2+0.2290077L3-0.128453L4-0.0686206L5+0.03279703L6
B(L) =
1
C(L) =
-0.08792481+0.2803458L1-0.2801545L2-0.3750078L3-0.07263863L4+0.5139702L5
如果想要用前面設置的ARMA形式來建模的話則需要使用estMaxLik,該函數采用Maximum likelihood estimation,注意每次計算的結果可能與ARMA設置的初始值有關,如果想要得到更好的結果,需要對該函數參數進行討論。
model <- estMaxLik(armaxdata, ARMAX, algorithm="optim")
> model
neg. log likelihood= 231.8164
TREND= 53.493
A(L) =
1+0.001700993L1
B(L) =
1+1.486967L1+0.7612954L2
C(L) =
0-1.002061L3-1.601826L4
最后還要使用forecast函數對模型進行預測,注意如果input和output一樣長,則不會產生新的未來預測值,但是會產生模型擬合值pred,如果想要對未來進行預測,需要討論conditioning.inputs.forecasts(A time series matrix or list of time series matrices to append to input variables for the forecast periods.)
pr <- forecast(model, conditioning.inputs=inputData(armaxdata))
提取出pred項畫出預測圖
pred<-ts(as.vector(pr$pred),frequency = 1,start = c(1,1)) plot(pred,col="red",type="p") par(new=TRUE) plot(output)
筆者也是ARMAX初學者,若有錯誤請大家指出