Robust regression(穩健回歸)
語法
b=robustfit(X,y)
b=robustfit(X,y,wfun,tune)
b=robustfit(X,y,wfun,tune,const)
[b,stats]=robustfit(...)
描述
b=robustfit(X,y) 通過執行穩健回歸來估計線性模型y=Xb,並返回一個由回歸系數組成的向量b。X是一個n*p預測變量矩陣,y是一個n*1觀測向量。計算使用的方法是加 上bisquare加權函數的迭代重加權最小二乘法。默認的情況下,robustfit函數把全1的一個列向量添加進X中,此列向量與b中第一個元素的常 數項對應。注意不能直接對X添加一個全1的列向量。可以在下面的第三個描述中通過改變變量“const”來更改robustfit函數的操作。 robustfit函數把X或y中的NaNs作為一個缺省值,接着把它刪除。
b=robustfit(X,y,wfun,tune) 增加了一個加權函數“wfun”和常數“tune”。“tune”是一個調節常數,其在計算權重之前被分成殘差向量,如果“wfun”被指定為一個函數, 那么“tune”是必不可少的。權重函數“wfun”可以為下表中的任何一個權重函數:
| 權重函數(Weight Function) | 等式(Equation) | 默認調節常數(Default Tuning Constant) |
|---|---|---|
| 'andrews' | w = (abs(r)<pi) .* sin(r) ./ r | 1.339 |
| 'bisquare' (default) | w = (abs(r)<1) .* (1 - r.^2).^2 | 4.685 |
| 'cauchy' | w = 1 ./ (1 + r.^2) | 2.385 |
| 'fair' | w = 1 ./ (1 + abs(r)) | 1.400 |
| 'huber' | w = 1 ./ max(1, abs(r)) | 1.345 |
| 'logistic' | w = tanh(r) ./ r | 1.205 |
| 'ols' | 傳統最小二乘估計 (無權重函數) | 無 |
| 'talwar' | w = 1 * (abs(r)<1) | 2.795 |
| 'welsch' | w = exp(-(r.^2)) | 2.985 |
b=robustfit(X,y,wfun,tune,const)增加一個“const”控制模式內是否包含一個常數項,默認為包含(on)。
[b,stats]=robustfit(...)返回一個包含一下域的STATS結構。
'ols_s' sigma estimate (rmse) from least squares fit
'robust_s' robust estimate of sigma
'mad_s' MAD estimate of sigma; used for scaling
residuals during the iterative fitting
's' final estimate of sigma, the larger of robust_s
and a weighted average of ols_s and robust_s
'se' standard error of coefficient estimates
't' ratio of b to stats.se
'p' p-values for stats.t
'covb' estimated covariance matrix for coefficient estimates
'coeffcorr' estimated correlation of coefficient estimates
'w' vector of weights for robust fit
'h' vector of leverage values for least squares fit
'dfe' degrees of freedom for error
'R' R factor in QR decomposition of X matrix
'robust_s' robust estimate of sigma
'mad_s' MAD estimate of sigma; used for scaling
residuals during the iterative fitting
's' final estimate of sigma, the larger of robust_s
and a weighted average of ols_s and robust_s
'se' standard error of coefficient estimates
't' ratio of b to stats.se
'p' p-values for stats.t
'covb' estimated covariance matrix for coefficient estimates
'coeffcorr' estimated correlation of coefficient estimates
'w' vector of weights for robust fit
'h' vector of leverage values for least squares fit
'dfe' degrees of freedom for error
'R' R factor in QR decomposition of X matrix
The ROBUSTFIT function estimates the variance-covariance matrix of the coefficient estimates as V=inv(X'*X)*STATS.S^2. The standard errors and correlations are derived from V.
matlab例子:
x = (1:10)';
y = 10 - 2*x + randn(10,1); y(10) = 0;
y = 10 - 2*x + randn(10,1); y(10) = 0;
使用原始最小二乘估計和穩健回歸估計結果如下:
bls = regress(y,[ones(10,1) x])
bls =
7.2481
-1.3208
bls =
7.2481
-1.3208
brob = robustfit(x,y)
brob =
9.1063
-1.8231
brob =
9.1063
-1.8231
顯示結果如下:
scatter(x,y,'filled'); grid on; hold on
plot(x,bls(1)+bls(2)*x,'r','LineWidth',2);
plot(x,brob(1)+brob(2)*x,'g','LineWidth',2)
legend('Data','Ordinary Least Squares','Robust Regression')
plot(x,bls(1)+bls(2)*x,'r','LineWidth',2);
plot(x,brob(1)+brob(2)*x,'g','LineWidth',2)
legend('Data','Ordinary Least Squares','Robust Regression')

一個來自網的例子的matlab實現:
估計:
(K>0)
(K>0)
matlab實現代碼:
function wf=robust(x,y,k)
% find starting values using Ordinary Least Squares
w = x\y;
r = y - x* w;
scale= 1;
%optional I can compute the scale using MED
% scale = median(abs(r - median(r)))/ 0.6745;
cvg = 1;%convergence
while (cvg > 1e- 5)
r = r/scale;
wf = w; %save w
WH=wfun(r,k);% diff(rho(xc)/x)
% do weighted least-squares
yst = y.*sqrt(WH);
xst = matmul(x,sqrt(WH));
w = xst\yst;
%the new residual
r = y - x* w;
% compute the convergence
cvg = max(abs( w-wf)./abs(wf));
end;
function W=wfun(r,k)
W=zeros(length(r), 1);
for i= 1:length(r)
if (r(i)>=-(k- 1)) && (r(i)<=k)
W(i)= 1;
elseif r(i)<-(k- 1)
W(i)=(k- 1)^ 4/(r(i)^ 4);
else
W(i)=k^ 4/(r(i)^ 4);
end
end
% find starting values using Ordinary Least Squares
w = x\y;
r = y - x* w;
scale= 1;
%optional I can compute the scale using MED
% scale = median(abs(r - median(r)))/ 0.6745;
cvg = 1;%convergence
while (cvg > 1e- 5)
r = r/scale;
wf = w; %save w
WH=wfun(r,k);% diff(rho(xc)/x)
% do weighted least-squares
yst = y.*sqrt(WH);
xst = matmul(x,sqrt(WH));
w = xst\yst;
%the new residual
r = y - x* w;
% compute the convergence
cvg = max(abs( w-wf)./abs(wf));
end;
function W=wfun(r,k)
W=zeros(length(r), 1);
for i= 1:length(r)
if (r(i)>=-(k- 1)) && (r(i)<=k)
W(i)= 1;
elseif r(i)<-(k- 1)
W(i)=(k- 1)^ 4/(r(i)^ 4);
else
W(i)=k^ 4/(r(i)^ 4);
end
end
,其中
。並且
。
另外,http://blog.csdn.net/abcjennifer/article/details/7449435#(M-estimator M估計法 用於幾何模型建立)
博客中對M估計法有蠻好的解釋。
