樣本
\[x_i=(x_{i1};x_{i2}; ...; x_{im}) \, {函數值 y_i} \]
每個樣本有m個變量
回歸面
\[f(x_i) = x_i^T \omega +b \]
\(\omega = (\omega_1; \omega_2; ...; \omega_m)\)
\[\hat x_i=(x_{i1};x_{i2}; ...; x_{im}; 1) \]
\[\hat \omega = (\omega_1; \omega_2; ...; \omega_m; b) \]
則
\[f(x_i) = \hat x_i^T \hat \omega \]
假設有n個樣本令
\[X = [\hat x_1^T; \hat x_2^T; ..., \hat x_n^T] \]
則均方誤差為
\[\frac 1n (X \hat \omega - Y)^T (X \hat \omega - Y) \]
其中\(Y = (y_1; y_2; ...; y_n)\)
損失函數
\[J(\hat \omega^*) = (X \hat \omega^* - Y)^T (X\hat \omega^* - Y) \]
令
\[\left. \frac{{\rm d} J(\hat \omega^*)}{{\rm d} \hat \omega^*} = 0\right. \]
\[\hat \omega^* = (X^TX)^{-1}X^TY \]
當樣本變量較多,樣本數量不足時\(m>n\), \(\hat \omega^*\)解不唯一
L2正則化
引入對於\(\hat \omega^*\)的L2正則化項
\[\hat J(\hat \omega^*) = (X \hat \omega^* - Y)^T (X\hat \omega^* - Y) + \frac{\lambda}{2} ||\hat \omega^*||_2^2 \]
可以發現,正則化項為\(\hat \omega^*\)二范數平方,乘以\(\frac{\lambda}{2}\)
\(\lambda\)用於控制正則化項和均方誤差的權重
\[\left. \frac{{\rm d} \hat J(\hat \omega^*)}{{\rm d} \hat \omega^*} = 0\right. \]
\[\hat \omega^* = (X^TX + \frac{\lambda}{2}I)^{-1}X^TY \]
python程序
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
#隨機生成兩個變量的N個樣本
N = 50
feature1 = np.random.rand(N)*10
feature2 = np.random.rand(N)*10
splt = np.ones((1, N))
#
temp_X1 = np.row_stack((feature1, feature2))
temp_X = np.vstack((temp_X1, splt))
X_t = np.mat(temp_X)
X = X_t.T
temp_Y = np.random.rand(N)*10
Y_t = np.mat(temp_Y)
Y = Y_t.T
#畫樣本散點圖
fig = plt.figure()
ax1 = Axes3D(fig)
ax1.scatter(feature1, feature2, temp_Y)
#求Omega
Lbd = 0.01 #確定lambda
I_11 = np.eye(2)
I_12 = np.zeros((2, 1))
I_2 = np.zeros((1, 3))
I_t1 = np.hstack((I_11, I_12))
I_t = np.vstack((I_t1, I_2))
I = np.mat(I_t)
Omega = (X.T*X + Lbd/2*I).I*X.T*Y
#畫分回歸面
xx = np.linspace(0,10, num=50)
yy = np.linspace(0,10, num=50)
xx_1, yy_1 = np.meshgrid(xx, yy)
Omega_h = np.array(Omega.T)
zz_1 = Omega_h[0, 0]*xx_1 + Omega_h[0, 1]*yy_1 + Omega_h[0, 2]
ax1.plot_surface(xx_1, yy_1, zz_1, alpha= 0.6, color= "r")
plt.show()