一、一維線性回歸
一維線性回歸最好的解法是:最小二乘法
問題描述:給定數據集$D=\left \{ \left ( x_{1},y_{1} \right ),\left ( x_{2},y_{2} \right ),\cdots ,\left ( x_{m},y_{m} \right ) \right \}$,一維線性回歸希望能找到一個函數$f\left ( x_{i} \right )$,使得$f\left ( x_{i} \right )=wx_{i}+b$能夠與$y_{i}$盡可能接近。
損失函數:$$L\left ( w,b \right )=\sum_{i=1}^{m}\left [ f\left ( x_{i} \right )- y_{i} \right ]^{2}$$
目標:$$\left ( w^{*},b^{*} \right )=\underset{w,b}{argmin}\sum_{i=1}^{m}\left [ f\left ( x_{i} \right )- y_{i} \right ]^{2}=\underset{w,b}{argmin}\sum_{i=1}^{m}\left (y_{i}- wx_{i}-b \right )^{2} $$
求解損失函數的方法很直觀,令損失函數的偏導數為零,即:$$\frac{\partial L\left ( w,b \right ) }{\partial w}=2\sum_{i=1}^{m}\left (y_{i}- wx_{i}-b \right )\left ( - x_{i}\right )\\=2\sum_{i=1}^{m}\left [ wx_{i}^{2} -\left ( y_{i}-b \right )x_{i}\right ]=2\left ( w\sum_{i=1}^{m}x_{i}^{2}- \sum_{i=1}^{m}\left ( y_{i}-b \right )x_{i}\right )=0$$
$$\frac{\partial L\left ( w,b \right ) }{\partial b}= 2 \sum_{i=1}^{m}\left (wx_{i}+b -y_{i}\right )=2\left ( mb- \sum_{i=1}^{m}\left ( y_{i}-wx_{i} \right )\right )=0$$
解上二式得:
$$ b= \frac{1}{m}\sum_{i=1}^{m}\left ( y_{i}-wx_{i} \right ) $$
$$w\sum_{i=1}^{m}x_{i}^{2}-\sum_{i=1}^{m}\left ( y_{i}-b \right )x_{i}=0$$
$$w\sum_{i=1}^{m}x_{i}^{2}-\sum_{i=1}^{m}y_{i}x_{i}+ \frac{1}{m}\sum_{i=1}^{m}\left ( y_{i}-wx_{i} \right )\sum_{i=1}^{m}x_{i}=0$$
$$w\sum_{i=1}^{m}x_{i}^{2}-\sum_{i=1}^{m}y_{i}x_{i}+\sum_{i=1}^{m}y_{i}\bar{x_{i}}-\frac{w}{m}\left ( \sum_{i=1}^{m}x_{i} \right )^{2}=0$$
$$w\left [ \sum_{i=1}^{m}x_{i}^{2} -\frac{1}{m}\left ( \sum_{i=1}^{m}x_{i} \right )^{2}\right ]=\sum_{i=1}^{m}y_{i}\left ( x_{i}-\bar{x_{i}} \right )$$
$$w=\frac{\sum_{i=1}^{m}y_{i}\left ( x_{i}-\bar{x_{i}} \right )}{\left [ \sum_{i=1}^{m}x_{i}^{2} -\frac{1}{m}\left ( \sum_{i=1}^{m}x_{i} \right )^{2}\right ]}$$
其中$\bar{x_{i}}=\frac{1}{m}\sum_{i=1}^{m}x_{i}$為$x_{i}的均值$
二、多元線性回歸
假設每個樣例$x_{i}$有d個屬性,即
$x_{i} = \begin{bmatrix}
x_{i}^{\left(1\right )}\\
x_{i}^{\left(2\right )}\\
\vdots \\
x_{i}^{\left(d\right )}
\end{bmatrix}$
試圖學得回歸函數$f\left(\mathbf{ x_{i}} \right)$,$f\left(\mathbf{ x_{i}} \right)=\mathbf{w}^{T}\mathbf{x_{i}}+b$
損失函數仍采用軍方誤差的形式,同樣可以采用最小二乘法對$\mathbf{ x}$和$b$進行估計。為了方便計算,我們把$\mathbf{ x}$和$b$寫進同一個矩陣,如下:
$$\mathbf{w }= \begin{bmatrix}
w_{1}\\
w_{2}\\
\vdots\\
w_{d}\\
b
\end{bmatrix}$$
$$X=\begin{bmatrix}
x_{1}^{\left ( 1 \right )} & x_{1}^{\left ( 2 \right )}&... & x_{1}^{\left ( d \right )} &1 \\
x_{2}^{\left ( 1 \right )} & x_{2}^{\left ( 2 \right )}& ... & x_{2}^{\left ( d \right )} &1 \\
\vdots & \vdots & \ddots & \vdots & \vdots\\
x_{m}^{\left ( 1 \right )} & x_{m}^{\left ( 2 \right )}& ... & x_{m}^{\left ( d \right )} &1
\end{bmatrix}$$
$$X\mathbf{w}=\begin{bmatrix}
f\left ( x_{1} \right )\\
f\left ( x_{1} \right )\\
\vdots\\
f\left ( x_{d} \right )
\end{bmatrix}$$
$$\mathbf{Y}=\begin{bmatrix}
y_{1}\\
y_{2}\\
\vdots\\
y_{d}
\end{bmatrix}$$
三、推導多元線性回歸
推導多元線性回歸前,首先列出推導過程中一些常用的跡和矩陣求導的定理。
$\mathbf{z}^{T}\mathbf{z}=\sum_{i}z_{i}^{2}$,$\mathbf{z}$是列向量
$\mathbf{A}和\mathbf{B}$是矩陣,tr表示求矩陣的跡,則有:
$$tr\left ( \mathbf{AB }\right )=tr\left ( \mathbf{BA} \right )\\
tr\left ( \mathbf{ABC }\right )=tr\left ( \mathbf{CAB} \right )=tr\left ( \mathbf{BCA }\right )$$
若$f\left ( \mathbf{A} \right )=tr\left ( \mathbf{AB} \right )$,則$\bigtriangledown _{A} tr\left ( \mathbf{AB }\right )=\mathbf{B}^{T}$
$$tr\left (\mathbf{A} \right )=tr\left ( \mathbf{A}^{T} \right )\\
if\quad a\epsilon R,\quad tr\left ( a \right )=a$$
$$\bigtriangledown _{A}tr\left ( \mathbf{ABA^{T}C} \right )=\mathbf{CAB}+\mathbf{C^{T}AB^{T}}$$
由題意的:$L\left (\mathbf{ w} \right )=\frac{1}{2}\left ( X\mathbf{w}-\mathbf{Y} \right )^{T}\left ( X\mathbf{w}-\mathbf{Y} \right )$
$$\begin{aligned}
\bigtriangledown _{w}L\left (\mathbf{ w} \right )&=\frac{1}{2}\bigtriangledown _{w}\left ( X\mathbf{w}-\mathbf{Y} \right )^{T}\left ( X\mathbf{w}-\mathbf{Y} \right )\\
&=\frac{1}{2}\bigtriangledown _{w}\left ( \mathbf{w^{T}}X^{T}-\mathbf{Y^{T}} \right )\left ( X\mathbf{w}-\mathbf{Y} \right )\\
&=\frac{1}{2}\bigtriangledown _{w}\left ( \mathbf{w^{T}}X^{T}X\mathbf{w}- \mathbf{w^{T}}X^{T}\mathbf{Y}-\mathbf{Y}^{T}X\mathbf{w}+\mathbf{Y}^{T}\mathbf{Y}\right ) \\
&=\frac{1}{2}\bigtriangledown _{w}tr\left ( \mathbf{w^{T}}X^{T}X\mathbf{w}- \mathbf{w^{T}}X^{T}\mathbf{Y}-\mathbf{Y}^{T}X\mathbf{w}+\mathbf{Y}^{T}\mathbf{Y}\right )\\
&=\frac{1}{2}\bigtriangledown _{w}tr\left ( \mathbf{w^{T}}X^{T}X\mathbf{w}- \mathbf{w^{T}}X^{T}\mathbf{Y}-\mathbf{Y}^{T}X\mathbf{w}\right )\\
&=\frac{1}{2}\left [ \bigtriangledown _{w}tr \left ( \mathbf{w^{T}}X^{T}X\mathbf{w} \right )-\bigtriangledown _{w}tr \left ( \mathbf{w^{T}}X^{T}\mathbf{Y} \right )-\bigtriangledown _{w}tr \left ( \mathbf{Y}^{T}X\mathbf{w} \right )\right ] \\
&=\frac{1}{2}\left [ \bigtriangledown _{w}tr \left ( \mathbf{w}I\mathbf{w^{T}}X^{T}X \right )-\bigtriangledown _{w}tr \left ( \mathbf{Y}^{T}X\mathbf{w } \right )-\bigtriangledown _{w}tr \left (\mathbf{Y}^{T}X\mathbf{w} \right )\right ]\\
&= \frac{1}{2}\left [ X^{T}Xw+ X^{T}Xw - X^{T}Y - X^{T}Y\right ]\\
&=X^{T}\left ( Xw-Y \right )=0
\end{aligned}$$
注:
- 損失函數前的1/2是為了求偏導數方便人為加上的,但是不影響w和b的最優解
- 上式子的求解涉及了矩陣的逆運算,所以需要 $X^{T}X$ 是一個滿秩矩陣或者正定矩陣,可以得到:$\mathbf{w}=\left ( X^{T}X \right )^{-1}X^{T}Y$
- 如果在現實任務中,不滿足滿秩矩陣時,我們可以用梯度下降法來求解