[原創]Python3實現多變量線性回歸模型(公式推導和源代碼)


原創文章,轉載或者引用本文內容請注明來源及原作者!

 

數據集 $D=\{(x_1,y_1),(x_2,y_2)\cdots(x_m,y_m)\}$ ,其中 $x_i=\left[x_i^{(1)},x_i^{(2)}\cdots x_i^{(d)}\right]^T$ 表示一條樣本數據有 $d$ 個屬性,我們的目標是尋找 $d$ 維列向量 $w$ 和常數 $b$,使得模型 $f(x_i)=w^Tx_i+b$ 所得的預測值與真實值 $y_i$ 盡可能接近。

 

令 $y={(y_1,y_2\cdots y_i)}^T$ ,我們要最小化 $\left\|y-X\widehat w\right\|^2$ 

 

此處將最小化的目標函數視為 $\widehat w$ 的“單變量”函數。令 $h(\widehat w)={(y-X\widehat w)}^T(y-X\widehat w)$,求它的最小值只需對 $\widehat w$ 求導,導數為0時 $\widehat w$ 的取值即為所求。求導過程如下:

\begin{align*}
\frac{\partial h(\boldsymbol{\hat w})}{\partial \boldsymbol{\hat w}}
&= \frac{\partial [(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})^T(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})]}{\partial \boldsymbol{\hat w}}\\
&= 2\frac{\partial (\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})^T}{\partial \boldsymbol{\hat w}}(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})\\
&= 2\frac{\partial \boldsymbol y^T}{\partial \boldsymbol{\hat w}}(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})-2\frac{\partial (\boldsymbol X\boldsymbol{\hat {w}})^T}{\partial \boldsymbol{\hat w}}(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}}) \\
&= 0-2\boldsymbol X^T(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})\\
&= 2\boldsymbol X^T(\boldsymbol X\boldsymbol{\hat {w}}-\boldsymbol y)\tag 1\\
\end{align*}

 

令上式等於0,則有:

\begin{eqnarray*}
&&\because 2\boldsymbol X^T(\boldsymbol X\boldsymbol{\hat {w}}-\boldsymbol y)=2\boldsymbol X^T\boldsymbol X\boldsymbol{\hat {w}}-2\boldsymbol X^T\boldsymbol y=0\\
&&\therefore \boldsymbol X^T\boldsymbol X\boldsymbol{\hat {w}}=\boldsymbol X^T\boldsymbol y\\
&&\therefore \boldsymbol{\hat {w}}=(\boldsymbol X^T\boldsymbol X)^{-1}\boldsymbol X^T\boldsymbol y\\
\end{eqnarray*}

如果直接按上述公式求解,則是最小二乘法。因為 $\boldsymbol X^T\boldsymbol X$ 不一定可逆,所以求解一般用梯度下降法,梯度下降法的公式如下:

\begin{eqnarray*}
\widehat{\boldsymbol w^k}=\widehat{\boldsymbol w^{k-1}}-\gamma\frac{\partial h(\widehat{\boldsymbol w})}{\partial\widehat{\boldsymbol w}}
\end{eqnarray*}

 上式中 $\gamma$ 為學習率, $\frac{\partial h(\widehat{\boldsymbol w})}{\partial\widehat{\boldsymbol w}}$ 由公式(1) 可知。

 

具體的python3代碼如下:

#-*- coding: gbk -*-
# @Author  : zhangchao
# @Data    : 2020-06-26
# @Cnblogs :https://www.cnblogs.com/zhangchaolts

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression
  

# 線性(最大最小)歸一化
def min_max_normalization(X):
    Xmin = np.zeros((1, X.shape[2]))
    Xmax = np.zeros((1, X.shape[2]))
    for i in range(X.shape[2]):
        Xmin[0,i] = np.min(X[:,0,i])
        Xmax[0,i] = np.max(X[:,0,i])
    return (X - Xmin) / (Xmax - Xmin)


# 0均值標准化
def zero_mean_normalization(X):
    mu = np.zeros((1, X.shape[2]))
    sigma = np.zeros((1, X.shape[2]))
    for i in range(X.shape[2]):
        mu[0,i] = np.mean(X[:,0,i])
        sigma[0,i] = np.std(X[:,0,i])
    return (X - mu) / sigma


# 隨機生成50個樣本,每個樣本2個特征,返回值中coef為回歸系數。X.shape:[50,2], Y.shape:[50], coef.shape:[2]
X, Y, coef = make_regression(n_samples=50, n_features=2, noise=10, bias=10., coef=True)
print('coef:', coef)

# 采用sklearn中LinearRegression求出回歸系數和截距
model = LinearRegression()
model.fit(X, Y)
print('model.coef_', model.coef_) # 回歸系數
print('model.intercept_', model.intercept_) # 截距

# 以下是自己實現的變量線性回歸模型

# 對X和Y增加維度,方便矩陣計算
X = np.expand_dims(X, 1) # [50, 1, 2]
Y = np.expand_dims(Y, 1) # [50, 1]
Y = np.expand_dims(Y, 2) # [50, 1, 1]

#X = min_max_normalization(X)
#X = zero_mean_normalization(X)

# 損失函數為均方誤差MSE(mean squared error)
loss_function = lambda w, intercept: np.mean([(y[0][0] - (x.dot(w)[0][0] + intercept)) ** 2 for x, y in zip(X, Y)])

# 梯度下降法求解回歸系數w和截距intercept
def gradient_descent(w, intercept, learning_rate):
    for x, y in zip(X, Y):
        #y_predict = w[0][0] * x[0][0] + w[1][0] * x[0][1] + intercept
        y_predict = x.dot(w)[0][0] + intercept
        diff = y_predict - y
        #w[0][0] -= learning_rate * diff * x[0][0]
        #w[1][0] -= learning_rate * diff * x[0][1]
        w -= learning_rate * 2 * x.transpose() * diff
        intercept -= learning_rate * diff
    return w, intercept

# 初始化回歸系數w和截距intercept
w = np.ones((X.shape[-1], 1)) # [2, 1]
intercept = 0

# 記錄loss變化
loss_history = []

for i in range(50):
    loss = loss_function(w, intercept)
    loss_history.append(loss)
    print('step%d loss:' % (i+1), loss)
    w, intercept = gradient_descent(w, intercept, 1.0/(100+i)) # 動態learning_rate,由大到小

print('w:', w)
print('intercept:', intercept)

# 繪圖
plt.plot(loss_history)
plt.ylabel('lost');
plt.xlabel('iter count')
plt.title('convergence graph')
plt.savefig('linear_regression.png')

 

運行結果如下:

coef: [84.54421311  5.2545353 ]
model.coef_ [84.9310078   5.48811974]
model.intercept_ 9.434942960315071
step1 loss: 6040.4032177086465
step2 loss: 1102.8264849114787
step3 loss: 283.25257014464876
step4 loss: 123.54863036366649
step5 loss: 85.00921056460024
step6 loss: 73.48634556845407
step7 loss: 69.44523759664635
step8 loss: 67.88178378711535
step9 loss: 67.23948955078853
step10 loss: 66.96354271958877
step11 loss: 66.83946411821033
step12 loss: 66.78031963233637
step13 loss: 66.74975970519239
step14 loss: 66.73219950499299
step15 loss: 66.72078988721178
step16 loss: 66.71243648657185
step17 loss: 66.70569500741604
step18 loss: 66.6998648623189
step19 loss: 66.69459085397719
step20 loss: 66.68968396030029
step21 loss: 66.68503840714563
step22 loss: 66.6805919983039
step23 loss: 66.676306398435
step24 loss: 66.67215691343604
step25 loss: 66.66812695912645
step26 loss: 66.66420493944473
step27 loss: 66.66038241625293
step28 loss: 66.65665300068504
step29 loss: 66.65301166318754
step30 loss: 66.64945429467006
step31 loss: 66.64597742244224
step32 loss: 66.64257802368101
step33 loss: 66.63925340142816
step34 loss: 66.63600110122701
step35 loss: 66.63281885446035
step36 loss: 66.62970453938905
step37 loss: 66.62665615401313
step38 loss: 66.62367179688592
step39 loss: 66.62074965331212
step40 loss: 66.61788798521602
step41 loss: 66.6150851235298
step42 loss: 66.61233946232669
step43 loss: 66.60964945417334
step44 loss: 66.60701360634461
step45 loss: 66.60443047765548
step46 loss: 66.60189867574294
step47 loss: 66.59941685468131
step48 loss: 66.59698371285032
step49 loss: 66.59459799099857
step50 loss: 66.59225847046397
w: [[85.29343409]
 [ 5.25287288]]
intercept: [[9.31683096]]

 

 

loss值變化如圖所示:

 

【參考文獻

  1. 多元線性回歸中的公式推導 : https://blog.csdn.net/wx_blue_pig/article/details/79791906
  2. Python【線性回歸】 : https://blog.csdn.net/yellow_python/article/details/81224614
  3. https://github.com/huping404/linear-regression/blob/master/stochastic.py
  4. python實現多變量線性回歸(Linear Regression with Multiple Variables) : https://www.cnblogs.com/xiaoqi/p/6408614.html
  5. https://github.com/darlinglele/regression/blob/master/linear_regression.py

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM