[原创]Python3实现多变量线性回归模型(公式推导和源代码)


原创文章,转载或者引用本文内容请注明来源及原作者!

 

数据集 $D=\{(x_1,y_1),(x_2,y_2)\cdots(x_m,y_m)\}$ ,其中 $x_i=\left[x_i^{(1)},x_i^{(2)}\cdots x_i^{(d)}\right]^T$ 表示一条样本数据有 $d$ 个属性,我们的目标是寻找 $d$ 维列向量 $w$ 和常数 $b$,使得模型 $f(x_i)=w^Tx_i+b$ 所得的预测值与真实值 $y_i$ 尽可能接近。

 

令 $y={(y_1,y_2\cdots y_i)}^T$ ,我们要最小化 $\left\|y-X\widehat w\right\|^2$ 

 

此处将最小化的目标函数视为 $\widehat w$ 的“单变量”函数。令 $h(\widehat w)={(y-X\widehat w)}^T(y-X\widehat w)$,求它的最小值只需对 $\widehat w$ 求导,导数为0时 $\widehat w$ 的取值即为所求。求导过程如下:

\begin{align*}
\frac{\partial h(\boldsymbol{\hat w})}{\partial \boldsymbol{\hat w}}
&= \frac{\partial [(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})^T(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})]}{\partial \boldsymbol{\hat w}}\\
&= 2\frac{\partial (\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})^T}{\partial \boldsymbol{\hat w}}(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})\\
&= 2\frac{\partial \boldsymbol y^T}{\partial \boldsymbol{\hat w}}(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})-2\frac{\partial (\boldsymbol X\boldsymbol{\hat {w}})^T}{\partial \boldsymbol{\hat w}}(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}}) \\
&= 0-2\boldsymbol X^T(\boldsymbol y-\boldsymbol X\boldsymbol{\hat {w}})\\
&= 2\boldsymbol X^T(\boldsymbol X\boldsymbol{\hat {w}}-\boldsymbol y)\tag 1\\
\end{align*}

 

令上式等于0,则有:

\begin{eqnarray*}
&&\because 2\boldsymbol X^T(\boldsymbol X\boldsymbol{\hat {w}}-\boldsymbol y)=2\boldsymbol X^T\boldsymbol X\boldsymbol{\hat {w}}-2\boldsymbol X^T\boldsymbol y=0\\
&&\therefore \boldsymbol X^T\boldsymbol X\boldsymbol{\hat {w}}=\boldsymbol X^T\boldsymbol y\\
&&\therefore \boldsymbol{\hat {w}}=(\boldsymbol X^T\boldsymbol X)^{-1}\boldsymbol X^T\boldsymbol y\\
\end{eqnarray*}

如果直接按上述公式求解,则是最小二乘法。因为 $\boldsymbol X^T\boldsymbol X$ 不一定可逆,所以求解一般用梯度下降法,梯度下降法的公式如下:

\begin{eqnarray*}
\widehat{\boldsymbol w^k}=\widehat{\boldsymbol w^{k-1}}-\gamma\frac{\partial h(\widehat{\boldsymbol w})}{\partial\widehat{\boldsymbol w}}
\end{eqnarray*}

 上式中 $\gamma$ 为学习率, $\frac{\partial h(\widehat{\boldsymbol w})}{\partial\widehat{\boldsymbol w}}$ 由公式(1) 可知。

 

具体的python3代码如下:

#-*- coding: gbk -*-
# @Author  : zhangchao
# @Data    : 2020-06-26
# @Cnblogs :https://www.cnblogs.com/zhangchaolts

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression
  

# 线性(最大最小)归一化
def min_max_normalization(X):
    Xmin = np.zeros((1, X.shape[2]))
    Xmax = np.zeros((1, X.shape[2]))
    for i in range(X.shape[2]):
        Xmin[0,i] = np.min(X[:,0,i])
        Xmax[0,i] = np.max(X[:,0,i])
    return (X - Xmin) / (Xmax - Xmin)


# 0均值标准化
def zero_mean_normalization(X):
    mu = np.zeros((1, X.shape[2]))
    sigma = np.zeros((1, X.shape[2]))
    for i in range(X.shape[2]):
        mu[0,i] = np.mean(X[:,0,i])
        sigma[0,i] = np.std(X[:,0,i])
    return (X - mu) / sigma


# 随机生成50个样本,每个样本2个特征,返回值中coef为回归系数。X.shape:[50,2], Y.shape:[50], coef.shape:[2]
X, Y, coef = make_regression(n_samples=50, n_features=2, noise=10, bias=10., coef=True)
print('coef:', coef)

# 采用sklearn中LinearRegression求出回归系数和截距
model = LinearRegression()
model.fit(X, Y)
print('model.coef_', model.coef_) # 回归系数
print('model.intercept_', model.intercept_) # 截距

# 以下是自己实现的变量线性回归模型

# 对X和Y增加维度,方便矩阵计算
X = np.expand_dims(X, 1) # [50, 1, 2]
Y = np.expand_dims(Y, 1) # [50, 1]
Y = np.expand_dims(Y, 2) # [50, 1, 1]

#X = min_max_normalization(X)
#X = zero_mean_normalization(X)

# 损失函数为均方误差MSE(mean squared error)
loss_function = lambda w, intercept: np.mean([(y[0][0] - (x.dot(w)[0][0] + intercept)) ** 2 for x, y in zip(X, Y)])

# 梯度下降法求解回归系数w和截距intercept
def gradient_descent(w, intercept, learning_rate):
    for x, y in zip(X, Y):
        #y_predict = w[0][0] * x[0][0] + w[1][0] * x[0][1] + intercept
        y_predict = x.dot(w)[0][0] + intercept
        diff = y_predict - y
        #w[0][0] -= learning_rate * diff * x[0][0]
        #w[1][0] -= learning_rate * diff * x[0][1]
        w -= learning_rate * 2 * x.transpose() * diff
        intercept -= learning_rate * diff
    return w, intercept

# 初始化回归系数w和截距intercept
w = np.ones((X.shape[-1], 1)) # [2, 1]
intercept = 0

# 记录loss变化
loss_history = []

for i in range(50):
    loss = loss_function(w, intercept)
    loss_history.append(loss)
    print('step%d loss:' % (i+1), loss)
    w, intercept = gradient_descent(w, intercept, 1.0/(100+i)) # 动态learning_rate,由大到小

print('w:', w)
print('intercept:', intercept)

# 绘图
plt.plot(loss_history)
plt.ylabel('lost');
plt.xlabel('iter count')
plt.title('convergence graph')
plt.savefig('linear_regression.png')

 

运行结果如下:

coef: [84.54421311  5.2545353 ]
model.coef_ [84.9310078   5.48811974]
model.intercept_ 9.434942960315071
step1 loss: 6040.4032177086465
step2 loss: 1102.8264849114787
step3 loss: 283.25257014464876
step4 loss: 123.54863036366649
step5 loss: 85.00921056460024
step6 loss: 73.48634556845407
step7 loss: 69.44523759664635
step8 loss: 67.88178378711535
step9 loss: 67.23948955078853
step10 loss: 66.96354271958877
step11 loss: 66.83946411821033
step12 loss: 66.78031963233637
step13 loss: 66.74975970519239
step14 loss: 66.73219950499299
step15 loss: 66.72078988721178
step16 loss: 66.71243648657185
step17 loss: 66.70569500741604
step18 loss: 66.6998648623189
step19 loss: 66.69459085397719
step20 loss: 66.68968396030029
step21 loss: 66.68503840714563
step22 loss: 66.6805919983039
step23 loss: 66.676306398435
step24 loss: 66.67215691343604
step25 loss: 66.66812695912645
step26 loss: 66.66420493944473
step27 loss: 66.66038241625293
step28 loss: 66.65665300068504
step29 loss: 66.65301166318754
step30 loss: 66.64945429467006
step31 loss: 66.64597742244224
step32 loss: 66.64257802368101
step33 loss: 66.63925340142816
step34 loss: 66.63600110122701
step35 loss: 66.63281885446035
step36 loss: 66.62970453938905
step37 loss: 66.62665615401313
step38 loss: 66.62367179688592
step39 loss: 66.62074965331212
step40 loss: 66.61788798521602
step41 loss: 66.6150851235298
step42 loss: 66.61233946232669
step43 loss: 66.60964945417334
step44 loss: 66.60701360634461
step45 loss: 66.60443047765548
step46 loss: 66.60189867574294
step47 loss: 66.59941685468131
step48 loss: 66.59698371285032
step49 loss: 66.59459799099857
step50 loss: 66.59225847046397
w: [[85.29343409]
 [ 5.25287288]]
intercept: [[9.31683096]]

 

 

loss值变化如图所示:

 

【参考文献

  1. 多元线性回归中的公式推导 : https://blog.csdn.net/wx_blue_pig/article/details/79791906
  2. Python【线性回归】 : https://blog.csdn.net/yellow_python/article/details/81224614
  3. https://github.com/huping404/linear-regression/blob/master/stochastic.py
  4. python实现多变量线性回归(Linear Regression with Multiple Variables) : https://www.cnblogs.com/xiaoqi/p/6408614.html
  5. https://github.com/darlinglele/regression/blob/master/linear_regression.py

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM