本文介紹如何使用python實現多變量線性回歸,文章參考NG的視頻和黃海廣博士的筆記
現在對房價模型增加更多的特征,例如房間數樓層等,構成一個含有多個變量的模型,模型中的特征為( x1,x2,...,xn)

表示為:

引入 x0=1,則公式
轉化為:

1、加載訓練數據
數據格式為:
X1,X2,Y
2104,3,399900
1600,3,329900
2400,3,369000
1416,2,232000
將數據逐行讀取,用逗號切分,並放入np.array
#加載數據
#加載數據
def load_exdata(filename):
data = []
with open(filename, 'r') as f:
for line in f.readlines():
line = line.split(',')
current = [int(item) for item in line]
#5.5277,9.1302
data.append(current)
return data
data = load_exdata('ex1data2.txt');
data = np.array(data,np.int64)
x = data[:,(0,1)].reshape((-1,2))
y = data[:,2].reshape((-1,1))
m = y.shape[0]
# Print out some data points
print('First 10 examples from the dataset: \n')
print(' x = ',x[range(10),:],'\ny=',y[range(10),:])
First 10 examples from the dataset:
x = [[2104 3]
[1600 3]
[2400 3]
[1416 2]
[3000 4]
[1985 4]
[1534 3]
[1427 3]
[1380 3]
[1494 3]]
y= [[399900]
[329900]
[369000]
[232000]
[539900]
[299900]
[314900]
[198999]
[212000]
[242500]]
2、通過梯度下降求解theta
(1)在多維特征問題的時候,要保證特征具有相近的尺度,這將幫助梯度下降算法更快地收斂。

解決的方法是嘗試將所有特征的尺度都盡量縮放到-1 到 1 之間,最簡單的方法就是(X - mu) / sigma,其中mu是平均值, sigma 是標准差。


我們的目標和單變量線性回歸問題中一樣,是要找出使得代價函數最小的一系列參數。多變量線性回歸的批量梯度下降算法為:

求導數后得到:

(3)向量化計算
向量化計算可以加快計算速度,怎么轉化為向量化計算呢?
在多變量情況下,損失函數可以寫為:

對theta求導后得到:
(1/2*m) * (X.T.dot(X.dot(theta) - y))
因此,theta迭代公式為:
theta = theta - (alpha/m) * (X.T.dot(X.dot(theta) - y))
(4)完整代碼如下:
#特征縮放
def featureNormalize(X):
X_norm = X;
mu = np.zeros((1,X.shape[1]))
sigma = np.zeros((1,X.shape[1]))
for i in range(X.shape[1]):
mu[0,i] = np.mean(X[:,i]) # 均值
sigma[0,i] = np.std(X[:,i]) # 標准差
# print(mu)
# print(sigma)
X_norm = (X - mu) / sigma
return X_norm,mu,sigma
#計算損失
def computeCost(X, y, theta):
m = y.shape[0]
# J = (np.sum((X.dot(theta) - y)**2)) / (2*m)
C = X.dot(theta) - y
J2 = (C.T.dot(C))/ (2*m)
return J2
#梯度下降
def gradientDescent(X, y, theta, alpha, num_iters):
m = y.shape[0]
#print(m)
# 存儲歷史誤差
J_history = np.zeros((num_iters, 1))
for iter in range(num_iters):
# 對J求導,得到 alpha/m * (WX - Y)*x(i), (3,m)*(m,1) X (m,3)*(3,1) = (m,1)
theta = theta - (alpha/m) * (X.T.dot(X.dot(theta) - y))
J_history[iter] = computeCost(X, y, theta)
return J_history,theta
iterations = 10000 #迭代次數
alpha = 0.01 #學習率
x = data[:,(0,1)].reshape((-1,2))
y = data[:,2].reshape((-1,1))
m = y.shape[0]
x,mu,sigma = featureNormalize(x)
X = np.hstack([x,np.ones((x.shape[0], 1))])
# X = X[range(2),:]
# y = y[range(2),:]
theta = np.zeros((3, 1))
j = computeCost(X,y,theta)
J_history,theta = gradientDescent(X, y, theta, alpha, iterations)
print('Theta found by gradient descent',theta)
Theta found by gradient descent [[ 109447.79646964]
[ -6578.35485416]
[ 340412.65957447]]
繪制迭代收斂圖
plt.plot(J_history)
plt.ylabel('lost');
plt.xlabel('iter count')
plt.title('convergence graph')

使用模型預測結果
def predict(data):
testx = np.array(data)
testx = ((testx - mu) / sigma)
testx = np.hstack([testx,np.ones((testx.shape[0], 1))])
price = testx.dot(theta)
print('price is %d ' % (price))
predict([1650,3])
price is 293081
no bb,上代碼,代碼下載
