【深度學習 01】線性回歸+PyTorch實現

本文轉載自查看原文 2022-03-27 13:42 703 Deep Learning

1. 線性回歸

1.1 線性模型

當輸入包含d個特征，預測結果表示為：

記x為樣本的特征向量，w為權重向量，上式可表示為：

對於含有n個樣本的數據集，可用X來表示n個樣本的特征集合，其中行代表樣本，列代表特征，那么預測值可用矩陣乘法表示為：

給定訓練數據特征X和對應的已知標簽y，線性回歸的⽬標是找到⼀組權重向量w和偏置b：當給定從X的同分布中取樣的新樣本特征時，這組權重向量和偏置能夠使得新樣本預測標簽的誤差盡可能小。

1.2 損失函數（loss function）

損失函數又稱代價函數（cost function），通常用其來度量目標的實際值和預測值之間的誤差。在回歸問題中，常用的損失函數為平方誤差函數：

我們的目標便是求得最小化損失函數下參數w和b的值：

求解上式，一般有以下兩種方式：

1> 正規方程（解析解）

2> 梯度下降（gradient descent）

（1）初始化模型參數的值，如隨機初始化；

（2）從數據集中隨機抽取小批量樣本且在負梯度的方向上更新參數，並不斷迭代這一步驟。

上式中：n表示每個小批量中的樣本數，也稱批量大小（batch size）、α表示學習率（learning rate），n和α的值需要手動預先指定，而不是模型訓練得到的，這類參數稱為超參數（hyperparameter），選擇超參數的過程稱為調參（hyperparameter tuning)。

梯度下降和正規方程比較：

1.3 矢量化加速

為了加快模型訓練速度，可以采用矢量化計算的方式，這通常會帶來數量級的加速。下邊用代碼簡單對比測試下矢量化計算的加速效果。

import math
import time
import numpy as np
import torch
from d2l import torch as d2l

# a、b是全為1的10000維向量
n = 10000
a = torch.ones(n)
b = torch.ones(n)


class Timer:
    def __init__(self):
        """記錄多次運行時間"""
        self.tik = None
        self.times = []
        self.start()

    def start(self):
        """啟動計時器"""
        self.tik = time.time()

    def stop(self):
        """停止計時器並將時間記錄在列表中"""
        self.times.append(time.time() - self.tik)
        return self.times[-1]

    def avg(self):
        """返回平均時間"""
        return sum(self.times) / len(self.times)

    def sum(self):
        """返回總時間"""
        return sum(self.times)

    def cumsum(self):
        """返回總時間"""
        return np.array(self.times).cumsum().tolist()


c = torch.zeros(n)
timer = Timer()
for i in range(n):
    c[i] = a[i] + b[i]
print(f'{timer.stop():.5f} sec')

timer.start()
d = a + b
print(f'{timer.stop():.5f} sec')

代碼運行結果如下，可見矢量化代碼確實極大的提高了計算速度。

注：這里矢量化計算d=a+b的時間不知道為什么統計出來是0，可能是跟電腦的計時器精度有關。

2. 從零實現線性回歸

線性回歸的實現過程可以簡單總結為以下幾個步驟：

（1）讀取數據（或構造數據），轉換成需要的格式和類型，並生成標簽；

（2）定義初始化模型參數、定義模型、定義損失函數、定義優化算法；

（3）使用優化算法訓練模型。

import random
import torch
import numpy as np
from matplotlib import pyplot as plt
from d2l import torch as d2l


# 構造數據集
def synthetic_data(w, b, num_examples):
    """生成 y = Xw + b + 噪聲。"""
    # 均值為0，方差為1的隨機數，行數為樣本數，列數是w的長度(行代表樣本，列代表特征）
    X = torch.normal(0, 1, (num_examples, len(w)))  # pytorch較新版本
    # X = torch.tensor(np.random.normal(0, 1, (num_examples, len(w))), dtype=torch.float32)  # pytorch1.1.0版本
    y = torch.matmul(X, w) + b
    # 均值為0，方差為1的隨機數，噪聲項。
    y += torch.normal(0, 0.01, y.shape)  # pytorch較新版本
    # y += torch.tensor(np.random.normal(0, 0.01, y.shape), dtype=torch.float32)  # pytorch1.1.0版本
    return X, y.reshape((-1, 1))


true_w = torch.tensor([2, -3.4])
true_b = 4.2
features, labels = synthetic_data(true_w, true_b, 1000)
print('features:', features[0], '\nlabel:', labels[0])

d2l.set_figsize()
d2l.plt.scatter(features[:, 1].detach().numpy(), labels.detach().numpy(), 1)


# 生成一個data_iter函數，該函數接收批量大小、特征矩陣和標簽向量作為輸入，生成大小為batch_size的小批量
def data_iter(batch_size, features, labels):
    num_examples = len(features)
    indices = list(range(num_examples))
    # 這些樣本是隨機讀取的，沒有特定的順序
    random.shuffle(indices)
    for i in range(0, num_examples, batch_size):
        batch_indices = torch.tensor(indices[i:min(i+batch_size, num_examples)])
        yield features[batch_indices], labels[batch_indices]


batch_size = 10
for X, y in data_iter(batch_size, features, labels):
    print(X, '\n', y)
    break

# 定義初始化模型參數
w = torch.normal(0, 0.01, size=(2, 1), requires_grad=True)  # pytorch較新版本
# w = torch.autograd.Variable(torch.tensor(np.random.normal(0, 0.01, size=(2, 1)),
#                                          dtype=torch.float32), requires_grad=True)  # pytorch1.1.0版本
b = torch.zeros(1, requires_grad=True)


# 定義模型
def linreg(X, w, b):
    """線性回歸模型。"""
    return torch.matmul(X, w) + b


# 定義損失函數
def squared_loss(y_hat, y):
    """均方損失。"""
    return (y_hat - y.reshape(y_hat.shape))**2 / 2


# 定義優化算法
def sgd(params, lr, batch_size):
    """小批量隨機梯度下降"""
    with torch.no_grad():
        for param in params:
            param -= lr * param.grad / batch_size
            param.grad.zero_()


# 訓練過程
lr = 0.03
num_epochs = 3
net = linreg
loss = squared_loss

for epoch in range(num_epochs):
    for X, y in data_iter(batch_size, features, labels):
        l = loss(net(X, w, b), y)  # X和y的小批量損失
        # 因為l形狀是（batch_size, 1），而不是一個標量。l中的所有元素被加到一起並以此來計算關於[w, b]的梯度
        l.sum().backward()
        sgd([w, b], lr, batch_size)  # 使用參數的梯度更新參數
    with torch.no_grad():
        train_l = loss(net(features, w, b), labels)
        print(f'epoch {epoch + 1}, loss {float(train_l.mean()):f}')

print(f'w的估計誤差：{true_w - w.reshape(true_w.shape)}')
print(f'b的估計誤差：{true_b - b}')

3. 使用深度學習框架（PyTorch）實現線性回歸

使用PyTorch封裝的高級API可以快速高效的實現線性回歸

import numpy as np
import torch
from torch import nn  # 'nn'是神經網路的縮寫
from torch.utils import data
from d2l import torch as d2l


# 構造數據集
def synthetic_data(w, b, num_examples):
    """生成 y = Xw + b + 噪聲。"""
    # 均值為0，方差為1的隨機數，行數為樣本數，列數是w的長度(行代表樣本，列代表特征）
    X = torch.normal(0, 1, (num_examples, len(w)))  # pytorch較新版本
    # X = torch.tensor(np.random.normal(0, 1, (num_examples, len(w))), dtype=torch.float32)  # pytorch1.1.0版本
    y = torch.matmul(X, w) + b
    # 均值為0，方差為1的隨機數，噪聲項。
    y += torch.normal(0, 0.01, y.shape)  # pytorch較新版本
    # y += torch.tensor(np.random.normal(0, 0.01, y.shape), dtype=torch.float32)  # pytorch1.1.0版本
    return X, y.reshape((-1, 1))


true_w = torch.tensor([2, -3.4])
true_b = 4.2
features, labels = synthetic_data(true_w, true_b, 1000)

d2l.set_figsize()
d2l.plt.scatter(features[:, 1].detach().numpy(), labels.detach().numpy(), 1)


# 調用框架中現有的API來讀取數據
def load_array(data_arrays, batch_size, is_train=True):
    """構造一個PyTorch數據迭代器"""
    dataset = data.TensorDataset(*data_arrays)
    return data.DataLoader(dataset, batch_size, shuffle=is_train)


batch_size = 10
data_iter = load_array((features, labels), batch_size)

print(next(iter(data_iter)))

# 使用框架預定義好的層
net = nn.Sequential(nn.Linear(2, 1))

# 初始化模型參數（等價於前邊手動實現w、b以及network的方式）
net[0].weight.data.normal_(0, 0.01)  # 使用正態分布替換掉w的值
net[0].bias.data.fill_(0)

# 計算均方誤差使用MSELoss類，也稱為平方L2范數
loss = nn.MSELoss()

# 實例化SGD實例
trainer = torch.optim.SGD(net.parameters(), lr=0.03)

# 訓練
num_epochs = 3  # 迭代三個周期
for epoch in range(num_epochs):
    for X, y in data_iter:
        l = loss(net(X), y)
        trainer.zero_grad()  # 優化器，先將梯度清零
        l.backward()
        trainer.step()  # 模型更新
    l = loss(net(features), labels)
    print(f'epoch {epoch + 1}, loss {l:f}')

w = net[0].weight.data
print('w的估計誤差：', true_w - w.reshape(true_w.shape))
b = net[0].bias.data
print('b的估計誤差：', true_b - b)

4. 報錯總結

1. torch.normal()報錯，這個是由於PyTorch版本問題，torch.normal()函數的參數形式和用法有所變化。

要生成均值為0且方差為1的隨機數，pytorch1.1.0和pytorch1.9.0可以分別采用以下形式：

# pytorch1.9.0 
X = torch.normal(0, 1, (num_examples, len(w)))    
# pytorch1.1.0（也適用於高版本） 
X = torch.tensor(np.random.normal(0, 1, (num_examples, len(w))), dtype=torch.float32)

2. d2l庫安裝報錯。這個我在公司電腦上直接一行pip install d2l成功安裝，回家換自己電腦，各種報錯。解決之后發現大多都是找不到安裝源、缺少相關庫或者庫版本不兼容的問題。

安裝方式：conda install d2l 或 pip install d2l。網速太慢下不下來可以選擇國內源鏡像：

pip install d2l -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com

國內常用源鏡像：

# 清華：https://pypi.tuna.tsinghua.edu.cn/simple
# 阿里雲：http://mirrors.aliyun.com/pypi/simple/
# 中國科技大學 https://pypi.mirrors.ustc.edu.cn/simple/
# 華中理工大學：http://pypi.hustunique.com/
# 山東理工大學：http://pypi.sdutlinux.org/
# 豆瓣：http://pypi.douban.com/simple/

需要注意的是：有時候使用conda install d2l命令無法下載，改為pip 命令后即可下載成功。這是因為有些包只能通過pip安裝。Anaconda提供超過1,500個軟件包，包括最流行的數據科學、機器學習和AI框架，這與PyPI上提供的150,000多個軟件包相比，只是一小部分。

Python官方安裝whl包和tar.gz包安裝方法：

安裝whl包：pip install wheel，pip install xxx.whl

安裝tar.gz包：cd到解壓后路徑，python setup.py install

參考資料

[1] Python錯誤筆記（2）之Pytorch的torch.normal()函數

[2] 動手學深度學習李沐

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Pytorch學習-線性回歸 Pytorch實現簡單的線性回歸 python深度學習-tensorflow實現一個線性回歸的案例《機器學習Python實現_01_線性模型_線性回歸》 [深度學習]python深度學習實現一個簡單的線性回歸案例 Pytorch 實現簡單線性回歸家樂的深度學習筆記「3」 - 線性回歸 Pytorch實現深度學習 pytorch深度學習：非線性模型從頭學pytorch(三) 線性回歸