用神經網絡擬合數據

本文轉載自查看原文 2021-03-10 09:25 382 動手學習深度學習（PyTorch）

1 神經元

從本質上講，神經元不過是輸入的線性變換（例如，輸入乘以一個數[weight，權重]，再加上一個常數[偏置，bias]），然后再經過一個固定的非線性函數（稱為激活函數）。

神經元：線性變換后再經過一個非線性函數

數學上，你可以將其寫為 $o = f(wx + b)$

$o = f(wx + b)$

其中神經元層的輸出將用作下一層的輸入。請記住，這里的 $w0w_0w0 是一個矩陣，而 xxx 是一個向量！在此使用向量可使 w0w_0w0 容納整個神經元層，而不僅僅是單個權重。$

激活函數：（深度）神經網絡中最簡單的單元是線性運算（縮放+偏移）然后緊跟一個激活函數。在你的上一個模型中有一個線性運算，而這個線性運算就是整個模型。激活函數的作用是將先前線性運算的輸出聚集到給定范圍內。

幾個常見及不常見的激活函數：

2、PyTorch的nn模塊

實例化nn.Linear並將其像一個函數一樣進行調用

import torch.nn as nn

linear_model = nn.Linear(1, 1) # 參數: input size, output size, bias(默認True)
linear_model.weight   # 權重
linear_model.bias   # 偏差
linear_model.parameters() # 參數

nn中的任何模塊都被編寫成同時產生一個批次（即多個輸入）的輸出。因此，假設你需要對10個樣本運行nn.Linear，則可以創建大小為 B x Nin 的輸入張量，其中 B 是批次的大小，而 Nin 是輸入特征的數量，然后在模型中同時運行：

x = torch.ones(10, 1)
linear_model(x)

現在更新原來的訓練代碼。首先，將之前的手工模型替換為nn.Linear(1,1)，然后將線性模型參數傳遞給優化器：

linear_model = nn.Linear(1, 1)
optimizer = optim.SGD(
    linear_model.parameters(),  # 之前，你需要自己創建參數並將其作為第一個參數傳遞給optim.SGD
    lr=1e-2)

用神經網絡代替線性模型作為近似函數：

接下來我們將重新定義模型，並將所有其他內容（包括損失函數）保持不變。還是構建最簡單的神經網絡：一個線性模塊然后是一個激活函數，最后將輸入喂入另一個線性模塊

# nn提供了一種通過nn.Sequential容器串聯模塊的簡單方法：
seq_model = nn.Sequential(
            nn.Linear(1, 13),
            nn.Tanh(),
            nn.Linear(13, 1))
seq_model

from collections import OrderedDict

seq_model = nn.Sequential(OrderedDict([
    ('hidden_linear', nn.Linear(1, 8)),
    ('hidden_activation', nn.Tanh()),
    ('output_linear', nn.Linear(8, 1))
]))

seq_model

$o = f(wx + b)$

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import torch
import torch.nn as nn
import torch.optim as optim

torch.set_printoptions(edgeitems=2)
torch.manual_seed(2020)


# In[2]:


t_c = [0.5,  14.0, 15.0, 28.0, 11.0,  8.0,  3.0, -4.0,  6.0, 13.0, 21.0]
t_u = [35.7, 55.9, 58.2, 81.9, 56.3, 48.9, 33.9, 21.8, 48.4, 60.4, 68.4]
t_c = torch.tensor(t_c).unsqueeze(1) # <1>
t_u = torch.tensor(t_u).unsqueeze(1) # <1>

t_u.shape


# In[3]:


n_samples = t_u.shape[0]
n_val = int(0.2 * n_samples)

shuffled_indices = torch.randperm(n_samples)

train_indices = shuffled_indices[:-n_val]
val_indices = shuffled_indices[-n_val:]

train_indices, val_indices


# In[8]:


t_u_train = t_u[train_indices]  # t_u里的測試集
t_c_train = t_c[train_indices]  # t_c對應t_u測試集的部分

t_u_val = t_u[val_indices]   # t_u里驗證集
t_c_val = t_c[val_indices]   # t_c里對應t_u驗證集的部分

t_un_train = 0.1 * t_u_train  # 相當於把t_u里的數據*0.1
t_un_val = 0.1 * t_u_val


# In[6]:


linear_model = nn.Linear(1, 1) # 參數: input size, output size, bias(默認True)
linear_model(t_un_val)


# In[9]:


# 現在你有一個具有一個輸入和一個輸出特征的nn.Linear實例，它需要一個權重
linear_model.weight


# In[10]:


# 一個偏差
linear_model.bias


# In[12]:


# 你可以用一些輸入來調用這個模塊
x = torch.ones(1)
linear_model(x)


# In[13]:


# nn中的任何模塊都被編寫成同時產生一個批次（即多個輸入）的輸出
x = torch.ones(10,1)
linear_model(x)


# In[14]:


# 這就是你要切換到使用nn.Linear所要做的。你需要將尺寸為 B 的輸入reshape為 B x Nin，其中Nin為1。你可以使用unsqueeze輕松地做到這一點：
t_c = [0.5,  14.0, 15.0, 28.0, 11.0,  8.0,  3.0, -4.0,  6.0, 13.0, 21.0]
t_u = [35.7, 55.9, 58.2, 81.9, 56.3, 48.9, 33.9, 21.8, 48.4, 60.4, 68.4]
t_c = torch.tensor(t_c).unsqueeze(1) # <1>
t_u = torch.tensor(t_u).unsqueeze(1) # <1>

t_u.shape


# In[15]:


linear_model = nn.Linear(1, 1)
optimizer = optim.SGD(
    linear_model.parameters(),
    lr=1e-2)


# In[16]:


print(linear_model.parameters())
print(list(linear_model.parameters()))


# In[17]:


def training_loop(n_epochs, optimizer, model, loss_fn, 
                  t_u_train, t_u_val, t_c_train, t_c_val):
    for epoch in range(1, n_epochs + 1):
        t_p_train = model(t_un_train)
        loss_train = loss_fn(t_p_train, t_c_train)

        t_p_val = model(t_un_val)
        loss_val = loss_fn(t_p_val, t_c_val)

        optimizer.zero_grad()
        loss_train.backward()
        optimizer.step()

        if epoch == 1 or epoch % 1000 == 0:
            print('Epoch %d, Training loss %.4f, Validation loss %.4f' % (
                    epoch, float(loss_train), float(loss_val)))


# In[18]:


linear_model = nn.Linear(1, 1)
optimizer = optim.SGD(linear_model.parameters(), lr=1e-2)

training_loop(
    n_epochs = 3000,
    optimizer = optimizer,
    model = linear_model,
    loss_fn = nn.MSELoss(), # 不再使用自己定義的loss
    t_u_train = t_un_train,
    t_u_val = t_un_val,
    t_c_train = t_c_train,
    t_c_val = t_c_val)

print()
print(linear_model.weight)
print(linear_model.bias)

# nn提供了一種通過nn.Sequential容器串聯模塊的簡單方法：
# In[19]:


seq_model = nn.Sequential(
            nn.Linear(1, 13),
            nn.Tanh(),
            nn.Linear(13, 1))
seq_model


# In[20]:


[param.shape for param in seq_model.parameters()]


# In[21]:


for name, param in seq_model.named_parameters():
    print(name, param.shape)

# Sequential還可以接受OrderedDict作為參數，這樣就可以給Sequential的每個模塊命名：
# In[22]:


from collections import OrderedDict

seq_model = nn.Sequential(OrderedDict([
    ('hidden_linear', nn.Linear(1, 8)),
    ('hidden_activation', nn.Tanh()),
    ('output_linear', nn.Linear(8, 1))
]))

seq_model


# In[23]:


for name, param in seq_model.named_parameters():
    print(name, param.shape)


# In[24]:


optimizer = optim.SGD(seq_model.parameters(), lr=1e-3) # 為了穩定性調小了梯度

training_loop(
    n_epochs = 5000,
    optimizer = optimizer,
    model = seq_model,
    loss_fn = nn.MSELoss(),
    t_u_train = t_un_train,
    t_u_val = t_un_val,
    t_c_train = t_c_train,
    t_c_val = t_c_val)

print('output', seq_model(t_un_val))
print('answer', t_c_val)
print('hidden', seq_model.hidden_linear.weight.grad)


# In[26]:


from matplotlib import pyplot as plt

t_range = torch.arange(20., 90.).unsqueeze(1)

fig = plt.figure(dpi=100)
plt.xlabel("Fahrenheit")
plt.ylabel("Celsius")
plt.plot(t_u.numpy(), t_c.numpy(), 'o')
plt.plot(t_range.numpy(), seq_model(0.1 * t_range).detach().numpy(), 'c-')
plt.plot(t_u.numpy(), seq_model(0.1 * t_u).detach().numpy(), 'kx')
plt.show()


# In[ ]:

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 神經網絡過擬合問題神經網絡中的過擬合 BP神經網絡擬合給定函數神經網絡防止過擬合的方法神經網絡擬合二次函數神經網絡是如何擬合任意函數的如何降低神經網絡模型的過擬合和欠擬合？ R語言代寫實現擬合神經網絡; 神經網絡包機器學習筆記（五）神經網絡參數的擬合 MATLAB神經網絡（3）遺傳算法優化BP神經網絡——非線性函數擬合