一、nn.Modules
我們可以定義一個模型,這個模型繼承自nn.Module類。如果需要定義一個比Sequential模型更加復雜的模型,就需要定義nn.Module模型。
定義了__init__和 forward 兩個方法,就實現了自定義的網絡模型。
_init_(),定義模型架構,實現每個層的定義。
forward(),實現前向傳播,返回y_pred
import torch
class TwoLayerNet(torch.nn.Module):
def __init__(self, D_in, H, D_out):
"""
In the constructor we instantiate two nn.Linear modules and assign them as
member variables.
"""
super(TwoLayerNet, self).__init__()
self.linear1 = torch.nn.Linear(D_in, H)
self.linear2 = torch.nn.Linear(H, D_out)
def forward(self, x):
"""
In the forward function we accept a Tensor of input data and we must return
a Tensor of output data. We can use Modules defined in the constructor as
well as arbitrary operators on Tensors.
"""
h_relu = self.linear1(x).clamp(min=0)
y_pred = self.linear2(h_relu)
return y_pred
N, D_in, H, D_out = 64, 1000, 100, 10
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)
model = TwoLayerNet(D_in, H, D_out)
criterion = torch.nn.MSELoss(reduction='sum')
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
for t in range(500):
y_pred = model(x)
loss = criterion(y_pred, y)
print(t, loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
二、一個實例:FizzBuzz
FizzBuzz是一個簡單的小游戲。游戲規則如下:從1開始往上數數,當遇到3的倍數的時候,說fizz,當遇到5的倍數,說buzz,當遇到15的倍數,就說fizzbuzz,其他情況下則正常數數。
# One-hot encode the desired outputs: [number, "fizz", "buzz", "fizzbuzz"]
def fizz_buzz_encode(i):
if i % 15 == 0: return 3
elif i % 5 == 0: return 2
elif i % 3 == 0: return 1
else: return 0
def fizz_buzz_decode(i, prediction):
return [str(i), "fizz", "buzz", "fizzbuzz"][prediction]
首先定義模型的輸入與輸出(訓練數據)
import numpy as np
import torch
NUM_DIGITS = 10
# Represent each input by an array of its binary digits.
def binary_encode(i, num_digits):
return np.array([i >> d & 1 for d in range(num_digits)])[::-1] # 右移一位再和1做與運算。
# 右移動運算符:把">>"左邊的運算數的各二進位全部右移若干位,>> 右邊的數字指定了移動的位數
trX = torch.Tensor([binary_encode(i, NUM_DIGITS) for i in range(101, 2 ** NUM_DIGITS)])
trY = torch.LongTensor([fizz_buzz_encode(i) for i in range(101, 2 ** NUM_DIGITS)]) #因為表示類別,用LongTensor
然后用PyTorch定義模型,損失函數,優化器。
# Define the model
NUM_HIDDEN = 100
model = torch.nn.Sequential(
torch.nn.Linear(NUM_DIGITS, NUM_HIDDEN),
torch.nn.ReLU(),
torch.nn.Linear(NUM_HIDDEN, 4)
)
loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr = 0.05)
以下是模型的訓練代碼
# Start training it
BATCH_SIZE = 128
for epoch in range(10000):
for start in range(0, len(trX), BATCH_SIZE):
end = start + BATCH_SIZE
batchX = trX[start:end]
batchY = trY[start:end]
y_pred = model(batchX)
loss = loss_fn(y_pred, batchY)
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Find loss on training data
loss = loss_fn(model(trX), trY).item()
print('Epoch:', epoch, 'Loss:', loss)