Pytorch定義網絡結構識別手寫數字,可以對網絡中的參數w和b進行手動定義的(參考上一節),也可以直接用nn.Linear定義層的方式來定義,更加方便的方式是直接繼承nn.Module來定義自己的網絡結構。
1.nn.Linear方式
1 import torch 2 import torch.nn as nn 3 import torch.nn.functional as F 4 5 #模擬一張28*28的圖片攤平 6 x = torch.randn(1, 784) #shape=[1,784] 7 8 #定義三個全連接層 9 layer1=nn.Linear(784, 200) #(in,out) 10 layer2=nn.Linear(200, 200) 11 layer3=nn.Linear(200, 10) 12 13 x=layer1(x) #shape=[1,200] 14 x=F.relu(x, inplace=True) #inplace=True在原對象基礎上修改,可以節省內存 15 16 x=layer2(x) #shape=[1,200] 17 x=F.relu(x, inplace=True) 18 19 x=layer3(x) #shape=[1,10] 20 x=F.relu(x, inplace=True)
2.繼承nn.Module方式
1 import torch 2 import torch.nn as nn 3 import torch.nn.functional as F 4 import torch.optim as optim 5 from torchvision import datasets, transforms 6 7 #超參數 8 batch_size=200 9 learning_rate=0.01 10 epochs=10 11 12 #獲取訓練數據 13 train_loader = torch.utils.data.DataLoader( 14 datasets.MNIST('../data', train=True, download=True, #train=True則得到的是訓練集 15 transform=transforms.Compose([ #transform進行數據預處理 16 transforms.ToTensor(), #轉成Tensor類型的數據 17 transforms.Normalize((0.1307,), (0.3081,)) #進行數據標准化(減去均值除以方差) 18 ])), 19 batch_size=batch_size, shuffle=True) #按batch_size分出一個batch維度在最前面,shuffle=True打亂順序 20 21 #獲取測試數據 22 test_loader = torch.utils.data.DataLoader( 23 datasets.MNIST('../data', train=False, transform=transforms.Compose([ 24 transforms.ToTensor(), 25 transforms.Normalize((0.1307,), (0.3081,)) 26 ])), 27 batch_size=batch_size, shuffle=True) 28 29 30 class MLP(nn.Module): 31 32 def __init__(self): 33 super(MLP, self).__init__() 34 35 self.model = nn.Sequential( #定義網絡的每一層,nn.ReLU可以換成其他激活函數,比如nn.LeakyReLU() 36 nn.Linear(784, 200), 37 nn.ReLU(inplace=True), 38 nn.Linear(200, 200), 39 nn.ReLU(inplace=True), 40 nn.Linear(200, 10), 41 nn.ReLU(inplace=True), 42 ) 43 44 def forward(self, x): 45 x = self.model(x) 46 return x 47 48 49 net = MLP() 50 #定義sgd優化器,指明優化參數、學習率,net.parameters()得到這個類所定義的網絡的參數[[w1,b1,w2,b2,...] 51 optimizer = optim.SGD(net.parameters(), lr=learning_rate) 52 criteon = nn.CrossEntropyLoss() 53 54 for epoch in range(epochs): 55 56 for batch_idx, (data, target) in enumerate(train_loader): 57 data = data.view(-1, 28*28) #將二維的圖片數據攤平[樣本數,784] 58 59 logits = net(data) #前向傳播 60 loss = criteon(logits, target) #nn.CrossEntropyLoss()自帶Softmax 61 62 optimizer.zero_grad() #梯度信息清空 63 loss.backward() #反向傳播獲取梯度 64 optimizer.step() #優化器更新 65 66 if batch_idx % 100 == 0: #每100個batch輸出一次信息 67 print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format( 68 epoch, batch_idx * len(data), len(train_loader.dataset), 69 100. * batch_idx / len(train_loader), loss.item())) 70 71 72 test_loss = 0 73 correct = 0 #correct記錄正確分類的樣本數 74 for data, target in test_loader: 75 data = data.view(-1, 28 * 28) 76 logits = net(data) 77 test_loss += criteon(logits, target).item() #其實就是criteon(logits, target)的值,標量 78 79 pred = logits.data.max(dim=1)[1] #也可以寫成pred=logits.argmax(dim=1) 80 correct += pred.eq(target.data).sum() 81 82 test_loss /= len(test_loader.dataset) 83 print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format( 84 test_loss, correct, len(test_loader.dataset), 85 100. * correct / len(test_loader.dataset)))
Train Epoch: 0 [0/60000 (0%)] Loss: 2.307840
Train Epoch: 0 [20000/60000 (33%)] Loss: 2.022810
Train Epoch: 0 [40000/60000 (67%)] Loss: 1.342542
Test set: Average loss: 0.0038, Accuracy: 8374/10000 (84%)
Train Epoch: 1 [0/60000 (0%)] Loss: 0.802759
Train Epoch: 1 [20000/60000 (33%)] Loss: 0.627895
Train Epoch: 1 [40000/60000 (67%)] Loss: 0.482087
Test set: Average loss: 0.0020, Accuracy: 8926/10000 (89%)
Train Epoch: 2 [0/60000 (0%)] Loss: 0.496279
Train Epoch: 2 [20000/60000 (33%)] Loss: 0.420009
Train Epoch: 2 [40000/60000 (67%)] Loss: 0.429296
Test set: Average loss: 0.0017, Accuracy: 9069/10000 (91%)
Train Epoch: 3 [0/60000 (0%)] Loss: 0.304612
Train Epoch: 3 [20000/60000 (33%)] Loss: 0.356296
Train Epoch: 3 [40000/60000 (67%)] Loss: 0.405541
Test set: Average loss: 0.0015, Accuracy: 9149/10000 (91%)
Train Epoch: 4 [0/60000 (0%)] Loss: 0.304062
Train Epoch: 4 [20000/60000 (33%)] Loss: 0.406027
Train Epoch: 4 [40000/60000 (67%)] Loss: 0.385962
Test set: Average loss: 0.0014, Accuracy: 9201/10000 (92%)
Train Epoch: 5 [0/60000 (0%)] Loss: 0.186269
Train Epoch: 5 [20000/60000 (33%)] Loss: 0.196249
Train Epoch: 5 [40000/60000 (67%)] Loss: 0.228671
Test set: Average loss: 0.0013, Accuracy: 9248/10000 (92%)
Train Epoch: 6 [0/60000 (0%)] Loss: 0.364886
Train Epoch: 6 [20000/60000 (33%)] Loss: 0.295816
Train Epoch: 6 [40000/60000 (67%)] Loss: 0.244240
Test set: Average loss: 0.0012, Accuracy: 9290/10000 (93%)
Train Epoch: 7 [0/60000 (0%)] Loss: 0.228807
Train Epoch: 7 [20000/60000 (33%)] Loss: 0.192547
Train Epoch: 7 [40000/60000 (67%)] Loss: 0.223399
Test set: Average loss: 0.0012, Accuracy: 9329/10000 (93%)
Train Epoch: 8 [0/60000 (0%)] Loss: 0.176273
Train Epoch: 8 [20000/60000 (33%)] Loss: 0.346954
Train Epoch: 8 [40000/60000 (67%)] Loss: 0.253838
Test set: Average loss: 0.0011, Accuracy: 9359/10000 (94%)
Train Epoch: 9 [0/60000 (0%)] Loss: 0.246411
Train Epoch: 9 [20000/60000 (33%)] Loss: 0.201452
Train Epoch: 9 [40000/60000 (67%)] Loss: 0.162228
Test set: Average loss: 0.0011, Accuracy: 9377/10000 (94%)
區別nn.ReLU()和F.relu()
nn.ReLU()是類風格的API(大寫開頭,必須先實例化再調用,參數w、b是內部成員,通過.parameters來訪問)
F.relu()是函數風格的API(自己管理)
1 import torch 2 import torch.nn as nn 3 import torch.nn.functional as F 4 5 x=torch.randn(1,10) 6 print(F.relu(x, inplace=True)) #tensor([[0.2846, 0.6158, 0.0000, 0.0000, 0.0000, 1.7980, 0.6466, 0.4263, 0.0000, 0.0000]]) 7 8 layer=nn.ReLU() 9 print(layer(x)) #tensor([[0.2846, 0.6158, 0.0000, 0.0000, 0.0000, 1.7980, 0.6466, 0.4263, 0.0000, 0.0000]])
3.GPU加速
Pytorch中使用torch.device()選取並返回抽象出的設備,然后在定義的網絡模塊或者Tensor后面加上.to(device變量)就可以將它們搬到設備上了。
1 device = torch.device('cpu:0') #使用第一張顯卡 2 net = MLP().to(device) #定義的網絡 3 criteon = nn.CrossEntropyLoss().to(device) #損失函數
每次取出的訓練集和驗證集的batch數據放到GPU上:
1 data, target = data.to(device), target.cuda() #兩種方式
應用上面的案例添加GPU加速,完整代碼如下:

1 import torch 2 import torch.nn as nn 3 import torch.nn.functional as F 4 import torch.optim as optim 5 from torchvision import datasets, transforms 6 7 #超參數 8 batch_size=200 9 learning_rate=0.01 10 epochs=10 11 12 #獲取訓練數據 13 train_loader = torch.utils.data.DataLoader( 14 datasets.MNIST('../data', train=True, download=True, #train=True則得到的是訓練集 15 transform=transforms.Compose([ #transform進行數據預處理 16 transforms.ToTensor(), #轉成Tensor類型的數據 17 transforms.Normalize((0.1307,), (0.3081,)) #進行數據標准化(減去均值除以方差) 18 ])), 19 batch_size=batch_size, shuffle=True) #按batch_size分出一個batch維度在最前面,shuffle=True打亂順序 20 21 #獲取測試數據 22 test_loader = torch.utils.data.DataLoader( 23 datasets.MNIST('../data', train=False, transform=transforms.Compose([ 24 transforms.ToTensor(), 25 transforms.Normalize((0.1307,), (0.3081,)) 26 ])), 27 batch_size=batch_size, shuffle=True) 28 29 30 class MLP(nn.Module): 31 32 def __init__(self): 33 super(MLP, self).__init__() 34 35 self.model = nn.Sequential( #定義網絡的每一層, 36 nn.Linear(784, 200), 37 nn.ReLU(inplace=True), 38 nn.Linear(200, 200), 39 nn.ReLU(inplace=True), 40 nn.Linear(200, 10), 41 nn.ReLU(inplace=True), 42 ) 43 44 def forward(self, x): 45 x = self.model(x) 46 return x 47 48 device = torch.device('cuda:0') 49 net = MLP().to(device) 50 #定義sgd優化器,指明優化參數、學習率,net.parameters()得到這個類所定義的網絡的參數[[w1,b1,w2,b2,...] 51 optimizer = optim.SGD(net.parameters(), lr=learning_rate) 52 criteon = nn.CrossEntropyLoss().to(device) 53 54 55 for epoch in range(epochs): 56 57 for batch_idx, (data, target) in enumerate(train_loader): 58 data = data.view(-1, 28*28) #將二維的圖片數據攤平[樣本數,784] 59 data, target = data.to(device), target.cuda() 60 61 logits = net(data) #前向傳播 62 loss = criteon(logits, target) #nn.CrossEntropyLoss()自帶Softmax 63 64 optimizer.zero_grad() #梯度信息清空 65 loss.backward() #反向傳播獲取梯度 66 optimizer.step() #優化器更新 67 68 if batch_idx % 100 == 0: #每100個batch輸出一次信息 69 print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format( 70 epoch, batch_idx * len(data), len(train_loader.dataset), 71 100. * batch_idx / len(train_loader), loss.item())) 72 73 74 test_loss = 0 75 correct = 0 #correct記錄正確分類的樣本數 76 for data, target in test_loader: 77 data = data.view(-1, 28 * 28) 78 data, target = data.to(device), target.cuda() 79 80 logits = net(data) 81 test_loss += criteon(logits, target).item() #其實就是criteon(logits, target)的值,標量 82 83 pred = logits.data.max(dim=1)[1] #也可以寫成pred=logits.argmax(dim=1) 84 correct += pred.eq(target.data).sum() 85 86 test_loss /= len(test_loader.dataset) 87 print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format( 88 test_loss, correct, len(test_loader.dataset), 89 100. * correct / len(test_loader.dataset)))
4.多分類測試
下面的例子中logits是一個4*10的Tensor,可以理解成4張圖片,每張圖片有10維向量的預測,然后對每一張照片的輸出值執行softmax和argmax(dim=1),得出預測標簽,與真實標簽比較,得出准確率。
1 import torch 2 import torch.nn.functional as F 3 4 logits = torch.rand(4, 10) 5 pred = F.softmax(logits, dim=1) 6 7 pred_label = pred.argmax(dim=1) #tensor([5, 8, 4, 7]) 和logits.argmax(dim=1)結果一樣 8 9 label = torch.tensor([5, 3, 2, 7]) 10 correct = torch.eq(pred_label, label) #tensor([ True, False, False, True]) 和pred_label.eq(label)結果一樣 11 12 print(correct.sum().float().item() / len(logits)) #0.5