pytorch nn.Module類—使用Module類來自定義模型

本文轉載自查看原文 2020-06-03 20:47 1649 PyTorch

前言

pytorch中對於一般的序列模型，直接使用torch.nn.Sequential類及可以實現，這點類似於keras，但是更多的時候面對復雜的模型，比如：多輸入多輸出、多分支模型、跨層連接模型、帶有自定義層的模型等，就需要自己來定義一個模型了。本文將詳細說明如何讓使用Mudule類來自定義一個模型。

一、torch.nn.Module類概述

個人理解，pytorch不像tensorflow那么底層，也不像keras那么高層，這里先比較keras和pytorch的一些小區別。

（1）keras更常見的操作是通過繼承Layer類來實現自定義層，不推薦去繼承Model類定義模型，詳細原因可以參見官方文檔

（2）pytorch中其實一般沒有特別明顯的Layer和Module的區別，不管是自定義層、自定義塊、自定義模型，都是通過繼承Module類完成的，這一點很重要。其實Sequential類也是繼承自Module類的。

注意：我們當然也可以直接通過繼承torch.autograd.Function類來自定義一個層，但是這很不推薦，不提倡，至於為什么后面會介紹。

總結：pytorch里面一切自定義操作基本上都是繼承nn.Module類來實現的

本文僅僅先討論使用Module來實現自定義模塊，自定義層先不做討論。

二、torch.nn.Module類的簡介

torch.nn.Module是所有神經網絡模塊的基類。您的模型也應該繼承此類。模塊也可以包含其他模塊，從而可以將它們嵌套在樹結構中。您可以將子模塊指定為常規屬性。

先來簡單看一它的定義：

class Module(object):
    def __init__(self):
    def forward(self, *input):
 
    def add_module(self, name, module):
    def cuda(self, device=None):
    def cpu(self):
    def __call__(self, *input, **kwargs):
    def parameters(self, recurse=True):
    def named_parameters(self, prefix='', recurse=True):
    def children(self):
    def named_children(self):
    def modules(self):  
    def named_modules(self, memo=None, prefix=''):
    def train(self, mode=True):
    def eval(self):
    def zero_grad(self):
    def __repr__(self):
    def __dir__(self):
'''
有一部分沒有完全列出來
'''

我們在定義自已的網絡的時候，需要繼承nn.Module類，並重新實現構造函數__init__構造函數和forward這兩個方法。但有一些注意技巧：

我們在定義自已的網絡的時候，需要繼承nn.Module類，並重新實現構造函數__init__構造函數和forward這兩個方法。但有一些注意技巧：

（1）一般把網絡中具有可學習參數的層（如全連接層、卷積層等）放在構造函數__init__()中，當然我也可以吧不具有參數的層也放在里面；

（2）一般把不具有可學習參數的層(如ReLU、dropout、BatchNormanation層)可放在構造函數中，也可不放在構造函數中，如果不放在構造函數__init__里面，則在forward方法里面可以使用nn.functional來代替；

（3）forward方法是必須要重寫的，它是實現模型的功能，實現各個層之間的連接關系的核心。

下面先看一個簡單的例子。

import torch
 
class MyNet(torch.nn.Module):
    def __init__(self):
        super(MyNet, self).__init__()  # 第一句話，調用父類的構造函數
        self.conv1 = torch.nn.Conv2d(3, 32, 3, 1, 1)
        self.relu1=torch.nn.ReLU()
        self.max_pooling1=torch.nn.MaxPool2d(2,1)
 
        self.conv2 = torch.nn.Conv2d(3, 32, 3, 1, 1)
        self.relu2=torch.nn.ReLU()
        self.max_pooling2=torch.nn.MaxPool2d(2,1)
 
        self.dense1 = torch.nn.Linear(32 * 3 * 3, 128)
        self.dense2 = torch.nn.Linear(128, 10)
 
    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.max_pooling1(x)
        x = self.conv2(x)
        x = self.relu2(x)
        x = self.max_pooling2(x)
        x = self.dense1(x)
        x = self.dense2(x)
        return x
 
model = MyNet()
print(model)

'''運行結果為：
MyNet(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (max_pooling1): MaxPool2d(kernel_size=2, stride=1, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu2): ReLU()
  (max_pooling2): MaxPool2d(kernel_size=2, stride=1, padding=0, dilation=1, ceil_mode=False)
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (dense2): Linear(in_features=128, out_features=10, bias=True)
)
'''

注意：上面的是將所有的層都放在了構造函數__init__里面，但是只是定義了一系列的層，各個層之間到底是什么連接關系並沒有，而是在forward里面實現所有層的連接關系，當然這里依然是順序連接的。下面再來看一下一個例子：

import torch
import torch.nn.functional as F
 
class MyNet(torch.nn.Module):
    def __init__(self):
        super(MyNet, self).__init__()  # 第一句話，調用父類的構造函數
        self.conv1 = torch.nn.Conv2d(3, 32, 3, 1, 1)
        self.conv2 = torch.nn.Conv2d(3, 32, 3, 1, 1)
 
        self.dense1 = torch.nn.Linear(32 * 3 * 3, 128)
        self.dense2 = torch.nn.Linear(128, 10)
 
    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = F.max_pool2d(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x)
        x = self.dense1(x)
        x = self.dense2(x)
        return x
 
model = MyNet()
print(model)

'''運行結果為：
MyNet(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv2): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (dense2): Linear(in_features=128, out_features=10, bias=True)
)
'''

注意：此時，將沒有訓練參數的層沒有放在構造函數里面了，所以這些層就不會出現在model里面，但是運行關系是在forward里面通過functional的方法實現的。

總結：所有放在構造函數__init__里面的層的都是這個模型的“固有屬性”。

三、torch.nn.Module類的的多種實現

上面是為了一個簡單的演示，但是Module類是非常靈活的，可以有很多靈活的實現方式，下面將一一介紹。

3.1 通過Sequential來包裝層

即將幾個層包裝在一起作為一個大的層（塊），前面的一篇文章詳細介紹了Sequential類的使用，包括常見的三種方式，以及每一種方式的優缺點，參見：https://blog.csdn.net/qq_27825451/article/details/90551513

所以這里對層的包裝當然也可以通過這三種方式了。

（1）方式一（簡單的序列堆疊）：

import torch.nn as nn
from collections import OrderedDict
class MyNet(nn.Module):
    def __init__(self):
        super(MyNet, self).__init__()
        self.conv_block = nn.Sequential(
            nn.Conv2d(3, 32, 3, 1, 1),
            nn.ReLU(),
            nn.MaxPool2d(2))
        self.dense_block = nn.Sequential(
            nn.Linear(32 * 3 * 3, 128),
            nn.ReLU(),
            nn.Linear(128, 10)
        )
    # 在這里實現層之間的連接關系，其實就是所謂的前向傳播
    def forward(self, x):
        conv_out = self.conv_block(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense_block(res)
        return out
 
model = MyNet()

同前面的文章，這里在每一個包裝塊里面，各個層是沒有名稱的，默認按照0、1、2、3、4來排名。

（2）方式二（通過有序字典OrderedDict定義）：

import torch.nn as nn
from collections import OrderedDict
class MyNet(nn.Module):
    def __init__(self):
        super(MyNet, self).__init__()
        self.conv_block = nn.Sequential(
            OrderedDict(
                [
                    ("conv1", nn.Conv2d(3, 32, 3, 1, 1)),
                    ("relu1", nn.ReLU()),
                    ("pool", nn.MaxPool2d(2))
                ]
            ))
 
        self.dense_block = nn.Sequential(
            OrderedDict([
                ("dense1", nn.Linear(32 * 3 * 3, 128)),
                ("relu2", nn.ReLU()),
                ("dense2", nn.Linear(128, 10))
            ])
        )
 
    def forward(self, x):
        conv_out = self.conv_block(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense_block(res)
        return out
 
model = MyNet()
print(model)
'''運行結果為：
MyNet(
  (conv_block): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU()
    (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense_block): Sequential(
    (dense1): Linear(in_features=288, out_features=128, bias=True)
    (relu2): ReLU()
    (dense2): Linear(in_features=128, out_features=10, bias=True)
  )
)
'''

(3)方式三(逐個添加模塊add_module())：

import torch.nn as nn
from collections import OrderedDict
class MyNet(nn.Module):
    def __init__(self):
        super(MyNet, self).__init__()
        self.conv_block=torch.nn.Sequential()
        self.conv_block.add_module("conv1",torch.nn.Conv2d(3, 32, 3, 1, 1))
        self.conv_block.add_module("relu1",torch.nn.ReLU())
        self.conv_block.add_module("pool1",torch.nn.MaxPool2d(2))
 
        self.dense_block = torch.nn.Sequential()
        self.dense_block.add_module("dense1",torch.nn.Linear(32 * 3 * 3, 128))
        self.dense_block.add_module("relu2",torch.nn.ReLU())
        self.dense_block.add_module("dense2",torch.nn.Linear(128, 10))
 
    def forward(self, x):
        conv_out = self.conv_block(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense_block(res)
        return out
 
model = MyNet()
print(model)
'''運行結果為：
MyNet(
  (conv_block): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU()
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense_block): Sequential(
    (dense1): Linear(in_features=288, out_features=128, bias=True)
    (relu2): ReLU()
    (dense2): Linear(in_features=128, out_features=10, bias=True)
  )
)
'''

上面的方式二和方式三，在每一個包裝塊里面，每個層都是有名稱的（通過字典定義）。

3.2 Module類的幾個常見方法的使用

Sequenrial類實現了整數索引，故而可以使用model[index] 這樣的方式獲取一個層，但是Module類並沒有實現整數索引，不能夠通過整數索引來獲得層，那該怎么辦呢？它提供了幾個主要的方法，如下：

def children(self):

def named_children(self):

def modules(self):

def named_modules(self, memo=None, prefix=''):

'''
注意：這幾個方法返回的都是一個Iterator迭代器，故而通過for循環訪問，當然也可以通過next
'''

下面就以上面的構建的網絡為例子來說明，

（1）model.children()方法

import torch.nn as nn
from collections import OrderedDict
class MyNet(nn.Module):
    def __init__(self):
        super(MyNet, self).__init__()
        self.conv_block=torch.nn.Sequential()
        self.conv_block.add_module("conv1",torch.nn.Conv2d(3, 32, 3, 1, 1))
        self.conv_block.add_module("relu1",torch.nn.ReLU())
        self.conv_block.add_module("pool1",torch.nn.MaxPool2d(2))
 
        self.dense_block = torch.nn.Sequential()
        self.dense_block.add_module("dense1",torch.nn.Linear(32 * 3 * 3, 128))
        self.dense_block.add_module("relu2",torch.nn.ReLU())
        self.dense_block.add_module("dense2",torch.nn.Linear(128, 10))
 
    def forward(self, x):
        conv_out = self.conv_block(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense_block(res)
        return out
 
model = MyNet()
 
for i in model.children():
    print(i)
    print(type(i)) # 查看每一次迭代的元素到底是什么類型，實際上是 Sequential 類型,所以可以使用下標index索引來獲取每一個Sequenrial 里面的具體層
 
'''運行結果為：
Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
<class 'torch.nn.modules.container.Sequential'>
Sequential(
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (relu2): ReLU()
  (dense2): Linear(in_features=128, out_features=10, bias=True)
)
<class 'torch.nn.modules.container.Sequential'>
'''

（2）model.named_children()方法

for i in model.named_children():
    print(i)
    print(type(i)) # 查看每一次迭代的元素到底是什么類型，實際上是 返回一個tuple,tuple 的第一個元素是
 
'''運行結果為：
('conv_block', Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
))
<class 'tuple'>
('dense_block', Sequential(
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (relu2): ReLU()
  (dense2): Linear(in_features=128, out_features=10, bias=True)
))
<class 'tuple'>
'''

總結：

（1）model.children()和model.named_children()方法返回的是迭代器iterator；

（2）model.children():每一次迭代返回的每一個元素實際上是 Sequential 類型,而Sequential類型又可以使用下標index索引來獲取每一個Sequenrial 里面的具體層，比如conv層、dense層等；
（3）model.named_children():每一次迭代返回的每一個元素實際上是一個元組類型，元組的第一個元素是名稱，第二個元素就是對應的層或者是Sequential。

（3）model.modules()方法

for i in model.modules():
    print(i)
    print("==================================================")
'''運行結果為：
MyNet(
  (conv_block): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU()
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense_block): Sequential(
    (dense1): Linear(in_features=288, out_features=128, bias=True)
    (relu2): ReLU()
    (dense2): Linear(in_features=128, out_features=10, bias=True)
  )
)
==================================================
Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
==================================================
Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
==================================================
ReLU()
==================================================
MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
==================================================
Sequential(
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (relu2): ReLU()
  (dense2): Linear(in_features=128, out_features=10, bias=True)
)
==================================================
Linear(in_features=288, out_features=128, bias=True)
==================================================
ReLU()
==================================================
Linear(in_features=128, out_features=10, bias=True)
==================================================
'''

（4）model.named_modules()方法

for i in model.named_modules():
    print(i)
    print("==================================================")
'''運行結果是：
('', MyNet(
  (conv_block): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU()
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense_block): Sequential(
    (dense1): Linear(in_features=288, out_features=128, bias=True)
    (relu2): ReLU()
    (dense2): Linear(in_features=128, out_features=10, bias=True)
  )
))
==================================================
('conv_block', Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
))
==================================================
('conv_block.conv1', Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
==================================================
('conv_block.relu1', ReLU())
==================================================
('conv_block.pool1', MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False))
==================================================
('dense_block', Sequential(
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (relu2): ReLU()
  (dense2): Linear(in_features=128, out_features=10, bias=True)
))
==================================================
('dense_block.dense1', Linear(in_features=288, out_features=128, bias=True))
==================================================
('dense_block.relu2', ReLU())
==================================================
('dense_block.dense2', Linear(in_features=128, out_features=10, bias=True))
==================================================
'''

總結：

（1）model.modules()和model.named_modules()方法返回的是迭代器iterator；

（2）model的modules()方法和named_modules()方法都會將整個模型的所有構成（包括包裝層、單獨的層、自定義層等）由淺入深依次遍歷出來，只不過modules()返回的每一個元素是直接返回的層對象本身，而named_modules()返回的每一個元素是一個元組，第一個元素是名稱，第二個元素才是層對象本身。

（3）如何理解children和modules之間的這種差異性。注意pytorch里面不管是模型、層、激活函數、損失函數都可以當成是Module的拓展，所以modules和named_modules會層層迭代，由淺入深，將每一個自定義塊block、然后block里面的每一個層都當成是module來迭代。而children就比較直觀，就表示的是所謂的“孩子”，所以沒有層層迭代深入。
（model.children僅返回子代，model.modules返回model自身、model的子代、子代的子代，model.named_children，model.named_modules返回對應的tuple版本，tuple的第一個元素是名字，第二個元素是sequential或層）

注意：上面這四個方法是以層包裝為例來說明的，如果沒有層的包裝，我們依然可以使用這四個方法，其實結果也是類似的這樣去推，這里就不再列出來了。

原文鏈接：https://blog.csdn.net/qq_27825451/java/article/details/90550890

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 pytorch——nn.Module pytorch nn.Module()模塊 [PyTorch 學習筆記] 3.1 模型創建步驟與 nn.Module 『PyTorch』第十二彈_nn.Module和nn.functional 『PyTorch x TensorFlow』第八彈_基本nn.Module層函數『PyTorch』第七彈_nn.Module擴展層 Pytorch 學習筆記之自定義 Module Pytorch模型中的parameter與buffer（torch.nn.Module的成員）小白學習之pytorch框架(2)-動手學深度學習(begin-random.shuffle()、torch.index_select()、nn.Module、nn.Sequential()) pytorch--nn.module里的parameters和buffer的區別與定義