【神經網絡】CNN在Pytorch中的使用

本文轉載自查看原文 2021-01-05 12:22 489 深度學習

因為研究方向為關系抽取，所以在文本的處理方面，一維卷積方法是很有必要掌握的，簡單介紹下加深學習印象。

Pytorch官方參數說明：

Conv1d

class torch.nn.Conv1d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

in_channels(int) – 輸入信號的通道。在文本分類中，即為詞向量的維度
out_channels(int) – 卷積產生的通道。有多少個out_channels，就需要多少個1維卷積
kernel_size(int or tuple) - 卷積核的尺寸，卷積核的大小為(k,)，第二個維度是由in_channels來決定的，所以實際上卷積大小為kernel_size*in_channels
stride(int or tuple, optional) - 卷積步長
padding (int or tuple, optional)- 輸入的每一條邊補充0的層數
dilation(int or tuple, `optional``) – 卷積核元素之間的間距
groups(int, optional) – 從輸入通道到輸出通道的阻塞連接數
bias(bool, optional) - 如果bias=True，添加偏置

有一位博主解釋的很清楚，我附上他的內容，閱讀連接我會在下方提供。

1 conv1 = nn.Conv1d(in_channels=256，out_channels=100,kernel_size=2)
2 input = torch.randn(32,35,256)
3 # batch_size x text_len x embedding_size -> batch_size x embedding_size x text_len
4 input = input.permute(0,2,1)
5 out = conv1(input)
6 print(out.size())

這里32為batch_size，35為句子最大長度，256為詞向量

再輸入一維卷積的時候，需要將32*35*256變換為32*256*35，因為一維卷積是在最后維度上掃的，最后out的大小即為：32*100*（35-2+1）=32*100*34

圖中輸入的詞向量維度為5，輸入大小為7*5，一維卷積和的大小為2、3、4，每個都有兩個，總共6個特征。

對於k=4，見圖中紅色的大矩陣，卷積核大小為4*5，步長為1。這里是針對輸入從上到下掃一遍，輸出的向量大小為((7-4)/1+1)*1=4*1，最后經過一個卷積核大小為4的max_pooling，變成1個值。最后獲得6個值，進行拼接，在經過一個全連接層，輸出2個類別的概率。

附上的代碼詳解如下：

其中，embedding_size=256, feature_size=100, window_sizes=[3,4,5,6], max_text_len=35

 1 class TextCNN(nn.Module):
 2     def __init__(self, config):
 3         super(TextCNN, self).__init__()
 4         self.is_training = True
 5         self.dropout_rate = config.dropout_rate
 6         self.num_class = config.num_class
 7         self.use_element = config.use_element
 8         self.config = config
 9  
10         self.embedding = nn.Embedding(num_embeddings=config.vocab_size, 
11                                 embedding_dim=config.embedding_size)
12         self.convs = nn.ModuleList([
13                 nn.Sequential(nn.Conv1d(in_channels=config.embedding_size, 
14                                         out_channels=config.feature_size, 
15                                         kernel_size=h),
16 #                              nn.BatchNorm1d(num_features=config.feature_size), 
17                               nn.ReLU(),
18                               nn.MaxPool1d(kernel_size=config.max_text_len-h+1))
19                      for h in config.window_sizes
20                     ])
21         self.fc = nn.Linear(in_features=config.feature_size*len(config.window_sizes),
22                             out_features=config.num_class)
23         if os.path.exists(config.embedding_path) and config.is_training and config.is_pretrain:
24             print("Loading pretrain embedding...")
25             self.embedding.weight.data.copy_(torch.from_numpy(np.load(config.embedding_path)))    
26     
27     def forward(self, x):
28         embed_x = self.embedding(x)
29         
30         #print('embed size 1',embed_x.size())  # 32*35*256
31 　　　　 # batch_size x text_len x embedding_size  -> batch_size x embedding_size x text_len
32         embed_x = embed_x.permute(0, 2, 1)
33         #print('embed size 2',embed_x.size())  # 32*256*35
34         out = [conv(embed_x) for conv in self.convs]  #out[i]:batch_size x feature_size*1
35         #for o in out:
36         #    print('o',o.size())  # 32*100*1
37         out = torch.cat(out, dim=1)  # 對應第二個維度（行）拼接起來，比如說5*2*1,5*3*1的拼接變成5*5*1
38         #print(out.size(1)) # 32*400*1
39         out = out.view(-1, out.size(1)) 
40         #print(out.size())  # 32*400 
41         if not self.use_element:
42             out = F.dropout(input=out, p=self.dropout_rate)
43             out = self.fc(out)
44         return out

embed_x一開始大小為32*35*256，32為batch_size。經過permute，變為32*256*35，輸入到自定義的網絡后，out中的每一個元素，大小為32*100*1，共有4個元素。在dim=1維度上進行拼接后，變為32*400*1，在經過view，變為32*400，最后通過400*num_class大小的全連接矩陣，變為32*2。

我在關系抽取的論文閱讀中，作者使用CNN卷積代碼如下：

 1 class CNN3(nn.Module):
 2     def __init__(self, config):
 3         super(CNN3, self).__init__()
 4         self.config = config
 5         self.word_emb = nn.Embedding(config.data_word_vec.shape[0], config.data_word_vec.shape[1])
 6         self.word_emb.weight.data.copy_(torch.from_numpy(config.data_word_vec))
 7         self.word_emb.weight.requires_grad = False
 8 
 9         # self.char_emb = nn.Embedding(config.data_char_vec.shape[0], config.data_char_vec.shape[1])
10         # self.char_emb.weight.data.copy_(torch.from_numpy(config.data_char_vec))
11         # char_dim = config.data_char_vec.shape[1]
12         # char_hidden = 100
13         # self.char_cnn = nn.Conv1d(char_dim,  char_hidden, 5)
14 
15         self.coref_embed = nn.Embedding(config.max_length, config.coref_size, padding_idx=0)
16         self.ner_emb = nn.Embedding(7, config.entity_type_size, padding_idx=0)
17         input_size = config.data_word_vec.shape[1] + config.coref_size + config.entity_type_size #+ char_hidden
18         #140維
19         self.out_channels = 200
20         self.in_channels = input_size
21         self.kernel_size = 3
22         self.stride = 1
23         self.padding = int((self.kernel_size - 1) / 2)
24         self.cnn_1 = nn.Conv1d(self.in_channels, self.out_channels, self.kernel_size, self.stride, self.padding)
25         self.cnn_2 = nn.Conv1d(self.out_channels, self.out_channels, self.kernel_size, self.stride, self.padding)
26         self.cnn_3 = nn.Conv1d(self.out_channels, self.out_channels, self.kernel_size, self.stride, self.padding)
27         self.max_pooling = nn.MaxPool1d(self.kernel_size, stride=self.stride, padding=self.padding)
28         self.relu = nn.ReLU()
29         self.dropout = nn.Dropout(config.cnn_drop_prob)
30         self.bili = torch.nn.Bilinear(self.out_channels+config.dis_size, self.out_channels+config.dis_size, config.relation_num)
31         self.dis_embed = nn.Embedding(20, config.dis_size, padding_idx=10)
32 
33     # model(context_idxs, context_pos, context_ner, context_char_idxs, input_lengths, h_mapping, t_mapping, relation_mask, dis_h_2_t, dis_t_2_h)
34     def forward(self, context_idxs, pos, context_ner, context_char_idxs, context_lens, h_mapping, t_mapping, relation_mask, dis_h_2_t, dis_t_2_h):
35         # para_size, char_size, bsz = context_idxs.size(1), context_char_idxs.size(2), context_idxs.size(0)
36         # context_ch = self.char_emb(context_char_idxs.contiguous().view(-1, char_size)).view(bsz * para_size, char_size, -1)
37         # context_ch = self.char_cnn(context_ch.permute(0, 2, 1).contiguous()).max(dim=-1)[0].view(bsz, para_size, -1)
38 
39         # self.word_emb(context_idxs).shape = [40,512,config.data_word_vec.shape[1]]
40         # self.coref_embed(pos) = [40,512,config.coref_size]
41         # self.coref_embed(pos) = [40,512,config.coref_size]
42         sent = torch.cat([self.word_emb(context_idxs), self.coref_embed(pos), self.ner_emb(context_ner)], dim=-1)
43         sent = sent.permute(0, 2, 1)#torch.Size([40, 140, 512])
44 
45         # batch * embedding_size * max_len
46         x = self.cnn_1(sent)  #（b,140,512）->(b,200,512)
47         x = self.max_pooling(x)#(b,200,512)->（b,200,512）
48         x = self.relu(x)
49         x = self.dropout(x)
50 
51         x = self.cnn_2(x)#(b,200,512)->(b,200,512)
52         x = self.max_pooling(x)#(b,200,512)->(b,200,512)
53         x = self.relu(x)
54         x = self.dropout(x)
55 
56         x = self.cnn_3(x)#(b,200,512)->(b,200,512)
57         x = self.max_pooling(x)#(b,200,512)->(b,200,512)
58         x = self.relu(x)
59         x = self.dropout(x)
60         context_output = x.permute(0, 2, 1)     #(b,512,200)  因為每一步都pading=1加了兩列，所以最后輸出沒有區別
61         start_re_output = torch.matmul(h_mapping, context_output)       #(b,1800,512)*(b,512,200) ->(b,1800,200)
62         end_re_output = torch.matmul(t_mapping, context_output)
63         s_rep = torch.cat([start_re_output, self.dis_embed(dis_h_2_t)], dim=-1)   #(b,1800,200+20)
64         t_rep = torch.cat([end_re_output, self.dis_embed(dis_t_2_h)], dim=-1)
65         predict_re = self.bili(s_rep, t_rep)   #(b,1800,97)
66         return predict_re

作者定義了三層CNN，第一層通過增加通道數並使用大小為3的卷積核來提取特征，為了使句子最大長度維度每次經過卷積層不變而設計了padding

並不是直接通過CNN進行關系分類，先通過第一層提取特征，后經過兩層CNN（數據shape未變化，功能？），將通過3層CNN得到的context_ouput，此時context_output已經含有了一些文檔級信息，然后將h_mapping中含有的頭實體mask的信息與ontext_ouput相乘，將t_mapping中含有的尾實體mask的信息與ontext_ouput相乘。

接着將頭實體到尾實體的距離特征與start_re_output的特征維度上進行拼接，將尾實體到頭實體的距離特征與end_re_output的特征維度上進行拼接

最后將兩個數據均送入與定義好的雙線性層進行預測，得到預測的結果。

可以看到35-37行代碼作者嘗試使用Glove訓練好的字符的預訓練量，可能遇到問題放棄了。因此效果並不好。

Conv2d

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

in_channels	Number of channels in the input image
out_channels	Number of channels produced by the convolution
kernel_size	卷積核尺寸
stride	步長，控制cross-correlation的步長，可以設為1個int型數或者一個(int, int)型的tuple。
padding	(補0)：控制zero-padding的數目。
dilation	(擴張)：控制kernel點（卷積核點）的間距
groups	(卷積核個數)：通常來說，卷積個數唯一，但是對某些情況，可以設置范圍在1 —— in_channels中數目的卷積核：
bias	adds a learnable bias to the output.

實例：

1 import torch
2 x = torch.randn(2,1,7,3)
3 conv = torch.nn.Conv2d(1,8,(2,3))
4 res = conv(x)
5 print(res.shape)    # shape = (2, 8, 6, 1)

輸入x：

[ batch_size, channels, height_1, width_1 ]

batch_size	一個batch中樣例的個數	2
channels	通道數，也就是當前層的深度	1
height_1	圖片的高	7
width_1	圖片的寬	3

Conv2d的參數

[ channels, output, height_2, width_2 ]

channels	通道數，和上面保持一致，也就是當前層的深度	1
output	輸出的深度	8
height_2	過濾器filter的高	2
weight_2	過濾器filter的寬	3

輸出res

[batch_size,output,height_3,width_3]

batch_size	一個batch中樣例的個數，同上	2
output	輸出的深度	8
height_3	卷積結果的高度	h1-h2+1 = 7-2+1 = 6
weight_3	卷積結果的寬度	w1-w2+1 = 3-3+1 = 1

Shape:

　　$(N,C_{in},H_{in},W_{in})$

　　$(N,C_{out},H_{out},W_{out})$

$H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[0] - \text{dilation}[0] \times (\text{kernel_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor$

$W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[1] - \text{dilation}[1] \times (\text{kernel_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor$

參考：

pytorch 中nn.MaxPool1d() 和nn.MaxPool2d()對比: https://www.jianshu.com/p/c5b8e02bedbe

pytorch中的nn.Bilinear的計算原理詳解： https://blog.csdn.net/nihate/article/details/90480459

pytorch之nn.Conv1d詳解：https://blog.csdn.net/sunny_xsc1994/article/details/82969867

pytorch中的matmul: https://blog.csdn.net/yu_1628060739/article/details/102720385

torch.nn.Conv2d()函數詳解：https://blog.csdn.net/m0_37586991/article/details/87855342

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 卷積神經網絡初體驗——使用pytorch搭建CNN pytorch-卷積神經網絡(cnn) pytorch卷積神經網絡CNN實例基於pytorch的cnn卷積神經網絡——代碼 Pytorch實現卷積神經網絡CNN 【神經網絡】LSTM在Pytorch中的使用基於pytorch的簡易卷積神經網絡結構搭建-卷積神經網絡（CNN）淺析卷積神經網絡(CNN) 卷積神經網絡（CNN）神經網絡入門之CNN（二）