1.Conv3d
class torch.nn.Conv3d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)
Parameters:
- in_channels(
int
) – 輸入信號的通道 - out_channels(
int
) – 卷積產生的通道 - kernel_size(
int
ortuple
) - 卷積核的尺寸 - stride(
int
ortuple
,optional
) - 卷積步長 - padding(
int
ortuple
,optional
) - 輸入的每一條邊補充0的層數 - dilation(
int
ortuple
,optional
) – 卷積核元素之間的間距 - groups(
int
,optional
) – 從輸入通道到輸出通道的阻塞連接數 - bias(
bool
,optional
) - 如果bias=True
,添加偏置
三維卷積層, 輸入的尺度是(N, C_in,D,H,W),輸出尺度(N,C_out,D_out,H_out,W_out)
shape:input
: (N,C_in,D_in,H_in,W_in)output
: (N,C_out,D_out,H_out,W_out)
注意:3D卷積的輸入是5維的tensor
官網案例:
>>> # With square kernels and equal stride >>> m = nn.Conv3d(16, 33, 3, stride=2) >>> # non-square kernels and unequal stride and with padding >>> m = nn.Conv3d(16, 33, (3, 5, 2), stride=(2, 1, 1), padding=(4, 2, 0)) >>> input = autograd.Variable(torch.randn(20, 16, 10, 50, 100)) >>> output = m(input)
2.nn.Conv2d
nn.Conv2d(self, in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True))
參數:
in_channel: 輸入數據的通道數,例RGB圖片通道數為3;
out_channel: 輸出數據的通道數,這個根據模型調整;
kennel_size: 卷積核大小,可以是int,或tuple;kennel_size=2,意味着卷積大小(2,2), kennel_size=(2,3),意味着卷積大小(2,3)即非正方形卷積
stride:步長,默認為1,與kennel_size類似,stride=2,意味着步長上下左右掃描皆為2, stride=(2,3),左右掃描步長為2,上下為3;
padding: 零填充
案例:
import torch import torch.nn as nn x = torch.randn(10, 16, 30, 32) # batch, channel , height , width print(x.shape) m = nn.Conv2d(16, 33, (3, 2), (2,1)) # in_channel, out_channel ,kennel_size,stride print(m) y = m(x) print(y.shape)
控制台輸出:
卷積計算過程:
h/w = (h/w - kennel_size + 2padding) / stride + 1
x = ([10,16,30,32]),其中h=30,w=32,對於卷積核長分別是 h:3,w:2 ;對於步長分別是h:2,w:1;padding默認0;
h = (30 - 3 + 2*0)/ 2 +1 = 27/2 +1 = 13+1 =14
w =(32 - 2 + 2*0)/ 1 +1 = 30/1 +1 = 30+1 =31
batch = 10, out_channel = 33
故: y= ([10, 33, 14, 31])
3.單通道與多通道卷積
(1)單通道卷積核卷積過程:
32個卷積核,可以學習32種特征。在有多個卷積核時,輸出就為32個feature map
conv2d( in_channels = 1 , out_channels = N)
有N個filter對輸入進行濾波。同時輸出N個結果即feature map,每個filter濾波輸出一個結果.
(2)多通道卷積
conv2d( in_channels = X(x>1) , out_channels = N)
有N乘X個filter(N組filters,每組X 個)對輸入進行濾波。即每次有一組里X個filter對原X個channels分別進行濾波最后相加輸出一個結果,最后輸出N個結果即feature map。
參考文獻:
https://blog.csdn.net/qq_26369907/article/details/88366147
https://www.pytorchtutorial.com/docs
https://zhuanlan.zhihu.com/p/32190799