簡介
圖像分類對網絡結構的要求,一個是精度,另一個是速度。這兩個需求推動了網絡結構的發展。
- resneXt:分組卷積,降低了網絡參數個數。
- densenet:密集的跳連接。
- mobilenet:標准卷積分解成深度卷積和逐點卷積,即深度分離卷積。
- SENet:注意力機制。
簡單起見,使用了[1]的代碼,注釋掉 layer4,作為基本框架resnet14。然后改變局部結構,驗證分類效果。
實驗結果
GPU:gtx1070
超參數:epochs=80,lr=0.001,optim=Adam
數據集:cifar10,batch_size=100
分組卷積
# 3x3 convolution with grouping
def conv3x3(in_channels, out_channels, stride=1, groups=1):
return nn.Conv2d(in_channels, out_channels, kernel_size=3,
stride=stride, padding=1, bias=False,groups=groups)
_ | 參數個數(k) | GPU內存(M) | 訓練時間(s) | 測試時間(s) | 精度(%) |
---|---|---|---|---|---|
resnet14 | 195 | 617 | 665 | 0.34 | 87 |
分組=2 | 99 | 615 | 727 | 0.40 | 85 |
分組=4 | 50 | 615 | 834 | 0.50 | 81 |
結論:卷積分組降低了參數個數,同時也降低了速度和精度。
密集連接
def forward(self, x): # basic block
residual = x
if self.downsample:
residual = self.downsample(x)
out = self.layer1(x)
out = self.relu(out)
out2 = self.layer2(out)
out2 = self.relu(out2)
out3 = torch.cat([out,out2],1)
out = self.layer3(out3)
out4 = self.relu(out)
out5 = torch.cat([out3,out4],1)
out = self.layer4(out5) # back to the specified channels
return out
_ | 參數個數(k) | GPU內存(M) | 訓練時間(s) | 測試時間(s) | 精度(%) |
---|---|---|---|---|---|
resnet14 | 195 | 617 | 665 | 0.34 | 87 |
密集連接 | 341 | 679 | 703 | 0.43 | 88 |
結論:參數個數和精度有所增加,速度下降一點點。
深度分離卷積
def Conv2d(in_channels, out_channels,kernel_size=1,padding=0,stride=1):
return nn.Sequential(*[
nn.Conv2d(in_channels, in_channels,kernel_size,stride=stride,padding=padding,groups=in_channels,bias=False),
nn.Conv2d(in_channels, out_channels,1,bias=False),
])
_ | 參數個數(k) | GPU內存(M) | 訓練時間(s) | 測試時間(s) | 精度(%) |
---|---|---|---|---|---|
resnet14 | 195 | 617 | 665 | 0.34 | 87 |
分組=2 | 99 | 615 | 727 | 0.40 | 85 |
分組=4 | 50 | 615 | 834 | 0.50 | 81 |
深度分離卷積 | 27 | 665 | 788 | 0.40 | 84 |
結論:深度分離卷積降低了參數個數,同時也降低了速度和精度。與分組卷積(分組=4)相比,精度要高一點。
注意力機制
利用[2]的代碼,修正通道個數
def forward(self, x): # BasicBlock
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
if self.downsample:
residual = self.downsample(x)
# attention
original_out = out
out = F.avg_pool2d(out,out.size()[2:])
out = out.view(out.size(0), -1)
out = self.fc1(out)
out = self.relu(out)
out = self.fc2(out)
out = self.sigmoid(out)
out = out.view(out.size(0), out.size(1), 1, 1)
out = out * original_out
out += residual
out = self.relu(out)
return out
_ | 參數個數(k) | GPU內存(M) | 訓練時間(s) | 測試時間(s) | 精度(%) |
---|---|---|---|---|---|
resnet14 | 195 | 617 | 665 | 0.34 | 87 |
注意力 | 201 | 641 | 838 | 0.51 | 87 |
結論:參數個數和精度變動不大,速度降低比較明顯。
引用
[1] https://github.com/yunjey/pytorch-tutorial/tree/master/tutorials/02-intermediate/deep_residual_network/main.py
[2] https://github.com/miraclewkf/SENet-PyTorch/blob/master/se_resnet.py
參考文獻
- Chollet, François. Xception: Deep Learning with Depthwise Separable Convolutions[J]. 2016.
- Xie S , Girshick R , Dollár, Piotr, et al. Aggregated Residual Transformations for Deep Neural Networks[J]. 2016.
- Huang G, Liu Z, Laurens V D M, et al. Densely Connected Convolutional Networks[J]. 2016.
- Howard A G , Zhu M , Chen B , et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications[J]. 2017.
- Hu J , Shen L , Albanie S , et al. Squeeze-and-Excitation Networks[J]. 2017.
- https://www.cnblogs.com/liaohuiqiang/p/9691458.html