经典卷积神经网络(CNN)复现-pytorch，VGGNet,Inception-v1(GoogleNet),BN-Inception,Resnet-v1,Resnet-v2

本文转载自查看原文 2018-12-14 22:28 2893 经典卷积神经网络/ CNN

在CIFAR10上的正确率

在CIFAR10上的正确率

这里我都是取了最好的结果，同一模型还有更细致的对比实验，详情参见实验对比。

MODEL	ACCURACY
VGG16	90.06%
BN-VGG16	92.31%
BN-Inception	92.41%
Inception-v3	92.94%
ResNet-v1	93.54%
ResNet-v2	95.35%

这里只讲解了ResNet，更多代码见__我的GitHub__

ResNet的pytorch实现

import torch.nn as nn
import torch.utils.model_zoo as model_zoo

#实现的resnet种类和与训练参数的地址
__all__=['ResNet','resnet18','resnet34','resnet50','resnet101','resnet152']

model_urls = {
    'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
    'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
    'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
    'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
    'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
}

def conv3x3(in_planes,out_planes,stride=1):
    '''3x3 convolution with padding'''
    return nn.Conv2d(in_planes,out_planes,kernel_size=3,stride=stride,padding=1,bias=False)

def conv1x1(in_planes,out_planes,stride=1):
    '''1x1 convolution'''
    return nn.Conv2d(in_planes,out_planes,kernel_size=1,stride=stride,bias=False)

普通残差模块的实现

\[H(x)=F(x)+x \]

class BasicBlock(nn.Module):
    expansion=1

    def __init__(self,inplanes,planes,stride=1,downsample=None):
        super(BasicBlock,self).__init__()
        self.conv1=conv3x3(inplanes,planes,stride)
        self.bn1=nn.BatchNorm2d(planes)
        self.relu=nn.ReLU(inplace=True)
        self.conv2=conv3x3(planes,planes)
        self.bn2=nn.BatchNorm2d(planes)
        self.downsample=downsample
        self.stride=stride

    def forward(self,x):
        residual=x

        out=self.conv1(x)
        out=self.bn1(out)
        out=self.relu(out)

        out=self.conv2(out)
        out=self.bn2(out)
        
        if self.downsample is not None:
            residual=self.downsample(x)
        #只有通道数翻倍的时候，空间分辨率才会缩小
        #也就是只有每个大模块的第一次卷积会使用stide=2
        
        out+=residual
        out=self.relu(out)

        return out

BottleNeck残差的实现

可以减少参数，在比较深的网络中用BottleNeck模块替换普通模块可以大幅减少参数

class Bottleneck(nn.Module):
    expansion=4

    def __init__(self,inplanes,planes,stride=1,downsample=None):
        super(Bottleneck,self).__init__()
        self.conv1=conv1x1(inplanes,planes)
        self.bn1=nn.BatchNorm2d(planes)
        self.conv2=conv3x3(planes,planes,stride)
        self.bn2=NN.BatchNorm2d(planes)
        self.conv3=conv1x1(planes,planes*self.expansion)
        #一般1x1会将通道缩小4倍，在这里还原
        
        self.relu=nn.ReLU(inplace=True)
        self.downsample=downsample
        self.stride=stride

    def forward(self,x):
        residual=x

        out=self.conv1(x)
        out=self.bn1(out)
        out=self.relu(out)

        out=self.conv2(out)
        out=self.bn2(out)
        out=self.relu(out)

        out=self.conv3(out)
        out=self.bn(out)
        
        if self.downsample is not None:
            residual=self.downsample(x)

        out+=residual
        out=self.relu(out)

        return out

ResNet的整体结构：

不同深度resnet的详细参数

class ResNet(nn.Module):
    
    def __init__(self, bolck,layer,num_classes=1000):
        '''
        block:残差模块的选择，普通/瓶颈
        layer：每个模块重复的次数
        numclasses：fc层的输出个数，也就是分类的个数
        '''
        super(ResNet, self).__init__()
        
        self.inplanes=64
        self.conv1=nn.Conv2d(3,64,kernel_size=7,stride=2,padding=3,bias=False)
        self.bn1=nn.BatchNorm2d(64)
        self.relu=nn.ReLU(inplace=True)
        self.maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=1)
        #所有深度的resnet第一步操作都是相同的
        #input：224x224 output：56x56  conv1和maxpool都进行了downsample
        #这里的第一步消耗掉的信息太多了，目的应该是为了节省内存空间
        
        self.layer1=self._make_layer(block,64,layers[0])
        self.layer2=self._make_layer(block,128,layers[1],stride=2)
        self.layer3=self._make_layer(block,256,layers[2],stride=2)
        self.layer4=self._make_layer(block,512,layers[3],stride=2)
        #每一层通道翻倍，空间分辨率宽高减半
        
        self.avgpool=nn.AdaptiveAvgPool2d((1,1))
        self.fc=nn.Linear(512*block.expansion,num_classes)

        for m in self.modules():
            if isinstance(m,nn.Conv2d):
                nn.init.kaiming_normal_(m.weight,mode='fan_out',nonlinearity='relu')
            elif isinstance(m,nn.BatchNorm2d):
                nn.init.constant_(m.weight,1)
                nn.init.constant_(m.bias,0)

    def _make_layer(self,bolck,planes,bolcks,stride=1):
        downsample=None
        if stride!=1 or self.inplanes!=planes*block.expansion:
            downsample=nn.Sequential(
                conv1x1(self.inplanes,planes*block.expansion,stride),
                nn.BatchNorm2d(planes*block.expansion),
                )
            #如果stride！=1，说明x也需要降采样。
            #但是这里空间降采样的同时重组了通道信息。

        layers=[]
        layers.append(block(self.inplanes,planes,stride,downsample))
        self.inplanes=planes*block.expansion
        for _ in range(1,blocks):
            layers.append(block(self.inplanes,planes))

        return nn.Sequential(*layers)

    def forward(self,x):
        x=self.conv1(x)
        x=self.bn1(x)
        x=self.relu(x)
        x=self.maxpool(x)

        x=self.layer1(x)
        x=self.layer2(x)
        x=self.layer3(x)
        x=self.layer4(x)

        x=self.avgpool(x)
        x=x.view(x.size(0),-1)
        x=self.fc(x)

        return x

def resnet18(pretrained=False,**kwargs):
    model=ResNet(BasicBlock,[2,2,2,2],**kwargs)
    if pretrained:
        model.load_state_dict(model_zoo.load_url(model_urls['resnet18']))
    return model


def resnet34(pretrained=False,**kwargs):
    model=ResNet(BasicBlock,[3,4,6,3],**kwargs)
    if pretrained:
        model.load_state_dict(model_zoo.load_url(model_urls['resnet34']))
    return model


def resnet50(pretrained=False,**kwargs):
    model=ResNet(Bottleneck,[3,4,6,3],**kwargs)
    if pretrained:
        model.load_state_dict(model_zoo.load_url(model_urls['resnet50']))
    return model

def resnet101(pretrained=False,**kwargs):
    model=ResNet(Bottleneck,[3,4,23,3],**kwargs)
    if pretrained:
        model.load_state_dict(model_zoo.load_url(model_urls['resnet101']))
    return model

def resnet152(pretrained=False,**kwargs):
    model=ResNet(Bottleneck,[3,8,36,3],**kwargs)
    if pretrained:
        model.load_state_dict(model_zoo.load_url(model_urls['resnet152']))
    return model

改进的残差结构

上面介绍的原始的残差结构效果很不错，但是到了1202层网络之后，出现了比101层的网络效果更差，根据作者的分析，网络越深应该效果越好，出现这种情况的原因是因为目标函数不太好优化。

作者统一了表达式：

\[y_l=h(x_l)+F(x_l,W_l)\\ x_(l+1)=f(y_l) \]

原始的shortcut采用了恒等映射，所以\(h(x_l)=x\)，但是\(x(l+1)=relu(x+F(x_l,W_l))\)，不是恒等映射。作者认为这样的信息流是造成优化的难度，所以作者把这两步操作都改成了恒等操作：

\[x_(l+1)=x_l+F(x_l,W_l) \]

所以对于任意层L的特征可以表示为

\[x_L=x_l+\sum_{i=l}^{L-1}F(x_i,W_i) \]

这样任意层的梯度：

\[\frac{\partial \epsilon}{\partial x_l}\\ =\frac{\partial \epsilon}{\partial x_L} \frac{\partial x_L}{\partial x_l}\\ =\frac{\partial \epsilon}{\partial x_L} (1+\frac{\partial}{\partial x_L} \sum_{i=l}^{L-1}F(x_i,W_i) ) \]

这样题目就可以传播到任意浅层，不会出现梯度消失的情况。而且第二项有正有负，初始化适当的情况下反向梯度会成高斯分布，也不会出现梯度爆炸。

class BasicBlock(nn.Module):
    expansion=1

    def __init__(self,inplanes,planes,stride=1,downsample=None):
        super(BasicBlock,self).__init__()
        self.bn1=nn.BatchNorm2d(inplanes)
        self.relu=nn.ReLU(inplace=True)
        self.conv1=conv3x3(inplanes,planes,stride)

        self.bn2=nn.BatchNorm2d(planes)
        self.conv2=conv3x3(planes,planes)
        self.downsample=downsample
        self.stride=stride

    def forward(self,x):
        residual=x

        out=self.bn1(x)
        out=self.relu(out)
        out=self.conv1(out)

        out=self.bn2(out)
        out=self.relu(out)
        out=self.conv2(out)
        
        if self.downsample is not None:
            residual=self.downsample(x)

        out+=residual

        return out

class Bottleneck(nn.Module):
    expansion=4

    def __init__(self,inplanes,planes,stride=1,downsample=None):
        super(Bottleneck,self).__init__()
        self.bn1=nn.BatchNorm2d(inplanes)
        self.conv1=conv1x1(inplanes,planes)

        self.bn2=nn.BatchNorm2d(planes)
        self.conv2=conv3x3(planes,planes,stride)
        
        self.bn3 = nn.BatchNorm2d(planes)
        self.conv3=conv1x1(planes,planes*self.expansion)
        
        self.relu=nn.ReLU(inplace=True)
        self.downsample=downsample
        self.stride=stride

    def forward(self,x):
        residual=x

        out=self.bn1(x)
        out=self.relu(out)
        out=self.conv1(out)
        
        out=self.bn2(out)
        out=self.relu(out)
        out=self.conv2(out)
        
        out=self.bn3(out)
        out=self.relu(out)
        out=self.conv3(out)
        
        
        if self.downsample is not None:
            residual=self.downsample(x)

        out+=residual

        return out

实验对比

在__CIFAI10__上的测试结果：

ResNet：

这里是原论文中描述的实现18,34,50,101,154层resnet-v1的效果，忘记加标记了,左边是18和34层。右边是50,101,154层。层数越多acc越低。

为了适应CIFAR10的低分辨率，我把第一层的卷积核从7x7改成了3x3,并且保持分辨率，后面遵从原论文的设置。为了对比v1和v2的效果，达到实验中说的v1层数越多效果反而越差的效果，我最高设到了301层，但是没有明显的效果，1000多层的实在是跑不动，302层我在GTX 2080Ti上跑了10个小时。看实验图：

可以看到随着网络加深，v1版本的正确率基本不变了，但是v2版本的还有微小的提升。而且横向对比，ResNet-v2也更有优势。

VGG16：

vgg16_bn初始学习率0.1，vgg初始学习率为0.01

如果vgg初始学习率也为0.1的话，整个网络就发散了，之后也不会再收敛，可以看到BN版的vgg对步长不是那么敏感，而且效果有明显的提升。而且加了BN层在100个epoch之前是不如原始vgg的，原因应该是步长太大，步长减少10倍之后，效果就超过了原始VGG。

Inception：

BN-Inception:

我都对0.01和0.001的测试率做了测试。但是按照原论文中设置weight-decay=0.00001怎么也到不了90%以上的正确率，所以我设置了weight-decay分别为1e-5(左图)、5e-5(右图)。这个结果比较有意思，左图不同学习率最终结果没什么差别，说明加了BN之后确实对学习率有一定的容忍性。右图的差别就大了，我个人分析应该是BN并不具备正则的作用，所以还是得到额外的正则项来防止过拟合，但是这次学习率的影响又比较大了，这我还不知道怎么解释，我猜想是因为学习率太小，被正则项限制之后欠拟合了，但是这还需要更多的实验去证明。

Inception-v3

比BN-Inception效果稍好。

Inception-v3 s

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 【转】CNN卷积神经网络_ GoogLeNet 之 Inception(V1-V4) Inception-Resnet-V2 网络结构解读之inception系列三：BN-Inception（Inception V2）网络结构解读之inception系列三：BN-Inception（Inception V2）从GoogLeNet至Inception v3 网络结构解读之inception系列二：GoogLeNet（Inception V1）网络结构解读之inception系列二：GoogLeNet（Inception V1）残差神经网络与inception-resnet 经典卷积神经网络算法(5)：ResNet 经典卷积神经网络（LeNet、AlexNet、VGG、GoogleNet、ResNet）的实现（MXNet版本）