一、簡單回顧EfficientNet結構

EfficientNet -B0 baseline netwwork網絡列表參數，有9個stage，其中2-8使用的operator全部都是MBConv。

MBConv的結構

在它的主分支上，先是一個1*1的升維卷積，個數是channel的n倍，然后是BN+Swish激活函數；然后，是我們的DW卷積，接上BN+Swish激活函數；然后是SE注意力機制模塊; 接着是1*1降維卷積+BN,最后通過Droupout層。

SE模塊結構

就是一個全局平均池化，兩個全連接構成。
簡單的回顧之后就看一下我們的代碼。

二、代碼

1._make_divisible

作用就是將channel的個數，調整到離它最近的8的整數。這樣做之后，對我們的硬件更加友好。

def _make_divisible(ch, divisor=8, min_ch=None):
    if min_ch is None:
        min_ch = divisor
    new_ch = max(min_ch, int(ch + divisor / 2) // divisor * divisor)
    # Make sure that round down does not go down by more than 10%.
    if new_ch < 0.9 * ch:
        new_ch += divisor
    return new_ch

2.ConvBNActivation模塊

對應MBConv模塊，在代碼中，使用定義ConvBNActivation類。在這個類當中傳入我們輸入序列channel，輸出channel，核的大小，步距，groups控制我們的卷積使用Conv還是DWconv，標准化層（BN），激活函數。下面根據我們核的大小，計算padding，定義標准化和激活的默認情況，為BatchNorm2d和SiLU(torch>1.7)。在super中,傳入所需構建的層結構，卷積層--輸入特征矩陣、輸出特征矩陣、kernel_size、stride、padding、groups、bias，都是我們傳入的參數。標准化層norm_layer--輸出的特征矩陣的channel，激活層。

class ConvBNActivation(nn.Sequential):
    def __init__(self,
                 in_planes: int,
                 out_planes: int,
                 kernel_size: int = 3,
                 stride: int = 1,
                 groups: int = 1,
                 norm_layer: Optional[Callable[..., nn.Module]] = None,
                 activation_layer: Optional[Callable[..., nn.Module]] = None):
        padding = (kernel_size - 1) // 2
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d
        if activation_layer is None:
            activation_layer = nn.SiLU  # alias Swish  (torch>=1.7)
        super(ConvBNActivation, self).__init__(nn.Conv2d(in_channels=in_planes,
                                                         out_channels=out_planes,
                                                         kernel_size=kernel_size,
                                                         stride=stride,
                                                         padding=padding,
                                                         groups=groups,
                                                         bias=False),
                                               norm_layer(out_planes),
                                               activation_layer())

3.SE模塊

傳入input_channel對應MB輸入的channel；expand_channel對應第一個卷積后，輸出的channel，由於DW卷積通道維度不變，所以輸出維度和第一個一樣；squeeze_c對應我們第一個全連接層的節點個數，等於input_c/squeeze_factor,論文中默認為4。

首先，我們定義squeeze_c；全連接層用卷積代替，第一個全連接層，輸入expand_channel對應第一個卷積后，輸出的channel；輸出squeeze_c對應我們第一個全連接層的節點個數，kernel1*1,對應的激活函數Swish-SiLU。

第二個全連接層，輸入squeeze_c對應我們第一個全連接層的節點個數，因為要尺寸對應相乘，輸出expand_channel對應第一個全連接輸入，kernel1*1,對應的激活函數sigmoid-Sigmoid。

下面定義它的正向傳播過程。對我們的輸入進行平均池化操作，對每個channel進行全局平均池化。再分別通過全連接一，激活一，全連接二，激活二，再和x相乘，就得到輸出。

class SqueezeExcitation(nn.Module):
    def __init__(self,
                 input_c: int,   # block input channel
                 expand_c: int,  # block expand channel
                 squeeze_factor: int = 4):
        super(SqueezeExcitation, self).__init__()
        squeeze_c = input_c // squeeze_factor
        self.fc1 = nn.Conv2d(expand_c, squeeze_c, 1)
        self.ac1 = nn.SiLU()  # alias Swish
        self.fc2 = nn.Conv2d(squeeze_c, expand_c, 1)
        self.ac2 = nn.Sigmoid()

    def forward(self, x: Tensor) -> Tensor:
        scale = F.adaptive_avg_pool2d(x, output_size=(1, 1))
        scale = self.fc1(scale)
        scale = self.ac1(scale)
        scale = self.fc2(scale)
        scale = self.ac2(scale)
        return scale * x

4.InvertedResidualConfig

是每一個MBConv模塊的配置參數。這里設計到這么幾個參數。kernel(3或5)、input_c對應輸入MB模塊的channel、out_c對應輸出MB模塊的channel、expanded_ratio(1或6)、stride(1或2)、use_se布爾變量使用SE模塊、drop_rate、index(1a、2a...)記錄當前MB名稱、width_coefficient網絡寬度的倍率因子: float。
在我們類當中還定義了一個類方法，adjust_channels，就是調用我們第一個講的_make_divisible函數，將channel乘上寬度倍率因子，在傳入函數中，再調整到離它最近的8的整數倍。
在搭建函數過程中，輸入和輸出的channel都調用adjust_channels，其他的賦值過去。

class InvertedResidualConfig:
    # kernel_size, in_channel, out_channel, exp_ratio, strides, use_SE, drop_connect_rate
    def __init__(self,
                 kernel: int,          # 3 or 5
                 input_c: int,
                 out_c: int,
                 expanded_ratio: int,  # 1 or 6
                 stride: int,          # 1 or 2
                 use_se: bool,         # True
                 drop_rate: float,
                 index: str,           # 1a, 2a, 2b, ...
                 width_coefficient: float):
        self.input_c = self.adjust_channels(input_c, width_coefficient)
        self.kernel = kernel
        self.expanded_c = self.input_c * expanded_ratio
        self.out_c = self.adjust_channels(out_c, width_coefficient)
        self.use_se = use_se
        self.stride = stride
        self.drop_rate = drop_rate
        self.index = index

    @staticmethod
    def adjust_channels(channels: int, width_coefficient: float):
        return _make_divisible(channels * width_coefficient, 8)

5.InvertedResidual模塊

傳入參數：cnf:配置文件,還有傳入norm_layer就是BN結構。
首先，判斷cnf的步距是否為1，2，如果不是報錯。第二，判斷是否需要shortcut捷徑分支，只有大小一樣才有。判斷步距是否為一，並且輸入等於輸出，都滿足的情況下，才使用捷徑分支。第三，定義一個有序的字典賦值給layer，激活定義為SiLU，然后依次搭建網絡。
當n=1時，不要第一個升維的1x1卷積層即Stage2中的MBConv結構都沒有第一個升維的1x1卷積層（這和MobileNetV3網絡類似）。要判斷expanded是否等於input，等於一時說明n=1，跳轉。這兩個不等，構建expand_conv結構，調用我們之前實現的ConvBNActivation。然后，搭建DW卷積，調用我們之前實現的ConvBNActivation。然后，是否使用SE模塊，調用SE模塊。最后，搭建1*1的卷積層，其中，激活函數使用activation_layer= nn.Identity，不做處理。將我們構造的有序字典layer傳給序列，就能搭建出MBConv的主分支，dropout>0的情況，才處理。
接着，看一下，正向傳播過程，將輸入特征矩陣x經過block主分支，再通過droupout輸出，再判斷是否使用捷徑分支，使用的話相加返回，不使用，直接輸出。

class InvertedResidual(nn.Module):
    def __init__(self,
                 cnf: InvertedResidualConfig,
                 norm_layer: Callable[..., nn.Module]):
        super(InvertedResidual, self).__init__()

        if cnf.stride not in [1, 2]:
            raise ValueError("illegal stride value.")
        self.use_res_connect = (cnf.stride == 1 and cnf.input_c == cnf.out_c)
        layers = OrderedDict()
        activation_layer = nn.SiLU  # alias Swish
        # expand
        if cnf.expanded_c != cnf.input_c:
            layers.update({"expand_conv": ConvBNActivation(cnf.input_c,
                                                           cnf.expanded_c,
                                                           kernel_size=1,
                                                           norm_layer=norm_layer,
                                                           activation_layer=activation_layer)})
        # depthwise
        layers.update({"dwconv": ConvBNActivation(cnf.expanded_c,
                                                  cnf.expanded_c,
                                                  kernel_size=cnf.kernel,
                                                  stride=cnf.stride,
                                                  groups=cnf.expanded_c,
                                                  norm_layer=norm_layer,
                                                  activation_layer=activation_layer)})
        if cnf.use_se:
            layers.update({"se": SqueezeExcitation(cnf.input_c,
                                                   cnf.expanded_c)})
        # project
        layers.update({"project_conv": ConvBNActivation(cnf.expanded_c,
                                                        cnf.out_c,
                                                        kernel_size=1,
                                                        norm_layer=norm_layer,
                                                        activation_layer=nn.Identity)})
        self.block = nn.Sequential(layers)
        self.out_channels = cnf.out_c
        self.is_strided = cnf.stride > 1
        # 只有在使用shortcut連接時才使用dropout層
        if self.use_res_connect and cnf.drop_rate > 0:
            self.dropout = DropPath(cnf.drop_rate)
        else:
            self.dropout = nn.Identity()
    def forward(self, x: Tensor) -> Tensor:
        result = self.block(x)
        result = self.dropout(result)
        if self.use_res_connect:
            result += x
        return result

6.EfficientNet

定義EfficientNet類，傳入width_coefficient：寬度倍率因子，depth_coefficient：深度倍率因子，num_classes類別數，dropout_rate對應整個框架中全連接層前面的，stage9中的dropout，drop_connect_rate對應MB中最后一個隨機失活比例。 block就是MBConv模塊，norm_layerBN結構。default_cnf記錄stage2-8的默認配置文件cnf。depth_coefficient代表depth維度上的倍率因子（僅針對Stage2到Stage8），比如，EfficientNetB0 中Stage7的L=4 ，那么在B6中就是4×2.6=10.4，接着向上取整即11.

class EfficientNet(nn.Module):
    def __init__(self,
                 width_coefficient: float,
                 depth_coefficient: float,
                 num_classes: int = 1000,
                 dropout_rate: float = 0.2,
                 drop_connect_rate: float = 0.2,
                 block: Optional[Callable[..., nn.Module]] = None,
                 norm_layer: Optional[Callable[..., nn.Module]] = None
                 ):
        super(EfficientNet, self).__init__()

        # kernel_size, in_channel, out_channel, exp_ratio, strides, use_SE, drop_connect_rate, repeats
        default_cnf = [[3, 32, 16, 1, 1, True, drop_connect_rate, 1],
                       [3, 16, 24, 6, 2, True, drop_connect_rate, 2],
                       [5, 24, 40, 6, 2, True, drop_connect_rate, 2],
                       [3, 40, 80, 6, 2, True, drop_connect_rate, 3],
                       [5, 80, 112, 6, 1, True, drop_connect_rate, 3],
                       [5, 112, 192, 6, 2, True, drop_connect_rate, 4],
                       [3, 192, 320, 6, 1, True, drop_connect_rate, 1]]

        def round_repeats(repeats):
            """Round number of repeats based on depth multiplier."""
            return int(math.ceil(depth_coefficient * repeats))

        if block is None:
            block = InvertedResidual

        if norm_layer is None:
            norm_layer = partial(nn.BatchNorm2d, eps=1e-3, momentum=0.1)

        adjust_channels = partial(InvertedResidualConfig.adjust_channels,
                                  width_coefficient=width_coefficient)

        # build inverted_residual_setting
        bneck_conf = partial(InvertedResidualConfig,
                             width_coefficient=width_coefficient)

        b = 0
        num_blocks = float(sum(round_repeats(i[-1]) for i in default_cnf))
        inverted_residual_setting = []
        for stage, args in enumerate(default_cnf):
            cnf = copy.copy(args)
            for i in range(round_repeats(cnf.pop(-1))):
                if i > 0:
                    # strides equal 1 except first cnf
                    cnf[-3] = 1  # strides
                    cnf[1] = cnf[2]  # input_channel equal output_channel

                cnf[-1] = args[-2] * b / num_blocks  # update dropout ratio
                index = str(stage + 1) + chr(i + 97)  # 1a, 2a, 2b, ...
                inverted_residual_setting.append(bneck_conf(*cnf, index))
                b += 1

        # create layers
        layers = OrderedDict()

        # first conv
        layers.update({"stem_conv": ConvBNActivation(in_planes=3,
                                                     out_planes=adjust_channels(32),
                                                     kernel_size=3,
                                                     stride=2,
                                                     norm_layer=norm_layer)})

        # building inverted residual blocks
        for cnf in inverted_residual_setting:
            layers.update({cnf.index: block(cnf, norm_layer)})

        # build top
        last_conv_input_c = inverted_residual_setting[-1].out_c
        last_conv_output_c = adjust_channels(1280)
        layers.update({"top": ConvBNActivation(in_planes=last_conv_input_c,
                                               out_planes=last_conv_output_c,
                                               kernel_size=1,
                                               norm_layer=norm_layer)})

        self.features = nn.Sequential(layers)
        self.avgpool = nn.AdaptiveAvgPool2d(1)

        classifier = []
        if dropout_rate > 0:
            classifier.append(nn.Dropout(p=dropout_rate, inplace=True))
        classifier.append(nn.Linear(last_conv_output_c, num_classes))
        self.classifier = nn.Sequential(*classifier)

        # initial weights
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode="fan_out")
                if m.bias is not None:
                    nn.init.zeros_(m.bias)
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.ones_(m.weight)
                nn.init.zeros_(m.bias)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.zeros_(m.bias)

    def _forward_impl(self, x: Tensor) -> Tensor:
        x = self.features(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)

        return x

    def forward(self, x: Tensor) -> Tensor:
        return self._forward_impl(x)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 EfficientNet & EfficientDet 論文解讀深度學習入門----EfficientNet解讀 EfficientNet算法筆記 EfficientNet學習筆記 EfficientNet EfficientNetV2 代碼解讀筆記 EfficientNet 簡介深度學習筆記（十一）網絡 Inception, Xception, MobileNet, ShuffeNet, ResNeXt, SqueezeNet, EfficientNet, MixConv 怎樣解讀Caffe源代碼 cartographer 代碼解讀