從零開始自己搭建復雜網絡(以Tensorflow為例)


從零開始自己搭建復雜網絡(以MobileNetV2為例)

 tensorflow經過這幾年的發展,已經成長為最大的神經網絡框架。而mobileNetV2在經過Xception的實踐與深度可分離卷積的應用之后,相對成熟和復雜,對於我們進行網絡搭建的學習有着很大的幫助。

MobileNetV2結構基於inverted residual(本質是一個殘差網絡設計,傳統Residual block是block的兩端channel通道數多,中間少,而本文設計的inverted residual是block的兩端channel通道數少,block內channel多,類似於沙漏和梭子形態的區別),另外保留Depthwise Separable Convolutions。論文模型在ImageNet classification, COCO object detection , VOC image segmentation等數據集驗證,在精度、模型參數和計算時間之前取得平衡。
 
 
那么接下來就讓我們開始吧!
首先,構建整體網絡框架:
import tensorflow as tf
from mobilenet_v2.ops import *


def mobilenetv2(inputs, num_classes, is_train=True, reuse=False):
    exp = 6  # expansion ratio
    with tf.variable_scope('mobilenetv2'):
        net = conv2d_block(inputs, 32, 3, 2, is_train, name='conv1_1')  # size/2

        net = res_block(net, 1, 16, 1, is_train, name='res2_1')

        net = res_block(net, exp, 24, 2, is_train, name='res3_1')  # size/4
        net = res_block(net, exp, 24, 1, is_train, name='res3_2')

        net = res_block(net, exp, 32, 2, is_train, name='res4_1')  # size/8
        net = res_block(net, exp, 32, 1, is_train, name='res4_2')
        net = res_block(net, exp, 32, 1, is_train, name='res4_3')

        net = res_block(net, exp, 64, 1, is_train, name='res5_1')
        net = res_block(net, exp, 64, 1, is_train, name='res5_2')
        net = res_block(net, exp, 64, 1, is_train, name='res5_3')
        net = res_block(net, exp, 64, 1, is_train, name='res5_4')

        net = res_block(net, exp, 96, 2, is_train, name='res6_1')  # size/16
        net = res_block(net, exp, 96, 1, is_train, name='res6_2')
        net = res_block(net, exp, 96, 1, is_train, name='res6_3')

        net = res_block(net, exp, 160, 2, is_train, name='res7_1')  # size/32
        net = res_block(net, exp, 160, 1, is_train, name='res7_2')
        net = res_block(net, exp, 160, 1, is_train, name='res7_3')

        net = res_block(net, exp, 320, 1, is_train, name='res8_1', shortcut=False)

        net = pwise_block(net, 1280, is_train, name='conv9_1')
        net = global_avg(net)
        logits = flatten(conv_1x1(net, num_classes, name='logits'))

        pred = tf.nn.softmax(logits, name='prob')
        return logits, pred

MobileNetV2在第一層使用了一個通道數為3×3的卷積進行處理,之后才轉入res_block(殘差層),在經過res_block疊加之后,使用pwise_block(主要是1×1的卷積調整通道數),然后使用平均池化層,和一個1×1的卷積,將最后輸出變為類數。

接着我們來細講每一個模塊:

首先是第一層卷積模塊:

卷積模塊由下面兩個函數組成

卷積模塊由卷積層和批正則化(batch_normalization),以及relu6組成

def conv2d(input_, output_dim, k_h, k_w, d_h, d_w, stddev=0.02, name='conv2d', bias=False):
    with tf.variable_scope(name):
        w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim],
              regularizer=tf.contrib.layers.l2_regularizer(weight_decay),
              initializer=tf.truncated_normal_initializer(stddev=stddev))
        #truncated_normal_initializer生成截斷正態分布的隨機數
        conv = tf.nn.conv2d(input_, w, strides=[1, d_h, d_w, 1], padding='SAME')
        if bias:
            biases = tf.get_variable('bias', [output_dim], initializer=tf.constant_initializer(0.0))
            conv = tf.nn.bias_add(conv, biases)

        return conv


def conv2d_block(input, out_dim, k, s, is_train, name):
    with tf.name_scope(name), tf.variable_scope(name):
        net = conv2d(input, out_dim, k, k, s, s, name='conv2d')
        net = batch_norm(net, train=is_train, name='bn')
        net = relu(net)
        return net

卷積層首先定義了w(可以把w理解為卷積核,是一個Tensor,w具有[filter_height, filter_width, in_channels, out_channels]這樣的形狀)(l2正則化之后+初始),然后通過 tf.nn.conv2d來進行卷積操作,之后加上偏置。

relu6和batch_normalization,tensorflow有直接的函數,調用即可。

def relu(x, name='relu6'):
    return tf.nn.relu6(x, name)


def batch_norm(x, momentum=0.9, epsilon=1e-5, train=True, name='bn'):
    return tf.layers.batch_normalization(x,
                      momentum=momentum,
                      epsilon=epsilon,
                      scale=True,
                      training=train,
                      name=name)

 接着我們要在卷積層后疊加res_block殘差模塊

def res_block(input, expansion_ratio, output_dim, stride, is_train, name, bias=False, shortcut=True):
    with tf.name_scope(name), tf.variable_scope(name):
        # pw
        bottleneck_dim=round(expansion_ratio*input.get_shape().as_list()[-1])
        net = conv_1x1(input, bottleneck_dim, name='pw', bias=bias)
        net = batch_norm(net, train=is_train, name='pw_bn')
        net = relu(net)
        # dw
        net = dwise_conv(net, strides=[1, stride, stride, 1], name='dw', bias=bias)
        net = batch_norm(net, train=is_train, name='dw_bn')
        net = relu(net)
        # pw & linear
        net = conv_1x1(net, output_dim, name='pw_linear', bias=bias)
        net = batch_norm(net, train=is_train, name='pw_linear_bn')

        # element wise add, only for stride==1
        if shortcut and stride == 1:
            in_dim=int(input.get_shape().as_list()[-1])
            if in_dim != output_dim:
                ins=conv_1x1(input, output_dim, name='ex_dim')
                net=ins+net
            else:
                net=input+net

        return net

殘差模塊使用倒置殘差結構,如下圖所示

MobileNetv2架構是基於倒置殘差結構(inverted residual structure),原本的殘差結構的主分支是有三個卷積,兩個逐點卷積通道數較多,而倒置的殘差結構剛好相反,中間的卷積通道數(依舊使用深度分離卷積結構)較多,旁邊的較小。

每個殘差結構由一個1的卷積和一個3的深度卷積和一個1的卷積經過線性變換得到。

bottleneck的維度有擴張系數=6來影響。使用1×1的卷積將輸入通道轉換為擴張系數×輸入維度。

dwise_conv深度卷積代碼具體如下:

def dwise_conv(input, k_h=3, k_w=3, channel_multiplier= 1, strides=[1,1,1,1],
               padding='SAME', stddev=0.02, name='dwise_conv', bias=False):
    with tf.variable_scope(name):
        in_channel=input.get_shape().as_list()[-1]
        w = tf.get_variable('w', [k_h, k_w, in_channel, channel_multiplier],
                        regularizer=tf.contrib.layers.l2_regularizer(weight_decay),
                        initializer=tf.truncated_normal_initializer(stddev=stddev))
        conv = tf.nn.depthwise_conv2d(input, w, strides, padding, rate=None,name=None,data_format=None)
        if bias:
            biases = tf.get_variable('bias', [in_channel*channel_multiplier], initializer=tf.constant_initializer(0.0))
            conv = tf.nn.bias_add(conv, biases)

        return conv

卷積核大小 k_h, k_w, in_channel, 1

 1×1的卷積定義如下:

def conv_1x1(input, output_dim, name, bias=False):
    with tf.name_scope(name):
        return conv2d(input, output_dim, 1,1,1,1, stddev=0.02, name=name, bias=bias)

我們為殘差層添加shortcut連接。

       # element wise add, only for stride==1
        if shortcut and stride == 1:
            in_dim=int(input.get_shape().as_list()[-1])
            if in_dim != output_dim:
                ins=conv_1x1(input, output_dim, name='ex_dim')
                net=ins+net
            else:
                net=input+net

只有當stride=1的時候,才啟用shortcut鏈接。

接着疊加res模塊

在最后使用全局平均池化:

def global_avg(x):
    with tf.name_scope('global_avg'):
        net=tf.layers.average_pooling2d(x, x.get_shape()[1:-1], 1)
        return net

我們沒有使用全連接層,而是使用了1×1的卷積將維度轉換為類數,再將其壓平。

tf.contrib.layers.flatten(x)

最后使用softmax分類

        pred = tf.nn.softmax(logits, name='prob')

好了,MobilenetV2就搭建成功了。

這種網絡的搭建模式可以當成一個模板,將其輸入輸出定好之后,很容易組裝到Estimator中,進行網絡的更換,以及后期的微調。

 

 

  

 本次我們使用了tf.nn搭建網絡,下次我們會去嘗試slim和tf.layer搭建網絡。

 

 

 

 

 

 

 
        


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM