從零開始自己搭建復雜網絡(以MobileNetV2為例)
tensorflow經過這幾年的發展,已經成長為最大的神經網絡框架。而mobileNetV2在經過Xception的實踐與深度可分離卷積的應用之后,相對成熟和復雜,對於我們進行網絡搭建的學習有着很大的幫助。
輕量化卷積神經網絡MobileNet論文詳解(V1&V2)
import tensorflow as tf from mobilenet_v2.ops import * def mobilenetv2(inputs, num_classes, is_train=True, reuse=False): exp = 6 # expansion ratio with tf.variable_scope('mobilenetv2'): net = conv2d_block(inputs, 32, 3, 2, is_train, name='conv1_1') # size/2 net = res_block(net, 1, 16, 1, is_train, name='res2_1') net = res_block(net, exp, 24, 2, is_train, name='res3_1') # size/4 net = res_block(net, exp, 24, 1, is_train, name='res3_2') net = res_block(net, exp, 32, 2, is_train, name='res4_1') # size/8 net = res_block(net, exp, 32, 1, is_train, name='res4_2') net = res_block(net, exp, 32, 1, is_train, name='res4_3') net = res_block(net, exp, 64, 1, is_train, name='res5_1') net = res_block(net, exp, 64, 1, is_train, name='res5_2') net = res_block(net, exp, 64, 1, is_train, name='res5_3') net = res_block(net, exp, 64, 1, is_train, name='res5_4') net = res_block(net, exp, 96, 2, is_train, name='res6_1') # size/16 net = res_block(net, exp, 96, 1, is_train, name='res6_2') net = res_block(net, exp, 96, 1, is_train, name='res6_3') net = res_block(net, exp, 160, 2, is_train, name='res7_1') # size/32 net = res_block(net, exp, 160, 1, is_train, name='res7_2') net = res_block(net, exp, 160, 1, is_train, name='res7_3') net = res_block(net, exp, 320, 1, is_train, name='res8_1', shortcut=False) net = pwise_block(net, 1280, is_train, name='conv9_1') net = global_avg(net) logits = flatten(conv_1x1(net, num_classes, name='logits')) pred = tf.nn.softmax(logits, name='prob') return logits, pred
MobileNetV2在第一層使用了一個通道數為3×3的卷積進行處理,之后才轉入res_block(殘差層),在經過res_block疊加之后,使用pwise_block(主要是1×1的卷積調整通道數),然后使用平均池化層,和一個1×1的卷積,將最后輸出變為類數。
接着我們來細講每一個模塊:
首先是第一層卷積模塊:
卷積模塊由下面兩個函數組成
卷積模塊由卷積層和批正則化(batch_normalization),以及relu6組成
def conv2d(input_, output_dim, k_h, k_w, d_h, d_w, stddev=0.02, name='conv2d', bias=False): with tf.variable_scope(name): w = tf.get_variable('w', [k_h, k_w, input_.get_shape()[-1], output_dim], regularizer=tf.contrib.layers.l2_regularizer(weight_decay), initializer=tf.truncated_normal_initializer(stddev=stddev)) #truncated_normal_initializer生成截斷正態分布的隨機數 conv = tf.nn.conv2d(input_, w, strides=[1, d_h, d_w, 1], padding='SAME') if bias: biases = tf.get_variable('bias', [output_dim], initializer=tf.constant_initializer(0.0)) conv = tf.nn.bias_add(conv, biases) return conv def conv2d_block(input, out_dim, k, s, is_train, name): with tf.name_scope(name), tf.variable_scope(name): net = conv2d(input, out_dim, k, k, s, s, name='conv2d') net = batch_norm(net, train=is_train, name='bn') net = relu(net) return net
卷積層首先定義了w(可以把w理解為卷積核,是一個Tensor,w具有[filter_height, filter_width, in_channels, out_channels]這樣的形狀)(l2正則化之后+初始),然后通過 tf.nn.conv2d來進行卷積操作,之后加上偏置。
relu6和batch_normalization,tensorflow有直接的函數,調用即可。
def relu(x, name='relu6'): return tf.nn.relu6(x, name) def batch_norm(x, momentum=0.9, epsilon=1e-5, train=True, name='bn'): return tf.layers.batch_normalization(x, momentum=momentum, epsilon=epsilon, scale=True, training=train, name=name)
接着我們要在卷積層后疊加res_block殘差模塊
def res_block(input, expansion_ratio, output_dim, stride, is_train, name, bias=False, shortcut=True): with tf.name_scope(name), tf.variable_scope(name): # pw bottleneck_dim=round(expansion_ratio*input.get_shape().as_list()[-1]) net = conv_1x1(input, bottleneck_dim, name='pw', bias=bias) net = batch_norm(net, train=is_train, name='pw_bn') net = relu(net) # dw net = dwise_conv(net, strides=[1, stride, stride, 1], name='dw', bias=bias) net = batch_norm(net, train=is_train, name='dw_bn') net = relu(net) # pw & linear net = conv_1x1(net, output_dim, name='pw_linear', bias=bias) net = batch_norm(net, train=is_train, name='pw_linear_bn') # element wise add, only for stride==1 if shortcut and stride == 1: in_dim=int(input.get_shape().as_list()[-1]) if in_dim != output_dim: ins=conv_1x1(input, output_dim, name='ex_dim') net=ins+net else: net=input+net return net
殘差模塊使用倒置殘差結構,如下圖所示
MobileNetv2架構是基於倒置殘差結構(inverted residual structure),原本的殘差結構的主分支是有三個卷積,兩個逐點卷積通道數較多,而倒置的殘差結構剛好相反,中間的卷積通道數(依舊使用深度分離卷積結構)較多,旁邊的較小。
每個殘差結構由一個1×1的卷積和一個3×3的深度卷積和一個1×1的卷積經過線性變換得到。
bottleneck的維度有擴張系數=6來影響。使用1×1的卷積將輸入通道轉換為擴張系數×輸入維度。
dwise_conv深度卷積代碼具體如下:
def dwise_conv(input, k_h=3, k_w=3, channel_multiplier= 1, strides=[1,1,1,1], padding='SAME', stddev=0.02, name='dwise_conv', bias=False): with tf.variable_scope(name): in_channel=input.get_shape().as_list()[-1] w = tf.get_variable('w', [k_h, k_w, in_channel, channel_multiplier], regularizer=tf.contrib.layers.l2_regularizer(weight_decay), initializer=tf.truncated_normal_initializer(stddev=stddev)) conv = tf.nn.depthwise_conv2d(input, w, strides, padding, rate=None,name=None,data_format=None) if bias: biases = tf.get_variable('bias', [in_channel*channel_multiplier], initializer=tf.constant_initializer(0.0)) conv = tf.nn.bias_add(conv, biases) return conv
卷積核大小 k_h, k_w, in_channel, 1
1×1的卷積定義如下:
def conv_1x1(input, output_dim, name, bias=False): with tf.name_scope(name): return conv2d(input, output_dim, 1,1,1,1, stddev=0.02, name=name, bias=bias)
我們為殘差層添加shortcut連接。
# element wise add, only for stride==1 if shortcut and stride == 1: in_dim=int(input.get_shape().as_list()[-1]) if in_dim != output_dim: ins=conv_1x1(input, output_dim, name='ex_dim') net=ins+net else: net=input+net
只有當stride=1的時候,才啟用shortcut鏈接。
接着疊加res模塊
在最后使用全局平均池化:
def global_avg(x): with tf.name_scope('global_avg'): net=tf.layers.average_pooling2d(x, x.get_shape()[1:-1], 1) return net
我們沒有使用全連接層,而是使用了1×1的卷積將維度轉換為類數,再將其壓平。
tf.contrib.layers.flatten(x)
最后使用softmax分類
pred = tf.nn.softmax(logits, name='prob')
好了,MobilenetV2就搭建成功了。
這種網絡的搭建模式可以當成一個模板,將其輸入輸出定好之后,很容易組裝到Estimator中,進行網絡的更換,以及后期的微調。
本次我們使用了tf.nn搭建網絡,下次我們會去嘗試slim和tf.layer搭建網絡。