Batch Normalization: 使用tf.layers高級函數來構建帶有Batch Normalization的神經網絡

參考文獻
吳恩達deeplearningai課程
 課程筆記
 Udacity課程
在使用tf.layers高級函數來構建神經網絡中我們使用了tf.layers包構建了一個不包含有Batch Normalization結構的卷積神經網絡模型作為本節模型的對比

本節中將使用tf.layers包實現包含有Batch Normalization的卷積神經網絡模型

"""
向生成全連接層的'fully_connected'函數中添加Batch Normalization,我們需要以下步驟：
1.在函數聲明中添加'is_training'參數，以確保可以向Batch Normalization層中傳遞信息
2.去除函數中bias偏置屬性和激活函數
3.使用'tf.layers.batch_normalization'來標准化神經層的輸出,注意，將“is_training”傳遞給該層，以確保網絡適時更新數據集均值和方差統計信息。
4.將經過Batch Normalization后的值傳遞到ReLU激活函數中
"""

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True, reshape=False)


def fully_connected(prev_layer, num_units, is_training):
    """
    num_units參數傳遞該層神經元的數量，根據prev_layer參數傳入值作為該層輸入創建全連接神經網絡。

   :param prev_layer: Tensor
        該層神經元輸入
    :param num_units: int
        該層神經元結點個數
    :param is_training: bool or Tensor
        表示該網絡當前是否正在訓練，告知Batch Normalization層是否應該更新或者使用均值或方差的分布信息
    :returns Tensor
        一個新的全連接神經網絡層

    """
    layer = tf.layers.dense(prev_layer, num_units, use_bias=False, activation=None)
    layer = tf.layers.batch_normalization(layer, training=is_training)
    layer = tf.nn.relu(layer)
    return layer


"""
向生成卷積層的'conv_layer'函數中添加Batch Normalization,我們需要以下步驟：
1.在函數聲明中添加'is_training'參數，以確保可以向Batch Normalization層中傳遞信息
2.去除conv2d層中bias偏置屬性和激活函數
3.使用'tf.layers.batch_normalization'來標准化卷積層的輸出,注意，將"is_training"傳遞給該層，以確保網絡適時更新數據集均值和方差統計信息。
4.將經過Batch Normalization后的值傳遞到ReLU激活函數中
PS:和'fully_connected'函數比較,你會發現如果你使用tf.layers包函數對全連接層進行BN操作和對卷積層進行BN操作沒有任何的區別，但是如果使用tf.nn包中函數實現BN會發現一些小的變動
"""

"""
我們會運用以下方法來構建神經網絡的卷積層，這個卷積層很基本，我們總是使用3x3內核，ReLU激活函數，
在具有奇數深度的圖層上步長為1x1，在具有偶數深度的圖層上步長為2x2。在這個網絡中，我們並不打算使用池化層。
PS：該版本的函數包括批量標准化操作。
"""


def conv_layer(prev_layer, layer_depth, is_training):
    """
   使用給定的參數作為輸入創建卷積層
    :param prev_layer: Tensor
        傳入該層神經元作為輸入
    :param layer_depth: int
        我們將根據網絡中圖層的深度設置特征圖的步長和數量。
        這不是實踐CNN的好方法，但它可以幫助我們用很少的代碼創建這個示例。
    :param is_training: bool or Tensor
        表示該網絡當前是否正在訓練，告知Batch Normalization層是否應該更新或者使用均值或方差的分布信息
    :returns Tensor
        一個新的卷積層
    """

    strides = 2 if layer_depth%3 == 0 else 1
    conv_layer = tf.layers.conv2d(prev_layer, layer_depth*4, 3, strides, 'same', use_bias=False, activation=None)
    conv_layer = tf.layers.batch_normalization(conv_layer, training=is_training)
    conv_layer = tf.nn.relu(conv_layer)

    return conv_layer


"""
批量標准化仍然是一個新的想法，研究人員仍在發現如何最好地使用它。
一般來說，人們似乎同意刪除層的偏差(因為批處理已經有了縮放和移位的術語)，並且在層的非線性激活函數之前添加了批處理規范化。
然而，對於某些網絡來說，使用其他的方法也能得到不錯的結果

為了演示這一點，以下三個版本的conv_layer展示了實現批量標准化的其他方法。
如果您嘗試使用這些函數的任何一個版本，它們都應該仍然運行良好(盡管有些版本可能仍然比其他版本更好)。
"""

# 在卷積層中使用偏置use_bias=True，在ReLU激活函數之前仍然添加了批處理規范化。
# def conv_layer(prev_layer, layer_num, is_training):
#     strides = 2 if layer_num%3 == 0 else 1
#     conv_layer = tf.layers.conv2d(prev_layer, layer_num*4, 3, strides, 'same', use_bias=True, activation=None)
#     conv_layer = tf.layers.batch_normalization(conv_layer, training=is_training)
#     conv_layer = tf.nn.relu(conv_layer)
#     return conv_layer

# 在卷積層中使用偏置use_bias=True，先使用ReLU激活函數處理然后添加了批處理規范化。
# def conv_layer(prev_layer, layer_num, is_training):
#     strides = 2 if layer_num % 3 == 0 else 1
#     conv_layer = tf.layers.conv2d(prev_layer, layer_num*4, 3, strides, 'same', use_bias=True, activation=tf.nn.relu)
#     conv_layer = tf.layers.batch_normalization(conv_layer, training=is_training)
#     return conv_layer

# 在卷積層中不使用偏置use_bias=False，但先使用ReLU激活函數處理然后添加了批處理規范化。
# def conv_layer(prev_layer, layer_num, is_training):
#     strides = 2 if layer_num % 3 == 0 else 1
#     conv_layer = tf.layers.conv2d(prev_layer, layer_num*4, 3, strides, 'same', use_bias=False, activation=tf.nn.relu)
#     conv_layer = tf.layers.batch_normalization(conv_layer, training=is_training)
#     return conv_layer

"""
為了修改訓練函數，我們需要做以下工作:

1.Added is_training, a placeholder to store a boolean value indicating whether or not the network is training.
添加is_training，一個用於存儲布爾值的占位符，該值指示網絡是否正在訓練
2.Passed is_training to the conv_layer and fully_connected functions.
傳遞is_training到conv_layer和fully_connected函數
3.Each time we call run on the session, we added to feed_dict the appropriate value for is_training
每次調用sess.run函數時，我們都添加到feed_dict中is_training的適當值用以表示當前是正在訓練還是預測
4.Moved the creation of train_opt inside a with tf.control_dependencies... statement.
This is necessary to get the normalization layers created with tf.layers.batch_normalization to update their population statistics,
 which we need when performing inference.
將train_opt訓練函數放進with tf.control_dependencies... 的函數結構體中
這是我們得到由tf.layers.batch_normalization創建的BN層的值所必須的操作，我們由這個操作來更新訓練數據的統計分布，使在inference前向傳播預測時使用正確的數據分布值

"""


def train(num_batches, batch_size, learning_rate):
    # Build placeholders for the input samples and labels
    # 創建輸入樣本和標簽的占位符
    inputs = tf.placeholder(tf.float32, [None, 28, 28, 1])
    labels = tf.placeholder(tf.float32, [None, 10])

    # Add placeholder to indicate whether or not we're training the model
    # 創建占位符表明當前是否正在訓練模型
    is_training = tf.placeholder(tf.bool)

    # Feed the inputs into a series of 20 convolutional layers
    # 把輸入數據填充到一系列20個卷積層的神經網絡中
    layer = inputs
    for layer_i in range(1, 20):
        layer = conv_layer(layer, layer_i, is_training)

    # Flatten the output from the convolutional layers
    # 將卷積層輸出扁平化處理
    orig_shape = layer.get_shape().as_list()
    layer = tf.reshape(layer, shape=[-1, orig_shape[1]*orig_shape[2]*orig_shape[3]])

    # Add one fully connected layer
    # 添加一個具有100個神經元的全連接層
    layer = fully_connected(layer, 100, is_training)

    # Create the output layer with 1 node for each
    # 為每一個類別添加一個輸出節點
    logits = tf.layers.dense(layer, 10)

    # Define loss and training operations
    # 定義loss 函數和訓練操作
    model_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=logits, labels=labels))

    # Tell TensorFlow to update the population statistics while training
    # 通知Tensorflow在訓練時要更新均值和方差的分布
    with tf.control_dependencies(tf.get_collection(tf.GraphKeys.UPDATE_OPS)):
        train_opt = tf.train.AdamOptimizer(learning_rate).minimize(model_loss)

    # Create operations to test accuracy
    # 創建計算准確度的操作
    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    # Train and test the network
    # 訓練並測試網絡模型
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        for batch_i in range(num_batches):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)

            # train this batch
            # 訓練樣本批次
            sess.run(train_opt, {inputs: batch_xs, labels: batch_ys, is_training: True})

            # Periodically check the validation or training loss and accuracy
            # 定期檢查訓練或驗證集上的loss和精確度
            if batch_i%100 == 0:
                loss, acc = sess.run([model_loss, accuracy], {inputs: mnist.validation.images,
                                                              labels: mnist.validation.labels,
                                                              is_training: False})
                print(
                    'Batch: {:>2}: Validation loss: {:>3.5f}, Validation accuracy: {:>3.5f}'.format(batch_i, loss, acc))
            elif batch_i%25 == 0:
                loss, acc = sess.run([model_loss, accuracy], {inputs: batch_xs, labels: batch_ys, is_training: False})
                print('Batch: {:>2}: Training loss: {:>3.5f}, Training accuracy: {:>3.5f}'.format(batch_i, loss, acc))

        # At the end, score the final accuracy for both the validation and test sets
        # 最后在驗證集和測試集上對模型准確率進行評分
        acc = sess.run(accuracy, {inputs: mnist.validation.images,
                                  labels: mnist.validation.labels,
                                  is_training: False})
        print('Final validation accuracy: {:>3.5f}'.format(acc))
        acc = sess.run(accuracy, {inputs: mnist.test.images,
                                  labels: mnist.test.labels,
                                  is_training: False})
        print('Final test accuracy: {:>3.5f}'.format(acc))

        # Score the first 100 test images individually, just to make sure batch normalization really worked
        # 對100個獨立的測試圖片進行評分,對比驗證Batch Normalization的效果
        correct = 0
        for i in range(100):
            correct += sess.run(accuracy, feed_dict={inputs: [mnist.test.images[i]],
                                                     labels: [mnist.test.labels[i]],
                                                     is_training: False})

        print("Accuracy on 100 samples:", correct/100)


num_batches = 800  # 迭代次數
batch_size = 64  # 批處理數量
learning_rate = 0.002  # 學習率

tf.reset_default_graph()
with tf.Graph().as_default():
    train(num_batches, batch_size, learning_rate)

"""
通過批量標准化，我們現在獲得了出色的性能。
事實上，在僅僅500個批次之后，驗證精度幾乎達到94%。
還要注意輸出的最后一行:100個樣本的精確性。
如果這個值很低，而其他一切看起來都很好，那意味着您沒有正確地實現批量標准化。
具體地說，這意味着你要么在訓練時沒有計算總體均值和方差，要么在推理過程中沒有使用這些值。
"""

# Extracting MNIST_data/train-images-idx3-ubyte.gz
# Extracting MNIST_data/train-labels-idx1-ubyte.gz
# Extracting MNIST_data/t10k-images-idx3-ubyte.gz
# Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
# 2018-03-18 18:35:03.506132: I D:\Build\tensorflow\tensorflow-r1.4\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX
# Batch:  0: Validation loss: 0.69091, Validation accuracy: 0.10700
# Batch: 25: Training loss: 0.57651, Training accuracy: 0.14062
# Batch: 50: Training loss: 0.46147, Training accuracy: 0.09375
# Batch: 75: Training loss: 0.38943, Training accuracy: 0.03125
# Batch: 100: Validation loss: 0.35058, Validation accuracy: 0.11260
# Batch: 125: Training loss: 0.33055, Training accuracy: 0.17188
# Batch: 150: Training loss: 0.32800, Training accuracy: 0.15625
# Batch: 175: Training loss: 0.34861, Training accuracy: 0.18750
# Batch: 200: Validation loss: 0.40572, Validation accuracy: 0.11260
# Batch: 225: Training loss: 0.33194, Training accuracy: 0.23438
# Batch: 250: Training loss: 0.46818, Training accuracy: 0.25000
# Batch: 275: Training loss: 0.38155, Training accuracy: 0.43750
# Batch: 300: Validation loss: 0.25433, Validation accuracy: 0.55320
# Batch: 325: Training loss: 0.17981, Training accuracy: 0.73438
# Batch: 350: Training loss: 0.18110, Training accuracy: 0.76562
# Batch: 375: Training loss: 0.06763, Training accuracy: 0.92188
# Batch: 400: Validation loss: 0.04946, Validation accuracy: 0.92360
# Batch: 425: Training loss: 0.07999, Training accuracy: 0.89062
# Batch: 450: Training loss: 0.04927, Training accuracy: 0.93750
# Batch: 475: Training loss: 0.00216, Training accuracy: 1.00000
# Batch: 500: Validation loss: 0.04071, Validation accuracy: 0.94060
# Batch: 525: Training loss: 0.01940, Training accuracy: 0.98438
# Batch: 550: Training loss: 0.05709, Training accuracy: 0.90625
# Batch: 575: Training loss: 0.04652, Training accuracy: 0.93750
# Batch: 600: Validation loss: 0.05811, Validation accuracy: 0.91580
# Batch: 625: Training loss: 0.01401, Training accuracy: 0.96875
# Batch: 650: Training loss: 0.04626, Training accuracy: 0.93750
# Batch: 675: Training loss: 0.03831, Training accuracy: 0.95312
# Batch: 700: Validation loss: 0.03709, Validation accuracy: 0.94960
# Batch: 725: Training loss: 0.00235, Training accuracy: 1.00000
# Batch: 750: Training loss: 0.02916, Training accuracy: 0.96875
# Batch: 775: Training loss: 0.01792, Training accuracy: 0.98438
# Final validation accuracy: 0.94040
# Final test accuracy: 0.93840
# Accuracy on 100 samples: 0.95
免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。
猜您在找 Tensorflow BatchNormalization詳解：2_使用tf.layers高級函數來構建神經網絡 Tensorflow BatchNormalization詳解：4_使用tf.nn.batch_normalization函數實現Batch Normalization操作使用tf.keras.layers.Layer自定義神經網絡的層使用tf.keras API 構建神經網絡(基礎) [DL學習筆記]從人工神經網絡到卷積神經網絡_3_使用tensorflow搭建CNN來分類not_MNIST數據(有一些問題) 【tensorflow】tf.keras + Sequential() 6 步搭建神經網絡 TensorFlow——卷積神經網絡的相關函數【tensorflow】tf.keras + 神經網絡類class 6 步搭建神經網絡 Pytorch 神經網絡模塊之 Linear Layers TensorFlow2.0——Sequential模型與函數式API構建神經網絡結構