Tensorflow BN具體實現（多種方式）：

理論知識（參照大佬）：https://blog.csdn.net/hjimce/article/details/50866313

補充知識：

① tf.nn.moments 這個函數的輸出就是BN需要的mean和variance。

方式1：

tf.nn.batch_normalization（`x, mean, variance, offset, scale, variance_epsilon, name=None`）：原始接口封裝使用

x
·mean moments方法的輸出之一
·variance moments方法的輸出之一
·offset BN需要學習的參數
·scale BN需要學習的參數
·variance_epsilon 歸一化時防止分母為0加的一個常量

實現代碼：

 1 import tensorflow as tf
 2 
 3 # 實現Batch Normalization
 4 def bn_layer(x,is_training,name='BatchNorm',moving_decay=0.9,eps=1e-5):
 5     # 獲取輸入維度並判斷是否匹配卷積層(4)或者全連接層(2)
 6     shape = x.shape
 7     assert len(shape) in [2,4]
 8 
 9     param_shape = shape[-1]
10     with tf.variable_scope(name):
11         # 聲明BN中唯一需要學習的兩個參數，y=gamma*x+beta
12         gamma = tf.get_variable('gamma',param_shape,initializer=tf.constant_initializer(1))
13         beta  = tf.get_variable('beat', param_shape,initializer=tf.constant_initializer(0))
14 
15         # 計算當前整個batch的均值與方差
16         axes = list(range(len(shape)-1))
17         batch_mean, batch_var = tf.nn.moments(x,axes,name='moments')
18 
19         # 采用滑動平均更新均值與方差
20         ema = tf.train.ExponentialMovingAverage(moving_decay)
21 
22         def mean_var_with_update():
23             ema_apply_op = ema.apply([batch_mean,batch_var])
24             with tf.control_dependencies([ema_apply_op]):
25                 return tf.identity(batch_mean), tf.identity(batch_var)
26 
27         # 訓練時，更新均值與方差，測試時使用之前最后一次保存的均值與方差
28         mean, var = tf.cond(tf.equal(is_training,True),mean_var_with_update,
29                 lambda:(ema.average(batch_mean),ema.average(batch_var)))
30 
31         # 最后執行batch normalization
32         return tf.nn.batch_normalization(x,mean,var,beta,gamma,eps)

方式2：

tf.contrib.layers.batch_norm：封裝好的批處理類

實際上tf.contrib.layers.batch_norm對於tf.nn.moments和tf.nn.batch_normalization進行了一次封裝

參數：

1 inputs：輸入

2 decay ：衰減系數。合適的衰減系數值接近1.0,特別是含多個9的值：0.999,0.99,0.9。如果訓練集表現很好而驗證/測試集表現得不好，選擇

小的系數（推薦使用0.9）。如果想要提高穩定性，zero_debias_moving_mean設為True

3 center：如果為True，有beta偏移量；如果為False，無beta偏移量

4 scale：如果為True，則乘以gamma。如果為False，gamma則不使用。當下一層是線性的時（例如nn.relu），由於縮放可以由下一層完成，

所以可以禁用該層。

5 epsilon：避免被零除

6 activation_fn：用於激活，默認為線性激活函數

7 param_initializers ： beta, gamma, moving mean and moving variance的優化初始化

8 param_regularizers ： beta and gamma正則化優化

9 updates_collections ：Collections來收集計算的更新操作。updates_ops需要使用train_op來執行。如果為None，則會添加控件依賴項以

確保更新已計算到位。

10 is_training:圖層是否處於訓練模式。在訓練模式下，它將積累轉入的統計量moving_mean並 moving_variance使用給定的指數移動平均值 decay。當它不是在訓練模式，那么它將使用的數值moving_mean和moving_variance。
11 scope：可選范圍variable_scope 注意：訓練時，需要更新moving_mean和moving_variance。默認情況下，更新操作被放入tf.GraphKeys.UPDATE_OPS，所以需要添加它們作為依賴項train_op。例如：

update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) with tf.control_dependencies(update_ops): train_op = optimizer.minimize(loss)

可以將updates_collections = None設置為強制更新，但可能會導致速度損失，尤其是在分布式設置中。

實現代碼：

1 import tensorflow as tf
2 
3 def batch_norm(x,epsilon=1e-5, momentum=0.9,train=True, name="batch_norm"):
4     with tf.variable_scope(name):
5         epsilon = epsilon
6         momentum = momentum
7         name = name
8     return tf.contrib.layers.batch_norm(x, decay=momentum, updates_collections=None, epsilon=epsilon,
9                                         scale=True, is_training=train,scope=name)

BN一般放哪一層？

BN層的設定一般是按照conv->bn->scale->relu的順序來形成一個block

訓練和測試時 BN的區別？？？

bn層訓練的時候，基於當前batch的mean和std調整分布；當測試的時候，也就是測試的時候，基於全部訓練樣本的mean和std調整分布

所以，訓練的時候需要讓BN層工作，並且保存BN層學習到的參數。測試的時候加載訓練得到的參數來重構測試集。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 GBDT參數理解 PyTorch之BN核心參數詳解 yolov3參數理解 json dumps的參數理解 tensorflow 之常見模塊conv,bn...實現 pytorch中LSTM各參數理解 Python open函數newline=''參數理解針對jquery的ajax中的參數理解 word2vec參數理解 TensorFlow使用記錄 (七）： BN 層及 Dropout 層的使用

BN 詳解和使用Tensorflow實現（參數理解）