Pytorch中的Batch Normalization操作

本文轉載自查看原文 2018-07-18 21:17 5298 pytorch

之前一直和小伙伴探討batch normalization層的實現機理，作用在這里不談，知乎上有一篇paper在講這個，鏈接

這里只探究其具體運算過程，我們假設在網絡中間經過某些卷積操作之后的輸出的feature map的尺寸為4×3×2×2

4為batch的大小，3為channel的數目，2×2為feature map的長寬

整個BN層的運算過程如下圖

上圖中，batch size一共是4, 對於每一個batch的feature map的size是3×2×2

對於所有batch中的同一個channel的元素進行求均值與方差，比如上圖，對於所有的batch，都拿出來最后一個channel，一共有4×4=16個元素，

然后求區這16個元素的均值與方差（上圖只求了mean，沒有求方差。。。），

求取完了均值與方差之后，對於這16個元素中的每個元素進行減去求取得到的均值與方差，然后乘以gamma加上beta，公式如下

所以對於一個batch normalization層而言，求取的均值與方差是對於所有batch中的同一個channel進行求取，batch normalization中的batch體現在這個地方

batch normalization層能夠學習到的參數，對於一個特定的channel而言實際上是兩個參數，gamma與beta，對於total的channel而言實際上是channel數目的兩倍。

用pytorch驗證上述想法是否准確，用上述方法求取均值，以及用batch normalization層輸出的均值，看看是否一樣

上代碼

 1 # -*-coding:utf-8-*-
 2 from torch import nn
 3 import torch
 4 
 5 m = nn.BatchNorm2d(3)  # bn設置的參數實際上是channel的參數
 6 input = torch.randn(4, 3, 2, 2)
 7 output = m(input)
 8 # print(output)
 9 a = (input[0, 0, :, :]+input[1, 0, :, :]+input[2, 0, :, :]+input[3, 0, :, :]).sum()/16
10 b = (input[0, 1, :, :]+input[1, 1, :, :]+input[2, 1, :, :]+input[3, 1, :, :]).sum()/16
11 c = (input[0, 2, :, :]+input[1, 2, :, :]+input[2, 2, :, :]+input[3, 2, :, :]).sum()/16
12 print('The mean value of the first channel is %f' % a.data)
13 print('The mean value of the first channel is %f' % b.data)
14 print('The mean value of the first channel is %f' % c.data)
15 print('The output mean value of the BN layer is %f, %f, %f' % (m.running_mean.data[0],m.running_mean.data[0],m.running_mean.data[0]))
16 print(m)

用

m = nn.BatchNorm2d(3)

聲明新的batch normalization層，用

input = torch.randn(4, 3, 2, 2)

模擬feature map的尺寸

輸出值

咦，怎么不一樣，貌似差了一個小數點，可能與BN層的momentum變量有關系，在生命batch normalization層的時候將momentum設置為1試一試

m.momentum=1

輸出結果

沒毛病

至於方差以及輸出值，大抵也是這樣進行計算的吧，留個坑

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 PyTorch中的Batch Normalization Pytorch Batch Normalization 中 track_running_stats問題使用TensorFlow中的Batch Normalization Batch normalization和Instance normalization的對比 Batch Normalization原理深度學習之Batch Normalization Caffe Batch Normalization推導 Batch Normalization 與Dropout 的沖突 Batch Normalization和Layer Normalization的對比分析 batch normalization在測試時的問題