如何理解歸一化(Normalization)對於神經網絡(深度學習)的幫助?
鏈接:https://www.zhihu.com/question/326034346/answer/730051338
來源:知乎
著作權歸作者所有。商業轉載請聯系作者獲得授權,非商業轉載請注明出處。
回顧一下圍繞normalization的一些工作(由最新到最舊的BatchNorm):
2019,Weight Standardization(沒有發表,但是有大佬Alan Yuille加持)
Weight Standardization 2019
WS叫權重標准化,建立在BN可以平滑損失landscape/BN可以平滑激活值這個觀點上,進一步提升GN的效果匹配到BN的水平上,針對GN在micro-batch訓練時性能不及BN。WS的原理是:減小損失和梯度的Lipschitz常數。
2019,Dynamic Normalization
Differentiable Dynamic Normalization for Learning Deep Representation ICML 2019
跟SN類似,加入了GN。
2019,Switchable Normalization
Differentiable Learning-to-Normalize via Switchable Normalization ICLR 2019
SN是為每一層選擇/學習適當的歸一化層(IN、LN和BN),在ImageNet,COCO,CityScapes,ADE20K和Kinetics等數據集上進行實驗,應用涵蓋圖像分類、物體檢測、語義分割和視頻分類。
2019,Iterative Normalization(CVPR)
Iterative Normalization Beyond Standardization towards Efficient Whitening CVPR 2019
DBN的高效版本
2019,Spatially-Adaptive Normalization(CVPR)
Semantic Image Synthesis with Spatially-Adaptive Normalization CVPR 2019
用於圖像生成
2018,Gradient Normalization(ICML)
GradNorm Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks ICML 2018
2018,Kalman Normalization
Kalman Normalization Normalizing Internal Representations Across Network Layers NIPS 2018
2018,Decorrelated Batch Normalization
Decorrelated Batch Normalization CVPR 2018
BN+白化
2018,Spectral Normalization(ICLR)
Spectral Normalization for Generative Adversarial Networks ICLR 2018
2018,Group Normalization(ECCV)
Group Normalization ECCV 2018
用於物體檢測和語義分割等batch size很小的時候
GroupNorm是InstanceNorm的變體。
2018,Batch-Instance Normalization
Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks NIPS 2018
2018,Instance-Batch Normalization
Two at Once Enhancing Learning and Generalization Capacities via IBN-Net ECCV 2018
2016,Layer Normalization(沒有發表)
用於RNN
2016,Instance Normalization(沒有發表,但是經過了實踐檢驗)
用於風格遷移
2016,Weight Normalization(NIPS)
2015,Batch Normalization(ICML)
用於卷積網絡ConvNet和圖像分類