【PyTorch】L2 正则化


论文 Bag of Tricks for Image Classification with Convolutional Neural Networks. 中提到,加 L2 正则就相当于将该权重趋向 0,而对于 CNN 而言,一般只对卷积层和全连接层的 weights 进行 L2(weight decay),而不对 biases 进行。Batch Normalization 层也不进行 L2。

PyTorch,只对卷积层和全连接层的 weights 进行 L2(weight decay):

weight_decay_list = (param for name, param in model.named_parameters() if name[-4:] != 'bias' and "bn" not in name)
no_decay_list = (param for name, param in model.named_parameters() if name[-4:] == 'bias' or "bn" in name)
parameters = [{'params': weight_decay_list},
              {'params': no_decay_list, 'weight_decay': 0.}]

optimizer = torch.optim.SGD(parameters, lr=0.1, momentum=0.9, weight_decay=5e-4, nesterov=True)

References

[1] He, T., Zhang, Z., Zhang, H., Zhang, Z., Xie, J., Li, M. (2019). Bag of Tricks for Image Classification with Convolutional Neural Networks. (CVPR) https://dx.doi.org/10.1109/cvpr.2019.00065


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM