DenseNet與殘差網絡(ResNet)有區別也類似。區別如下:
在跨層連接上:ResNet(左)使⽤相加;DenseNet(右)使⽤連結。
DenseNet將模塊 A 直接跟模塊 B 后⾯的所有層連接在了⼀起。這也是它被稱
為“稠密連接”的原因。
DenseNet的主要構建模塊是稠密塊(dense block)和過渡層(transition layer)。前者定義了輸⼊和輸出是如何連結的,后者則⽤來控制通道數,使之不過⼤。
一、稠密塊
DenseNet使⽤了ResNet改良版的“批量歸⼀化、激活和卷積”結構。
import time
import torch
from torch import nn, optim
import torch.nn.functional as F
import sys
sys.path.append("..")
import d2lzh_pytorch as d2l
device = torch.device('cuda' if torch.cuda.is_available() else'cpu')
def conv_block(in_channels, out_channels):
blk = nn.Sequential(nn.BatchNorm2d(in_channels),
nn.ReLU(),
nn.Conv2d(in_channels, out_channels,
kernel_size=3, padding=1))
return blk
稠密塊由多個 conv_block
組成,每塊使⽤相同的輸出通道數。但在前向計算時,我們將每塊的輸⼊和輸出在通道維上連結。
class DenseBlock(nn.Module):
def __init__(self, num_convs, in_channels, ut_channels):
super(DenseBlock, self).__init__()
net = []
for i in range(num_convs):
in_c = in_channels + i * out_channels
net.append(conv_block(in_c, out_channels))
self.net = nn.ModuleList(net)
self.out_channels = in_channels + num_convs *
out_channels
# 計算輸出通道數
def forward(self, X):
for blk in self.net:
Y = blk(X)
X = torch.cat((X, Y), dim=1) # 在通道維上將輸⼊和輸出連結
return X
二、過渡層
由於每個稠密塊都會帶來通道數的增加,使⽤過多則會帶來過於復雜的模型。過渡層⽤來控制模型復雜度。它通過 卷積層來減⼩通道數,並使⽤步幅為2的平均池化層減半⾼和寬,從⽽進⼀步降低模型復雜度。
def transition_block(in_channels, out_channels):
blk = nn.Sequential(nn.BatchNorm2d(in_channels),
nn.ReLU(),
nn.Conv2d(in_channels, out_channels,
kernel_size=1),
nn.AvgPool2d(kernel_size=2,
stride=2))
return blk
三、構建模型
DenseNet⾸先使⽤同ResNet⼀樣的單卷積層和最⼤池化層。
net = nn.Sequential(
nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3),
nn.BatchNorm2d(64),
nn.ReLU(),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1))
參考鏈接:https://zh.d2l.ai/chapter_convolutional-neural-networks/densenet.html