源碼地址:https://github.com/aitorzip/PyTorch-CycleGAN
如圖所示,cycleGAN的網絡結構包括兩個生成器G(X->Y)和F(Y->X),兩個判別器Dx和Dy
生成器部分:網絡整體上經過一個降采樣然后上采樣的過程,中間是一系列殘差塊,數目由實際情況確定,根據論文中所說,當輸入分辨率為128x128,采用6個殘差塊,當輸入分辨率為256x256甚至更高時,采用9個殘差塊,其源代碼如下,
class Generator(nn.Module): def __init__(self, input_nc, output_nc, n_residual_blocks=9): super(Generator, self).__init__() # Initial convolution block model = [ nn.ReflectionPad2d(3), nn.Conv2d(input_nc, 64, 7), nn.InstanceNorm2d(64), nn.ReLU(inplace=True) ] # Downsampling in_features = 64 out_features = in_features*2 for _ in range(2): model += [ nn.Conv2d(in_features, out_features, 3, stride=2, padding=1), nn.InstanceNorm2d(out_features), nn.ReLU(inplace=True) ] in_features = out_features out_features = in_features*2 # Residual blocks for _ in range(n_residual_blocks): model += [ResidualBlock(in_features)] # Upsampling out_features = in_features//2 for _ in range(2): model += [ nn.ConvTranspose2d(in_features, out_features, 3, stride=2, padding=1, output_padding=1), nn.InstanceNorm2d(out_features), nn.ReLU(inplace=True) ] in_features = out_features out_features = in_features//2 # Output layer model += [ nn.ReflectionPad2d(3), nn.Conv2d(64, output_nc, 7), nn.Tanh() ] self.model = nn.Sequential(*model) def forward(self, x): return self.model(x)
其中,值得注意的網絡層是nn.ReflectionPad2d和nn.InstanceNorm2d,前者搭配7x7卷積,先在特征圖周圍以反射的方式補長度,使得卷積后特征圖尺寸不變,示例如下,輸出結果就是以特征圖邊界為反射邊,向外補充
nn.InstanceNorm2d是相比於batchNorm更加適合圖像生成,風格遷移的歸一化方法,相比於batchNorm跨樣本,單通道統計,InstanceNorm采用單樣本,單通道統計,括號中的參數代表通道數。
判別器部分:結構比生成器更加簡單,經過5層卷積,通道數縮減為1,最后池化平均,尺寸也縮減為1x1,最最后reshape一下,變為(batchsize,1)
class Discriminator(nn.Module): def __init__(self, input_nc): super(Discriminator, self).__init__() # A bunch of convolutions one after another model = [ nn.Conv2d(input_nc, 64, 4, stride=2, padding=1), nn.LeakyReLU(0.2, inplace=True) ] model += [ nn.Conv2d(64, 128, 4, stride=2, padding=1), nn.InstanceNorm2d(128), nn.LeakyReLU(0.2, inplace=True) ] model += [ nn.Conv2d(128, 256, 4, stride=2, padding=1), nn.InstanceNorm2d(256), nn.LeakyReLU(0.2, inplace=True) ] model += [ nn.Conv2d(256, 512, 4, padding=1), nn.InstanceNorm2d(512), nn.LeakyReLU(0.2, inplace=True) ] # FCN classification layer model += [nn.Conv2d(512, 1, 4, padding=1)] self.model = nn.Sequential(*model) def forward(self, x): x = self.model(x) # Average pooling and flatten return F.avg_pool2d(x, x.size()[2:]).view(x.size()[0])