一.讀前說明
1.論文"Densely Connected Convolutional Networks"是現在為止效果最好的CNN架構,比Resnet還好,有必要學習一下它為什么效果這么好.
2.代碼地址:https://github.com/liuzhuang13/DenseNet
3.這篇論文主要參考了Highway Networks,Residual Networks (ResNets)和GoogLeNet,所以在讀本篇論文之前,有必要讀一下這幾篇論文,另外還可以看一下Very Deep Learning with Highway Networks
4.參考文獻 :ResNet && DenseNet(原理篇), DenseNet模型
二.閱讀筆記
Abstract
最近的一些論文表明,如果卷積神經網絡的各層到輸入層和輸出層的連接更短,那么該網絡就大體上可以設計得更深、更准確、訓練得更有效。本文基於此提出了“稠密卷積網絡(DensNet),該網絡每一層均以前饋的形式與其他任一層連接。因此,傳統卷積網絡有L層就只有L個連接,而DenseNet的任一層不僅與相鄰層有連接,而且與它的隨后的所有層都有直接連接,所以該網絡有L(L+1)/2個直接連接。任意一層的輸入都是其前面所有層的特征圖,而該層自己的特征圖是其隨后所有層的輸入。DenseNet有以下幾個令人激動的優點:1.減輕了梯度消失問題;2.強化了特征傳播;3.大幅度減少了參數數量。該網絡結構在4個高競爭性的目標識別基准數據集上進行了評估,包括:CIFAR-10,CIFAR-100,SVHN,ImageNet。DenseNet在這些數據集上大部分都獲得了巨大的提高,達到目前為止最高的識別准確率。
1.Introduction
在視覺識別中,CNN是一種強大的機器學習方法。盡管CNN在20年以前就被提出來,但是只是在最近幾年,計算機硬件和網絡結構的提高才使得真正的深層CNN的訓練變成可能。最開始的LeNet5包含5層,VGG包含19層,只有去年的Highway Networks和ResNets才超過了100層這個關卡。
三.閱讀感想:
翻譯了一半,居然感覺完全不用翻譯,真接看英文原文也能看懂,嗯對,這篇文章寫得通俗易懂,根本不用像看那些什么hiton、begio、yanlecun之類大牛寫的文章一樣,直接一遍看過去,看得似懂非懂的。看這篇論文看完之后,感覺像吃了蜂蜜一樣,看了還想看,連連最后實驗結果分析和discuss也寫得非常好,特別是discuss中那個圖,該文創意非常棒,並且簡單,最主要的是該文創意來源就是我最喜歡的那種,就是總結以前很多文章中效果好的原因,找出它們的共性,然后強化這個共性,從而得到更好的結果。
四.DenseNet結構:
1.在CIFAR-10上用訓練時的結構DenseNet-BC:
如果depth=40, growth_rate=12, bottleneck=True, reduction=0.5=1-compression,則每個denseblock里面的層數n_layers=((40-4)/3)//2=6.其中//2表示除以2后向下取整。
注:conv表示正常的2D卷積,CONV表示BN-ReLU-conv
結構如下:
input:(32,32,3)
conv(24,3,3), % 其中conv(24,3,3)=conv(filters=2*growth_rate=24,kernel_size=3,3)
#第1個dense block
CONV(48,1,1)-CONV(12,3,3)-merge(36)- % 其中CONV(48,1,1)=CONV(filters=inter_channel = nb_filter*4=48,1,1),merge后nb_filter=24+12=36
CONV(48,1,1)-CONV(12,3,3)-merge(48)- % 同上,merge后nb_filter=36+12=48
CONV(48,1,1)-CONV(12,3,3)-merge(60)-
CONV(48,1,1)-CONV(12,3,3)-merge(72)-
CONV(48,1,1)-CONV(12,3,3)-merge(84)-
CONV(48,1,1)-CONV(12,3,3)-merge(96)- % 此時nb_filter每多一層就增加growth_rate=12個,這里1個dense block里有6層,故增加72個,所以nb_falter=24+72=96
#第1個Transition Layer
CONV(48,1,1) % nb_filter=nb_filter*compression=96*0.5=48
AveragePool(2,2,(2,2)) % pool_size=2,2 strides=(2,2)
#第2個dense block
CONV(48,1,1)-CONV(12,3,3)-merge(108)- % 其中CONV(48,1,1)=CONV(filters=inter_channel = nb_filter*4=48,1,1),merge后nb_filter=96+12=108
CONV(48,1,1)-CONV(12,3,3)-merge(120)-
CONV(48,1,1)-CONV(12,3,3)-merge(132)-
CONV(48,1,1)-CONV(12,3,3)-merge(144)-
CONV(48,1,1)-CONV(12,3,3)-merge(156)-
CONV(48,1,1)-CONV(12,3,3)-merge(168)- % 此時nb_filter每多一層就增加growth_rate=12個,這里1個dense block里有6層,故增加72個,所以nb_falter=96+72=168
#第2個Transition Layer
CONV(60,1,1) % nb_filter=nb_filter*compression=120*0.5=60
AveragePool(2,2,(2,2)) % pool_size=2,2 strides=(2,2)
#第3個dense block
CONV(48,1,1)-CONV(12,3,3)-merge(180)- % 其中CONV(48,1,1)=CONV(filters=inter_channel = nb_filter*4=48,1,1)
CONV(48,1,1)-CONV(12,3,3)-merge(192)-
CONV(48,1,1)-CONV(12,3,3)-merge(204)-
CONV(48,1,1)-CONV(12,3,3)-merge(216)-
CONV(48,1,1)-CONV(12,3,3)-merge(228)-
CONV(48,1,1)-CONV(12,3,3)-merge(240)- % 此時nb_filter每多一層就增加growth_rate=12個,這里1個dense block里有6層,故增加72個,所以nb_falter=168+72=240
Relu-GlobalAveragePool-softmax
為驗證以上的分析,用keras==1.2.0版本驗證結果如下:
1 Model created 2 ____________________________________________________________________________________________________ 3 Layer (type) Output Shape Param # Connected to 4 ==================================================================================================== 5 input_1 (InputLayer) (None, 32, 32, 3) 0 6 ____________________________________________________________________________________________________ 7 initial_conv2D (Convolution2D) (None, 32, 32, 24) 648 input_1[0][0] 8 ____________________________________________________________________________________________________ 9 batchnormalization_1 (BatchNorma (None, 32, 32, 24) 96 initial_conv2D[0][0] 10 ____________________________________________________________________________________________________ 11 activation_1 (Activation) (None, 32, 32, 24) 0 batchnormalization_1[0][0] 12 ____________________________________________________________________________________________________ 13 convolution2d_1 (Convolution2D) (None, 32, 32, 48) 1152 activation_1[0][0] 14 ____________________________________________________________________________________________________ 15 batchnormalization_2 (BatchNorma (None, 32, 32, 48) 192 convolution2d_1[0][0] 16 ____________________________________________________________________________________________________ 17 activation_2 (Activation) (None, 32, 32, 48) 0 batchnormalization_2[0][0] 18 ____________________________________________________________________________________________________ 19 convolution2d_2 (Convolution2D) (None, 32, 32, 12) 5184 activation_2[0][0] 20 ____________________________________________________________________________________________________ 21 merge_1 (Merge) (None, 32, 32, 36) 0 initial_conv2D[0][0] 22 convolution2d_2[0][0] 23 ____________________________________________________________________________________________________ 24 batchnormalization_3 (BatchNorma (None, 32, 32, 36) 144 merge_1[0][0] 25 ____________________________________________________________________________________________________ 26 activation_3 (Activation) (None, 32, 32, 36) 0 batchnormalization_3[0][0] 27 ____________________________________________________________________________________________________ 28 convolution2d_3 (Convolution2D) (None, 32, 32, 48) 1728 activation_3[0][0] 29 ____________________________________________________________________________________________________ 30 batchnormalization_4 (BatchNorma (None, 32, 32, 48) 192 convolution2d_3[0][0] 31 ____________________________________________________________________________________________________ 32 activation_4 (Activation) (None, 32, 32, 48) 0 batchnormalization_4[0][0] 33 ____________________________________________________________________________________________________ 34 convolution2d_4 (Convolution2D) (None, 32, 32, 12) 5184 activation_4[0][0] 35 ____________________________________________________________________________________________________ 36 merge_2 (Merge) (None, 32, 32, 48) 0 initial_conv2D[0][0] 37 convolution2d_2[0][0] 38 convolution2d_4[0][0] 39 ____________________________________________________________________________________________________ 40 batchnormalization_5 (BatchNorma (None, 32, 32, 48) 192 merge_2[0][0] 41 ____________________________________________________________________________________________________ 42 activation_5 (Activation) (None, 32, 32, 48) 0 batchnormalization_5[0][0] 43 ____________________________________________________________________________________________________ 44 convolution2d_5 (Convolution2D) (None, 32, 32, 48) 2304 activation_5[0][0] 45 ____________________________________________________________________________________________________ 46 batchnormalization_6 (BatchNorma (None, 32, 32, 48) 192 convolution2d_5[0][0] 47 ____________________________________________________________________________________________________ 48 activation_6 (Activation) (None, 32, 32, 48) 0 batchnormalization_6[0][0] 49 ____________________________________________________________________________________________________ 50 convolution2d_6 (Convolution2D) (None, 32, 32, 12) 5184 activation_6[0][0] 51 ____________________________________________________________________________________________________ 52 merge_3 (Merge) (None, 32, 32, 60) 0 initial_conv2D[0][0] 53 convolution2d_2[0][0] 54 convolution2d_4[0][0] 55 convolution2d_6[0][0] 56 ____________________________________________________________________________________________________ 57 batchnormalization_7 (BatchNorma (None, 32, 32, 60) 240 merge_3[0][0] 58 ____________________________________________________________________________________________________ 59 activation_7 (Activation) (None, 32, 32, 60) 0 batchnormalization_7[0][0] 60 ____________________________________________________________________________________________________ 61 convolution2d_7 (Convolution2D) (None, 32, 32, 48) 2880 activation_7[0][0] 62 ____________________________________________________________________________________________________ 63 batchnormalization_8 (BatchNorma (None, 32, 32, 48) 192 convolution2d_7[0][0] 64 ____________________________________________________________________________________________________ 65 activation_8 (Activation) (None, 32, 32, 48) 0 batchnormalization_8[0][0] 66 ____________________________________________________________________________________________________ 67 convolution2d_8 (Convolution2D) (None, 32, 32, 12) 5184 activation_8[0][0] 68 ____________________________________________________________________________________________________ 69 merge_4 (Merge) (None, 32, 32, 72) 0 initial_conv2D[0][0] 70 convolution2d_2[0][0] 71 convolution2d_4[0][0] 72 convolution2d_6[0][0] 73 convolution2d_8[0][0] 74 ____________________________________________________________________________________________________ 75 batchnormalization_9 (BatchNorma (None, 32, 32, 72) 288 merge_4[0][0] 76 ____________________________________________________________________________________________________ 77 activation_9 (Activation) (None, 32, 32, 72) 0 batchnormalization_9[0][0] 78 ____________________________________________________________________________________________________ 79 convolution2d_9 (Convolution2D) (None, 32, 32, 48) 3456 activation_9[0][0] 80 ____________________________________________________________________________________________________ 81 batchnormalization_10 (BatchNorm (None, 32, 32, 48) 192 convolution2d_9[0][0] 82 ____________________________________________________________________________________________________ 83 activation_10 (Activation) (None, 32, 32, 48) 0 batchnormalization_10[0][0] 84 ____________________________________________________________________________________________________ 85 convolution2d_10 (Convolution2D) (None, 32, 32, 12) 5184 activation_10[0][0] 86 ____________________________________________________________________________________________________ 87 merge_5 (Merge) (None, 32, 32, 84) 0 initial_conv2D[0][0] 88 convolution2d_2[0][0] 89 convolution2d_4[0][0] 90 convolution2d_6[0][0] 91 convolution2d_8[0][0] 92 convolution2d_10[0][0] 93 ____________________________________________________________________________________________________ 94 batchnormalization_11 (BatchNorm (None, 32, 32, 84) 336 merge_5[0][0] 95 ____________________________________________________________________________________________________ 96 activation_11 (Activation) (None, 32, 32, 84) 0 batchnormalization_11[0][0] 97 ____________________________________________________________________________________________________ 98 convolution2d_11 (Convolution2D) (None, 32, 32, 48) 4032 activation_11[0][0] 99 ____________________________________________________________________________________________________ 100 batchnormalization_12 (BatchNorm (None, 32, 32, 48) 192 convolution2d_11[0][0] 101 ____________________________________________________________________________________________________ 102 activation_12 (Activation) (None, 32, 32, 48) 0 batchnormalization_12[0][0] 103 ____________________________________________________________________________________________________ 104 convolution2d_12 (Convolution2D) (None, 32, 32, 12) 5184 activation_12[0][0] 105 ____________________________________________________________________________________________________ 106 merge_6 (Merge) (None, 32, 32, 96) 0 initial_conv2D[0][0] 107 convolution2d_2[0][0] 108 convolution2d_4[0][0] 109 convolution2d_6[0][0] 110 convolution2d_8[0][0] 111 convolution2d_10[0][0] 112 convolution2d_12[0][0] 113 ____________________________________________________________________________________________________ 114 batchnormalization_13 (BatchNorm (None, 32, 32, 96) 384 merge_6[0][0] 115 ____________________________________________________________________________________________________ 116 activation_13 (Activation) (None, 32, 32, 96) 0 batchnormalization_13[0][0] 117 ____________________________________________________________________________________________________ 118 convolution2d_13 (Convolution2D) (None, 32, 32, 96) 9216 activation_13[0][0] 119 ____________________________________________________________________________________________________ 120 averagepooling2d_1 (AveragePooli (None, 16, 16, 96) 0 convolution2d_13[0][0] 121 ____________________________________________________________________________________________________ 122 batchnormalization_14 (BatchNorm (None, 16, 16, 96) 384 averagepooling2d_1[0][0] 123 ____________________________________________________________________________________________________ 124 activation_14 (Activation) (None, 16, 16, 96) 0 batchnormalization_14[0][0] 125 ____________________________________________________________________________________________________ 126 convolution2d_14 (Convolution2D) (None, 16, 16, 48) 4608 activation_14[0][0] 127 ____________________________________________________________________________________________________ 128 batchnormalization_15 (BatchNorm (None, 16, 16, 48) 192 convolution2d_14[0][0] 129 ____________________________________________________________________________________________________ 130 activation_15 (Activation) (None, 16, 16, 48) 0 batchnormalization_15[0][0] 131 ____________________________________________________________________________________________________ 132 convolution2d_15 (Convolution2D) (None, 16, 16, 12) 5184 activation_15[0][0] 133 ____________________________________________________________________________________________________ 134 merge_7 (Merge) (None, 16, 16, 108) 0 averagepooling2d_1[0][0] 135 convolution2d_15[0][0] 136 ____________________________________________________________________________________________________ 137 batchnormalization_16 (BatchNorm (None, 16, 16, 108) 432 merge_7[0][0] 138 ____________________________________________________________________________________________________ 139 activation_16 (Activation) (None, 16, 16, 108) 0 batchnormalization_16[0][0] 140 ____________________________________________________________________________________________________ 141 convolution2d_16 (Convolution2D) (None, 16, 16, 48) 5184 activation_16[0][0] 142 ____________________________________________________________________________________________________ 143 batchnormalization_17 (BatchNorm (None, 16, 16, 48) 192 convolution2d_16[0][0] 144 ____________________________________________________________________________________________________ 145 activation_17 (Activation) (None, 16, 16, 48) 0 batchnormalization_17[0][0] 146 ____________________________________________________________________________________________________ 147 convolution2d_17 (Convolution2D) (None, 16, 16, 12) 5184 activation_17[0][0] 148 ____________________________________________________________________________________________________ 149 merge_8 (Merge) (None, 16, 16, 120) 0 averagepooling2d_1[0][0] 150 convolution2d_15[0][0] 151 convolution2d_17[0][0] 152 ____________________________________________________________________________________________________ 153 batchnormalization_18 (BatchNorm (None, 16, 16, 120) 480 merge_8[0][0] 154 ____________________________________________________________________________________________________ 155 activation_18 (Activation) (None, 16, 16, 120) 0 batchnormalization_18[0][0] 156 ____________________________________________________________________________________________________ 157 convolution2d_18 (Convolution2D) (None, 16, 16, 48) 5760 activation_18[0][0] 158 ____________________________________________________________________________________________________ 159 batchnormalization_19 (BatchNorm (None, 16, 16, 48) 192 convolution2d_18[0][0] 160 ____________________________________________________________________________________________________ 161 activation_19 (Activation) (None, 16, 16, 48) 0 batchnormalization_19[0][0] 162 ____________________________________________________________________________________________________ 163 convolution2d_19 (Convolution2D) (None, 16, 16, 12) 5184 activation_19[0][0] 164 ____________________________________________________________________________________________________ 165 merge_9 (Merge) (None, 16, 16, 132) 0 averagepooling2d_1[0][0] 166 convolution2d_15[0][0] 167 convolution2d_17[0][0] 168 convolution2d_19[0][0] 169 ____________________________________________________________________________________________________ 170 batchnormalization_20 (BatchNorm (None, 16, 16, 132) 528 merge_9[0][0] 171 ____________________________________________________________________________________________________ 172 activation_20 (Activation) (None, 16, 16, 132) 0 batchnormalization_20[0][0] 173 ____________________________________________________________________________________________________ 174 convolution2d_20 (Convolution2D) (None, 16, 16, 48) 6336 activation_20[0][0] 175 ____________________________________________________________________________________________________ 176 batchnormalization_21 (BatchNorm (None, 16, 16, 48) 192 convolution2d_20[0][0] 177 ____________________________________________________________________________________________________ 178 activation_21 (Activation) (None, 16, 16, 48) 0 batchnormalization_21[0][0] 179 ____________________________________________________________________________________________________ 180 convolution2d_21 (Convolution2D) (None, 16, 16, 12) 5184 activation_21[0][0] 181 ____________________________________________________________________________________________________ 182 merge_10 (Merge) (None, 16, 16, 144) 0 averagepooling2d_1[0][0] 183 convolution2d_15[0][0] 184 convolution2d_17[0][0] 185 convolution2d_19[0][0] 186 convolution2d_21[0][0] 187 ____________________________________________________________________________________________________ 188 batchnormalization_22 (BatchNorm (None, 16, 16, 144) 576 merge_10[0][0] 189 ____________________________________________________________________________________________________ 190 activation_22 (Activation) (None, 16, 16, 144) 0 batchnormalization_22[0][0] 191 ____________________________________________________________________________________________________ 192 convolution2d_22 (Convolution2D) (None, 16, 16, 48) 6912 activation_22[0][0] 193 ____________________________________________________________________________________________________ 194 batchnormalization_23 (BatchNorm (None, 16, 16, 48) 192 convolution2d_22[0][0] 195 ____________________________________________________________________________________________________ 196 activation_23 (Activation) (None, 16, 16, 48) 0 batchnormalization_23[0][0] 197 ____________________________________________________________________________________________________ 198 convolution2d_23 (Convolution2D) (None, 16, 16, 12) 5184 activation_23[0][0] 199 ____________________________________________________________________________________________________ 200 merge_11 (Merge) (None, 16, 16, 156) 0 averagepooling2d_1[0][0] 201 convolution2d_15[0][0] 202 convolution2d_17[0][0] 203 convolution2d_19[0][0] 204 convolution2d_21[0][0] 205 convolution2d_23[0][0] 206 ____________________________________________________________________________________________________ 207 batchnormalization_24 (BatchNorm (None, 16, 16, 156) 624 merge_11[0][0] 208 ____________________________________________________________________________________________________ 209 activation_24 (Activation) (None, 16, 16, 156) 0 batchnormalization_24[0][0] 210 ____________________________________________________________________________________________________ 211 convolution2d_24 (Convolution2D) (None, 16, 16, 48) 7488 activation_24[0][0] 212 ____________________________________________________________________________________________________ 213 batchnormalization_25 (BatchNorm (None, 16, 16, 48) 192 convolution2d_24[0][0] 214 ____________________________________________________________________________________________________ 215 activation_25 (Activation) (None, 16, 16, 48) 0 batchnormalization_25[0][0] 216 ____________________________________________________________________________________________________ 217 convolution2d_25 (Convolution2D) (None, 16, 16, 12) 5184 activation_25[0][0] 218 ____________________________________________________________________________________________________ 219 merge_12 (Merge) (None, 16, 16, 168) 0 averagepooling2d_1[0][0] 220 convolution2d_15[0][0] 221 convolution2d_17[0][0] 222 convolution2d_19[0][0] 223 convolution2d_21[0][0] 224 convolution2d_23[0][0] 225 convolution2d_25[0][0] 226 ____________________________________________________________________________________________________ 227 batchnormalization_26 (BatchNorm (None, 16, 16, 168) 672 merge_12[0][0] 228 ____________________________________________________________________________________________________ 229 activation_26 (Activation) (None, 16, 16, 168) 0 batchnormalization_26[0][0] 230 ____________________________________________________________________________________________________ 231 convolution2d_26 (Convolution2D) (None, 16, 16, 168) 28224 activation_26[0][0] 232 ____________________________________________________________________________________________________ 233 averagepooling2d_2 (AveragePooli (None, 8, 8, 168) 0 convolution2d_26[0][0] 234 ____________________________________________________________________________________________________ 235 batchnormalization_27 (BatchNorm (None, 8, 8, 168) 672 averagepooling2d_2[0][0] 236 ____________________________________________________________________________________________________ 237 activation_27 (Activation) (None, 8, 8, 168) 0 batchnormalization_27[0][0] 238 ____________________________________________________________________________________________________ 239 convolution2d_27 (Convolution2D) (None, 8, 8, 48) 8064 activation_27[0][0] 240 ____________________________________________________________________________________________________ 241 batchnormalization_28 (BatchNorm (None, 8, 8, 48) 192 convolution2d_27[0][0] 242 ____________________________________________________________________________________________________ 243 activation_28 (Activation) (None, 8, 8, 48) 0 batchnormalization_28[0][0] 244 ____________________________________________________________________________________________________ 245 convolution2d_28 (Convolution2D) (None, 8, 8, 12) 5184 activation_28[0][0] 246 ____________________________________________________________________________________________________ 247 merge_13 (Merge) (None, 8, 8, 180) 0 averagepooling2d_2[0][0] 248 convolution2d_28[0][0] 249 ____________________________________________________________________________________________________ 250 batchnormalization_29 (BatchNorm (None, 8, 8, 180) 720 merge_13[0][0] 251 ____________________________________________________________________________________________________ 252 activation_29 (Activation) (None, 8, 8, 180) 0 batchnormalization_29[0][0] 253 ____________________________________________________________________________________________________ 254 convolution2d_29 (Convolution2D) (None, 8, 8, 48) 8640 activation_29[0][0] 255 ____________________________________________________________________________________________________ 256 batchnormalization_30 (BatchNorm (None, 8, 8, 48) 192 convolution2d_29[0][0] 257 ____________________________________________________________________________________________________ 258 activation_30 (Activation) (None, 8, 8, 48) 0 batchnormalization_30[0][0] 259 ____________________________________________________________________________________________________ 260 convolution2d_30 (Convolution2D) (None, 8, 8, 12) 5184 activation_30[0][0] 261 ____________________________________________________________________________________________________ 262 merge_14 (Merge) (None, 8, 8, 192) 0 averagepooling2d_2[0][0] 263 convolution2d_28[0][0] 264 convolution2d_30[0][0] 265 ____________________________________________________________________________________________________ 266 batchnormalization_31 (BatchNorm (None, 8, 8, 192) 768 merge_14[0][0] 267 ____________________________________________________________________________________________________ 268 activation_31 (Activation) (None, 8, 8, 192) 0 batchnormalization_31[0][0] 269 ____________________________________________________________________________________________________ 270 convolution2d_31 (Convolution2D) (None, 8, 8, 48) 9216 activation_31[0][0] 271 ____________________________________________________________________________________________________ 272 batchnormalization_32 (BatchNorm (None, 8, 8, 48) 192 convolution2d_31[0][0] 273 ____________________________________________________________________________________________________ 274 activation_32 (Activation) (None, 8, 8, 48) 0 batchnormalization_32[0][0] 275 ____________________________________________________________________________________________________ 276 convolution2d_32 (Convolution2D) (None, 8, 8, 12) 5184 activation_32[0][0] 277 ____________________________________________________________________________________________________ 278 merge_15 (Merge) (None, 8, 8, 204) 0 averagepooling2d_2[0][0] 279 convolution2d_28[0][0] 280 convolution2d_30[0][0] 281 convolution2d_32[0][0] 282 ____________________________________________________________________________________________________ 283 batchnormalization_33 (BatchNorm (None, 8, 8, 204) 816 merge_15[0][0] 284 ____________________________________________________________________________________________________ 285 activation_33 (Activation) (None, 8, 8, 204) 0 batchnormalization_33[0][0] 286 ____________________________________________________________________________________________________ 287 convolution2d_33 (Convolution2D) (None, 8, 8, 48) 9792 activation_33[0][0] 288 ____________________________________________________________________________________________________ 289 batchnormalization_34 (BatchNorm (None, 8, 8, 48) 192 convolution2d_33[0][0] 290 ____________________________________________________________________________________________________ 291 activation_34 (Activation) (None, 8, 8, 48) 0 batchnormalization_34[0][0] 292 ____________________________________________________________________________________________________ 293 convolution2d_34 (Convolution2D) (None, 8, 8, 12) 5184 activation_34[0][0] 294 ____________________________________________________________________________________________________ 295 merge_16 (Merge) (None, 8, 8, 216) 0 averagepooling2d_2[0][0] 296 convolution2d_28[0][0] 297 convolution2d_30[0][0] 298 convolution2d_32[0][0] 299 convolution2d_34[0][0] 300 ____________________________________________________________________________________________________ 301 batchnormalization_35 (BatchNorm (None, 8, 8, 216) 864 merge_16[0][0] 302 ____________________________________________________________________________________________________ 303 activation_35 (Activation) (None, 8, 8, 216) 0 batchnormalization_35[0][0] 304 ____________________________________________________________________________________________________ 305 convolution2d_35 (Convolution2D) (None, 8, 8, 48) 10368 activation_35[0][0] 306 ____________________________________________________________________________________________________ 307 batchnormalization_36 (BatchNorm (None, 8, 8, 48) 192 convolution2d_35[0][0] 308 ____________________________________________________________________________________________________ 309 activation_36 (Activation) (None, 8, 8, 48) 0 batchnormalization_36[0][0] 310 ____________________________________________________________________________________________________ 311 convolution2d_36 (Convolution2D) (None, 8, 8, 12) 5184 activation_36[0][0] 312 ____________________________________________________________________________________________________ 313 merge_17 (Merge) (None, 8, 8, 228) 0 averagepooling2d_2[0][0] 314 convolution2d_28[0][0] 315 convolution2d_30[0][0] 316 convolution2d_32[0][0] 317 convolution2d_34[0][0] 318 convolution2d_36[0][0] 319 ____________________________________________________________________________________________________ 320 batchnormalization_37 (BatchNorm (None, 8, 8, 228) 912 merge_17[0][0] 321 ____________________________________________________________________________________________________ 322 activation_37 (Activation) (None, 8, 8, 228) 0 batchnormalization_37[0][0] 323 ____________________________________________________________________________________________________ 324 convolution2d_37 (Convolution2D) (None, 8, 8, 48) 10944 activation_37[0][0] 325 ____________________________________________________________________________________________________ 326 batchnormalization_38 (BatchNorm (None, 8, 8, 48) 192 convolution2d_37[0][0] 327 ____________________________________________________________________________________________________ 328 activation_38 (Activation) (None, 8, 8, 48) 0 batchnormalization_38[0][0] 329 ____________________________________________________________________________________________________ 330 convolution2d_38 (Convolution2D) (None, 8, 8, 12) 5184 activation_38[0][0] 331 ____________________________________________________________________________________________________ 332 merge_18 (Merge) (None, 8, 8, 240) 0 averagepooling2d_2[0][0] 333 convolution2d_28[0][0] 334 convolution2d_30[0][0] 335 convolution2d_32[0][0] 336 convolution2d_34[0][0] 337 convolution2d_36[0][0] 338 convolution2d_38[0][0] 339 ____________________________________________________________________________________________________ 340 batchnormalization_39 (BatchNorm (None, 8, 8, 240) 960 merge_18[0][0] 341 ____________________________________________________________________________________________________ 342 activation_39 (Activation) (None, 8, 8, 240) 0 batchnormalization_39[0][0] 343 ____________________________________________________________________________________________________ 344 globalaveragepooling2d_1 (Global (None, 240) 0 activation_39[0][0] 345 ____________________________________________________________________________________________________ 346 dense_1 (Dense) (None, 10) 2410 globalaveragepooling2d_1[0][0] 347 ==================================================================================================== 348 Total params: 257,218 349 Trainable params: 249,946 350 Non-trainable params: 7,272 351 ____________________________________________________________________________________________________ 352 Finished compiling 353 Building model...
五.疑問:
1.運行完keras實驗之后發現,居然在每個CONV(48,1,1)-CONV(12,3,3)- 后面都有一個Merge,可是在代碼中我並沒有發現呀,哪里來的?肯定是我看漏了,可是它是從哪來的呢?
答:原來在dense_block的定義中有這樣一句話看掉了:
1 for i in range(nb_layers): 2 x = conv_block(x, growth_rate, bottleneck, dropout_rate, weight_decay) 3 feature_list.append(x) 4 x = merge(feature_list, mode='concat', concat_axis=concat_axis) 5 nb_filter += growth_rate
意思就是在每個這樣一個模塊后,都要進行Merge,即:就是把每一層的輸出都串聯在一起,從而組成一個新的tensor。
2.為什么每個denseblock里面的層數n_layers=((40-4)/3)//2=6.其中//2表示除以2后向下取整?即為什么是減4?
答:因為該結構中層,除了dense block 中有很多層外,還1個初始的卷積層、2個過渡層、以及1個最后分類輸出層。注意:在該論文中,講的結構深度depth為L,它並不包括輸入層在內。
所以對本論文中的深度depth或L的定義如下:
a.初始的卷積conv,算作1層;
b.每個過渡層,算作1層;
c.每個dense block中的CONV(48,1,1)-CONV(12,3,3)模塊,算作2層,即:1個CONV就算作1層;
d.最后的輸出模塊Relu-GlobalAveragePool-softmax,算作1層。
也可這么說:深度就是卷積層的層數加上1個softmax層。
