一個預測層的網絡結構如下所示:
可以看到,是由三個分支組成的,分別是"PriorBox"層,以及conf、loc的預測層,其中,conf與loc的預測層的參數是由PriorBox的參數計算得到的,具體計算公式如下:
min_size與max_size分別對應一個尺度的預測框(有幾個就對應幾個預測框),in_size只管自己的預測,而max_size是與aspect_ratio聯系在一起的;
filp參數是對應aspect_ratio的預測框*2,以幾個max_size,再乘以幾;最終得到結果為A
conf、loc的參數是在A的基礎上再乘以類別數(加背景),以及4
如下,是需要預測兩類的其中一個尺度的網絡參數;
如上算出的是,每個格子需要預測的conf以及loc的個數;
每個預測層有H*W個格子,因此,總共預測的loc以及conf的個數是需要乘以H*W的;
如下是某一個層的例子(轉自:http://www.360doc.com/content/17/1013/16/42392246_694639090.shtml)
注意最后這里的num_priorbox的值與前面的並不一樣,這里是每個預測層所有的輸出框的個數:
layer { name: "combined_2_EltwisePROD_relu" type: "ReLU" bottom: "combined_2_EltwisePROD" top: "combined_2_EltwisePROD_relu" } ########################################### ################################################################### layer { name: "rescombined_2_EltwisePROD_relu_inter256_mbox_locnew_inter" type: "Convolution" bottom: "combined_2_EltwisePROD_relu" top: "rescombined_2_EltwisePROD_relu_inter256_mbox_locnew_inter" param { lr_mult: 1 decay_mult: 1 } convolution_param { num_output: 128 bias_term: false pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } } } layer { name: "rescombined_2_EltwisePROD_relu_inter256_mbox_locnew_inter_bn" type: "BatchNorm" bottom: "rescombined_2_EltwisePROD_relu_inter256_mbox_locnew_inter" top: "rescombined_2_EltwisePROD_relu_inter256_mbox_locnew_inter" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } batch_norm_param { moving_average_fraction: 0.999 eps: 0.001 } } layer { name: "rescombined_2_EltwisePROD_relu_inter256_mbox_locnew_inter_scale" type: "Scale" bottom: "rescombined_2_EltwisePROD_relu_inter256_mbox_locnew_inter" top: "rescombined_2_EltwisePROD_relu_inter256_mbox_locnew_inter" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 1 decay_mult: 0 } scale_param { filler { type: "constant" value: 1.0 } bias_term: true bias_filler { type: "constant" value: 0.0 } } } layer { name: "rescombined_2i_EltwisePROD_relu_inter256_mbox_locnew_inter" type: "Convolution" bottom: "combined_2_EltwisePROD_relu" top: "rescombined_2i_EltwisePROD_relu_inter256_mbox_locnew_inter" param { lr_mult: 1 decay_mult: 1 } convolution_param { num_output: 128 bias_term: false pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "gaussian" std: 0.01 } } } layer { name: "rescombined_2i_EltwisePROD_relu_inter256_mbox_locnew_inter_bn" type: "BatchNorm" bottom: "rescombined_2i_EltwisePROD_relu_inter256_mbox_locnew_inter" top: "rescombined_2i_EltwisePROD_relu_inter256_mbox_locnew_inter" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } batch_norm_param { moving_average_fraction: 0.999 eps: 0.001 } } layer { name: "rescombined_2i_EltwisePROD_relu_inter256_mbox_locnew_inter_scale" type: "Scale" bottom: "rescombined_2i_EltwisePROD_relu_inter256_mbox_locnew_inter" top: "rescombined_2i_EltwisePROD_relu_inter256_mbox_locnew_inter" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 1 decay_mult: 0 } scale_param { filler { type: "constant" value: 1.0 } bias_term: true bias_filler { type: "constant" value: 0.0 } } } layer { name: "combined_2_EltwisePROD_relu_mbox_loc" type: "Convolution" bottom: "rescombined_2_EltwisePROD_relu_inter256_mbox_locnew_inter" top: "combined_2_EltwisePROD_relu_mbox_loc" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { engine: CAFFE num_output: 64 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "combined_2_EltwisePROD_relu_mbox_loc_perm" type: "Permute" bottom: "combined_2_EltwisePROD_relu_mbox_loc" top: "combined_2_EltwisePROD_relu_mbox_loc_perm" permute_param { order: 0 order: 2 order: 3 order: 1 } } layer { name: "combined_2_EltwisePROD_relu_mbox_loc_flat" type: "Flatten" bottom: "combined_2_EltwisePROD_relu_mbox_loc_perm" top: "combined_2_EltwisePROD_relu_mbox_loc_flat" flatten_param { axis: 1 } } layer { name: "combined_2_EltwisePROD_relu_mbox_conf_new" type: "Convolution" bottom: "rescombined_2i_EltwisePROD_relu_inter256_mbox_locnew_inter" top: "combined_2_EltwisePROD_relu_mbox_conf_new" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { engine: CAFFE num_output: 32 pad: 1 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "combined_2_EltwisePROD_relu_mbox_conf_new_perm" type: "Permute" bottom: "combined_2_EltwisePROD_relu_mbox_conf_new" top: "combined_2_EltwisePROD_relu_mbox_conf_new_perm" permute_param { order: 0 order: 2 order: 3 order: 1 } } layer { name: "combined_2_EltwisePROD_relu_mbox_conf_new_flat" type: "Flatten" bottom: "combined_2_EltwisePROD_relu_mbox_conf_new_perm" top: "combined_2_EltwisePROD_relu_mbox_conf_new_flat" flatten_param { axis: 1 } } layer { name: "combined_2_EltwisePROD_relu_mbox_priorbox" type: "PriorBox" bottom: "combined_2_EltwisePROD_relu" bottom: "data" top: "combined_2_EltwisePROD_relu_mbox_priorbox" prior_box_param { min_size: 12.0 min_size: 6.0 max_size: 30.0 max_size: 20.0 aspect_ratio: 2 aspect_ratio: 2.5 aspect_ratio: 3 flip: true clip: false variance: 0.1 variance: 0.1 variance: 0.2 variance: 0.2 step: 4 offset: 0.5 } }