在前一篇隨筆中,數據制作成了VOC2007格式,可以用於Faster-RCNN的訓練。
1.針對數據的修改
修改datasets\VOCdevkit2007\VOCcode\VOCinit.m,我只做了兩類
VOCopts.classes={... 'dog' 'flower'};
修改function\fast_rcnn\fast_rcnn_train.m,val_iters不能大於val數據量(我的只有幾十個)。
ip.addParamValue('val_iters', 20, @isscalar);
修改function\rpn\proposal_train.m,與上一致。
ip.addParamValue('val_iters', 20, @isscalar);
修改models\fast_rcnn_prototxts中兩個文件夾里面的train_val.prototxt和test.prototxt,以K代表類別數做相應的修改,(共4個文件修改12處)。
input: "bbox_targets" input_dim: 1 # to be changed on-the-fly to match num ROIs input_dim: 12 # 4 * (K+1) (=21) classes input_dim: 1 input_dim: 1 input: "bbox_loss_weights" input_dim: 1 # to be changed on-the-fly to match num ROIs input_dim: 12 # 4 * (K+1) (=21) classes input_dim: 1 input_dim: 1
type: "InnerProduct" inner_product_param { num_output: 3 #K+1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } }
layer { bottom: "fc7" top: "cls_score" name: "cls_score" param { lr_mult: 1.0 } param { lr_mult: 2.0 } type: "InnerProduct" inner_product_param { num_output: 3 # K+1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { bottom: "fc7" top: "bbox_pred" name: "bbox_pred" type: "InnerProduct" param { lr_mult: 1.0 } param { lr_mult: 2.0 } inner_product_param { num_output: 12 # 4 * (K+1) weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0 } } }
修改experiments\+Model\ZF_for_Faster_RCNN_VOC2007.m的三個為solver_30k40k.prototxt,默認60k80k所需時間過長。
2.根據設備性能的修改
顯卡GTX750,顯存2G,盡管數據不多,在默認設置下出現了內存不夠的錯誤。
修改functions\fast_rcnn\fast_rcnn_config.m,以%標注的為默認值。
%% training % whether use gpu ip.addParamValue('use_gpu', gpuDeviceCount > 0, ... @islogical); % Image scales -- the short edge of input image ip.addParamValue('scales', 60, @ismatrix); %600 % Max pixel size of a scaled input image ip.addParamValue('max_size', 1000, @isscalar); % Images per batch ip.addParamValue('ims_per_batch', 2, @isscalar); % Minibatch size ip.addParamValue('batch_size', 32, @isscalar); %128 % Fraction of minibatch that is foreground labeled (class > 0) ip.addParamValue('fg_fraction', 0.25, @isscalar); % Overlap threshold for a ROI to be considered foreground (if >= fg_thresh) ip.addParamValue('fg_thresh', 0.5, @isscalar); % Overlap threshold for a ROI to be considered background (class = 0 if % overlap in [bg_thresh_lo, bg_thresh_hi)) ip.addParamValue('bg_thresh_hi', 0.5, @isscalar); ip.addParamValue('bg_thresh_lo', 0.1, @isscalar); % mean image, in RGB order ip.addParamValue('image_means', 128, @ismatrix); % Use horizontally-flipped images during training? ip.addParamValue('use_flipped', true, @islogical); % Vaild training sample (IoU > bbox_thresh) for bounding box regresion ip.addParamValue('bbox_thresh', 0.5, @isscalar); % random seed ip.addParamValue('rng_seed', 6, @isscalar); %% testing ip.addParamValue('test_scales', 60, @isscalar); %600 ip.addParamValue('test_max_size', 1000, @isscalar); ip.addParamValue('test_nms', 0.3, @isscalar); ip.addParamValue('test_binary', false, @islogical);
3.開始訓練
訓練前刪除或備份output,imdb\cache,運行experiments/script_faster_rcnn_VOC2007_ZF.m 開始訓練。
在我的顯卡上經過四個小時,訓練完成。
下面是未刪除output重新運行(很快)的結果。
*************** stage one proposal *************** aver_boxes_num = 1090, select top 2000 aver_boxes_num = 1091, select top 2000 *************** stage one fast rcnn *************** !!! dog : 0.8969 0.9418 !!! flower : 0.9006 0.9458 ~~~~~~~~~~~~~~~~~~~~ Results: 89.6920 90.0606 89.8763 ~~~~~~~~~~~~~~~~~~~~ *************** stage two proposal *************** aver_boxes_num = 1263, select top 2000 aver_boxes_num = 1271, select top 2000 *************** stage two fast rcnn *************** *************** final test *************** aver_boxes_num = 233, select top 300 !!! dog : 0.8893 0.9449 !!! flower : 0.8990 0.9445 ~~~~~~~~~~~~~~~~~~~~ Results: 88.9304 89.9025 89.4165 ~~~~~~~~~~~~~~~~~~~~ Cleared 0 solvers and 2 stand-alone nets please modify detection_test.prototxt file for sharing conv layers with proposal model (delete layers until relu5) >>
4.測試
訓練結束已有提示,要先修改detection_test.prototxt。
修改data為1*256*50*50,去掉roi_pool5之前的layer並將bottom改為data。
name: "Zeiler_conv5" input: "data" input_dim: 1 input_dim: 256 input_dim: 50 input_dim: 50 input: "rois" input_dim: 1 # to be changed on-the-fly to num ROIs input_dim: 5 # [batch ind, x1, y1, x2, y2] zero-based indexing input_dim: 1 input_dim: 1 layer { bottom: "data" bottom: "rois" top: "pool5" name: "roi_pool5" type: "ROIPooling" roi_pooling_param { pooled_w: 6 pooled_h: 6 spatial_scale: 0.0625 # (1/16) } }
在experiments\script_faster_rcnn_demo.m中將路徑更改成本地相應路徑,根據測試結果可以修改thres值。
model_dir = fullfile(pwd, 'output', 'faster_rcnn_final', 'faster_rcnn_VOC2007_ZF'); %% ZF_test
im_names = {'000001.jpg','000002.jpg','000034.jpg','000212.jpg','000213.jpg', '001150.jpg'};
thres = 0.3; %0.6
檢測速度很快,不過此次我的數據檢測效果很不好,可能由於數據太少、畫框不認真或某些沒有意識到的參數錯誤-_-!。