本blog為github上CharlesShang/TFFRCNN版源碼解析系列代碼筆記
---------------個人學習筆記---------------
----------------本文作者疆--------------
------點擊此處鏈接至博客園原文------
1.Faster RCNN中RPN中預測的bbox_pred坐標補償量說明(RCNN subnet中預測的補償量是同樣的形式,這種預測形式最初由RCNN中提出)
回歸預測直接預測坐標很難,而預測一種轉換則比較容易,回歸預測的補償量為(tx,ty,tw,th),對應gt標簽為(tx*,ty*,tw*,th*),由式(2)第一行、第二行在(測試階段)有如下關系:
(x,y,w,h)分別為回歸后box的中心點橫坐標、縱坐標、寬和高,(xa,ya,wa,ha)為未回歸前的box的中心點橫坐標、縱坐標、寬和高。
2.bbox_transform_inv(boxes,deltas))返回pred_boxes的回歸過程
boxes的shape為(R2,4),deltas的shape為(R2,n_classes*4),返回的pred_boxes的shape與deltas相同,即(R2,n_classes*4),表明每一個box均向各類回歸。
被train.py和test.py等調用
def bbox_transform_inv(boxes, deltas): if boxes.shape[0] == 0: return np.zeros((0, deltas.shape[1]), dtype=deltas.dtype) boxes = boxes.astype(deltas.dtype, copy=False) widths = boxes[:, 2] - boxes[:, 0] + 1.0 # wa heights = boxes[:, 3] - boxes[:, 1] + 1.0 # ha ctr_x = boxes[:, 0] + 0.5 * widths # xa ctr_y = boxes[:, 1] + 0.5 * heights # ya dx = deltas[:, 0::4] # tx 以4為步長 dy = deltas[:, 1::4] # ty dw = deltas[:, 2::4] # tw dh = deltas[:, 3::4] # th pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis] pred_ctr_y = dy * heights[:, np.newaxis] + ctr_y[:, np.newaxis] pred_w = np.exp(dw) * widths[:, np.newaxis] # 以e為底的指數函數 pred_h = np.exp(dh) * heights[:, np.newaxis] pred_boxes = np.zeros(deltas.shape, dtype=deltas.dtype) # pred_boxes與deltas的shape相同 # x1 pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * pred_w # y1 pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * pred_h # x2 pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * pred_w # y2 pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * pred_h return pred_boxes

# -*- coding:utf-8 -*- # Author: WUJiang # 測試功能 import numpy as np deltas = np.array([ [0, 1, 2, 3, 4, 5, 6, 7], [1, 2, 3, 4, 5, 6, 7, 8], [2, 3, 4, 5, 6, 7, 8, 9] ]) """ [[0 4] [1 5] [2 6]] """ dx = deltas[:, 0::4] # 第二維以4為步長 print(dx)
2.其他函數
clip_boxes(boxes, im_shape) 將越界的box限制為圖像邊界,test.py中也定義了該函數,被rpn/msr proposal_layer_tf.py中proposal_layer(...)調用
def clip_boxes(boxes, im_shape): """ Clip boxes to image boundaries. """
bbox_transform(ex_rois, gt_rois)由RCNN subnet中(訓練時)未回歸前的ex_rois和真實的gt_rois計算回歸補償量的gt值tx*、ty*、tw*、th*(見式子(2)第三行、第四行) (暫未見調用,猜測在訓練中被調用)
def bbox_transform(ex_rois, gt_rois): # 由RCNN subnet中(訓練時)未回歸前的ex_rois和真實的gt_rois計算回歸補償量的gt值tx*、ty*、tw*、th* """ computes the distance from ground-truth boxes to the given boxes, normed by their size :param ex_rois: n * 4 numpy array, given boxes :param gt_rois: n * 4 numpy array, ground-truth boxes :return: deltas: n * 4 numpy array, ground-truth boxes """ ex_widths = ex_rois[:, 2] - ex_rois[:, 0] + 1.0 # wa ex_heights = ex_rois[:, 3] - ex_rois[:, 1] + 1.0 # ha ex_ctr_x = ex_rois[:, 0] + 0.5 * ex_widths # xa ex_ctr_y = ex_rois[:, 1] + 0.5 * ex_heights # ya assert np.min(ex_widths) > 0.1 and np.min(ex_heights) > 0.1, \ 'Invalid boxes found: {} {}'. \ format(ex_rois[np.argmin(ex_widths), :], ex_rois[np.argmin(ex_heights), :]) gt_widths = gt_rois[:, 2] - gt_rois[:, 0] + 1.0 # w* gt_heights = gt_rois[:, 3] - gt_rois[:, 1] + 1.0 # h* gt_ctr_x = gt_rois[:, 0] + 0.5 * gt_widths # x* gt_ctr_y = gt_rois[:, 1] + 0.5 * gt_heights # y* # warnings.catch_warnings() # warnings.filterwarnings('error') targets_dx = (gt_ctr_x - ex_ctr_x) / ex_widths # tx* targets_dy = (gt_ctr_y - ex_ctr_y) / ex_heights # ty* targets_dw = np.log(gt_widths / ex_widths) # tw* targets_dh = np.log(gt_heights / ex_heights) # th* targets = np.vstack( (targets_dx, targets_dy, targets_dw, targets_dh)).transpose() return targets

# -*- coding:utf-8 -*- # Author: WUJiang # 測試功能 import numpy as np a = np.array([ [0, 1, 2, 3, 4, 5, 6, 7], [1, 2, 3, 4, 5, 6, 7, 8], [2, 3, 4, 5, 6, 7, 8, 9] ]) """ [[0 1 2] [1 2 3] [2 3 4] [3 4 5] [4 5 6] [5 6 7] [6 7 8] [7 8 9]] """ print(a.transpose()) # 轉置