Anchors
Mask 生成錨框本質與SSD一樣
中心點個數等於特征層像素數
框體生成圍繞中心點
Bbox的坐標是要歸一化到0~1之間的,都是相對於輸入圖片的大小。
基本生成方式:
H乘np.sqrt(anchor_ratio)
W乘np.sqrt(anchor_ratio)
這樣,H:W = ratio
Mask rcnn
self.config.BACKBONE_STRIDES = [4, 8, 16, 32, 64]
# 特征層的下采樣倍數,中心點計算使用
self.config.RPN_ANCHOR_RATIOS = [0.5, 1, 2] # 特征層錨框生成參數
self.config.RPN_ANCHOR_SCALES = [32, 64, 128, 256, 512] # 特征層錨框感
anchor生成:
錨框生成入口函數位於model.py中的get_anchor函數,需要參數image_shape,保證含有[h, w]即可,也可以包含[h, w, c],
1 def get_anchors(self, image_shape): 2 """Returns anchor pyramid for the given image size.""" 3 # [N, (height, width)] 4 backbone_shapes = compute_backbone_shapes(self.config, image_shape) 5 # Cache anchors and reuse if image shape is the same 6 if not hasattr(self, "_anchor_cache"): 7 self._anchor_cache = {} 8 if not tuple(image_shape) in self._anchor_cache: 9 # Generate Anchors: [anchor_count, (y1, x1, y2, x2)] 10 a = utils.generate_pyramid_anchors( 11 self.config.RPN_ANCHOR_SCALES, # (32, 64, 128, 256, 512) 12 self.config.RPN_ANCHOR_RATIOS, # [0.5, 1, 2] 13 backbone_shapes, # with shape [N, (height, width)] 14 self.config.BACKBONE_STRIDES, # [4, 8, 16, 32, 64] 15 self.config.RPN_ANCHOR_STRIDE) # 1 16 # Keep a copy of the latest anchors in pixel coordinates because 17 # it's used in inspect_model notebooks. 18 # TODO: Remove this after the notebook are refactored to not use it 19 self.anchors = a 20 # Normalize coordinates 21 self._anchor_cache[tuple(image_shape)] = utils.norm_boxes(a, image_shape[:2]) 22 return self._anchor_cache[tuple(image_shape)]
調用函數compute_backbone_shapes計算各個特征層shape:
def compute_backbone_shapes(config, image_shape): """Computes the width and height of each stage of the backbone network. Returns: [N, (height, width)]. Where N is the number of stages """ if callable(config.BACKBONE): return config.COMPUTE_BACKBONE_SHAPE(image_shape) # Currently supports ResNet only assert config.BACKBONE in ["resnet50", "resnet101"] return np.array( [[int(math.ceil(image_shape[0] / stride)), int(math.ceil(image_shape[1] / stride))] for stride in config.BACKBONE_STRIDES]) # [4, 8, 16, 32, 64]
調用函數utils.generate_pyramid_anchors生成全部錨框:
def generate_pyramid_anchors(scales, ratios, feature_shapes, feature_strides, anchor_stride): """Generate anchors at different levels of a feature pyramid. Each scale is associated with a level of the pyramid, but each ratio is used in all levels of the pyramid. Returns: anchors: [N, (y1, x1, y2, x2)]. All generated anchors in one array. Sorted with the same order of the given scales. So, anchors of scale[0] come first, then anchors of scale[1], and so on. """ # Anchors # [anchor_count, (y1, x1, y2, x2)] anchors = [] for i in range(len(scales)): anchors.append(generate_anchors(scales[i], ratios, feature_shapes[i], feature_strides[i], anchor_stride)) # [anchor_count, (y1, x1, y2, x2)] return np.concatenate(anchors, axis=0)
utils.generate_pyramid_anchors會調用utils.generate_anchors來生成每一層的錨框(介紹見『Numpy』np.meshgrid):
def generate_anchors(scales, ratios, shape, feature_stride, anchor_stride): """ scales: 1D array of anchor sizes in pixels. Example: [32, 64, 128] ratios: 1D array of anchor ratios of width/height. Example: [0.5, 1, 2] shape: [height, width] spatial shape of the feature map over which to generate anchors. feature_stride: Stride of the feature map relative to the image in pixels. anchor_stride: Stride of anchors on the feature map. For example, if the value is 2 then generate anchors for every other feature map pixel. """ # Get all combinations of scales and ratios scales, ratios = np.meshgrid(np.array(scales), np.array(ratios)) scales = scales.flatten() ratios = ratios.flatten() # Enumerate heights and widths from scales and ratios heights = scales / np.sqrt(ratios) widths = scales * np.sqrt(ratios) # Enumerate shifts in feature space shifts_y = np.arange(0, shape[0], anchor_stride) * feature_stride shifts_x = np.arange(0, shape[1], anchor_stride) * feature_stride shifts_x, shifts_y = np.meshgrid(shifts_x, shifts_y) # Enumerate combinations of shifts, widths, and heights box_widths, box_centers_x = np.meshgrid(widths, shifts_x) # (n, 3) (n, 3) box_heights, box_centers_y = np.meshgrid(heights, shifts_y) # (n, 3) (n, 3) # Reshape to get a list of (y, x) and a list of (h, w) # (n, 3, 2) -> (3n, 2) box_centers = np.stack([box_centers_y, box_centers_x], axis=2).reshape([-1, 2])
#box_centers_y, box_centers_x都是坐標矩陣,要想恢復各個點的坐標,調用np.stack函數,指定axis
box_sizes = np.stack([box_heights, box_widths], axis=2).reshape([-1, 2]) # Convert to corner coordinates (y1, x1, y2, x2) boxes = np.concatenate([box_centers - 0.5 * box_sizes, box_centers + 0.5 * box_sizes], axis=1) # 框體信息是相對於原圖的, [N, (y1, x1, y2, x2)] return boxes
boxes的長寬:
self.config.RPN_ANCHOR_RATIOS = [0.5, 1, 2] # 特征層錨框生成參數
self.config.RPN_ANCHOR_SCALES = [32, 64, 128, 256, 512] # 特征層錨框感
最小的框:
heights: 32/sqrt(0.5) = 45.25 width: 32 * sqrt(0.5) = 22.62
height: 32/sqrt(1) = 32 width:32* sqrt(1) = 32
最大的框:
heights: 512/sqrt(2) = 362 widths: 512*sqrt(2) = 724
最后回到get_anchor,調用utils.norm_boxes將錨框坐標化為01之間:
def norm_boxes(boxes, shape): """Converts boxes from pixel coordinates to normalized coordinates. boxes: [N, (y1, x1, y2, x2)] in pixel coordinates shape: [..., (height, width)] in pixels Note: In pixel coordinates (y2, x2) is outside the box. But in normalized coordinates it's inside the box. Returns: [N, (y1, x1, y2, x2)] in normalized coordinates """ h, w = shape scale = np.array([h - 1, w - 1, h - 1, w - 1]) shift = np.array([0, 0, 1, 1]) return np.divide((boxes - shift), scale).astype(np.float32)
抄自:https://www.cnblogs.com/hellcat/p/9854736.html