Mask rcn nanchor部分理解

本文轉載自查看原文 2019-01-10 19:58 778

Anchors

Mask 生成錨框本質與SSD一樣
中心點個數等於特征層像素數
框體生成圍繞中心點
Bbox的坐標是要歸一化到0~1之間的，都是相對於輸入圖片的大小。
基本生成方式：
H乘np.sqrt(anchor_ratio)
W乘np.sqrt(anchor_ratio)
這樣，H:W = ratio
Mask rcnn
self.config.BACKBONE_STRIDES = [4, 8, 16, 32, 64]
# 特征層的下采樣倍數，中心點計算使用
self.config.RPN_ANCHOR_RATIOS = [0.5, 1, 2] # 特征層錨框生成參數
self.config.RPN_ANCHOR_SCALES = [32, 64, 128, 256, 512] # 特征層錨框感

anchor生成：

錨框生成入口函數位於model.py中的get_anchor函數，需要參數image_shape，保證含有[h, w]即可，也可以包含[h, w, c]，

 1 def get_anchors(self, image_shape):
 2     """Returns anchor pyramid for the given image size."""
 3     # [N, (height, width)]
 4     backbone_shapes = compute_backbone_shapes(self.config, image_shape)
 5     # Cache anchors and reuse if image shape is the same
 6     if not hasattr(self, "_anchor_cache"):
 7         self._anchor_cache = {}
 8     if not tuple(image_shape) in self._anchor_cache:
 9         # Generate Anchors: [anchor_count, (y1, x1, y2, x2)]
10         a = utils.generate_pyramid_anchors(
11             self.config.RPN_ANCHOR_SCALES,  # (32, 64, 128, 256, 512)
12             self.config.RPN_ANCHOR_RATIOS,  # [0.5, 1, 2]
13             backbone_shapes,                # with shape [N, (height, width)]
14             self.config.BACKBONE_STRIDES,   # [4, 8, 16, 32, 64]
15             self.config.RPN_ANCHOR_STRIDE)  # 1
16         # Keep a copy of the latest anchors in pixel coordinates because
17         # it's used in inspect_model notebooks.
18         # TODO: Remove this after the notebook are refactored to not use it
19         self.anchors = a
20         # Normalize coordinates
21         self._anchor_cache[tuple(image_shape)] = utils.norm_boxes(a, image_shape[:2])
22     return self._anchor_cache[tuple(image_shape)]

調用函數compute_backbone_shapes計算各個特征層shape：

def compute_backbone_shapes(config, image_shape):
    """Computes the width and height of each stage of the backbone network.
 
    Returns:
        [N, (height, width)]. Where N is the number of stages
    """
    if callable(config.BACKBONE):
        return config.COMPUTE_BACKBONE_SHAPE(image_shape)
 
    # Currently supports ResNet only
    assert config.BACKBONE in ["resnet50", "resnet101"]
    return np.array(
        [[int(math.ceil(image_shape[0] / stride)),
            int(math.ceil(image_shape[1] / stride))]
            for stride in config.BACKBONE_STRIDES])  # [4, 8, 16, 32, 64]

調用函數utils.generate_pyramid_anchors生成全部錨框：

def generate_pyramid_anchors(scales, ratios, feature_shapes, feature_strides,
                             anchor_stride):
    """Generate anchors at different levels of a feature pyramid. Each scale
    is associated with a level of the pyramid, but each ratio is used in
    all levels of the pyramid.
 
    Returns:
    anchors: [N, (y1, x1, y2, x2)]. All generated anchors in one array. Sorted
        with the same order of the given scales. So, anchors of scale[0] come
        first, then anchors of scale[1], and so on.
    """
    # Anchors
    # [anchor_count, (y1, x1, y2, x2)]
    anchors = []
    for i in range(len(scales)):
        anchors.append(generate_anchors(scales[i],
                                        ratios,
                                        feature_shapes[i],
                                        feature_strides[i],
                                        anchor_stride))
    # [anchor_count, (y1, x1, y2, x2)]
    return np.concatenate(anchors, axis=0)

utils.generate_pyramid_anchors會調用utils.generate_anchors來生成每一層的錨框（介紹見『Numpy』np.meshgrid）：

def generate_anchors(scales, ratios, shape, feature_stride, anchor_stride):
    """
    scales: 1D array of anchor sizes in pixels. Example: [32, 64, 128]
    ratios: 1D array of anchor ratios of width/height. Example: [0.5, 1, 2]
    shape: [height, width] spatial shape of the feature map over which
            to generate anchors.
    feature_stride: Stride of the feature map relative to the image in pixels.
    anchor_stride: Stride of anchors on the feature map. For example, if the
        value is 2 then generate anchors for every other feature map pixel.
    """
    # Get all combinations of scales and ratios
    scales, ratios = np.meshgrid(np.array(scales), np.array(ratios))
    scales = scales.flatten()
    ratios = ratios.flatten()
 
    # Enumerate heights and widths from scales and ratios
    heights = scales / np.sqrt(ratios)
    widths = scales * np.sqrt(ratios)
 
    # Enumerate shifts in feature space
    shifts_y = np.arange(0, shape[0], anchor_stride) * feature_stride
    shifts_x = np.arange(0, shape[1], anchor_stride) * feature_stride
    shifts_x, shifts_y = np.meshgrid(shifts_x, shifts_y)
 
    # Enumerate combinations of shifts, widths, and heights
    box_widths, box_centers_x = np.meshgrid(widths, shifts_x)    # (n, 3) (n, 3)
    box_heights, box_centers_y = np.meshgrid(heights, shifts_y)  # (n, 3) (n, 3)
 
    # Reshape to get a list of (y, x) and a list of (h, w)
    # (n, 3, 2) -> (3n, 2)
    box_centers = np.stack([box_centers_y, box_centers_x], axis=2).reshape([-1, 2])
　　#box_centers_y, box_centers_x都是坐標矩陣，要想恢復各個點的坐標，調用np.stack函數，指定axis

box_sizes = np.stack([box_heights, box_widths], axis=2).reshape([-1, 2]) # Convert to corner coordinates (y1, x1, y2, x2) boxes = np.concatenate([box_centers - 0.5 * box_sizes, box_centers + 0.5 * box_sizes], axis=1) # 框體信息是相對於原圖的, [N, (y1, x1, y2, x2)] return boxes

boxes的長寬：

self.config.RPN_ANCHOR_RATIOS = [0.5, 1, 2] # 特征層錨框生成參數
self.config.RPN_ANCHOR_SCALES = [32, 64, 128, 256, 512] # 特征層錨框感

最小的框：

heights: 32/sqrt(0.5) = 45.25 width: 32 * sqrt(0.5) = 22.62

height: 32/sqrt(1) = 32 width:32* sqrt(1) = 32

最大的框：

heights: 512/sqrt(2) = 362 widths: 512*sqrt(2) = 724

最后回到get_anchor，調用utils.norm_boxes將錨框坐標化為01之間：

def norm_boxes(boxes, shape):
    """Converts boxes from pixel coordinates to normalized coordinates.
    boxes: [N, (y1, x1, y2, x2)] in pixel coordinates
    shape: [..., (height, width)] in pixels
 
    Note: In pixel coordinates (y2, x2) is outside the box. But in normalized
    coordinates it's inside the box.
 
    Returns:
        [N, (y1, x1, y2, x2)] in normalized coordinates
    """
    h, w = shape
    scale = np.array([h - 1, w - 1, h - 1, w - 1])
    shift = np.array([0, 0, 1, 1])
    return np.divide((boxes - shift), scale).astype(np.float32)

抄自：https://www.cnblogs.com/hellcat/p/9854736.html

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 go中channel的部分理解充分理解bootstrap的柵格系統水平拆分和垂直拆分理解（未完） Mask R-CNN論文理解 mask CNN(卷積神經網絡)、RNN(循環神經網絡)、DNN(深度神經網絡)概念區分理解 CNN(卷積神經網絡)、RNN(循環神經網絡)、DNN(深度神經網絡)概念區分理解按鍵中斷部分的理解 Solder Mask與Paste Mask TCP頭部分析與確認號的理解