一、anchor生成

生成anchor的总体思路是，首先生成base_anchor，然后网格化（meshgrid）生成其他anchor。

1、faster rcnn

faster rcnn的anchor生成是最经典的，其他anchor生成方式与之相比大同小异。在anchor_generator中，可以

看到它是在没有使用for循环的情况下，如何生成的anchor的。

首先是base_anchor。以（0，0）为左上角，以（stride, stride）为基本(w, h)，分别与scale，ratio计算得到的多个

anchor。比如scale = [8, 16, 32]（w，h的大小） , ratios=[0.5, 1.0, 2.0]（w和h的比例）, 那就是生成9个anchor。

 1 def gen_single_level_base_anchors(self,  2  base_size,  3  scales,  4  ratios,  5                                       center=None):  6         """Generate base anchors of a single level.  7 
 8  Args:  9  base_size (int | float): Basic size of an anchor. 10  scales (torch.Tensor): Scales of the anchor. 11  ratios (torch.Tensor): The ratio between between the height 12  and width of anchors in a single level. 13  center (tuple[float], optional): The center of the base anchor 14  related to a single feature grid. Defaults to None. 15 
16  Returns: 17  torch.Tensor: Anchors in a single-level feature maps. 18         """
19         w = base_size 20         h = base_size 21         if center is None: 22             x_center = self.center_offset * w 23             y_center = self.center_offset * h 24         else: 25             x_center, y_center = center 26 
27         h_ratios = torch.sqrt(ratios) 28         w_ratios = 1 / h_ratios 29         if self.scale_major: 30             ws = (w * w_ratios[:, None] * scales[None, :]).view(-1) 31             hs = (h * h_ratios[:, None] * scales[None, :]).view(-1) 32         else: 33             ws = (w * scales[:, None] * w_ratios[None, :]).view(-1) 34             hs = (h * scales[:, None] * h_ratios[None, :]).view(-1) 35 
36         # use float anchor and the anchor's center is aligned with the
37         # pixel center
38         base_anchors = [ 39             x_center - 0.5 * ws, y_center - 0.5 * hs, x_center + 0.5 * ws, 40             y_center + 0.5 * hs 41  ] 42         base_anchors = torch.stack(base_anchors, dim=-1) 43 
44         return base_anchors

有了base_anchor，那就只需在其他位置上，对base_anchor进行相应的偏移即可。因此，先通过meshgrid，生成各个位置，然后加上base_anchor。

 1 def _meshgrid(self, x, y, row_major=True):  2         """Generate mesh grid of x and y.  3 
 4  Args:  5  x (torch.Tensor): Grids of x dimension.  6  y (torch.Tensor): Grids of y dimension.  7  row_major (bool, optional): Whether to return y grids first.  8  Defaults to True.  9 
10  Returns: 11  tuple[torch.Tensor]: The mesh grids of x and y. 12         """
13         xx = x.repeat(len(y)) 14         yy = y.view(-1, 1).repeat(1, len(x)).view(-1) 15         if row_major: 16             return xx, yy 17         else: 18             return yy, xx 19 
20 def single_level_grid_anchors(self, 21  base_anchors, 22  featmap_size, 23                                   stride=(16, 16), 24                                   device='cuda'): 25         """Generate grid anchors of a single level. 26 
27  Note: 28  This function is usually called by method ``self.grid_anchors``. 29 
30  Args: 31  base_anchors (torch.Tensor): The base anchors of a feature grid. 32  featmap_size (tuple[int]): Size of the feature maps. 33  stride (tuple[int], optional): Stride of the feature map in order 34  (w, h). Defaults to (16, 16). 35  device (str, optional): Device the tensor will be put on. 36  Defaults to 'cuda'. 37 
38  Returns: 39  torch.Tensor: Anchors in the overall feature maps. 40         """
41         feat_h, feat_w = featmap_size 42         # convert Tensor to int, so that we can covert to ONNX correctlly
43         feat_h = int(feat_h) 44         feat_w = int(feat_w) 45         shift_x = torch.arange(0, feat_w, device=device) * stride[0] 46         shift_y = torch.arange(0, feat_h, device=device) * stride[1] 47 
48         shift_xx, shift_yy = self._meshgrid(shift_x, shift_y) 49         shifts = torch.stack([shift_xx, shift_yy, shift_xx, shift_yy], dim=-1) 50         shifts = shifts.type_as(base_anchors) 51         # first feat_w elements correspond to the first row of shifts
52         # add A anchors (1, A, 4) to K shifts (K, 1, 4) to get
53         # shifted anchors (K, A, 4), reshape to (K*A, 4)
54 
55         all_anchors = base_anchors[None, :, :] + shifts[:, None, :] 56         all_anchors = all_anchors.view(-1, 4) 57         # first A rows correspond to A anchors of (0, 0) in feature map,
58         # then (0, 1), (0, 2), ...
59         return all_anchors

2、yolov2&yolov3

和faster rcnn的不同就是base_anchor。yolo的base_anchor是通过对数据集聚类得到的。如下，可以看到这里不需要在去

计算scale和ratio，剩下的就是网格化生成其余anchor。

 1 def gen_single_level_base_anchors(self, base_sizes_per_level, center=None):  2         """Generate base anchors of a single level.  3 
 4  Args:  5  base_sizes_per_level (list[tuple[int, int]]): Basic sizes of  6  anchors.  7  center (tuple[float], optional): The center of the base anchor  8  related to a single feature grid. Defaults to None.  9 
10  Returns: 11  torch.Tensor: Anchors in a single-level feature maps. 12         """
13         x_center, y_center = center 14         base_anchors = [] 15         for base_size in base_sizes_per_level: 16             w, h = base_size 17 
18             # use float anchor and the anchor's center is aligned with the
19             # pixel center
20             base_anchor = torch.Tensor([ 21                 x_center - 0.5 * w, y_center - 0.5 * h, x_center + 0.5 * w, 22                 y_center + 0.5 * h 23  ]) 24  base_anchors.append(base_anchor) 25         base_anchors = torch.stack(base_anchors, dim=0) 26 
27         return base_anchors

3、SSD

SSD也类似，不同的地方就是anchor的尺度不再是固定的，而是变化的（参考论文给的尺度公式）：随着特征图减小，尺度逐渐增大（感受野大，anchor也要大）。

剩下的就和faster rcnn一样了。参考https://zhuanlan.zhihu.com/p/33544892

 1 # 计算出在原图上，anchor大小（单边为60， 111， 162， 213， 264）
 2         min_sizes = []  3         max_sizes = []  4         for ratio in range(int(min_ratio), int(max_ratio) + 1, step):  5             min_sizes.append(int(self.input_size * ratio / 100))  6             max_sizes.append(int(self.input_size * (ratio + step) / 100))  7 
 8         # anchor再增加一个尺度30 
 9         if self.input_size == 300: 10             if basesize_ratio_range[0] == 0.15:  # SSD300 COCO
11                 min_sizes.insert(0, int(self.input_size * 7 / 100)) 12                 max_sizes.insert(0, int(self.input_size * 15 / 100)) 13 
14 # 计算其 scale ratio
15 anchor_ratios = [] 16         anchor_scales = [] 17         for k in range(len(self.strides)): 18             scales = [1., np.sqrt(max_sizes[k] / min_sizes[k])] 19             anchor_ratio = [1.] 20             for r in ratios[k]: 21                 anchor_ratio += [1 / r, r]  # 4 or 6 ratio
22  anchor_ratios.append(torch.Tensor(anchor_ratio)) 23             anchor_scales.append(torch.Tensor(scales))

二、anchor assigner

生成了anchor之后，要对其打标签，看看哪些是正样本，哪些是负样本。

1、MaxIouAssigner

SSD和faseter rcnn采用的方式，计算anchor与GT的IOU。对每个anchor，如果其IOU>pos_thread，则为正样本，如果IOU<neg_thread，则为背景类；此外，

在代码中还有一段细节。在上述策略中，有些GT可能没有匹配到任何的anchor，因此添加了一个补救措施来扩充正样本：遍历每一个GT，

查看iou最大的anchor，如果最大iou>pos_min_iou，那就标记为正样本。注意：这个其实并不能保证每个GT一定有anchor（跟GT遍历顺序有关），

并且会引入一些不太好的正样本，因此效果并不一定好。这边博客举了一个不错的例子

目标检测(MMdetection)——Retina(Anchor、Focal Loss) - 知乎 (zhihu.com)

 1 if self.match_low_quality:  2             # Low-quality matching will overwirte the assigned_gt_inds assigned
 3             # in Step 3. Thus, the assigned gt might not be the best one for
 4             # prediction.
 5             # For example, if bbox A has 0.9 and 0.8 iou with GT bbox 1 & 2,
 6             # bbox 1 will be assigned as the best target for bbox A in step 3.
 7             # However, if GT bbox 2's gt_argmax_overlaps = A, bbox A's
 8             # assigned_gt_inds will be overwritten to be bbox B.
 9             # This might be the reason that it is not used in ROI Heads.
10             for i in range(num_gts): 11                 if gt_max_overlaps[i] >= self.min_pos_iou: 12                     if self.gt_max_assign_all: 13                         max_iou_inds = overlaps[i, :] == gt_max_overlaps[i] 14                         assigned_gt_inds[max_iou_inds] = i + 1
15                     else: 16                         assigned_gt_inds[gt_argmax_overlaps[i]] = i + 1

2、GridAssigner

YOLO采用的方式，同样先计算anchor与GT的IOU。负样本标记方式相同，不同的是正样本。对于每个anchor，其最近网格的IOU>pos_thread并且其中心落入

该网格，则该anchor为正样本；对于每个GT，将其最近的anchor，赋值给该GT最近的格子。这就意味着，每个GT，其实只有一个正样本。

三、box编码

在训练过程中，并非直接使用GT和anchor的坐标直接训练，为了加速收敛，会对其进行编码，编码的方式略有不同。

这部分参考史上最详细的Yolov3边框预测分析_逍遥王的博客-CSDN博客

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 【Flutter 实战】各种各样形状的组件 fastjson之坑-你可能遇到的各种各样的奇葩问题 filebench - File system and storage benchmark - 模拟生成各种各样的应用的负载 - A Model Based File System Workload Generator Hibernate主键的生成方式 mysql的 UUID的生成方式 es证书生成方式 keystone令牌三种生成方式 Jmeter的脚本生成方式 Java的随机数生成方式目标检测 anchor的生成