prior_box層

本文轉載自查看原文 2018-06-26 19:50 1838 ssd

https://www.jianshu.com/p/5195165bbd06

1.step_w、step_h其實就相當於faster中的feat_stride,也就是把這些點從feature map映射回原圖,同時也可以看出min_size、max_size這些都是直接在針對原圖來講的

2.以mobileNet-ssd為例子:https://github.com/chuanqi305/MobileNet-SSD/blob/master/train.prototxt

layer {
  name: "conv11_mbox_priorbox"
  type: "PriorBox"
  bottom: "conv11"
  bottom: "data"
  top: "conv11_mbox_priorbox"
  prior_box_param {
    min_size: 60.0
    aspect_ratio: 2.0
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset: 0.5
  }
}

layer {
  name: "conv13_mbox_priorbox"
  type: "PriorBox"
  bottom: "conv13"
  bottom: "data"
  top: "conv13_mbox_priorbox"
  prior_box_param {
    min_size: 105.0
    max_size: 150.0
    aspect_ratio: 2.0
    aspect_ratio: 3.0
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset: 0.5
  }
}

只有conv11的anchor個數是3,其他5層都是6,原因是conv11只有min_size,沒有max_size,並且aspect_ratio只有1個,其他5層都是兩個,也就是說conv11是1+1*2=3,其他5層是1+1+2*2=6

prior_box_layer.cpp里,aspect_ratios_根據這層的param存儲相應的aspect ratio.如果flip為true,param里一個aspect ratio就要存儲他本身和他的倒數兩個值

  aspect_ratios_.clear();
  aspect_ratios_.push_back(1.);
  flip_ = prior_box_param.flip();
  for (int i = 0; i < prior_box_param.aspect_ratio_size(); ++i) {
    float ar = prior_box_param.aspect_ratio(i);
    bool already_exist = false;
    for (int j = 0; j < aspect_ratios_.size(); ++j) {     //檢查是否有重復的 if (fabs(ar - aspect_ratios_[j]) < 1e-6) {
        already_exist = true;
        break;
      }
    }
    if (!already_exist) {
      aspect_ratios_.push_back(ar);　　　　　　　　　　　　　 //如果flip為true,存儲aspect ratio和他的倒數,否則只存儲aspect ratio本身 if (flip_) {
        aspect_ratios_.push_back(1./ar);
      }
    }
}

對於每個點,先計算以min_size為長寬的正方形這個anchor;然后如果有max_size,計算以sqrt(min_size_ * max_size_)為長寬的正方形;然后計算aspect_ratios_中所有的aspect ratios,然后以這個aspect ratios計算box_width = min_size_ * sqrt(ar)和box_height = min_size_ / sqrt(ar),prototxt中的param里,一個ratio要存儲他和他的倒數,這樣一個ratio就要求兩個anchor

  for (int h = 0; h < layer_height; ++h) {
    for (int w = 0; w < layer_width; ++w) {
      float center_x = (w + offset_) * step_w;
      float center_y = (h + offset_) * step_h;
      float box_width, box_height;
      for (int s = 0; s < min_sizes_.size(); ++s) {
        int min_size_ = min_sizes_[s];
        // first prior: aspect_ratio = 1, size = min_size
        box_width = box_height = min_size_;
        // xmin
        top_data[idx++] = (center_x - box_width / 2.) / img_width;
        // ymin
        top_data[idx++] = (center_y - box_height / 2.) / img_height;
        // xmax
        top_data[idx++] = (center_x + box_width / 2.) / img_width;
        // ymax
        top_data[idx++] = (center_y + box_height / 2.) / img_height;

        if (max_sizes_.size() > 0) {
          CHECK_EQ(min_sizes_.size(), max_sizes_.size());
          int max_size_ = max_sizes_[s];
          // second prior: aspect_ratio = 1, size = sqrt(min_size * max_size)
          box_width = box_height = sqrt(min_size_ * max_size_);
          // xmin
          top_data[idx++] = (center_x - box_width / 2.) / img_width;
          // ymin
          top_data[idx++] = (center_y - box_height / 2.) / img_height;
          // xmax
          top_data[idx++] = (center_x + box_width / 2.) / img_width;
          // ymax
          top_data[idx++] = (center_y + box_height / 2.) / img_height;
        }

        // rest of priors
        for (int r = 0; r < aspect_ratios_.size(); ++r) {
          float ar = aspect_ratios_[r];
          if (fabs(ar - 1.) < 1e-6) {
            continue;
          }
          box_width = min_size_ * sqrt(ar);
          box_height = min_size_ / sqrt(ar);
          // xmin
          top_data[idx++] = (center_x - box_width / 2.) / img_width;
          // ymin
          top_data[idx++] = (center_y - box_height / 2.) / img_height;
          // xmax
          top_data[idx++] = (center_x + box_width / 2.) / img_width;
          // ymax
          top_data[idx++] = (center_y + box_height / 2.) / img_height;
        }
      }
    }
}

3.從reshape可以看出,輸出的shape是(1,2,layer_width * layer_height * num_priors_ * 4),layer_width * layer_height * num_priors_ * 4是每個feature map上每個點乘以anchor數,再每個anchor乘以對應的4個坐標,比如整個blob中第一個4個值存儲的就是feature map中第一個像素點的min size對應的正方形那個anchor的4個坐標值,第二個就是第一個像素點對應的max size對應的anchor的4個坐標值

void PriorBoxLayer<Dtype>::Reshape(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) {
  const int layer_width = bottom[0]->width();
  const int layer_height = bottom[0]->height();
  vector<int> top_shape(3, 1);
  // Since all images in a batch has same height and width, we only need to
  // generate one set of priors which can be shared across all images.
  top_shape[0] = 1;
  // 2 channels. First channel stores the mean of each prior coordinate.
  // Second channel stores the variance of each prior coordinate.
  top_shape[1] = 2;
  top_shape[2] = layer_width * layer_height * num_priors_ * 4;
  CHECK_GT(top_shape[2], 0);
  top[0]->Reshape(top_shape);
}

注意到,輸出是2channel的,第一個channel就是存儲的真實的每個anchor的4個坐標,第二個channel存儲的就是variance,variance_在layer_setup里面就初始化了4個值,這4個值就是來自於prototxt的param.這4個值分別對應4個坐標點,對於每個anchor,都會有對應這4個variance值,這些值存儲在第二個channel,並且在第二個channel里面每4個值每4個值重復

 top_data += top[0]->offset(0, 1);
  if (variance_.size() == 1) {
    caffe_set<Dtype>(dim, Dtype(variance_[0]), top_data);
  } else {
    int count = 0;
    for (int h = 0; h < layer_height; ++h) {
      for (int w = 0; w < layer_width; ++w) {
        for (int i = 0; i < num_priors_; ++i) {
          for (int j = 0; j < 4; ++j) {
            top_data[count] = variance_[j];
            ++count;
          }
        }
      }
    }
}

4.http://www.360doc.com/content/17/0810/10/10408243_678091430.shtml

這兩段代碼都來自於bbox_util.cpp的DecodeBBox函數.prior_box層輸出的prior_variance就是一個系數,這個系數乘以bounding box regression的回歸值,在faster中,是直接在anchor的坐標上加bounding box regression,ssd這里可以對回歸乘以一個系數.當然DecodeBBox其實也可以使用faster那種方式,可以通過參數控制

else {
      // variance is encoded in bbox, we need to scale the offset accordingly.
      decode_bbox->set_xmin(
          prior_bbox.xmin() + prior_variance[0] * bbox.xmin());
      decode_bbox->set_ymin(
          prior_bbox.ymin() + prior_variance[1] * bbox.ymin());
      decode_bbox->set_xmax(
          prior_bbox.xmax() + prior_variance[2] * bbox.xmax());
      decode_bbox->set_ymax(
          prior_bbox.ymax() + prior_variance[3] * bbox.ymax());
}

else {
      // variance is encoded in bbox, we need to scale the offset accordingly.
      decode_bbox->set_xmin(
          prior_bbox.xmin() + prior_variance[0] * bbox.xmin() * prior_width);
      decode_bbox->set_ymin(
          prior_bbox.ymin() + prior_variance[1] * bbox.ymin() * prior_height);
      decode_bbox->set_xmax(
          prior_bbox.xmax() + prior_variance[2] * bbox.xmax() * prior_width);
      decode_bbox->set_ymax(
          prior_bbox.ymax() + prior_variance[3] * bbox.ymax() * prior_height);
}

5.https://zhuanlan.zhihu.com/p/33544892 這個介紹了每層的prior如何確定min_size

對於后面的特征圖，先驗框尺度按照上面公式線性增加，但是先將尺度比例先擴大100倍，此時增長步長為 $\lfloor \frac{\lfloor s_{max}\times 100\rfloor - \lfloor s_{min}\times 100\rfloor}{m-1}\rfloor=17$ ，這樣各個特征圖的 s_k 為 20, 37, 54, 71, 88 ，將這些比例除以100，然后再乘以圖片大小，可以得到各個特征圖的尺度為 60,111, 162,213,264 ，這種計算方式是參考SSD的Caffe源碼。綜上，可以得到各個特征圖的先驗框尺度 30,60,111, 162,213,264

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 lammps教程：增加真空層、改變box尺寸，change_box三種用法詳解使用CSS3的box-shadow實現雙透明遮罩層對話框 oracle中connect by prior的使用 oracle中的 connect by prior用法 Hack The Box - 如何開始。 js中box和box（）的區別 Oracle遞歸查詢（start with…connect by prior） display:box和display:inline-box的區別 border-box與content-box的區別 content-box與border-box區別