論文-SSD-Single Shot MultiBox Detector

本文轉載自查看原文 2017-03-21 21:47 2560 ComputerVision/ DeepLearning

SSD: Single Shot MultiBox Detector

SSD 的貢獻是：

1，比state-of-the-art single shot detector YOLO取得了更快和更准確的表現。

2， SSD的核心是使用convolutional filter在feature map的很多給定的Bounding boxes中得到category scores 和 box offsets。

3，為了獲得高的detection accuracy ，在不同 scales 上的feature maps 進行預測。

網絡結構：

SSD的關鍵特點有：

(1)，使用 multi-scale feature maps 來detect，

(2)， Convolutional predictors ，沒有采用FC layer，這樣可以適用於各種size的image，對於一個NxMxP的feature map， prediction detection 是 3x3xP的small kernel

(3)，利用filter作為classifier 和 regressor，來對feature map的每一個pixels所在的位置是否object並且判斷bbox相對於ground true的offsets。此時，filter將classifier和regressor結合起來，所以產生了filter產生的vector的維度是 (c + 4) x K, 其中c是categories number， 4 是bbox相對於ground true的4個offsets，k值是不同aspect和scale的bbox的數量。

Training：

包含： default boxes集合和detection scale的選擇，hard negative mining 和 data augmentation。Hard negative mining，因為出來的絕大多數的bbox都是negatives，為了使得categories balance，sort 之后取最高的confidence loss的bbox，其中限制 negative/positive 最大為3.

對於Matching，用jaccard overlap來衡量每一個candidate的bounding boxes。

其中training loss是：

Loss function 是localization 和 confidence 結合的加權和。

Aspect和scale的選取：

假設我們想要選擇m個feature maps來做prediction，scale的計算可為：

其中 Smin是0.2，表示lowest layer的scale是0.2。Smax是0.9。

總結：

(1)，用filter來做classification和regression是一個新的創新，這樣就可以不引入fc，從而讓image可以任意shape

(2)，用multi-scale進行detection，淺層的feature map對較小的object有保證。

(3)，多級的filter進行detection，有個疑問，不同scale的backpropagation對network的影響不同，淺層scale的backpropagation對前面的主干cnn影響較大，而后面scale的對前面的主干cnn的影響較小，是否會有影響？

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 SSD(Single Shot MultiBox Detector) 論文筆記 SSD: Single Shot MultiBox Detector 目標檢測SSD: Single Shot MultiBox Detector 五、SSD原理（Single Shot MultiBox Detector）論文閱讀筆記二十九：SSD: Single Shot MultiBox Detector(ECCV2016) 目標檢測--SSD: Single Shot MultiBox Detector(2015) SSD（single shot multibox detector）算法及Caffe代碼詳解[轉] SSD: Single Shot MultiBox Detector 編譯方法總結 SSD: Single Shot MultiBoxDetector英文論文翻譯【論文速讀】Pan He_ICCV2017_Single Shot Text Detector With Regional Attention