Pan He_ICCV2017_Single Shot Text Detector With Regional Attention

作者和代碼

caffe代碼

關鍵詞

文字檢測、多方向、SSD、$$xywh\theta$$ 、one-stage、開源

方法亮點

Attention機制強化文字特征： Text Attentional Module
引入Inception來增強detector對文字大小的魯棒性：Hierarchical Inception Module（HIM）

方法概述

本文方法是對SSD進行改進，通過增加一個角度信息，用於多方向文字檢測。只要通過Attention機制和引入Inception來提高對文字特征的魯棒性。

方法細節

網絡結構

SSD的feature fusion層進行改進。增加了Text Attentional Module， Hierarchical Inception Module，以及AIF進行特征融合。

Aggregated Inception Features (AIFs)

Text Attentional Module

Attention的思想是原來的特征可能是全局整張圖的，但是通過強化文字部分的特征（增加監督信息來對text部分的特征進行加權強化），來讓文字特征更明顯，更利於分類和回歸任務。簡單說，原來可能要看完整張圖來做判斷，現在只要多看看文字部分。

從效果來看，attention的好處：噪聲的魯棒性更強，文字的黏連問題解決的更好。

Figure 3: Text attention module. It computes a text attention map from Aggregated Inception Features (AIFs). The attention map indicates rough text regions and is further encoded into the AIFs. The attention module is trained by using a pixel-wise binary mask of text.

Figure 4: We compare detection results of the baseline model and the model with our text attention module (TAM), which enables the detector with stronger capability for identifying extremely challenging text with a higher word-level accuracy.

Hierarchical Inception Module

Inception有多種不同感受野的特征融合，對文字的大小魯棒性更強。

Figure 5: Inception module. The convolutional maps are processed through four different convolutional operations, with Dilated convolutions [34] applied.

Figure 6: Comparisons of baseline model and Hierarchical Inception Module (HIM) model. The HIM allows the detector to handle extremely challenging text, and also improves word-level detection accuracy.

其他細節點

default box的aspect ratio從1,2,3,5,7 換成1,2,3,5,$\frac{1}{2}$,$\frac{1}{3}$,$\frac{1}{5}$

實驗結果

ICDAR13數據集上驗證TAM（+3）、HIM（+2）、TAM+HIM（+5）的效果

ICDAR2013和ICDAR2015

COCO-text
速度
- TITAN X， caffe，0.13s/image

總結與收獲

這篇文章的方法主要是修改網絡模型，通過增加attention和inception來提升特征魯棒性。這個思想可以用於任何其他目標檢測框架的特征融合層。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 SSD(Single Shot MultiBox Detector) 論文閱讀筆記三十四：DSSD: Deconvolutiona lSingle Shot Detector（CVPR2017） SSD（single shot multibox detector）算法及Caffe代碼詳解[轉] 論文閱讀筆記五十四：Gradient Harmonized Single-stage Detector（CVPR2019）【論文筆記】FASF：Feature Selective Anchor-Free Module for Single-Shot Object Detection 【SPOS】2019-arxiv-Single Path One-Shot Neural Architecture Search with Uniform Sampling-論文閱讀論文閱讀筆記四十四：RetinaNet:Focal Loss for Dense Object Detection(ICCV2017）【論文速讀】Chuhui Xue_ECCV2018_Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping 論文速讀（Jiaming Liu——【2019】Detecting Text in the Wild with Deep Character Embedding Network ）論文閱讀筆記二十二：End-to-End Instance Segmentation with Recurrent Attention（CVPR2017）