圖像對應的bag-of-words向量\(v_t\)
假設詞典總共有\(W\)個單詞,那么每一幅圖像能夠用一個\(W\)維的向量表示
\((t_1, t_2, t_3, ..., t_W)\)其中
其中,\(n_{id}\)是單詞i在當前幀圖像中出現的次數,\(n_d\)是當前圖像中所以單詞的數目,\(n_i\)是詞匯\(i\)在整個數據庫中出現的次數,\(N\)是為所有圖像中描述子的數目,\(\frac{n_{id}}{n_{nd}}\)表示\(tf\),\(\log\frac{N}{n_i}\)表示\(idf\),在建立視覺詞袋的時候已經得到。
反向索引:描述詞匯中的每一個單詞在出現過的圖像列表,能夠加速查找具有相同詞匯的圖像。(用什么數據結構實現的?)存儲一系列\(<I_t, v_t^i>\)(其中\(I_t\)為圖像的索引,\(v_t^i\)為該單詞在圖像中的權重)。查詢數據庫時只需要比較有相同詞匯的圖像,加速查找,也就是說搜索圖像只需要(1)詞袋和(2)反向索引
具體流程如下:提取當前幀的描述子,查詢字典,得到單詞,查找反向索引表,得到所有具有該單詞的圖像。
直接索引:(存儲每一幅圖像的特征)對於每一幅圖像\(I_t\),存儲其使用的詞匯的祖先節點(任何一層l)及每一個節點的局部特征\(f_{tj}\)
直接索引能夠加快閉環檢測的幾何認證,因為只有具有相同的詞匯或者在第l層有相同的祖先的關鍵幀才需要進行幾何認證
直接索引存儲每一個圖像\(I_t\)中詞匯的在第\(l\)層(預先給定的)的所在的節點已經所有該圖像中屬於該節點的描述子。
DBow2的作用:通過視覺詞匯將一幅圖像轉換成稀疏的數字向量(能夠對大量的圖像進行處理)
視覺詞匯是離線建立的,通過將描述子空間划分成W個視覺詞匯
代碼如下:
#include <iostream>
#include <vector>
// DBoW2
//#include "DBoW2/DBoW2.h"
//#include <DUtils/DUtils.h>
//#include <DUtilsCV/DUtilsCV.h> // defines macros CVXX
//#include <DVision/DVision.h>
#include "Thirdparty/DBoW2/DBoW2/FORB.h"
#include "Thirdparty/DBoW2/DBoW2/TemplatedVocabulary.h"
//#include "Thirdparty/DBoW2/DBoW2/FClass.h"
// OpenCV
#include <opencv2/opencv.hpp>
#include "opencv2/core/core.hpp"
#include <opencv/cv.h>
#include <opencv/highgui.h>
#include <opencv2/nonfree/features2d.hpp>
//#include <opencv2/features2d/features2d.hpp>
// ROS
#include <rosbag/bag.h>
#include <rosbag/view.h>
#include <ros/ros.h>
#include <sensor_msgs/Image.h>
#include <boost/foreach.hpp>
#include <cv_bridge/cv_bridge.h>
#include "ORBextractor.h"
#include <dirent.h>
#include <string.h>
using namespace DBoW2;
using namespace DUtils;
using namespace std;
using namespace ORB_SLAM;
// - - - - - --- - - - -- - - - - -
/// ORB Vocabulary
typedef DBoW2::TemplatedVocabulary<DBoW2::FORB::TDescriptor, DBoW2::FORB>
ORBVocabulary;
/// ORB Database
//typedef DBoW2::TemplatedDatabase<DBoW2::FORB::TDescriptor, DBoW2::FORB>
//ORBDatabase;
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
void extractORBFeatures(cv::Mat &image, vector<vector<cv::Mat> > &features, ORBextractor* extractor);
void changeStructureORB( const cv::Mat &descriptor,vector<bool> &mask, vector<cv::Mat> &out);
void isInImage(vector<cv::KeyPoint> &keys, float &cx, float &cy, float &rMin, float &rMax, vector<bool> &mask);
void createVocabularyFile(ORBVocabulary &voc, std::string &fileName, const vector<vector<cv::Mat> > &features);
// ----------------------------------------------------------------------------
int main()
{
//Extracting ORB features from image folder
vector<std::string> filenames;
std::string folder = "/home/saodiseng/FRONTAL/";
cv::glob(folder, filenames);
// initialze ORBextractor
int nLevels = 5;//6;
ORBextractor* extractor = new ORBextractor(1000,1.2,nLevels,1,20);
int nImages = filenames.size();
vector<vector<cv::Mat > > features;
features.clear();
features.reserve(nImages);
//cv_bridge::CvImageConstPtr cv_ptr;
cv::Mat image;
//cout << "> Using bag file: " << bagFile << endl;
cout << "> Extracting Features from " << nImages << " images..." << endl;
//BOOST_FOREACH(rosbag::MessageInstance const m, viewTopic)
for(int i = 0; i < nImages; ++i)
{
//sensor_msgs::Image::ConstPtr i = m.instantiate<sensor_msgs::Image>();
std::cout << "Processing the " << i <<" image " << std::endl;
cv::Mat src = cv::imread(filenames[i]);
imshow("View", src);
cv::waitKey(1);
if (!src.empty())
{
//cv_ptr = cv_bridge::toCvShare(i);
cv::cvtColor(src, image, CV_RGB2GRAY);
extractORBFeatures(image, features, extractor);
}
}
//bag.close();
cout << "... Extraction done!" << endl;
// Creating the Vocabulary
// define vocabulary
const int k = 10; // branching factor
const WeightingType weight = TF_IDF;
const ScoringType score = L1_NORM;
ORBVocabulary voc(k, nLevels, weight, score);
std::string vociName = "vociOmni.txt";
createVocabularyFile(voc, vociName, features);
cout << "--- THE END ---" << endl;
return 0;
}
// ----------------------------------------------------------------------------
void extractORBFeatures(cv::Mat &image, vector<vector<cv::Mat> > &features, ORBextractor* extractor) {
vector<cv::KeyPoint> keypoints;
cv::Mat descriptorORB;
// extract
(*extractor)(image, cv::Mat(), keypoints, descriptorORB);
// reject features outside region of interest
vector<bool> mask;
float cx = 0; float cy = 0;
float rMin = 0; float rMax = 0;
isInImage(keypoints, cx, cy, rMin, rMax, mask);
// create descriptor vector for the vocabulary
features.push_back(vector<cv::Mat>());
changeStructureORB(descriptorORB, mask, features.back());
imshow("ORBFeature", features.back().back());
}
// ----------------------------------------------------------------------------
void changeStructureORB( const cv::Mat &descriptor,vector<bool> &mask, vector<cv::Mat> &out) {
for (int i = 0; i < descriptor.rows; i++) {
if(mask[i]) {
out.push_back(descriptor.row(i));
}
}
}
// ----------------------------------------------------------------------------
void isInImage(vector<cv::KeyPoint> &keys, float &cx, float &cy, float &rMin, float &rMax, vector<bool> &mask) {
int N = keys.size();
mask = vector<bool>(N, false);
int num = 0;
for(int i=0; i<N; i++) {
cv::KeyPoint kp = keys[i];
float u = kp.pt.x;
float v = kp.pt.y;
if(u>20 && u<320-20 && v>20 && v<240-20)
{
mask[i] = true;
num ++;
}
}
std::cout << "In image number " << num << std::endl;
}
// ----------------------------------------------------------------------------
void createVocabularyFile(ORBVocabulary &voc, std::string &fileName, const vector<vector<cv::Mat> > &features)
{
cout << "> Creating vocabulary. May take some time ..." << endl;
voc.create(features);
cout << "... done!" << endl;
cout << "> Vocabulary information: " << endl
<< voc << endl << endl;
// save the vocabulary to disk
cout << endl << "> Saving vocabulary..." << endl;
voc.saveToTextFile(fileName);
cout << "... saved to file: " << fileName << endl;
}
基於DBoW2做閉環檢測
A.查詢數據庫
通過數據庫存儲和檢索相似的所有圖像。步驟為:首先將圖像圖像轉換成bag-of-words向量\(v_t\)(\(tf-idf\),開頭的公式),然后查找數據庫中最相似的bag-of-words向量集,\(s(v_t,v_{ti})\)(多少個??)->正則化(\(s(v_t,v_{t- \Delta t}\))很小的情況單獨考慮)為\(\eta (v_t, v_{t_j})=\frac{s(v_t,v_{t_j})}{s(v_t, v_{t-\Delta t})}\)->舍棄小於閾值的匹配
計算兩個bag-of-word向量(兩幀圖像)1和v2的相似度
B.匹配聚類
為了防止時間上很近的關鍵幀之間相互競爭,將檢索得到的時間戳相差比價小的幀聚成island並將它們看做一個匹配,一系列匹配可以轉換成一個匹配\(<v_t, V_{T_i}>\),island也根據評分排序,選擇最高的。Island的得分為$ H(v_t, V_{T_i})=\sum_{j=n_i}^{m_i}\eta(v_t, v_{t_j})$。
C.Temporal consistency(時間一致性)
檢測\(V_{T_i}\)和以前的查詢結果\(<v_{t-\Delta t},V_{T_j}>\)的\(T_i,T_j\)時間一致性,\(<v_t,V_{T_i}>\)必須和k個以前的匹配查詢結果\(<v_{t-\Delta t},V_{T_j}>\)一致(\(k\)個以前的匹配的island時間\(T_i\)接近重疊),一旦通過一致性檢驗,則選取island\(V_{T_i}\)中得分最高的詞匯\(v_{t'}\)
D.有效的幾何一致性
用RANSAC方法得到\(I_t\)和\(I_t'\)的基礎矩陣(至少12個對應點),查找對應的特征點(brute force和k-d tree方法)
使用直接索引近似最近鄰(字典樹中屬於第l層的同一個節點,\(l\)提前設定,是速度和recall的折中)具體做法為:
(1)往數據庫中加入圖像時,在直接索引中存儲節點和一些特征的對;
(2)在得到圖像間的對應點時,在直接索引中查找只有在第l層屬於同一個節點的描述子,並進行比較。這個方法能夠提高對應點計算,l是提前固定的是對應點數目和進行該操作的時間的折中。