【OpenCV】SIFT原理與源碼分析:關鍵點搜索與定位


《SIFT原理與源碼分析》系列文章索引:http://www.cnblogs.com/tianyalu/p/5467813.html

由前一步《DoG尺度空間構造》,我們得到了DoG高斯差分金字塔:

如上圖的金字塔,高斯尺度空間金字塔中每組有五層不同尺度圖像,相鄰兩層相減得到四層DoG結果。關鍵點搜索就在這四層DoG圖像上尋找局部極值點。

DoG局部極值點

尋找DoG極值點時,每一個像素點和它所有的相鄰點比較,當其大於(或小於)它的圖像域和尺度域的所有相鄰點時,即為極值點。如下圖所示,比較的范圍是個3×3的立方體:中間的檢測點和它同尺度的8個相鄰點,以及和上下相鄰尺度對應的9×2個點——共26個點比較,以確保在尺度空間和二維圖像空間都檢測到極值點。 

在一組中,搜索從每組的第二層開始,以第二層為當前層,第一層和第三層分別作為立方體的的上下層;搜索完成后再以第三層為當前層做同樣的搜索。所以每層的點搜索兩次。通常我們將組Octaves索引以-1開始,則在比較時犧牲了-1組的第0層和第N組的最高層

高斯金字塔,DoG圖像及極值計算的相互關系如上圖所示。

關鍵點精確定位

以上極值點的搜索是在離散空間進行搜索的,由下圖可以看到,在離散空間找到的極值點不一定是真正意義上的極值點。可以通過對尺度空間DoG函數進行曲線擬合尋找極值點來減小這種誤差。
利用DoG函數在尺度空間的Taylor展開式:
則極值點為:
程序中還除去了極值小於0.04的點。如下所示:
// Detects features at extrema in DoG scale space.  Bad features are discarded
// based on contrast and ratio of principal curvatures.
// 在DoG尺度空間尋特征點(極值點)
void SIFT::findScaleSpaceExtrema( const vector<Mat>& gauss_pyr, const vector<Mat>& dog_pyr,
                                  vector<KeyPoint>& keypoints ) const
{
    int nOctaves = (int)gauss_pyr.size()/(nOctaveLayers + 3);
    
    // The contrast threshold used to filter out weak features in semi-uniform
    // (low-contrast) regions. The larger the threshold, the less features are produced by the detector.
    // 過濾掉弱特征的閾值 contrastThreshold默認為0.04
    int threshold = cvFloor(0.5 * contrastThreshold / nOctaveLayers * 255 * SIFT_FIXPT_SCALE);
    const int n = SIFT_ORI_HIST_BINS; //36
    float hist[n];
    KeyPoint kpt;

    keypoints.clear();

    for( int o = 0; o < nOctaves; o++ )
        for( int i = 1; i <= nOctaveLayers; i++ )
        {
            int idx = o*(nOctaveLayers+2)+i;
            const Mat& img = dog_pyr[idx];
            const Mat& prev = dog_pyr[idx-1];
            const Mat& next = dog_pyr[idx+1];
            int step = (int)img.step1();
            int rows = img.rows, cols = img.cols;

            for( int r = SIFT_IMG_BORDER; r < rows-SIFT_IMG_BORDER; r++)
            {
                const short* currptr = img.ptr<short>(r);
                const short* prevptr = prev.ptr<short>(r);
                const short* nextptr = next.ptr<short>(r);

                for( int c = SIFT_IMG_BORDER; c < cols-SIFT_IMG_BORDER; c++)
                {
                    int val = currptr[c];

                    // find local extrema with pixel accuracy
                    // 尋找局部極值點,DoG中每個點與其所在的立方體周圍的26個點比較
                    // if (val比所有都大 或者 val比所有都小)
                    if( std::abs(val) > threshold &&
                       ((val > 0 && val >= currptr[c-1] && val >= currptr[c+1] &&
                         val >= currptr[c-step-1] && val >= currptr[c-step] && 
                         val >= currptr[c-step+1] && val >= currptr[c+step-1] && 
                         val >= currptr[c+step] && val >= currptr[c+step+1] &&
                         val >= nextptr[c] && val >= nextptr[c-1] && 
                         val >= nextptr[c+1] && val >= nextptr[c-step-1] && 
                         val >= nextptr[c-step] && val >= nextptr[c-step+1] && 
                         val >= nextptr[c+step-1] && val >= nextptr[c+step] && 
                         val >= nextptr[c+step+1] && val >= prevptr[c] && 
                         val >= prevptr[c-1] && val >= prevptr[c+1] &&
                         val >= prevptr[c-step-1] && val >= prevptr[c-step] && 
                         val >= prevptr[c-step+1] && val >= prevptr[c+step-1] && 
                         val >= prevptr[c+step] && val >= prevptr[c+step+1]) ||
                        (val < 0 && val <= currptr[c-1] && val <= currptr[c+1] &&
                         val <= currptr[c-step-1] && val <= currptr[c-step] && 
                         val <= currptr[c-step+1] && val <= currptr[c+step-1] && 
                         val <= currptr[c+step] && val <= currptr[c+step+1] &&
                         val <= nextptr[c] && val <= nextptr[c-1] && 
                         val <= nextptr[c+1] && val <= nextptr[c-step-1] && 
                         val <= nextptr[c-step] && val <= nextptr[c-step+1] && 
                         val <= nextptr[c+step-1] && val <= nextptr[c+step] && 
                         val <= nextptr[c+step+1] && val <= prevptr[c] && 
                         val <= prevptr[c-1] && val <= prevptr[c+1] &&
                         val <= prevptr[c-step-1] && val <= prevptr[c-step] && 
                         val <= prevptr[c-step+1] && val <= prevptr[c+step-1] && 
                         val <= prevptr[c+step] && val <= prevptr[c+step+1])))
                    {
                        int r1 = r, c1 = c, layer = i;
                        
                        // 關鍵點精確定位
                        if( !adjustLocalExtrema(dog_pyr, kpt, o, layer, r1, c1,
                                                nOctaveLayers, (float)contrastThreshold,
                                                (float)edgeThreshold, (float)sigma) )
                            continue;
                        
                        float scl_octv = kpt.size*0.5f/(1 << o);
                        // 計算梯度直方圖
                        float omax = calcOrientationHist(
                            gauss_pyr[o*(nOctaveLayers+3) + layer],
                            Point(c1, r1),
                            cvRound(SIFT_ORI_RADIUS * scl_octv),
                            SIFT_ORI_SIG_FCTR * scl_octv,
                            hist, n);
                        float mag_thr = (float)(omax * SIFT_ORI_PEAK_RATIO);
                        for( int j = 0; j < n; j++ )
                        {
                            int l = j > 0 ? j - 1 : n - 1;
                            int r2 = j < n-1 ? j + 1 : 0;

                            if( hist[j] > hist[l]  &&  hist[j] > hist[r2]  &&  hist[j] >= mag_thr )
                            {
                                float bin = j + 0.5f * (hist[l]-hist[r2]) / 
                                (hist[l] - 2*hist[j] + hist[r2]);
                                bin = bin < 0 ? n + bin : bin >= n ? bin - n : bin;
                                kpt.angle = (float)((360.f/n) * bin);
                                keypoints.push_back(kpt);
                            }
                        }
                    }
                }
            }
        }
}

刪除邊緣效應

除了DoG響應較低的點,還有一些響應較強的點也不是穩定的特征點。DoG對圖像中的邊緣有較強的響應值,所以落在圖像邊緣的點也不是穩定的特征點。
一個平坦的DoG響應峰值在橫跨邊緣的地方有較大的主曲率,而在垂直邊緣的地方有較小的主曲率。主曲率可以通過2×2的Hessian矩陣H求出:
D值可以通過求臨近點差分得到。H的特征值與D的主曲率成正比,具體可參見Harris角點檢測算法。
為了避免求具體的值,我們可以通過H將特征值的比例表示出來。令 為最大特征值, 為最小特征值,那么:
Tr(H)表示矩陣H的跡,Det(H)表示H的行列式。
表示最大特征值與最小特征值的比值,則有:
上式與兩個特征值的比例有關。隨着主曲率比值的增加, 也會增加。我們只需要去掉比率大於一定值的特征點。Lowe論文中去掉r=10的點。
// Interpolates a scale-space extremum's location and scale to subpixel
// accuracy to form an image feature.  Rejects features with low contrast.
// Based on Section 4 of Lowe's paper.
// 特征點精確定位
static bool adjustLocalExtrema( const vector<Mat>& dog_pyr, KeyPoint& kpt, int octv,
                                int& layer, int& r, int& c, int nOctaveLayers,
                                float contrastThreshold, float edgeThreshold, float sigma )
{
    const float img_scale = 1.f/(255*SIFT_FIXPT_SCALE);
    const float deriv_scale = img_scale*0.5f;
    const float second_deriv_scale = img_scale;
    const float cross_deriv_scale = img_scale*0.25f;

    float xi=0, xr=0, xc=0, contr;
    int i = 0;

    //三維子像元插值
    for( ; i < SIFT_MAX_INTERP_STEPS; i++ )
    {
        int idx = octv*(nOctaveLayers+2) + layer;
        const Mat& img = dog_pyr[idx];
        const Mat& prev = dog_pyr[idx-1];
        const Mat& next = dog_pyr[idx+1];

        Vec3f dD((img.at<short>(r, c+1) - img.at<short>(r, c-1))*deriv_scale,
                 (img.at<short>(r+1, c) - img.at<short>(r-1, c))*deriv_scale,
                 (next.at<short>(r, c) - prev.at<short>(r, c))*deriv_scale);

        float v2 = (float)img.at<short>(r, c)*2;
        float dxx = (img.at<short>(r, c+1) + 
                img.at<short>(r, c-1) - v2)*second_deriv_scale;
        float dyy = (img.at<short>(r+1, c) + 
                img.at<short>(r-1, c) - v2)*second_deriv_scale;
        float dss = (next.at<short>(r, c) + 
                prev.at<short>(r, c) - v2)*second_deriv_scale;
        float dxy = (img.at<short>(r+1, c+1) - 
                img.at<short>(r+1, c-1) - img.at<short>(r-1, c+1) + 
                img.at<short>(r-1, c-1))*cross_deriv_scale;
        float dxs = (next.at<short>(r, c+1) - 
                next.at<short>(r, c-1) - prev.at<short>(r, c+1) + 
                prev.at<short>(r, c-1))*cross_deriv_scale;
        float dys = (next.at<short>(r+1, c) - 
                next.at<short>(r-1, c) - prev.at<short>(r+1, c) + 
                prev.at<short>(r-1, c))*cross_deriv_scale;

        Matx33f H(dxx, dxy, dxs,
                  dxy, dyy, dys,
                  dxs, dys, dss);

        Vec3f X = H.solve(dD, DECOMP_LU);

        xi = -X[2];
        xr = -X[1];
        xc = -X[0];

        if( std::abs( xi ) < 0.5f  &&  std::abs( xr ) < 0.5f  &&  std::abs( xc ) < 0.5f )
            break;

        //將找到的極值點對應成像素(整數)
        c += cvRound( xc );
        r += cvRound( xr );
        layer += cvRound( xi );

        if( layer < 1 || layer > nOctaveLayers ||
           c < SIFT_IMG_BORDER || c >= img.cols - SIFT_IMG_BORDER  ||
           r < SIFT_IMG_BORDER || r >= img.rows - SIFT_IMG_BORDER )
            return false;
    }

    /* ensure convergence of interpolation */
    // SIFT_MAX_INTERP_STEPS:插值最大步數,避免插值不收斂,程序中默認為5
    if( i >= SIFT_MAX_INTERP_STEPS )
        return false;

    {
        int idx = octv*(nOctaveLayers+2) + layer;
        const Mat& img = dog_pyr[idx];
        const Mat& prev = dog_pyr[idx-1];
        const Mat& next = dog_pyr[idx+1];
        Matx31f dD((img.at<short>(r, c+1) - img.at<short>(r, c-1))*deriv_scale,
                   (img.at<short>(r+1, c) - img.at<short>(r-1, c))*deriv_scale,
                   (next.at<short>(r, c) - prev.at<short>(r, c))*deriv_scale);
        float t = dD.dot(Matx31f(xc, xr, xi));

        contr = img.at<short>(r, c)*img_scale + t * 0.5f;
        if( std::abs( contr ) * nOctaveLayers < contrastThreshold )
            return false;

        /* principal curvatures are computed using the trace and det of Hessian */
       //利用Hessian矩陣的跡和行列式計算主曲率的比值
       float v2 = img.at<short>(r, c)*2.f;
        float dxx = (img.at<short>(r, c+1) + 
                img.at<short>(r, c-1) - v2)*second_deriv_scale;
        float dyy = (img.at<short>(r+1, c) + 
                img.at<short>(r-1, c) - v2)*second_deriv_scale;
        float dxy = (img.at<short>(r+1, c+1) - 
                img.at<short>(r+1, c-1) - img.at<short>(r-1, c+1) + 
                img.at<short>(r-1, c-1)) * cross_deriv_scale;
        float tr = dxx + dyy;
        float det = dxx * dyy - dxy * dxy;

        //這里edgeThreshold可以在調用SIFT()時輸入;
        //其實代碼中定義了 static const float SIFT_CURV_THR = 10.f 可以直接使用
        if( det <= 0 || tr*tr*edgeThreshold >= (edgeThreshold + 1)*(edgeThreshold + 1)*det )
            return false;
    }

    kpt.pt.x = (c + xc) * (1 << octv);
    kpt.pt.y = (r + xr) * (1 << octv);
    kpt.octave = octv + (layer << 8) + (cvRound((xi + 0.5)*255) << 16);
    kpt.size = sigma*powf(2.f, (layer + xi) / nOctaveLayers)*(1 << octv)*2;

    return true;
}

至此,SIFT第二步就完成了。參見《SIFT原理與源碼分析

 

本文轉自:http://blog.csdn.net/xiaowei_cqu/article/details/8087239


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM