1. 緒言

圖像拼接算是傳統計算機視覺領域集大成者的一個方向，涉及的步驟主要有：特征點提取、特征匹配、圖像配准、圖像融合等。如下圖1.1 是opencv圖像拼接的流程圖，圖像拼接方向涉及的研究方向眾多，如特征提取方向就有常用的SIFT、SURF、ORB等，這些特征提取方法在slam方向也有非常廣的應用，所以有余力的話弄清楚這些實現細節，對建立自身的知識體系還是非常有必要的。
圖1.1 opencv 拼接流程圖

圖1.1 opencv 拼接流程圖

2. opencv stitcher

opencv當中有直接封裝的拼接類 Stitcher，基本是調用一個接口就可以完成所有拼接步驟，得到拼接圖像。測試用例圖片參考。

2.1 示例代碼

下面是調用接口的示例代碼：

#include "opencv2/opencv.hpp"
#include "logging.hpp"
#include <string>

void stitchImg(const std::vector<cv::Mat>& imgs, cv::Mat& pano)
{
    //設置拼接圖像 warp 模式，有PANORAMA與SCANS兩種模式
    //panorama: 圖像會投影到球面或者柱面進行拼接
    //scans: 默認沒有光照補償與柱面投影，直接經過仿射變換進行拼接
    cv::Stitcher::Mode mode = cv::Stitcher::PANORAMA;
    cv::Ptr<cv::Stitcher> stitcher = cv::Stitcher::create(mode);
    cv::Stitcher::Status status = stitcher->stitch(imgs, pano);
    if(cv::Stitcher::OK != status){
        LOG(INFO) << "failed to stitch images, err code: " << (int)status;
    }
}

int main(int argc, char* argv[])
{
    std::string pic_path = "data/img/*";
    std::string pic_pattern = ".jpg";

    if(2 == argc){
        pic_path = std::string(argv[1]);
    }else if(3 == argc){
        pic_path = std::string(argv[1]);
        pic_pattern = std::string(argv[2]);
    }else{
        LOG(INFO) << "default value";
    }
    std::vector<cv::String> img_names;
    std::vector<cv::Mat> imgs;
    pic_pattern = pic_path + pic_pattern;
    cv::glob(pic_pattern, img_names);
    if(img_names.empty()){
        LOG(INFO) << "no images";
        return -1;
    }
    for(size_t i = 0; i < img_names.size(); ++i){
        cv::Mat img = cv::imread(img_names[i]);
        imgs.push_back(img.clone());
    }
    cv::Mat pano;
    stitchImg(imgs, pano);
    if(!pano.empty()){
        cv::imshow("pano", pano);
        cv::waitKey(0);
    }
    return 0;
}

2.2 示例效果

mode = panorama

CMU場景拼接 1
mode=scans

CMU場景拼接 2

上面的兩組CMU場景對比圖說明了PANORAMA與SCANS的區別，前者會將圖像進行柱面投影，得到的全景圖會有彎曲的現象，而SCANS只有仿射變換，所以拼接圖基本都保留了原圖的直線平行關系。

3. 簡化的拼接

這一節准備挖一些坑。在看opencv stitcher里面的細節時，先簡單模仿實現一下scans模式的拼接，看看拼接的效果。基本思路是：

特征提取與匹配，找到圖像間的匹配關系；
估算圖像的變換矩陣，以便圖像對齊；選取十個匹配程度最高的特征點，繪制這十個特征點，找到正確匹配的三個點估算仿射變換矩陣；
設置一個畫布，寬度是所有圖像的寬度之和，高度為所有圖像高度的最大值，默認值為0
將匹配程度最高的點投影到畫布上，作為左右拼接圖像的中心
以右邊的圖像為參考圖像，即將左邊的圖像進行變換然后與右邊的圖像進行融合

3.1 特征提取

常用的特征提取主要有SIFT 、SURF、ORB，ORB速度較快，再其他視覺任務中用的也比較多，但是精度沒有前兩者高。

void featureExtract(const std::vector<cv::Mat> &imgs,
                    std::vector<std::vector<cv::KeyPoint>> &keyPoints,
                    std::vector<cv::Mat> &imageDescs)
{
    keyPoints.clear();
    imageDescs.clear();
    //提取特征點
    int minHessian       = 800;
    cv::Ptr<cv::ORB> orbDetector = cv::ORB::create(minHessian);
    for (int i = 0; i < imgs.size(); ++i) {
        std::vector<cv::KeyPoint> keyPoint;
        //灰度圖轉換
        cv::Mat image;
        cvtColor(imgs[i], image, cv::COLOR_BGR2GRAY);
        orbDetector->detect(image, keyPoint);
        keyPoints.push_back(keyPoint);
        cv::Mat imageDesc1;
        orbDetector->compute(image, keyPoint, imageDesc1);
        /*需要將imageDesc轉成浮點型，不然會出錯
       **Unsupported format or combination of formats 
       **in buildIndex using FLANN algorithm
       */
        imageDesc1.convertTo(imageDesc1, CV_32F);
        imageDescs.push_back(imageDesc1.clone());
    }
}

3.2 特征匹配

這一步根據圖像的特征點確定圖像之間特征點的配對關系，從而求取變換矩陣H 。此H是對整幅圖像進行的變換，現在為了解決一些視差問題，有人在圖像上划分網格，然后對每個網格單獨計算變換矩陣H。

void featureMatching(const std::vector<cv::Mat> &imgs,
                     const std::vector<std::vector<cv::KeyPoint>> &keyPoints,
                     const std::vector<cv::Mat> &imageDescs,
                     std::vector<std::vector<cv::Point2f>> &optimalMatchePoint)
{
    optimalMatchePoint.clear();
    //獲得匹配特征點，並提取最優配對,此處假設是順序輸入，測試使用假設是兩張圖
    cv::FlannBasedMatcher matcher;
    std::vector<cv::DMatch> matchePoints;
    matcher.match(imageDescs[0], imageDescs[1], matchePoints, cv::Mat());

    sort(matchePoints.begin(), matchePoints.end());//特征點排序
    //獲取排在前N個的最優匹配特征點
    std::vector<cv::Point2f> imagePoints1, imagePoints2;
    for (int i = 0; i < MAX_OPTIMAL_POINT_NUM; i++) {
        imagePoints1.push_back(keyPoints[0][matchePoints[i].queryIdx].pt);
        imagePoints2.push_back(keyPoints[1][matchePoints[i].trainIdx].pt);
    }
       optimalMatchePoint.push_back(std::vector<cv::Point2f>{
            imagePoints1[0], imagePoints1[3], imagePoints1[6]});
    optimalMatchePoint.push_back(std::vector<cv::Point2f>{
            imagePoints2[0], imagePoints2[3], imagePoints2[6]});
}

使用orb特征提取的時候，這里有很多誤匹配的點，上面三個點是根據顯示出來匹配正確的點，將用來估算仿射變換矩陣H。opencv 內部處理是使用 RANSAC 算法進行估計的，此處我省略了這個步驟。

3.3 估算仿射變換矩陣

上一步得到了最強匹配的三個點，這一步可以直接計算得到H。在計算之前，先將右邊的圖像移到畫布的右邊

void getAffineMat(std::vector<std::vector<cv::Point2f>>& optimalMatchePoint,
                  int left_cols, std::vector<cv::Mat>& Hs)
{
    std::vector<cv::Point2f> newMatchingPt;
    for (int i = 0; i < optimalMatchePoint[1].size(); i++) {
        cv::Point2f pt = optimalMatchePoint[1][i];
        pt.x += left_cols;
        newMatchingPt.push_back(pt);
    }
    //左邊圖像的變換矩陣，右圖的特征點經過移動，左圖需要變換到畫布上右圖的特征點位置
    cv::Mat homo1 = getAffineTransform(optimalMatchePoint[0], newMatchingPt);
    //右邊圖像的變換矩陣，即將右圖移到畫布右側
    cv::Mat homo2 = getAffineTransform(optimalMatchePoint[1], newMatchingPt);

    Hs.push_back(homo1);
    Hs.push_back(homo2);
}

3.4 拼接圖像

確定了變換矩陣以后，取最強響應的特征點作為兩幅圖像的融合中心，中心左右兩邊分別對應各自兩幅圖像。這種拼接處理方式非常粗暴，對於只有平移變化拍攝的圖像，尚且還能拼接到一起，但是若加上旋轉或者拍攝時光心不對起的情況，拼接錯位非常嚴重。另外一點是圖像融合，此處直接選用一條分界線作為選取原圖像素的依據，過渡不夠平滑，也會有錯位。

void getPano2(std::vector<cv::Mat> &imgs, const std::vector<cv::Mat> &H, 
			  cv::Point2f &optimalPt, cv::Mat &pano)
{
    //以右邊圖像為參考，將left的圖像經過仿射變換變到與右邊圖像重合,取最強響應特征點作為兩幅圖像融合的中心
    //默認的全景圖畫布尺寸為：
   //	width=left.width + right.width, 
   //	height = std::max(left.height, right.height)
    int pano_width  = imgs[0].cols + imgs[1].cols;
    int pano_height = std::max(imgs[0].rows, imgs[1].rows);
    pano            = cv::Mat::zeros(cv::Size(pano_width, pano_height), CV_8UC3);
    cv::Mat img_trans0, img_trans1;
    img_trans0 = cv::Mat::zeros(pano.size(), CV_8UC3);
    img_trans1 = cv::Mat::zeros(pano.size(), CV_8UC3);
    //原圖經過仿射變化后已經位於全景圖對應的位置
    cv::warpAffine(imgs[0], img_trans0, H[0], pano.size());
    cv::warpAffine(imgs[1], img_trans1, H[1], pano.size());

    //最強響應特征點
    cv::Mat trans_pt = (cv::Mat_<double>(3, 1) << optimalPt.x, optimalPt.y, 1.0f);
    //最強響應特征點在畫布上的位置
    trans_pt = H[0]*trans_pt;

    //確定兩幅圖像需要選取的區域
    cv::Rect left_roi  = cv::Rect(0, 0, trans_pt.at<double>(0, 0), pano_height);
    cv::Rect right_roi = cv::Rect(trans_pt.at<double>(0, 0), 0,
            pano_width - trans_pt.at<double>(0, 0) + 1, pano_height);
    //將選取的區域像素復制到畫布上
    img_trans0(left_roi).copyTo(pano(left_roi));
    img_trans1(right_roi).copyTo(pano(right_roi));
    cv::imshow("pano", pano);
    cv::waitKey(0);
}

int main(int argc, char *argv[])
{
    cv::Mat image01 = cv::imread("data/img/medium11.jpg");
    cv::resize(image01, image01, cv::Size(image01.cols, image01.rows + 1));
    cv::Mat image02 = cv::imread("data/img/medium12.jpg");
    cv::resize(image02, image02, cv::Size(image02.cols, image02.rows + 1));
    std::vector<cv::Mat> imgs = {image01, image02};
    std::vector<std::vector<cv::KeyPoint>> keyPoints;
    std::vector<std::vector<cv::Point2f>> optimalMatchePoint;
    std::vector<cv::Mat> imageDescs;
    featureExtract(imgs, keyPoints, imageDescs);
    featureMatching(imgs, keyPoints, imageDescs, optimalMatchePoint);

    std::vector<cv::Point2f> newMatchingPt;
    for (int i = 0; i < optimalMatchePoint[1].size(); i++) {
        cv::Point2f pt = optimalMatchePoint[1][i];
        pt.x += imgs[0].cols;
        newMatchingPt.push_back(pt);
    }
    cv::Mat homo1 = getAffineTransform(optimalMatchePoint[0], newMatchingPt);
    cv::Mat homo2 = getAffineTransform(optimalMatchePoint[1], newMatchingPt);

    std::vector<cv::Mat> Hs = {homo1, homo2};
    cv::Mat pano;
    //getPano1(imgs, Hs, pano);
    getPano2(imgs, Hs, optimalMatchePoint[0][0], pano);
    return 0;
}

3.5 簡化拼接效果

只有平移變化的圖像拼接效果

圖 3.5.1 雪地場景仿射變化拼接

有旋轉變化的圖像拼接

圖 3.5.2 CMU仿射變化拼接

算不上啥效果吧，圖3.5.2可以清晰的看到錯位，而且整個拼接圖左側有明顯的傾斜，左側紅框為左圖區域，中間繪制的紅線表示左右圖分界線。錯位有多方面原因，沒有好的融合過渡算法，沒有考慮到相機的旋轉變化，拼接縫位置找的不好。畫面有傾斜，不夠自然，則是單一選擇某張圖片作為參考圖片，將其它圖像變換到其所在坐標系導致。

4. opencv stitcher 模塊

opencv在示例代碼中有提供 stitching_detailed.cpp 示例，里面包含了各個模塊的實現步驟。我們在實際使用的時候一般都是要求實時拼接，直接調接口基本是沒法達到這個要求的，特別是在arm嵌入式端，這就需要我們弄清楚實現細節找到優化點。我這里只對 stitching_detailed.cpp 中的部分細節感興趣，所以將耗時統計、縮放選找融合區域這些都去掉了。

4.1 參數預覽

opencv的stitching_detailed.cpp中有非常多的配置參數，由圖1.1 opencv 拼接流程圖可知，opencv stitcher中的主要步驟有：

registration
- 特征提取
- 特征匹配
- 圖像配准
- 相機內參估算
- 波形矯正
compositing
- 圖像變換
- 光照補償
- 查找拼接縫
- 圖像融合

registration部分主要是用來獲取圖像間的匹配關系，估算相機的內外參，並使用BA算法對參數進行優化，此模塊主要是對圖像的拼接順序和變換矩陣估算。compositing部分則是在獲取到參數以后進行圖像變換、融合，並使用光照補償等算法進行畫面一致性的改善。參數預覽如下：

static void printUsage(char** argv)
{
    cout <<
         "Rotation model images stitcher.\n\n"
         << argv[0] << " img1 img2 [...imgN] [flags]\n\n"
                       "Flags:\n"
                       "  --preview\n"
                       "      Run stitching in the preview mode. Works faster than usual mode,\n"
                       "      but output image will have lower resolution.\n"
                       "  --try_cuda (yes|no)\n"
                       "      Try to use CUDA. The default value is 'no'. All default values\n"
                       "      are for CPU mode.\n"
                       "\nMotion Estimation Flags:\n"
                       "  --work_megapix <float>\n"
                       "      Resolution for image registration step. The default is 0.6 Mpx.\n"
                       "  --features (surf|orb|sift|akaze)\n"
                       "      Type of features used for images matching.\n"
                       "      The default is surf if available, orb otherwise.\n"
                       "  --matcher (homography|affine)\n"
                       "      Matcher used for pairwise image matching.\n"
                       "  --estimator (homography|affine)\n"
                       "      Type of estimator used for transformation estimation.\n"
                       "  --match_conf <float>\n"
                       "      Confidence for feature matching step. The default is 0.65 for surf and 0.3 for orb.\n"
                       "  --conf_thresh <float>\n"
                       "      Threshold for two images are from the same panorama confidence.\n"
                       "      The default is 1.0.\n"
                       "  --ba (no|reproj|ray|affine)\n"
                       "      Bundle adjustment cost function. The default is ray.\n"
                       "  --ba_refine_mask (mask)\n"
                       "      Set refinement mask for bundle adjustment. It looks like 'x_xxx',\n"
                       "      where 'x' means refine respective parameter and '_' means don't\n"
                       "      refine one, and has the following format:\n"
                       "      <fx><skew><ppx><aspect><ppy>. The default mask is 'xxxxx'. If bundle\n"
                       "      adjustment doesn't support estimation of selected parameter then\n"
                       "      the respective flag is ignored.\n"
                       "  --wave_correct (no|horiz|vert)\n"
                       "      Perform wave effect correction. The default is 'horiz'.\n"
                       "  --save_graph <file_name>\n"
                       "      Save matches graph represented in DOT language to <file_name> file.\n"
                       "      Labels description: Nm is number of matches, Ni is number of inliers,\n"
                       "      C is confidence.\n"
                       "\nCompositing Flags:\n"
                       "  --warp (affine|plane|cylindrical|spherical|fisheye|stereographic|"
                       "     compressedPlaneA2B1|compressedPlaneA1.5B1|compressedPlanePortraitA2B1|"
                       "      compressedPlanePortraitA1.5B1|paniniA2B1|paniniA1.5B1|paniniPortraitA2B1|"
                       "      paniniPortraitA1.5B1|mercator|transverseMercator)\n"
                       "      Warp surface type. The default is 'spherical'.\n"
                       "  --seam_megapix <float>\n"
                       "      Resolution for seam estimation step. The default is 0.1 Mpx.\n"
                       "  --seam (no|voronoi|gc_color|gc_colorgrad)\n"
                       "      Seam estimation method. The default is 'gc_color'.\n"
                       "  --compose_megapix <float>\n"
                       "      Resolution for compositing step. Use -1 for original resolution.\n"
                       "      The default is -1.\n"
                       "  --expos_comp (no|gain|gain_blocks|channels|channels_blocks)\n"
                       "      Exposure compensation method. The default is 'gain_blocks'.\n"
                       "  --expos_comp_nr_feeds <int>\n"
                       "      Number of exposure compensation feed. The default is 1.\n"
                       "  --expos_comp_nr_filtering <int>\n"
                       "      Number of filtering iterations of the exposure compensation gains.\n"
                       "      Only used when using a block exposure compensation method.\n"
                       "      The default is 2.\n"
                       "  --expos_comp_block_size <int>\n"
                       "      BLock size in pixels used by the exposure compensator.\n"
                       "      Only used when using a block exposure compensation method.\n"
                       "      The default is 32.\n"
                       "  --blend (no|feather|multiband)\n"
                       "      Blending method. The default is 'multiband'.\n"
                       "  --blend_strength <float>\n"
                       "      Blending strength from [0,100] range. The default is 5.\n"
                       "  --output <result_img>\n"
                       "      The default is 'result.jpg'.\n"
                       "  --timelapse (as_is|crop) \n"
                       "      Output warped images separately as frames of a time lapse movie, "
                       "      with 'fixed_' prepended to input file names.\n"
                       "  --rangewidth <int>\n"
                       "      uses range_width to limit number of images to match with.\n";
}

4.2 Motion Estimation Flags 參數含義

work_megapix ：在特征提取等 registration過程中，為了減小耗時，會將圖像進行縮放，這就需要一個縮放比例；
features : 表示選用的提取的特征，（SURF|ORB|SIFT|akaze）
matcher : 特征匹配方法，（homography | affine）,單應性變換與仿射變換方法，分別對應BestOf2NearestMatcher、AffineBestOf2NearestMatcher，后者會找到兩幅圖仿射變換的最佳匹配點；
estimator : （homography | affine）,相機參數評估方法；
match_conf : 浮點型數據，表示匹配階段內點判斷的閾值；
conf_thresh : 兩幅圖片是來自同一全景的閾值：
ba : BA優化相機參數的代價函數，（no|reproj|ray|affine）;
ba_refine_mask : BA優化的時候，可以固定某些參數不動，通過指定mask實現。'x'表示需要優化，'_'表示固定參數，對應的順序是fx,skew,ppx,aspect,ppy；
wave_correct : 波形矯正標志，有（no|horiz|vert）三種類型，可以將拼接圖像約束在水平方向，或者垂直方向，避免出現“大鵬展翅”的情況；
save_graph : 以DOT語言格式保存圖像之間的匹配關系；

4.3 Compositing Flags 參數含義

warp ：圖像變換方法，包括球面投影、柱面投影等，opencv支持的投影方法比較多；
seam_megapix : 尋找拼接縫的時候，會將圖像進行縮放，此參數與 work_scale 可以用來控制縮放比例；
seam : 接縫尋找的方法；
compose_megapix : 預覽時用於設置拼接過程中以及拼接圖的分辨率；
expos_comp : 光照補償方法；
blend : 圖像融合方法，常用的有（feather|multibend）；

4.4 小結

如果輸入的圖片數量、分辨率不是太大，源碼中一些分辨率縮放的步驟，還有耗時測試的步驟都可以去除，以簡化拼接實現流程，在實際的拼接應用過程中，一般也不會直接采用這個流程進行實時拼接。流程中每一個配置參數涉及的算法原理有助於我們理解更多細節，也是后面我想逐步介紹的內容

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 OpenCV常用圖像拼接方法(四)：基於Stitcher類拼接 OpenCV 圖像拼接-Stitcher類-Stitching detailed使用與參數介紹 opencv圖像拼接 OpenCV圖像拼接函數關於OpenCV的Mat圖像拼接 OpenCV 實現多張圖像拼接 OpenCV 圖像拼接和圖像融合技術 Python+opencv 圖像拼接 opencv實戰-全景圖像拼接 OpenCV--全景圖像拼接