OpenCV 之透視 n 點問題

本文轉載自查看原文 2021-09-08 07:40 309 OpenCV/ 3D視覺

透視 n 點問題，源自相機標定，是計算機視覺的經典問題，廣泛應用在機器人定位、SLAM、AR/VR、攝影測量等領域

1 PnP 問題

1.1 定義

已知：相機的內參和畸變系數；世界坐標系中，n 個空間點坐標，以及投影在像平面上的像素坐標

求解：相機在世界坐標系下的位姿 R 和 t，即 {W} 到 {C} 的變換矩陣 $\;^w_c\bm{T} $，如下圖：

世界坐標系中的 3d 空間點，與投影到像平面的 2d 像素點，兩者之間的關系為：

$\quad s \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_1 \\ r_{21} & r_{22} & r_{23} & t_2 \\ r_{31} & r_{32} & r_{33} & t_3 \end{bmatrix} \begin{bmatrix} X_w \\ Y_w \\ Z_w\\ 1 \end{bmatrix} $

1.2 分類

根據給定空間點的數量，可將 PnP 問題分為兩類：

第一類 3≤n≤5，選取的空間點較少，可通過聯立方程組的方式求解，精度易受圖像噪聲影響，魯棒性較差

第二類 n≥6，選取的空間點較多，可轉化為求解超定方程的問題，一般側重於魯棒性和實時性的平衡

2 求解方法

2.1 DLT 法

2.1.1 轉化為 Ax=0

令 $P = K\;[R\;\, t]$，$K$ 為相機內參矩陣，則 PnP 問題可簡化為：已知 n 組 3d-2d 對應點，求解 $P_{3\times4}$

DLT (Direct Linear Transformation，直接線性變換)，便是直接利用這 n 組對應點，構建線性方程組來求解

$\quad s \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = \begin{bmatrix} p_{11} & p_{12} & p_{13} & p_{14} \\ p_{21} & p_{22} & p_{23} & p_{23} \\ p_{31} & p_{32} & p_{33} & p_{33} \end{bmatrix} \begin{bmatrix} X_w \\ Y_w \\ Z_w\\ 1 \end{bmatrix} $

簡化符號 $X_w, Y_w, Z_w$ 為 $X, Y, Z$，展開得：

$\quad \begin{equation} \begin{cases} su= p_{11}X + p_{12}Y + p_{13}Z + p_{14}\\ \\sv=p_{21}X + p_{22}Y + p_{23}Z + p_{24} \\ \\s\;=p_{31}X + p_{32}Y + p_{33}Z + p_{34} \end{cases}\end{equation} \;\bm{=>} \; \begin{cases} Xp_{11} + Yp_{12} + Zp_{13} + p_{14} - uXp_{31} - uYp_{32} - uZp_{33} - up_{34} = 0 \\ \\ Xp_{21} + Yp_{22} + Zp_{23} + p_{24} - vXp_{31} - vYp_{32} - vZp_{33} - vp_{34} = 0 \end{cases}$

未知數有 11 個 ($p_{34}$可約掉)，則至少需要 6 組對應點，寫成矩陣形式如下：

$\quad \begin{bmatrix} X_1&Y_1&Z_1&1 &0&0&0&0&-u_1X_1&-u_1Y_1&-u_1Z_1&-u_1 \\ 0&0&0&0& X_1&Y_1&Z_1&1&-v_1X_1&-v_1Y_1&-v_1Z_1&-v_1 \\ \vdots &\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots \\ \vdots &\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots\\ X_n&Y_n&Z_n&1 &0&0&0&0&-u_nX_n&-u_nY_1&-u_nZ_n&-u_n \\ 0&0&0&0& X_n&Y_n&Z_n&1&-v_nX_n&-v_nY_n&-v_nZ_n&-v_n\end{bmatrix} \begin{bmatrix}p_{11}\\p_{12}\\p_{13}\\p_{14}\\ \vdots\\p_{32}\\p_{33}\\p_{34}\end{bmatrix}=\begin{bmatrix}0\\ \vdots\\ \vdots\\0\end{bmatrix}$

因此，求解 $P_{3\times4}$ 便轉化成了 $Ax=0$ 的問題

2.1.2 SVD 求 R t

給定相機內參矩陣，則有 $K \begin{bmatrix} R & t \end{bmatrix} = \lambda \begin{bmatrix} p_1 & p_2 &p_3&p_4 \end{bmatrix}$

考慮 $\lambda$ 符號無關，得 $\lambda R = K^{-1}\begin{bmatrix} p_1 & p_2&p_3 \end{bmatrix}$

SVD 分解 $K^{-1}\begin{bmatrix} p_1&p_2&p_3\end{bmatrix}=\bm{U}\begin{bmatrix}d_{11} && \\ &d_{22}&\\&&&d_{33}\end{bmatrix} \bm{V^T}$

$\quad=> \lambda \approx d_{11}$ 和 $\begin{cases}\bm{R=UV^T} \\ \bm{t=\dfrac{K^{-1}p_4}{d_{11}}} \end{cases}$

2.2 P3P 法

當 n=3 時，PnP 即為 P3P，它有 4 個可能的解，求解方法是余弦定理 + 向量點積

2.2.1 余弦定理

根據投影幾何的消隱點和消隱線，構建 3d-2d 之間的幾何關系，如下：

根據余弦定理，則有

$\begin{cases} d_1^2 + d_2^2 - 2d_1d_2\cos\theta_{12} = p_{12}^2 \\ \\ d_2^2 + d_3^2 - 2d_2d_3\cos\theta_{23} = p_{23}^2 \\ \\ d_3^2 + d_1^2 + 2d_3d_2\cos\theta_23 = p_{31}^2 \end{cases}$

只有 $d_1,\, d_2,\,d_3$ 是未知數，求解方程組即可

其中，有個關鍵的隱含點：給定相機內參，以及 3d-2d 的投影關系，則消隱線之間的夾角 $\theta_{12}\; \theta_{23}\; \theta_{31}$ 是可計算得出的

2.2.2 向量點積

相機坐標系中，原點即為消隱點，原點到 3d-2d 的連線即為消隱線，如圖所示：

如果知道 3d點投影到像平面的 2d點，在相機坐標系中的坐標 $U_1,\,U_2,\,U_3$，則 $\cos\theta_{23}= \dfrac {\overrightarrow{OU_2}\cdot \overrightarrow{OU_3}} {||\overrightarrow{OU_2}||\;||\overrightarrow{OU_3}||} $

具體到運算，可視為世界坐標系 {W} 和相機坐標系 {C} 重合，且 $Z = f$，則有

$\quad \begin{bmatrix} R & t \end{bmatrix} = \begin{bmatrix} 1 &0&0&0 \\ 0&1&0&0 \\ 0&0&1&0 \end{bmatrix} =>$ $\; s \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} X_c \\ Y_c \\ Z_c \end{bmatrix} $

$K^{-1}$ 可用增廣矩陣求得，且 $Z_c = f$，則有

$\quad \begin{bmatrix} X_c \\Y_c\\f \end{bmatrix} = s K^{-1}\begin{bmatrix} u\\v\\1 \end{bmatrix}$

記 $\vec u = \begin{bmatrix} X_c \\ Y_c \\ Z_c \end{bmatrix}$，則 $\cos\theta_{12}=\dfrac{(K^{-1}\vec{u_1})^T (K^{-1}\vec{u_2})}{||K^{-1}\vec{u_1}||\,||K^{-1}\vec{u_2}||}$，以此類推 $\cos\theta_{23}$ 和 $\cos\theta_{31}$

3 OpenCV 函數

OpenCV 中解 PnP 的方法有 9 種，目前實現了 7 種，還有 2 種未實現，對應論文如下：

- SOLVEPNP_P3P Complete Solution Classification for the Perspective-Three-Point Problem

- SOLVEPNP_AP3P An Efficient Algebraic Solution to the Perspective-Three-Point Problem

- SOLVEPNP_ITERATIVE 基於 L-M 最優化方法，求解重投影誤差最小的位姿

- SOLVEPNP_EPNP EPnP: An Accurate O(n) Solution to the PnP Problem

- SOLVEPNP_SQPNP A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem

- SOLVEPNP_IPPE Infinitesimal Plane-based Pose Estimation 輸入的 3D 點需要共面且 n ≥ 4

- SOLVEPNP_IPPE_SQUARE SOLVEPNP_IPPE 的一種特殊情況，要求輸入 4 個共面點的坐標，並且按照特定的順序排列

- SOLVEPNP_DLS (未實現) A Direct Least-Squares (DLS) Method for PnP 實際調用 SOLVEPNP_EPNP

- SOLVEPNP_UPLP (未實現) Exhaustive Linearization for Robust Camera Pose and Focal Length Estimation 實際調用 SOLVEPNP_EPNP

3.1 solveP3P()

solveP3P() 的輸入是 3 組 3d-2d 對應點，定義如下：

    // P3P has up to 4 solutions, and the solutions are sorted by reprojection errors(lowest to highest).
    int solveP3P (
        InputArray     objectPoints,     // object points, 3x3 1-channel or 1x3/3x1 3-channel. vector<Point3f> can be also passed
        InputArray      imagePoints,     // corresponding image points, 3x2 1-channel or 1x3/3x1 2-channel. vector<Point2f> can be also passed
        InputArray     cameraMatrix,     // camera intrinsic matrix
        InputArray       distCoeffs,     // distortion coefficients.If NULL/empty, the zero distortion coefficients are assumed.
        OutputArrayOfArrays   rvecs,     // rotation vectors
        OutputArrayOfArrays   tvecs,     // translation vectors
        int                   flags      // solving method
    );

3.2 solvePnP() 和 solvePnPGeneric()

solvePnP() 實際上調用的是 solvePnPGeneric()，內部實現如下：

bool solvePnP(InputArray opoints, InputArray ipoints, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess, int flags)
{
    CV_INSTRUMENT_REGION();

    vector<Mat> rvecs, tvecs;
    int solutions = solvePnPGeneric(opoints, ipoints, cameraMatrix, distCoeffs, rvecs, tvecs, useExtrinsicGuess, (SolvePnPMethod)flags, rvec, tvec);

    if (solutions > 0)
    {
        int rdepth = rvec.empty() ? CV_64F : rvec.depth();
        int tdepth = tvec.empty() ? CV_64F : tvec.depth();
        rvecs[0].convertTo(rvec, rdepth);
        tvecs[0].convertTo(tvec, tdepth);
    }

    return solutions > 0;
}

solvePnPGeneric() 除了求解相機位姿外，還可得到重投影誤差，其定義如下：

    bool solvePnPGeneric (
        InputArray         objectPoints,    // object points, Nx3 1-channel or 1xN/Nx1 3-channel, N is the number of points. vector<Point3d> can be also passed
        InputArray          imagePoints,    // corresponding image points, Nx2 1-channel or 1xN/Nx1 2-channel, N is the number of points. vector<Point2d> can be also passed
        InputArray         cameraMatrix,    // camera intrinsic matrix
        InputArray           distCoeffs,    // distortion coefficients
        OutputArrayOfArrays        rvec,    // rotation vector
        OutputArrayOfArrays        tvec,    // translation vector
        bool  useExtrinsicGuess = false,    // used for SOLVEPNP_ITERATIVE. If true, use the provided rvec and tvec as initial approximations, and further optimize them.
        SolvePnPMethod  flags = SOLVEPNP_ITERATIVE, // solving method
        InputArray               rvec = noArray(),  // initial rotation vector when using SOLVEPNP_ITERATIVE and useExtrinsicGuess is set to true
        InputArray               tvec = noArray(),  // initial translation vector when using SOLVEPNP_ITERATIVE and useExtrinsicGuess is set to true
        OutputArray reprojectionError = noArray()   // optional vector of reprojection error, that is the RMS error
    );

3.3 solvePnPRansac()

solvePnP() 的一個缺點是魯棒性不強，對異常點敏感，這在相機標定中問題不大，因為標定板的圖案已知，並且特征提取較為穩定

然而，當相機拍攝實際物體時，因為特征難以穩定提取，會出現一些異常點，導致位姿估計的不准，因此，需要一種處理異常點的方法

RANSAC 便是一種高效剔除異常點的方法，對應 solvePnPRansac()，它是一個重載函數，共有 2 種參數形式，第 1 種形式如下：

    bool solvePnPRansac (
        InputArray         objectPoints,    // object points, Nx3 1-channel or 1xN/Nx1 3-channel, N is the number of points. vector<Point3d> can be also passed
        InputArray          imagePoints,    // corresponding image points, Nx2 1-channel or 1xN/Nx1 2-channel, N is the number of points. vector<Point2d> can be also passed
        InputArray         cameraMatrix,    // camera intrinsic matrix
        InputArray           distCoeffs,    // distortion coefficients
        OutputArray                rvec,    // rotation vector
        OutputArray                tvec,    // translation vector
        bool  useExtrinsicGuess = false,    // used for SOLVEPNP_ITERATIVE. If true, use the provided rvec and tvec as initial approximations, and further optimize them.
        int       iterationsCount = 100,    // number of iterations
        float   reprojectionError = 8.0,    // inlier threshold value. It is the maximum allowed distance between the observed and computed point projections to consider it an inlier
        double        confidence = 0.99,    // the probability that the algorithm produces a useful result
        OutputArray inliers = noArray(),    // output vector that contains indices of inliers in objectPoints and imagePoints
        int  flags = SOLVEPNP_ITERATIVE     // solving method
    );

3.4 solvePnPRefineLM() 和 solvePnPRefineVVS()

OpenCV 中還有 2 個位姿細化函數：通過迭代不斷減小重投影誤差，從而求得最佳位姿，solvePnPRefineLM() 使用 L-M 算法，solvePnPRefineVVS() 則用虛擬視覺伺服 (Virtual Visual Servoing)

solvePnPRefineLM() 的定義如下：

  void solvePnPRefineLM (
      InputArray      objectPoints,  // object points, Nx3 1-channel or 1xN/Nx1 3-channel, N is the number of points
      InputArray       imagePoints,  // corresponding image points, Nx2 1-channel or 1xN/Nx1 2-channel
      InputArray      cameraMatrix,  // camera intrinsic matrix
      InputArray        distCoeffs,  // distortion coefficients
      InputOutputArray      rvec,      // input/output rotation vector
      InputOutputArray      tvec,      // input/output translation vector      
      TermCriteria  criteria = TermCriteria(TermCriteria::EPS+TermCriteria::COUNT, 20, FLT_EPSILON) // Criteria when to stop the LM iterative algorithm
);

4 應用實例

4.1 位姿估計 (靜態+標定板)

當手持標定板旋轉不同角度時，利用相機內參 + solvePnP()，便可求出相機相對標定板的位姿

#include "opencv2/imgproc.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/calib3d.hpp"

using namespace std;
using namespace cv;

Size kPatternSize = Size(9, 6);
float kSquareSize = 0.025;
// camera intrinsic parameters and distortion coefficient
const Mat cameraMatrix = (Mat_<double>(3, 3) << 5.3591573396163199e+02, 0.0, 3.4228315473308373e+02,
                                                0.0, 5.3591573396163199e+02, 2.3557082909788173e+02,
                                                0.0, 0.0, 1.0);
const Mat distCoeffs = (Mat_<double>(5, 1) << -2.6637260909660682e-01, -3.8588898922304653e-02, 1.7831947042852964e-03,
                                              -2.8122100441115472e-04, 2.3839153080878486e-01);

int main()
{
    // 1) read image
    Mat src = imread("left07.jpg");
    if (src.empty())
        return -1;
    // prepare for subpixel corner
    Mat src_gray;
    cvtColor(src, src_gray, COLOR_BGR2GRAY);

    // 2) find chessboard corners and subpixel refining
    vector<Point2f> corners;
    bool patternfound = findChessboardCorners(src, kPatternSize, corners);
    if (patternfound) {
        cornerSubPix(src_gray, corners, Size(11, 11), Size(-1, -1), TermCriteria(TermCriteria::EPS + TermCriteria::MAX_ITER, 30, 0.1));
    }
    else {
        return -1;
    }

    // 3) object coordinates
    vector<Point3f> objectPoints;
    for (int i = 0; i < kPatternSize.height; i++)
    {
        for (int j = 0; j < kPatternSize.width; j++)
        {
            objectPoints.push_back(Point3f(float(j * kSquareSize), float(i * kSquareSize), 0));
        }
    }

    // 4) Rotation and Translation vectors
    Mat rvec, tvec;
    solvePnP(objectPoints, corners, cameraMatrix, distCoeffs, rvec, tvec);

    // 5) project estimated pose on the image 
    drawFrameAxes(src, cameraMatrix, distCoeffs, rvec, tvec, 2*kSquareSize);
    imshow("Pose estimation", src);
    waitKey();
}

當標定板旋轉不同角度時，相機所對應的位姿如下：

4.2 位姿估計 (實時+任意物)

OpenCV 中有一個實時目標跟蹤例程，位於 "opencv\samples\cpp\tutorial_code\calib3d\real_time_pose_estimation" 中，實現步驟如下：

1) 讀取目標的三維模型和網格 -> 2) 獲取視頻流 -> 3) ORB 特征檢測 -> 4) 3d-2d 特征匹配 -> 5) 相機位姿估計 -> 6) 卡爾曼濾波

例程中設計了一個 PnPProblem 類來實現位姿估計，其中 2 個重要的函數 estimatePoseRANSAC() 和 backproject3DPoint() 定義如下：

    class PnPProblem
    {
    public:
        explicit PnPProblem(const double param[]);  // custom constructor
        virtual ~PnPProblem();

        cv::Point2f backproject3DPoint(const cv::Point3f& point3d);

        void estimatePoseRANSAC(const std::vector<cv::Point3f>& list_points3d, const std::vector<cv::Point2f>& list_points2d,
                                int flags, cv::Mat& inliers, int iterationsCount, float reprojectionError, double confidence);
        // ...
    }

    // Custom constructor given the intrinsic camera parameters
    PnPProblem::PnPProblem(const double params[])
    {
        // intrinsic camera parameters
        _A_matrix = cv::Mat::zeros(3, 3, CV_64FC1);   
        _A_matrix.at<double>(0, 0) = params[0];       //  [ fx   0  cx ]
        _A_matrix.at<double>(1, 1) = params[1];       //  [  0  fy  cy ]
        _A_matrix.at<double>(0, 2) = params[2];       //  [  0   0   1 ]
        _A_matrix.at<double>(1, 2) = params[3];
        _A_matrix.at<double>(2, 2) = 1;
        // rotation matrix, translation matrix, rotation-translation matrix
        _R_matrix = cv::Mat::zeros(3, 3, CV_64FC1);
        _t_matrix = cv::Mat::zeros(3, 1, CV_64FC1);
        _P_matrix = cv::Mat::zeros(3, 4, CV_64FC1);
    }

    // Estimate the pose given a list of 2D/3D correspondences with RANSAC and the method to use
    void PnPProblem::estimatePoseRANSAC (
        const std::vector<Point3f>& list_points3d,        // list with model 3D coordinates
        const std::vector<Point2f>& list_points2d,        // list with scene 2D coordinates
        int flags, Mat& inliers, int iterationsCount,     // PnP method; inliers container
        float reprojectionError, float confidence)        // RANSAC parameters
    {
        // distortion coefficients, rotation vector and translation vector 
        Mat distCoeffs = Mat::zeros(4, 1, CV_64FC1);
        Mat rvec = Mat::zeros(3, 1, CV_64FC1);          
        Mat tvec = Mat::zeros(3, 1, CV_64FC1);   
        // no initial approximations
        bool useExtrinsicGuess = false;           

        // PnP + RANSAC
        solvePnPRansac(list_points3d, list_points2d, _A_matrix, distCoeffs, rvec, tvec, useExtrinsicGuess, iterationsCount, reprojectionError, confidence, inliers, flags);
   
        // converts Rotation Vector to Matrix
        Rodrigues(rvec, _R_matrix);                   
        _t_matrix = tvec;                            // set translation matrix
        this->set_P_matrix(_R_matrix, _t_matrix);    // set rotation-translation matrix
    }

    // Backproject a 3D point to 2D using the estimated pose parameters
    cv::Point2f PnPProblem::backproject3DPoint(const cv::Point3f& point3d)
    {
        // 3D point vector [x y z 1]'
        cv::Mat point3d_vec = cv::Mat(4, 1, CV_64FC1);
        point3d_vec.at<double>(0) = point3d.x;
        point3d_vec.at<double>(1) = point3d.y;
        point3d_vec.at<double>(2) = point3d.z;
        point3d_vec.at<double>(3) = 1;

        // 2D point vector [u v 1]'
        cv::Mat point2d_vec = cv::Mat(4, 1, CV_64FC1);
        point2d_vec = _A_matrix * _P_matrix * point3d_vec;

        // Normalization of [u v]'
        cv::Point2f point2d;
        point2d.x = (float)(point2d_vec.at<double>(0) / point2d_vec.at<double>(2));
        point2d.y = (float)(point2d_vec.at<double>(1) / point2d_vec.at<double>(2));

        return point2d;
    }

PnPProblem 類的調用如下：實例化 -> estimatePoseRansac() 估計位姿 -> backproject3DPoint() 畫出位姿

// Intrinsic camera parameters: UVC WEBCAM
double f = 55;                     // focal length in mm
double sx = 22.3, sy = 14.9;       // sensor size
double width = 640, height = 480;  // image size
double params_WEBCAM[] = { width * f / sx,   // fx
                           height * f / sy,  // fy
                           width / 2,      // cx
                           height / 2 };   // cy
// instantiate PnPProblem class
PnPProblem pnp_detection(params_WEBCAM);

// RANSAC parameters
int iterCount = 500;        // number of Ransac iterations.
float reprojectionError = 2.0;    // maximum allowed distance to consider it an inlier.
float confidence = 0.95;          // RANSAC successful confidence.

// OpenCV requires solvePnPRANSAC to minimally have 4 set of points
if (good_matches.size() >= 4)
{
    // -- Step 3: Estimate the pose using RANSAC approach
    pnp_detection.estimatePoseRANSAC(list_points3d_model_match, list_points2d_scene_match,
        pnpMethod, inliers_idx, iterCount, reprojectionError, confidence);

    // ... ..
}

// ... ...　

float fp = 5;
vector<Point2f> pose2d;
pose2d.push_back(pnp_detect_est.backproject3DPoint(Point3f(0, 0, 0))); // axis center
pose2d.push_back(pnp_detect_est.backproject3DPoint(Point3f(fp, 0, 0))); // axis x
pose2d.push_back(pnp_detect_est.backproject3DPoint(Point3f(0, fp, 0))); // axis y
pose2d.push_back(pnp_detect_est.backproject3DPoint(Point3f(0, 0, fp))); // axis z

draw3DCoordinateAxes(frame_vis, pose2d); // draw axes

// ... ...

　　實時目標跟蹤的效果如下：

參考資料

OpenCV-Python Tutorials / Camera Calibration and 3D Reconstruction / Pose Estimation

OpenCV Tutorials / Camera calibration and 3D reconstruction (calib3d module) / Real time pose estimation of a textured object

一種改進的 PnP 問題求解算法研究[J]

Perspective-n-Point, Hyun Soo Park

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 opencv透視變換 opencv透視變換 Python：n個點的費馬問題 Python - opencv (七) 透視變換【OpenCV】透視變換矯正 warpperspective 透視變化的opencv實現 opencv實踐::透視變換經典分治問題，平面N個點求最近點對 Java基於opencv—透視變換矯正圖像 OpenCV利用透視變換矯正圖像

OpenCV 之 透視 n 點問題