Problem Specification
Definition
Given a set of 3D points in a world reference frame and their corresponding 2D image projections as well as the calibrated intrinsic camera parameters, determine the 6Assumptions and Data Characteristics
There are a few preliminary aspects of the problem that are common to all solutions of P''n''P. The assumption made in most solutions is that the camera is already calibrated. Thus, its intrinsic properties are already known, such as the focal length, principal image point, skew parameter, and other parameters. Some methods, such as UP''n''P. or the Direct Linear Transform (DLT) applied to the projection model, are exceptions to this assumption as they estimate these intrinsic parameters as well as the extrinsic parameters which make up the pose of the camera that the original P''n''P problem is trying to find. For each solution to PnP, the chosen point correspondences cannot be colinear. In addition, P''n''P can have multiple solutions, and choosing a particular solution would require post-processing of the solution set. RANSAC is also commonly used with a P''n''P method to make the solution robust to outliers in the set of point correspondences. P3P methods assume that the data is noise free, most PnP methods assume Gaussian noise on the inlier set.Methods
This following section describes two common methods that can be used to solve the P''n''P problem that are also readily available in open source software and how RANSAC can be used to deal with outliers in the data set.P3P
When , the P''n''P problem is in its minimal form of P3P and can be solved with three point correspondences. However, with just three point correspondences, P3P yields up to four real, geometrically feasible solutions. For low noise levels a fourth correspondence can be used to remove ambiguity. The setup for the problem is as follows. Let ''P'' be the center of projection for the camera, ''A'', ''B'', and ''C'' be 3D world points with corresponding images points ''u'', ''v'', and ''w''. Let ''X = , PA, '', ''Y = , PB, '', ''Z = , PC, '', , , , , , , , , . This forms triangles ''PBC'', ''PAC'', and ''PAB'' from which we obtain a sufficient equation system for P3P: :. Solving the P3P system results in up to four geometrically feasible real solutions for and . The oldest published solution dates to 1841. A recent algorithm for solving the problem as well as a solution classification for it is given in the 2003 ''IEEE Transactions on Pattern Analysis and Machine Intelligence'' paper by Gao, et al. An open source implementation of Gao's P3P solver can be found in OpenCV's ''calib3d'' module in the ''solvePnP'' function. Several faster and more accurate versions have been published since, including Lambda Twist P3P which achieved state of the art performance in 2018 with a 50 fold increase in speed and a 400 fold decrease in numerical failures. Lambdatwist is available as open source in OpenMVG and at https://github.com/midjji/pnp.EP''n''P
Efficient P''n''P (EP''n''P) is a method developed by Lepetit, et al. in their 2008 International Journal of Computer Vision paper that solves the general problem of P''n''P for . This method is based on the notion that each of the ''n'' points (which are called reference points) can be expressed as a weighted sum of four virtual control points. Thus, the coordinates of these control points become the unknowns of the problem. It is from these control points that the final pose of the camera is solved for. As an overview of the process, first note that each of the reference points in the world frame, , and their corresponding image points, , are weighted sums of the four controls points, and respectively, and the weights are normalized per reference point as shown below. All points are expressed in homogeneous form. : : : From this, the derivation of the image reference points becomes :. Where is the image reference points with pixel coordinate . The homogeneous image control point has the form . Rearranging the image reference point equation yields the following two linear equations for each reference point: : :. Using these two equations for each of the reference points, the system can be formed where . The solution for the control points exists in theSQPnP
SQPnP was described by Terzakis and Lourakis in an ECCV 2020 paper. It is a non-minimal, non-polynomial solver which casts P''n''P as a non-linear quadratic program. SQPnP identifies regions in the parameter space of 3D rotations (i.e., the 8-sphere) that contain unique minima with guarantees that at least one of them is the global one. Each regional minimum is computed with sequential quadratic programming that is initiated at nearest orthogonal approximation matrices. SQPnP has similar or even higher accuracy compared to state of the art polynomial solvers, is globally optimal and computationally very efficient, being practically linear in the number of supplied points . A C++ implementation is , which has also been ported to OpenCV and included in the Camera Calibration and 3D Reconstruction module (''SolvePnP'' function).Using RANSAC
P''n''P is prone to errors if there are outliers in the set of point correspondences. Thus, RANSAC can be used in conjunction with existing solutions to make the final solution for the camera pose more robust to outliers. An open source implementation of P''n''P methods with RANSAC can be found in OpenCV's Camera Calibration and 3D Reconstruction module in the ''solvePnPRansac'' function.See also
* Camera resectioning * Random sample consensusReferences
{{ReflistExternal links