Camera resectioning is the process of estimating the parameters of a
pinhole camera model
The pinhole camera model describes the mathematical relationship between the coordinates of a point in three-dimensional space and its projection onto the image plane of an ''ideal'' pinhole camera, where the camera aperture is described as a poi ...
approximating the camera that produced a given photograph or video; it determines which incoming
light ray
In optics a ray is an idealized geometrical model of light, obtained by choosing a curve that is perpendicular to the ''wavefronts'' of the actual light, and that points in the direction of energy flow. Rays are used to model the propagation o ...
is associated with each pixel on the resulting image. Basically, the process determines the
pose
Human positions refer to the different physical configurations that the human body can take.
There are several synonyms that refer to human positioning, often used interchangeably, but having specific nuances of meaning.
*''Position'' is a gen ...
of the pinhole camera.
Usually, the camera parameters are represented in a 3 × 4
projection matrix
In statistics, the projection matrix (\mathbf), sometimes also called the influence matrix or hat matrix (\mathbf), maps the vector of response values (dependent variable values) to the vector of fitted values (or predicted values). It describes t ...
called the ''
camera matrix
In computer vision a camera matrix or (camera) projection matrix is a 3 \times 4 matrix which describes the mapping of a pinhole camera from 3D points in the world to 2D points in an image.
Let \mathbf be a representation of a 3D point in homog ...
''.
The extrinsic parameters define the camera ''
pose
Human positions refer to the different physical configurations that the human body can take.
There are several synonyms that refer to human positioning, often used interchangeably, but having specific nuances of meaning.
*''Position'' is a gen ...
'' (position and orientation) while the intrinsic parameters specify the camera image format (focal length, pixel size, and image origin).
This process is often called geometric camera calibration or simply camera calibration, although that term may also refer to
photometric camera calibration
Color mapping is a function that maps (transforms) the colors of one (source) image to the colors of another (target) image. A color mapping may be referred to as the algorithm that results in the mapping function or the algorithm that transform ...
or be restricted for the estimation of the intrinsic parameters only. Exterior orientation and interior orientation refer to the determination of only the extrinsic and intrinsic parameters, respectively.
The classic camera calibration requires special objects in the scene, which is not required in ''
camera auto-calibration''.
Camera resectioning is often used in the application of
stereo vision
Stereopsis () is the component of depth perception retrieved through binocular vision.
Stereopsis is not the only contributor to depth perception, but it is a major one. Binocular vision happens because each eye receives a different image becaus ...
where the camera projection matrices of two cameras are used to calculate the 3D world coordinates of a point viewed by both cameras.
Formulation
The camera projection matrix is derived from the intrinsic and extrinsic parameters of the camera, and is often represented by the series of transformations; e.g., a matrix of camera intrinsic parameters, a 3 × 3
rotation matrix In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix
:R = \begin
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta
\end ...
, and a translation vector. The camera projection matrix can be used to associate points in a camera's image space with locations in 3D world space.
Homogeneous coordinates
In this context, we use
to represent a 2D point position in ''pixel'' coordinates and
is used to represent a 3D point position in ''world'' coordinates. In both cases, they are represented in
homogeneous coordinates
In mathematics, homogeneous coordinates or projective coordinates, introduced by August Ferdinand Möbius in his 1827 work , are a system of coordinates used in projective geometry, just as Cartesian coordinates are used in Euclidean geometry. T ...
(i.e. they have an additional last component, which is initially, by convention, a 1), which is the most common notation in
robotics
Robotics is an interdisciplinary branch of computer science and engineering. Robotics involves design, construction, operation, and use of robots. The goal of robotics is to design machines that can help and assist humans. Robotics integrat ...
and
rigid body
In physics, a rigid body (also known as a rigid object) is a solid body in which deformation is zero or so small it can be neglected. The distance between any two given points on a rigid body remains constant in time regardless of external force ...
transforms.
Projection
Referring to the
pinhole camera model
The pinhole camera model describes the mathematical relationship between the coordinates of a point in three-dimensional space and its projection onto the image plane of an ''ideal'' pinhole camera, where the camera aperture is described as a poi ...
, a
camera matrix
In computer vision a camera matrix or (camera) projection matrix is a 3 \times 4 matrix which describes the mapping of a pinhole camera from 3D points in the world to 2D points in an image.
Let \mathbf be a representation of a 3D point in homog ...
is used to denote a projective mapping from ''world'' coordinates to ''pixel'' coordinates.
:
where
.
by convention are the x and y coordinates of the pixel in the camera,
is the intrinsic matrix as described below, and
form the extrinsic matrix as described below.
are the coordinates of the source of the light ray which hits the camera sensor in world coordinates, relative to the origin of the world. By dividing the matrix product by
, the z-coordinate of the camera relative to the world origin, the theoretical value for the pixel coordinates can be found.
Intrinsic parameters
:
The
contains 5 intrinsic parameters of the specific camera model. These parameters encompass
focal length
The focal length of an optical system is a measure of how strongly the system converges or diverges light; it is the inverse of the system's optical power. A positive focal length indicates that a system converges light, while a negative foca ...
,
image sensor format
In digital photography, the image sensor format is the shape and size of the image sensor.
The image sensor format of a digital camera determines the angle of view of a particular lens when used with a particular sensor. Because the image se ...
, and
camera principal point.
The parameters
and
represent focal length in terms of pixels, where
and
are the inverses of the width and height of a pixel on the projection plane and
is the
focal length
The focal length of an optical system is a measure of how strongly the system converges or diverges light; it is the inverse of the system's optical power. A positive focal length indicates that a system converges light, while a negative foca ...
in terms of distance.
represents the skew coefficient between the x and the y axis, and is often 0.
and
represent the principal point, which would be ideally in the center of the image.
Nonlinear intrinsic parameters such as
lens distortion are also important although they cannot be included in the linear camera model described by the intrinsic parameter matrix. Many modern camera calibration algorithms estimate these intrinsic parameters as well in the form of non-linear optimisation techniques. This is done in the form of optimising the camera and distortion parameters in the form of what is generally known as
bundle adjustment
In photogrammetry and computer stereo vision, bundle adjustment is simultaneous refining of the 3D coordinates describing the scene geometry, the parameters of the relative motion, and the optical characteristics of the camera(s) employed to acq ...
.
Extrinsic parameters
are the extrinsic parameters which denote the coordinate system transformations from 3D world coordinates to 3D camera coordinates. Equivalently, the extrinsic parameters define the position of the
camera center
The pinhole camera model describes the mathematical relationship between the coordinates of a point in three-dimensional space and its projection onto the image plane of an ''ideal'' pinhole camera, where the camera aperture is described as a po ...
and the camera's heading in world coordinates.
is the position of the origin of the world coordinate system expressed in coordinates of the camera-centered coordinate system.
is often mistakenly considered the position of the camera. The position,
, of the camera expressed in world coordinates is
(since
is a
rotation matrix In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix
:R = \begin
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta
\end ...
).
Camera calibration is often used as an early stage in
computer vision
Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the hum ...
.
When a
camera
A camera is an Optics, optical instrument that can capture an image. Most cameras can capture 2D images, with some more advanced models being able to capture 3D images. At a basic level, most cameras consist of sealed boxes (the camera body), ...
is used, light from the environment is focused on an image plane and captured. This process reduces the dimensions of the data taken in by the camera from three to two (light from a 3D scene is stored on a 2D image). Each
pixel
In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a raster image, or the smallest point in an all points addressable display device.
In most digital display devices, pixels are the smal ...
on the image plane therefore corresponds to a shaft of light from the original scene.
Algorithms
There are many different approaches to calculate the intrinsic and extrinsic parameters for a specific camera setup. The most common ones are:
#
Direct linear transformation Direct linear transformation (DLT) is an algorithm which solves a set of variables from a set of similarity relations:
: \mathbf_ \propto \mathbf \, \mathbf_ for \, k = 1, \ldots, N
where \mathbf_ and \mathbf_ are known vectors, \, ...
(DLT) method
# Zhang's method
# Tsai's method
# Selby's method (for X-ray cameras)
Zhang's method
Zhang model is a camera calibration method that uses traditional calibration techniques (known calibration points) and self-calibration techniques (correspondence between the calibration points when they are in different positions). To perform a full calibration by the Zhang method at least three different images of the calibration target/gauge are required, either by moving the gauge or the camera itself. If some of the intrinsic parameters are given as data (orthogonality of the image or optical center coordinates) the number of images required can be reduced to two.
In a first step, an approximation of the estimated projection matrix
between the calibration target and the image plane is determined using
DLT method. Subsequently, applying self-calibration techniques to obtained the image of the absolute conic matrix
ink
Ink is a gel, sol, or solution that contains at least one colorant, such as a dye or pigment, and is used to color a surface to produce an image, text, or design. Ink is used for drawing or writing with a pen, brush, reed pen, or quill. Thi ...
The main contribution of Zhang method is how to extract a constrained instrinsic
and
numbers of
and
calibration parameters from
pose of the calibration target.
Derivation
Assume we have a
homography
In projective geometry, a homography is an isomorphism of projective spaces, induced by an isomorphism of the vector spaces from which the projective spaces derive. It is a bijection that maps lines to lines, and thus a collineation. In general, ...
that maps points
on a "probe plane"
to points
on the image.
The circular points
lie on both our probe plane
and on the absolute conic
. Lying on
of course means they are also projected onto the ''image'' of the absolute conic (IAC)
, thus
and
. The circular points project as
:
.
We can actually ignore
while substituting our new expression for
as follows:
:
Tsai's Algorithm
It is a 2-stage algorithm, calculating the pose (3D Orientation, and x-axis and y-axis translation) in first stage. In second stage it computes the focal length, distortion coefficients and the z-axis translation.
Selby's method (for X-ray cameras)
Selby's camera calibration method
[Boris Peter Selby et al.]
"Patient positioning with X-ray detector self-calibration for image guided therapy"
Australasian Physical & Engineering Science in Medicine, Vol.34, No.3, pages 391–400, 2011 addresses the auto-calibration of X-ray camera systems.
X-ray camera systems, consisting of the X-ray generating tube and a solid state detector can be modelled as pinhole camera systems, comprising 9 intrinsic and extrinsic camera parameters.
Intensity based registration based on an arbitrary X-ray image and a reference model (as a tomographic dataset) can then be used to determine the relative camera parameters without the need of a special calibration body or any ground-truth data.
See also
*
3D pose estimation
3D pose estimation is a process of predicting the transformation of an object from a user-defined reference pose, given an image or a 3D scan. It arises in computer vision or robotics where the pose or transformation of an object can be used for ...
*
Augmented reality
Augmented reality (AR) is an interactive experience that combines the real world and computer-generated content. The content can span multiple sensory modalities, including visual, auditory, haptic, somatosensory and olfactory. AR can be de ...
*
Augmented virtuality
Mixed reality (MR) is a term used to describe the merging of a real-world environment and a computer-generated one. Physical and virtual objects may co-exist in mixed reality environments and interact in real time. Mixed reality is largely synony ...
*
Eight-point algorithm The eight-point algorithm is an algorithm used in computer vision to estimate the essential matrix or the fundamental matrix related to a stereo camera pair from a set of corresponding image points. It was introduced by Christopher Longuet-Higgin ...
*
Mixed reality
Mixed reality (MR) is a term used to describe the merging of a real-world environment and a computer-generated one. Physical and virtual objects may co-exist in mixed reality environments and interact in real time. Mixed reality is largely synony ...
*
Pinhole camera model
The pinhole camera model describes the mathematical relationship between the coordinates of a point in three-dimensional space and its projection onto the image plane of an ''ideal'' pinhole camera, where the camera aperture is described as a poi ...
*
Perspective-n-Point
*
Rational polynomial coefficient
References
External links
{{external cleanup, date=July 2015
Zhang's Camera Calibration and Tsai's Calibration Software on LGPL licence Zhang's Camera Calibration Method with SoftwareC++ Camera Calibration Toolbox with source codeCamera Calibration Toolbox for MatlabThe DLR CalDe and DLR CalLab Camera Calibration ToolboxCamera Calibration- Augmented reality lecture at TU Muenchen, Germany
(using
ARToolKit
ARToolKit is an open-source computer tracking library for creation of strong augmented reality applications that overlay virtual imagery on the real world. Currently, it is maintained as an open-source project hosted on GitHub. ARToolKit is a very ...
)
A Four-step Camera Calibration Procedure with Implicit Image Correctionmrcal: a high-fidelity calibration toolkit with thorough uncertainty propagation
Geometry in computer vision
Mixed reality
Stereophotogrammetry