Phase correlation is an approach to estimate the relative
translative offset between two similar
image
An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensio ...
s (
digital image correlation Digital image correlation and tracking is an optical method that employs tracking and image registration techniques for accurate 2D and 3D measurements of changes in images. This method is often used to measure full-field displacement and strains ...
) or other data sets. It is commonly used in
image registration
Image registration is the process of transforming different sets of data into one coordinate system. Data may be multiple photographs, data from different sensors, times, depths, or viewpoints. It is used in computer vision, medical imaging, mili ...
and relies on a
frequency-domain
In physics, electronics, control systems engineering, and statistics, the frequency domain refers to the analysis of mathematical functions or signals with respect to frequency, rather than time. Put simply, a time-domain graph shows how a s ...
representation of the data, usually calculated by
fast Fourier transform
A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in t ...
s. The term is applied particularly to a subset of
cross-correlation
In signal processing, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a ''sliding dot product'' or ''sliding inner-product''. It is commonly used f ...
techniques that isolate the phase information from the Fourier-space representation of the cross-
correlogram
In the analysis of data, a correlogram is a chart of correlation statistics.
For example, in time series analysis, a plot of the sample autocorrelations r_h\, versus h\, (the time lags) is an autocorrelogram.
If cross-correlation is plotte ...
.
Example
The following image demonstrates the usage of phase correlation to determine relative translative movement between two images corrupted by independent Gaussian noise. The image was translated by (30,33) pixels. Accordingly, one can clearly see a peak in the phase-correlation representation at approximately (30,33).
Method
Given two input images
and
:
Apply a
window function
In signal processing and statistics, a window function (also known as an apodization function or tapering function) is a mathematical function that is zero-valued outside of some chosen interval, normally symmetric around the middle of the in ...
(e.g., a
Hamming window) on both images to reduce edge effects (this may be optional depending on the image characteristics). Then, calculate the discrete 2D
Fourier transform
A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
of both images.
:
Calculate the
cross-power spectrum
The power spectrum S_(f) of a time series x(t) describes the distribution of power into frequency components composing that signal. According to Fourier analysis, any physical signal can be decomposed into a number of discrete frequencies, o ...
by taking the
complex conjugate
In mathematics, the complex conjugate of a complex number is the number with an equal real part and an imaginary part equal in magnitude but opposite in sign. That is, (if a and b are real, then) the complex conjugate of a + bi is equal to a - ...
of the second result, multiplying the
Fourier transform
A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
s together elementwise, and normalizing this product elementwise.
:
Where
is the
Hadamard product (entry-wise product) and the absolute values are taken entry-wise as well. Written out entry-wise for element index
:
:
Obtain the normalized cross-correlation by applying the inverse Fourier transform.
:
Determine the location of the peak in
.
:
Commonly,
interpolation
In the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing (finding) new data points based on the range of a discrete set of known data points.
In engineering and science, one often has ...
methods are used to estimate the peak location in the cross-
correlogram
In the analysis of data, a correlogram is a chart of correlation statistics.
For example, in time series analysis, a plot of the sample autocorrelations r_h\, versus h\, (the time lags) is an autocorrelogram.
If cross-correlation is plotte ...
to non-
integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
values, despite the fact that the data are discrete, and this procedure is often termed 'subpixel registration'. A large variety of subpixel interpolation methods are given in the technical literature. Common peak interpolation methods such as parabolic interpolation have been used, and the
OpenCV
OpenCV (''Open Source Computer Vision Library'') is a library of programming functions mainly aimed at real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage then Itseez (which was later acquired by I ...
computer vision package uses a
centroid
In mathematics and physics, the centroid, also known as geometric center or center of figure, of a plane figure or solid figure is the arithmetic mean position of all the points in the surface of the figure. The same definition extends to any ...
-based method, though these generally have inferior accuracy compared to more sophisticated methods.
Because the Fourier representation of the data has already been computed, it is especially convenient to use the
Fourier shift theorem with
real
Real may refer to:
Currencies
* Brazilian real (R$)
* Central American Republic real
* Mexican real
* Portuguese real
* Spanish real
* Spanish colonial real
Music Albums
* ''Real'' (L'Arc-en-Ciel album) (2000)
* ''Real'' (Bright album) (201 ...
-valued (sub-integer) shifts for this purpose, which essentially interpolates using the sinusoidal
basis function
In mathematics, a basis function is an element of a particular basis for a function space. Every function in the function space can be represented as a linear combination of basis functions, just as every vector in a vector space can be repre ...
s of the Fourier transform. An especially popular FT-based estimator is given by Foroosh ''et al.'' In this method, the subpixel peak location is approximated by a simple formula involving peak pixel value and the values of its nearest neighbors, where
is the peak value and
is the nearest neighbor in the x direction (assuming, as in most approaches, that the integer shift has already been found and the comparand images differ only by a subpixel shift).
:
The Foroosh ''et al.'' method is quite fast compared to most methods, though it is not always the most accurate. Some methods shift the peak in Fourier space and apply
non-linear optimization
In mathematics, nonlinear programming (NLP) is the process of solving an optimization problem where some of the constraints or the objective function are nonlinear. An optimization problem is one of calculation of the extrema (maxima, minima or sta ...
to maximize the correlogram peak, but these tend to be very slow since they must apply an inverse Fourier transform or its equivalent in the objective function.
It is also possible to infer the peak location from phase characteristics in Fourier space without the inverse transformation, as noted by Stone. These methods usually use a
linear least squares (LLS) fit of the
phase angles to a planar model. The long latency of the phase angle computation in these methods is a disadvantage, but the speed can sometimes be comparable to the Foroosh ''et al.'' method depending on the image size. They often compare favorably in speed to the multiple iterations of extremely slow objective functions in iterative non-linear methods.
Since all subpixel shift computation methods are fundamentally interpolative, the performance of a particular method depends on how well the underlying data conform to the assumptions in the interpolator. This fact also may limit the usefulness of high numerical accuracy in an algorithm, since the uncertainty due to interpolation method choice may be larger than any numerical or approximation error in the particular method.
Subpixel methods are also particularly sensitive to noise in the images, and the utility of a particular algorithm is distinguished not only by its speed and accuracy but its resilience to the particular types of noise in the application.
Rationale
The method is based on the
Fourier shift theorem.
Let the two images
and
be circularly-shifted versions of each other:
:
(where the images are
in size).
Then, the discrete Fourier transforms of the images will be shifted relatively in
phase:
:
One can then calculate the normalized cross-power spectrum to factor out the phase difference:
:
since the magnitude of an
imaginary exponential always is one, and the phase of
always is zero.
The inverse Fourier transform of a complex exponential is a
Kronecker delta
In mathematics, the Kronecker delta (named after Leopold Kronecker) is a function of two variables, usually just non-negative integers. The function is 1 if the variables are equal, and 0 otherwise:
\delta_ = \begin
0 &\text i \neq j, \\
1 ...
, i.e. a single peak:
:
This result could have been obtained by calculating the
cross correlation directly. The advantage of this method is that the discrete Fourier transform and its inverse can be performed using the
fast Fourier transform
A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in t ...
, which is much faster than correlation for large images.
Benefits
Unlike many spatial-domain algorithms, the phase correlation method is resilient to noise, occlusions, and other defects typical of medical or satellite images.
The method can be extended to determine rotation and scaling differences between two images by first converting the images to
log-polar coordinates In mathematics, log-polar coordinates (or logarithmic polar coordinates) is a coordinate system in two dimensions, where a point is identified by two numbers, one for the logarithm of the distance to a certain point, and one for an angle. Log-polar ...
. Due to properties of the
Fourier transform
A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
, the rotation and scaling parameters can be determined in a manner invariant to translation.
Limitations
In practice, it is more likely that
will be a simple linear shift of
, rather than a circular shift as required by the explanation above. In such cases,
will not be a simple delta function, which will reduce the performance of the method. In such cases, a
window function
In signal processing and statistics, a window function (also known as an apodization function or tapering function) is a mathematical function that is zero-valued outside of some chosen interval, normally symmetric around the middle of the in ...
(such as a Gaussian or Tukey window) should be employed during the Fourier transform to reduce edge effects, or the images should be zero padded so that the edge effects can be ignored. If the images consist of a flat background, with all detail situated away from the edges, then a linear shift will be equivalent to a circular shift, and the above derivation will hold exactly. The peak can be sharpened by using edge or vector correlation.
For
periodic images (such as a chessboard or picket fence), phase correlation may yield ambiguous results with several peaks in the resulting output.
Applications
Phase correlation is the preferred method for
television standards conversion
Television standards conversion is the process of changing a television transmission or recording from one video system to another.
Converting video between different numbers of lines, frame rates, and color models in video pictures is a comple ...
, as it leaves the fewest artifacts.
See also
General
*
Cross correlation
*
Scaled Correlation In statistics, scaled correlation is a form of a coefficient of correlation applicable to data that have a temporal component such as time series. It is the average short-term correlation. If the signals have multiple components (slow and fast), sc ...
*
Phase retrieval
Phase retrieval is the process of algorithmically finding solutions to the phase problem. Given a complex signal F(k), of amplitude , F (k), , and phase \psi(k):
::F(k) = , F(k), e^ =\int_^ f(x)\ e^\,dx
where ''x'' is an ''M''-dimensional spatia ...
Television
*
Television standards conversion
Television standards conversion is the process of changing a television transmission or recording from one video system to another.
Converting video between different numbers of lines, frame rates, and color models in video pictures is a comple ...
*
Reverse Standards Conversion
Reverse Standards Conversion or RSC is a process developed by a team led by James Insell at the BBC for the restoration of video recordings which have already been converted between different video standards using early conversion techniques.
Hi ...
References
External links
Using Matlab to perform normalized cross-correlation on images
{{DEFAULTSORT:Phase Correlation
Computer vision