HOME

TheInfoList



OR:

Phase correlation is an approach to estimate the relative translative offset between two similar
image An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensio ...
s (
digital image correlation Digital image correlation and tracking is an optical method that employs tracking and image registration techniques for accurate 2D and 3D measurements of changes in images. This method is often used to measure full-field displacement and strains ...
) or other data sets. It is commonly used in
image registration Image registration is the process of transforming different sets of data into one coordinate system. Data may be multiple photographs, data from different sensors, times, depths, or viewpoints. It is used in computer vision, medical imaging, mili ...
and relies on a
frequency-domain In physics, electronics, control systems engineering, and statistics, the frequency domain refers to the analysis of mathematical functions or signals with respect to frequency, rather than time. Put simply, a time-domain graph shows how a s ...
representation of the data, usually calculated by
fast Fourier transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in t ...
s. The term is applied particularly to a subset of
cross-correlation In signal processing, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a ''sliding dot product'' or ''sliding inner-product''. It is commonly used f ...
techniques that isolate the phase information from the Fourier-space representation of the cross-
correlogram In the analysis of data, a correlogram is a chart of correlation statistics. For example, in time series analysis, a plot of the sample autocorrelations r_h\, versus h\, (the time lags) is an autocorrelogram. If cross-correlation is plotte ...
.


Example

The following image demonstrates the usage of phase correlation to determine relative translative movement between two images corrupted by independent Gaussian noise. The image was translated by (30,33) pixels. Accordingly, one can clearly see a peak in the phase-correlation representation at approximately (30,33).


Method

Given two input images \ g_a and \ g_b: Apply a
window function In signal processing and statistics, a window function (also known as an apodization function or tapering function) is a mathematical function that is zero-valued outside of some chosen interval, normally symmetric around the middle of the in ...
(e.g., a Hamming window) on both images to reduce edge effects (this may be optional depending on the image characteristics). Then, calculate the discrete 2D
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
of both images. :\ \mathbf_a = \mathcal\, \; \mathbf_b = \mathcal\ Calculate the
cross-power spectrum The power spectrum S_(f) of a time series x(t) describes the distribution of power into frequency components composing that signal. According to Fourier analysis, any physical signal can be decomposed into a number of discrete frequencies, o ...
by taking the
complex conjugate In mathematics, the complex conjugate of a complex number is the number with an equal real part and an imaginary part equal in magnitude but opposite in sign. That is, (if a and b are real, then) the complex conjugate of a + bi is equal to a - ...
of the second result, multiplying the
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
s together elementwise, and normalizing this product elementwise. :\ R = \frac Where \circ is the Hadamard product (entry-wise product) and the absolute values are taken entry-wise as well. Written out entry-wise for element index (j,k): :\ R_ = \frac Obtain the normalized cross-correlation by applying the inverse Fourier transform. :\ r = \mathcal^\ Determine the location of the peak in \ r. :\ (\Delta x, \Delta y) = \arg \max_\ Commonly,
interpolation In the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing (finding) new data points based on the range of a discrete set of known data points. In engineering and science, one often has ...
methods are used to estimate the peak location in the cross-
correlogram In the analysis of data, a correlogram is a chart of correlation statistics. For example, in time series analysis, a plot of the sample autocorrelations r_h\, versus h\, (the time lags) is an autocorrelogram. If cross-correlation is plotte ...
to non-
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
values, despite the fact that the data are discrete, and this procedure is often termed 'subpixel registration'. A large variety of subpixel interpolation methods are given in the technical literature. Common peak interpolation methods such as parabolic interpolation have been used, and the
OpenCV OpenCV (''Open Source Computer Vision Library'') is a library of programming functions mainly aimed at real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage then Itseez (which was later acquired by I ...
computer vision package uses a
centroid In mathematics and physics, the centroid, also known as geometric center or center of figure, of a plane figure or solid figure is the arithmetic mean position of all the points in the surface of the figure. The same definition extends to any ...
-based method, though these generally have inferior accuracy compared to more sophisticated methods. Because the Fourier representation of the data has already been computed, it is especially convenient to use the Fourier shift theorem with
real Real may refer to: Currencies * Brazilian real (R$) * Central American Republic real * Mexican real * Portuguese real * Spanish real * Spanish colonial real Music Albums * ''Real'' (L'Arc-en-Ciel album) (2000) * ''Real'' (Bright album) (201 ...
-valued (sub-integer) shifts for this purpose, which essentially interpolates using the sinusoidal
basis function In mathematics, a basis function is an element of a particular basis for a function space. Every function in the function space can be represented as a linear combination of basis functions, just as every vector in a vector space can be repre ...
s of the Fourier transform. An especially popular FT-based estimator is given by Foroosh ''et al.'' In this method, the subpixel peak location is approximated by a simple formula involving peak pixel value and the values of its nearest neighbors, where r_ is the peak value and r_ is the nearest neighbor in the x direction (assuming, as in most approaches, that the integer shift has already been found and the comparand images differ only by a subpixel shift). :\ \Delta x = \frac The Foroosh ''et al.'' method is quite fast compared to most methods, though it is not always the most accurate. Some methods shift the peak in Fourier space and apply
non-linear optimization In mathematics, nonlinear programming (NLP) is the process of solving an optimization problem where some of the constraints or the objective function are nonlinear. An optimization problem is one of calculation of the extrema (maxima, minima or sta ...
to maximize the correlogram peak, but these tend to be very slow since they must apply an inverse Fourier transform or its equivalent in the objective function. It is also possible to infer the peak location from phase characteristics in Fourier space without the inverse transformation, as noted by Stone. These methods usually use a linear least squares (LLS) fit of the phase angles to a planar model. The long latency of the phase angle computation in these methods is a disadvantage, but the speed can sometimes be comparable to the Foroosh ''et al.'' method depending on the image size. They often compare favorably in speed to the multiple iterations of extremely slow objective functions in iterative non-linear methods. Since all subpixel shift computation methods are fundamentally interpolative, the performance of a particular method depends on how well the underlying data conform to the assumptions in the interpolator. This fact also may limit the usefulness of high numerical accuracy in an algorithm, since the uncertainty due to interpolation method choice may be larger than any numerical or approximation error in the particular method. Subpixel methods are also particularly sensitive to noise in the images, and the utility of a particular algorithm is distinguished not only by its speed and accuracy but its resilience to the particular types of noise in the application.


Rationale

The method is based on the Fourier shift theorem. Let the two images \ g_a and \ g_b be circularly-shifted versions of each other: :\ g_b(x,y) \ \stackrel\ g_a((x - \Delta x) \bmod M, (y - \Delta y) \bmod N) (where the images are \ M \times N in size). Then, the discrete Fourier transforms of the images will be shifted relatively in phase: :\mathbf_b(u,v) = \mathbf_a(u,v) e^ One can then calculate the normalized cross-power spectrum to factor out the phase difference: : \begin R(u,v) &= \frac \\ &= \frac \\ &= \frac \\ &= e^ \end since the magnitude of an imaginary exponential always is one, and the phase of \ \mathbf_a \mathbf_a^* always is zero. The inverse Fourier transform of a complex exponential is a
Kronecker delta In mathematics, the Kronecker delta (named after Leopold Kronecker) is a function of two variables, usually just non-negative integers. The function is 1 if the variables are equal, and 0 otherwise: \delta_ = \begin 0 &\text i \neq j, \\ 1 ...
, i.e. a single peak: :\ r(x,y) = \delta(x + \Delta x, y + \Delta y) This result could have been obtained by calculating the cross correlation directly. The advantage of this method is that the discrete Fourier transform and its inverse can be performed using the
fast Fourier transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in t ...
, which is much faster than correlation for large images.


Benefits

Unlike many spatial-domain algorithms, the phase correlation method is resilient to noise, occlusions, and other defects typical of medical or satellite images. The method can be extended to determine rotation and scaling differences between two images by first converting the images to
log-polar coordinates In mathematics, log-polar coordinates (or logarithmic polar coordinates) is a coordinate system in two dimensions, where a point is identified by two numbers, one for the logarithm of the distance to a certain point, and one for an angle. Log-polar ...
. Due to properties of the
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
, the rotation and scaling parameters can be determined in a manner invariant to translation.


Limitations

In practice, it is more likely that \ g_b will be a simple linear shift of \ g_a, rather than a circular shift as required by the explanation above. In such cases, \ r will not be a simple delta function, which will reduce the performance of the method. In such cases, a
window function In signal processing and statistics, a window function (also known as an apodization function or tapering function) is a mathematical function that is zero-valued outside of some chosen interval, normally symmetric around the middle of the in ...
(such as a Gaussian or Tukey window) should be employed during the Fourier transform to reduce edge effects, or the images should be zero padded so that the edge effects can be ignored. If the images consist of a flat background, with all detail situated away from the edges, then a linear shift will be equivalent to a circular shift, and the above derivation will hold exactly. The peak can be sharpened by using edge or vector correlation. For periodic images (such as a chessboard or picket fence), phase correlation may yield ambiguous results with several peaks in the resulting output.


Applications

Phase correlation is the preferred method for
television standards conversion Television standards conversion is the process of changing a television transmission or recording from one video system to another. Converting video between different numbers of lines, frame rates, and color models in video pictures is a comple ...
, as it leaves the fewest artifacts.


See also

General * Cross correlation *
Scaled Correlation In statistics, scaled correlation is a form of a coefficient of correlation applicable to data that have a temporal component such as time series. It is the average short-term correlation. If the signals have multiple components (slow and fast), sc ...
*
Phase retrieval Phase retrieval is the process of algorithmically finding solutions to the phase problem. Given a complex signal F(k), of amplitude , F (k), , and phase \psi(k): ::F(k) = , F(k), e^ =\int_^ f(x)\ e^\,dx where ''x'' is an ''M''-dimensional spatia ...
Television *
Television standards conversion Television standards conversion is the process of changing a television transmission or recording from one video system to another. Converting video between different numbers of lines, frame rates, and color models in video pictures is a comple ...
*
Reverse Standards Conversion Reverse Standards Conversion or RSC is a process developed by a team led by James Insell at the BBC for the restoration of video recordings which have already been converted between different video standards using early conversion techniques. Hi ...


References


External links


Using Matlab to perform normalized cross-correlation on images
{{DEFAULTSORT:Phase Correlation Computer vision