digital image processing Digital image processing is the use of a digital computer to process digital images through an algorithm. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allow ...

, the sum of absolute differences (SAD) is a measure of the similarity between image blocks. It is calculated by taking the

absolute difference The absolute difference of two real numbers x and y is given by , x-y, , the absolute value of their difference. It describes the distance on the real line between the points corresponding to x and y. It is a special case of the Lp distance for ...

between each

pixel In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a raster image, or the smallest point in an all points addressable display device. In most digital display devices, pixels are the s ...

in the original block and the corresponding pixel in the block being used for comparison. These differences are summed to create a simple metric of block similarity, the ''L''¹ norm of the difference image or

Manhattan distance A taxicab geometry or a Manhattan geometry is a geometry whose usual distance function or metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the absolute differences of their Cartesian co ...

between two image blocks. The sum of absolute differences may be used for a variety of purposes, such as

object recognition Object recognition – technology in the field of computer vision for finding and identifying objects in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the ...

, the generation of disparity maps for

stereo Stereophonic sound, or more commonly stereo, is a method of sound reproduction that recreates a multi-directional, 3-dimensional audible perspective. This is usually achieved by using two independent audio channels through a configuration ...

images, and

motion estimation Motion estimation is the process of determining ''motion vectors'' that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions b ...

for video compression.

Example

This example uses the sum of absolute differences to identify which part of a search image is most similar to a template image. In this example, the template image is 3 by 3 pixels in size, while the search image is 3 by 5 pixels in size. Each pixel is represented by a single

integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...

from 0 to 9.

Template    Search image
 2 5 5       2 7 5 8 6
 4 0 7       1 7 4 2 7
 7 5 9       8 4 6 8 5

There are exactly three unique locations within the search image where the template may fit: the left side of the image, the center of the image, and the right side of the image. To calculate the SAD values, the absolute value of the difference between each corresponding pair of pixels is used: the difference between 2 and 2 is 0, 4 and 1 is 3, 7 and 8 is 1, and so forth. Calculating the values of the absolute differences for each pixel, for the three possible template locations, gives the following:

Left    Center   Right
0 2 0   5 0 3    3 3 1
3 7 3   3 4 5    0 2 0
1 1 3   3 1 1    1 3 4

For each of these three image patches, the 9 absolute differences are added together, giving SAD values of 20, 25, and 17, respectively. From these SAD values, it could be asserted that the right side of the search image is the most similar to the template image, because it has the lowest sum of absolute differences as compared to the other two locations.

Comparison to other metrics

Object recognition

The sum of absolute differences provides a simple way to automate the searching for objects inside an image, but may be unreliable due to the effects of contextual factors such as changes in lighting, color, viewing direction, size, or shape. The SAD may be used in conjunction with other object recognition methods, such as

edge detection Edge detection includes a variety of mathematical methods that aim at identifying edges, curves in a digital image at which the image brightness changes sharply or, more formally, has discontinuities. The same problem of finding discontinuiti ...

, to improve the reliability of results.

Video compression

SAD is an extremely fast metric due to its simplicity; it is effectively the simplest possible metric that takes into account every

in a block. Therefore, it is very effective for a wide motion search of many different blocks. SAD is also easily

parallelizable In mathematics, a differentiable manifold M of dimension ''n'' is called parallelizable if there exist smooth vector fields \ on the manifold, such that at every point p of M the tangent vectors \ provide a basis of the tangent space at p. Equ ...

since it analyzes each pixel separately, making it easily implementable with such instructions as

ARM NEON ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer (RISC) instruction set architectures for computer processors, configured ...

or x86 SSE2. For example, SSE has packed sum of absolute differences instruction (PSADBW) specifically for this purpose. Once candidate blocks are found, the final refinement of the motion estimation process is often done with other slower but more accurate metrics, which better take into account human perception. These include the sum of absolute transformed differences (SATD), the sum of squared differences (SSD), and rate-distortion optimization.

References

*{{cite book , last = E. G. Richardson , first = Iain , title = H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia , publisher = John Wiley & Sons Ltd. , year = 2003 , location = Chichester Video compression Signal processing metrics Loss functions

Example

Comparison to other metrics

Object recognition

Video compression

See also

References