A video coding format (or sometimes video compression format) is a
content representation format for storage or transmission of
digital video content
Digital video is an electronic representation of moving visual images (video) in the form of encoded digital data. This is in contrast to analog video, which represents moving visual images in the form of analog signals. Digital video comprises ...
(such as in a data file or
bitstream). It typically uses a standardized
video compression algorithm, most commonly based on
discrete cosine transform (DCT) coding and
motion compensation. A specific software,
firmware
In computing, firmware is a specific class of computer software that provides the low-level control for a device's specific hardware. Firmware, such as the BIOS of a personal computer, may contain basic functions of a device, and may provide ...
, or hardware implementation capable of compression or decompression to/from a specific video coding format is called a
video codec.
Some video coding formats are documented by a detailed
technical specification document known as a video coding specification. Some such specifications are written and approved by
standardization organizations as
technical standards, and are thus known as a video coding standard. The term 'standard' is also sometimes used for
''de facto'' standards as well as formal standards.
Video content encoded using a particular video coding format is normally bundled with an audio stream (encoded using an
audio coding format) inside a
multimedia container format such as
AVI
Avi is a given name, usually masculine, often a diminutive of Avram, Avraham, etc. It is sometimes feminine and a diminutive of the Hebrew spelling of Abigail.
People with the given name include:
* Avi (born 1937), Newbery award-winning Americ ...
,
MP4,
FLV,
RealMedia, or
Matroska. As such, the user normally doesn't have a
H.264 file, but instead has a .mp4
video file, which is an MP4 container containing H.264-encoded video, normally alongside
AAC-encoded audio. Multimedia container formats can contain any one of a number of different video coding formats; for example the MP4 container format can contain video in either the
MPEG-2 Part 2 or the H.264 video coding format, among others. Another example is the initial specification for the file type
WebM, which specified the container format (Matroska), but also exactly which video (
VP8) and audio (
Vorbis) compression format is used inside the Matroska container, even though the Matroska container format itself is capable of containing other video coding formats (
VP9 video and
Opus audio support was later added to the
WebM specification).
Distinction between ''format'' and ''codec''
A ''format'' is the layout plan for data produced or consumed by a ''codec''.
Although video coding formats such as H.264 are sometimes referred to as ''codecs'', there is a clear conceptual difference between a specification and its implementations. Video coding formats are described in specifications, and software,
firmware
In computing, firmware is a specific class of computer software that provides the low-level control for a device's specific hardware. Firmware, such as the BIOS of a personal computer, may contain basic functions of a device, and may provide ...
, or hardware to encode/decode data in a given video coding format from/to uncompressed video are implementations of those specifications. As an analogy, the video coding format
H.264 (specification) is to the
codec OpenH264 (specific implementation) what the
C Programming Language (specification) is to the compiler
GCC (specific implementation). Note that for each specification (e.g.
H.264), there can be many codecs implementing that specification (e.g.
x264, OpenH264,
H.264/MPEG-4 AVC products and implementations).
This distinction is not consistently reflected terminologically in the literature. The H.264 specification calls
H.261,
H.262,
H.263
H.263 is a video compression standard originally designed as a low-bit-rate compressed format for videotelephony. It was standardized by the ITU-T Video Coding Experts Group (VCEG) in a project ending in 1995/1996. It is a member of the H.26x fam ...
, and
H.264 ''video coding standards'' and does not contain the word ''codec''.
[ The Alliance for Open Media clearly distinguishes between the AV1 video coding format and the accompanying codec they are developing, but calls the video coding format itself a '' video codec specification''. The VP9 specification calls the video coding format VP9 itself a ''codec''.
As an example of conflation, Chromium's and Mozilla's pages listing their video format support both call video coding formats such as H.264 ''codecs''. As another example, in Cisco's announcement of a free-as-in-beer video codec, the press release refers to the H.264 video coding format as a ''codec'' ("choice of a common video codec"), but calls Cisco's implementation of a H.264 encoder/decoder a ''codec'' shortly thereafter ("open-source our H.264 codec").
A video coding format does not dictate all ]algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s used by a codec implementing the format. For example, a large part of how video compression typically works is by finding similarities between video frames (block-matching), and then achieving compression by copying previously-coded similar subimages (e.g., macroblocks) and adding small differences when necessary. Finding optimal combinations of such predictors and differences is an NP-hard problem, meaning that it is practically impossible to find an optimal solution. While the video coding format must support such compression across frames in the bitstream format, by not needlessly mandating specific algorithms for finding such block-matches and other encoding steps, the codecs implementing the video coding specification have some freedom to optimize and innovate in their choice of algorithms. For example, section 0.5 of the H.264 specification says that encoding algorithms are not part of the specification. Free choice of algorithm also allows different space–time complexity trade-offs for the same video coding format, so a live feed can use a fast but space-inefficient algorithm, while a one-time DVD encoding for later mass production can trade long encoding-time for space-efficient encoding.
History
The concept of analog video compression dates back to 1929, when R.D. Kell in Britain proposed the concept of transmitting only the portions of the scene that changed from frame-to-frame. The concept of digital video compression dates back to 1952, when Bell Labs
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial research and scientific development company owned by mult ...
researchers B.M. Oliver and C.W. Harrison proposed the use of differential pulse-code modulation (DPCM) in video coding. In 1959, the concept of inter-frame motion compensation was proposed by NHK researchers Y. Taki, M. Hatori and S. Tanaka, who proposed predictive inter-frame video coding in the temporal dimension. In 1967, University of London researchers A.H. Robinson and C. Cherry proposed run-length encoding (RLE), a lossless compression scheme, to reduce the transmission bandwidth of analog television signals.
The earliest digital video coding algorithms were either for uncompressed video or used lossless compression, both methods inefficient and impractical for digital video coding. Digital video was introduced in the 1970s,[ initially using uncompressed pulse-code modulation (PCM) requiring high bitrates around 45200 Mbit/s for standard-definition (SD) video,][ which was up to 2,000 times greater than the ]telecommunication
Telecommunication is the transmission of information by various types of technologies over wire, radio, optical, or other electromagnetic systems. It has its origin in the desire of humans for communication over a distance greater than tha ...
bandwidth (up to 100 kbit/s) available until the 1990s.[ Similarly, uncompressed high-definition (HD) 1080p video requires bitrates exceeding 1 Gbit/s, significantly greater than the bandwidth available in the 2000s.
]
Motion-compensated DCT
Practical video compression emerged with the development of motion-compensated DCT (MC DCT) coding,[ also called block motion compensation (BMC)][ or DCT motion compensation. This is a hybrid coding algorithm,][ which combines two key ]data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressi ...
techniques: discrete cosine transform (DCT) coding[ in the ]spatial dimension
In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coordinat ...
, and predictive motion compensation in the temporal dimension.[
DCT coding is a lossy block compression transform coding technique that was first proposed by Nasir Ahmed, who initially intended it for image compression, while he was working at Kansas State University in 1972. It was then developed into a practical image compression algorithm by Ahmed with T. Natarajan and ]K. R. Rao
Kamisetty Ramamohan Rao was an Indian-American electrical engineer. He was a professor of Electrical Engineering at the University of Texas at Arlington (UT Arlington). Academically known as K. R. Rao, he is credited with the co-invention of di ...
at the University of Texas in 1973, and was published in 1974.
The other key development was motion-compensated hybrid coding.[ In 1974, Ali Habibi at the University of Southern California introduced hybrid coding,] which combines predictive coding with transform coding.[ He examined several transform coding techniques, including the DCT, Hadamard transform, Fourier transform, slant transform, and Karhunen-Loeve transform.][ However, his algorithm was initially limited to intra-frame coding in the spatial dimension. In 1975, John A. Roese and Guner S. Robinson extended Habibi's hybrid coding algorithm to the temporal dimension, using transform coding in the spatial dimension and predictive coding in the temporal dimension, developing inter-frame motion-compensated hybrid coding.] For the spatial transform coding, they experimented with different transforms, including the DCT and the fast Fourier transform
A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in t ...
(FFT), developing inter-frame hybrid coders for them, and found that the DCT is the most efficient due to its reduced complexity, capable of compressing image data down to 0.25-bit
The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented a ...
per pixel
In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a raster image, or the smallest point in an all points addressable display device.
In most digital display devices, pixels are the s ...
for a videotelephone scene with image quality comparable to a typical intra-frame coder requiring 2-bit per pixel.[
The DCT was applied to video encoding by Wen-Hsiung Chen,] who developed a fast DCT algorithm with C.H. Smith and S.C. Fralick in 1977, and founded Compression Labs to commercialize DCT technology. In 1979, Anil K. Jain and Jaswant R. Jain further developed motion-compensated DCT video compression.[ This led to Chen developing a practical video compression algorithm, called motion-compensated DCT or adaptive scene coding, in 1981.][ Motion-compensated DCT later became the standard coding technique for video compression from the late 1980s onwards.]
Video coding standards
The first digital video coding standard was H.120
H.120 was the first digital video compression standard. It was developed by COST 211 and published by the CCITT (now the ITU-T) in 1984, with a revision in 1988 that included contributions proposed by other organizations. The video turned out not ...
, developed by the CCITT (now ITU-T) in 1984. H.120 was not usable in practice, as its performance was too poor.[ H.120 used motion-compensated DPCM coding,][ a lossless compression algorithm that was inefficient for video coding.][ During the late 1980s, a number of companies began experimenting with discrete cosine transform (DCT) coding, a much more efficient form of compression for video coding. The CCITT received 14 proposals for DCT-based video compression formats, in contrast to a single proposal based on vector quantization (VQ) compression. The H.261 standard was developed based on motion-compensated DCT compression.][ H.261 was the first practical video coding standard,][ and uses patents licensed from a number of companies, including Hitachi, PictureTel, NTT, BT, and Toshiba, among others.][ Since H.261, motion-compensated DCT compression has been adopted by all the major video coding standards (including the H.26x and MPEG formats) that followed.][
]MPEG-1
MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s (26:1 and 6:1 compression ratios respectively) without excessive quality loss, mak ...
, developed by the Motion Picture Experts Group (MPEG), followed in 1991, and it was designed to compress VHS-quality video.[ It was succeeded in 1994 by MPEG-2/ H.262,][ which was developed with patents licensed from a number of companies, primarily ]Sony
, commonly stylized as SONY, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan. As a major technology company, it operates as one of the world's largest manufacturers of consumer and professional ...
, Thomson and Mitsubishi Electric.[ MPEG-2 became the standard video format for DVD and SD digital television.][ Its motion-compensated DCT algorithm was able to achieve a compression ratio of up to 100:1, enabling the development of digital media technologies such as video-on-demand (VOD)][ and ]high-definition television
High-definition television (HD or HDTV) describes a television system which provides a substantially higher image resolution than the previous generation of technologies. The term has been used since 1936; in more recent times, it refers to the ...
(HDTV). In 1999, it was followed by MPEG-4/H.263
H.263 is a video compression standard originally designed as a low-bit-rate compressed format for videotelephony. It was standardized by the ITU-T Video Coding Experts Group (VCEG) in a project ending in 1995/1996. It is a member of the H.26x fam ...
, which was a major leap forward for video compression technology.[ It uses patents licensed from a number of companies, primarily Mitsubishi, Hitachi and Panasonic.][
The most widely used video coding format is H.264/MPEG-4 AVC.] It was developed in 2003, and uses patents licensed from a number of organizations, primarily Panasonic, Godo Kaisha IP Bridge and LG Electronics.[ In contrast to the standard DCT used by its predecessors, AVC uses the integer DCT.] H.264 is one of the video encoding standards for Blu-ray Discs; all Blu-ray Disc players must be able to decode H.264. It is also widely used by streaming internet sources, such as videos from YouTube
YouTube is a global online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second most ...
, Netflix, Vimeo, and the iTunes Store
The iTunes Store is a digital media store operated by Apple Inc. It opened on April 28, 2003, as a result of Steve Jobs' push to open a digital marketplace for music. As of April 2020, iTunes offered 60 million songs, 2.2 million apps, 25,00 ...
, web software such as the Adobe Flash Player and Microsoft Silverlight, and also various HDTV
High-definition television (HD or HDTV) describes a television system which provides a substantially higher image resolution than the previous generation of technologies. The term has been used since 1936; in more recent times, it refers to the ...
broadcasts over terrestrial ( Advanced Television Systems Committee standards, ISDB-T, DVB-T or DVB-T2), cable (DVB-C
Digital Video Broadcasting - Cable (DVB-C) is the DVB European consortium standard for the broadcast transmission of digital television over cable. This system transmits an MPEG-2 or MPEG-4 family digital audio/digital video stream, using a QAM ...
), and satellite ( DVB-S2).
A main problem for many video coding formats has been patent
A patent is a type of intellectual property that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of time in exchange for publishing an sufficiency of disclosure, enabling disclo ...
s, making it expensive to use or potentially risking a patent lawsuit due to submarine patents. The motivation behind many recently designed video coding formats such as Theora, VP8 and VP9 have been to create a ( libre) video coding standard covered only by royalty-free patents. Patent status has also been a major point of contention for the choice of which video formats the mainstream web browser
A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
s will support inside the HTML5 video tag.
The current-generation video coding format is HEVC (H.265), introduced in 2013. While AVC uses the integer DCT with 4x4 and 8x8 block sizes, HEVC uses integer DCT and DST transforms with varied block sizes between 4x4 and 32x32. HEVC is heavily patented, with the majority of patents belonging to Samsung Electronics, GE, NTT and JVC Kenwood
, stylized as JVCKENWOOD, is a Japanese multinational electronics company headquartered in Yokohama, Japan. It was formed from the merger of Victor Company of Japan, Ltd (JVC) and Kenwood Corporation on October 1, 2008. Upon creation, Haruo Kaw ...
.[ It is currently being challenged by the aiming-to-be-freely-licensed AV1 format. , AVC is by far the most commonly used format for the recording, compression and distribution of video content, used by 91% of video developers, followed by HEVC which is used by 43% of developers.][
]
List of video coding standards
Lossless, lossy, and uncompressed video coding formats
Consumer video is generally compressed using lossy video codecs, since that results in significantly smaller files than lossless compression. While there are video coding formats designed explicitly for either lossy or lossless compression, some video coding formats such as Dirac and H.264 support both.
Uncompressed video formats, such as ''Clean HDMI'', is a form of lossless video used in some circumstances such as when sending video to a display over a HDMI connection. Some high-end cameras can also capture video directly in this format.
Intra-frame video coding formats
Interframe compression complicates editing of an encoded video sequence.
One subclass of relatively simple video coding formats are the intra-frame video formats, such as DV, in which each frame of the video stream is compressed independently without referring to other frames in the stream, and no attempt is made to take advantage of correlations between successive pictures over time for better compression. One example is Motion JPEG, which is simply a sequence of individually JPEG-compressed images. This approach is quick and simple, at the expense the encoded video being much larger than a video coding format supporting Inter frame coding.
Because interframe compression copies data from one frame to another, if the original frame is simply cut out (or lost in transmission), the following frames cannot be reconstructed properly. Making 'cuts' in intraframe-compressed video while video editing is almost as easy as editing uncompressed video: one finds the beginning and ending of each frame, and simply copies bit-for-bit each frame that one wants to keep, and discards the frames one doesn't want. Another difference between intraframe and interframe compression is that, with intraframe systems, each frame uses a similar amount of data. In most interframe systems, certain frames (such as " I frames" in MPEG-2) aren't allowed to copy data from other frames, so they require much more data than other frames nearby.
It is possible to build a computer-based video editor that spots problems caused when I frames are edited out while other frames need them. This has allowed newer formats like HDV to be used for editing. However, this process demands a lot more computing power than editing intraframe compressed video with the same picture quality. But, this compression is not very effective to use for any audio format.
Profiles and levels
A video coding format can define optional restrictions to encoded video, called profiles and levels. It is possible to have a decoder which only supports decoding a subset of profiles and levels of a given video format, for example to make the decoder program/hardware smaller, simpler, or faster.
A ''profile'' restricts which encoding techniques are allowed. For example, the H.264 format includes the profiles ''baseline'', ''main'' and ''high'' (and others). While P-slices (which can be predicted based on preceding slices) are supported in all profiles, B-slices (which can be predicted based on both preceding and following slices) are supported in the ''main'' and ''high'' profiles but not in ''baseline''.[
A ''level'' is a restriction on parameters such as maximum resolution and data rates.]
See also
* Comparison of video container formats
These tables compare features of multimedia container formats, most often used for storing or streaming digital video or digital audio content. To see which multimedia players support which container format, look at comparison of media players.
...
* Data compression#Video
* Display resolution
The display resolution or display modes of a digital television, computer monitor or display device is the number of distinct pixels in each dimension that can be displayed. It can be an ambiguous term especially as the displayed resolution i ...
* List of video compression formats
The following is a list of compression formats and related codecs.
Audio compression formats
Non-compression
* Linear pulse-code modulation (LPCM, generally only described as PCM) is the format for uncompressed audio in media files and it is al ...
* Video file format
A video file format is a type of file format for storing digital video data on a computer system. Video is almost always stored using lossy compression to reduce the file size.
A video file normally consists of a container (e.g. in the Matr ...
Notes
References
{{Reflist
Video formats