Scalable Video Coding
   HOME

TheInfoList



OR:

Scalable Video Coding: (SVC) is the name for the Annex G extension of the H.264/MPEG-4 AVC video compression standard. SVC standardizes the encoding of a high-quality video bitstream that also contains one or more subset bitstreams (a form of layered coding). A subset video bitstream is derived by dropping packets from the larger video to reduce the bandwidth required for the subset bitstream. The subset bitstream can represent a lower spatial resolution (smaller screen), lower temporal resolution (lower frame rate), or lower quality video signal. H.264/MPEG-4 AVC was developed jointly by
ITU-T The ITU Telecommunication Standardization Sector (ITU-T) is one of the three sectors (divisions or units) of the International Telecommunication Union (ITU). It is responsible for coordinating standards for telecommunications and Information Co ...
and ISO/
IEC The International Electrotechnical Commission (IEC; in French: ''Commission électrotechnique internationale'') is an international standards organization that prepares and publishes international standards for all electrical, electronic and r ...
JTC 1 ISO/IEC JTC 1, entitled "Information technology", is a joint technical committee (JTC) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). Its purpose is to develop, maintain and pr ...
. These two groups created the Joint Video Team (JVT) to develop the H.264/MPEG-4 AVC standard.


Overview

The objective of the SVC standardization has been to enable the encoding of a high-quality video bitstream that contains one or more subset bitstreams that can themselves be decoded with a complexity and reconstruction quality similar to that achieved using the existing H.264/MPEG-4 AVC design with the same quantity of data as in the subset bitstream. The subset bitstream is derived by dropping packets from the larger bitstream. A subset bitstream can represent a lower spatial resolution (smaller screen), or a lower temporal resolution (lower frame rate), or a lower quality video signal (each separately or in combination) compared to the bitstream it is derived from. The following modalities are possible: * Temporal (frame rate) scalability: the motion compensation dependencies are structured so that complete pictures (i.e. their associated packets) can be dropped from the bitstream. Temporal scalability is already enabled by H.264/MPEG-4 AVC (also it's available in some other formats, such that VP8Draft IETF, P. Westin, H. Lundin, M. Glover, J. Uberti, F. Galligan, "RTP Payload Format for VP8 Video"
/ref>). SVC has only provided supplemental enhancement information to improve its usage. * Spatial (picture size) scalability: video is coded at multiple spatial resolutions. The data and decoded samples of lower resolutions can be used to predict data or samples of higher resolutions in order to reduce the bit rate to code the higher resolutions. * SNR/Quality/Fidelity scalability: video is coded at a single spatial resolution but at different qualities. The data and decoded samples of lower qualities can be used to predict data or samples of higher qualities in order to reduce the bit rate to code the higher qualities. * Combined scalability: a combination of the 3 scalability modalities described above. SVC enables
forward compatibility Forward compatibility or upward compatibility is a design characteristic that allows a system to accept input intended for a later version of itself. The concept can be applied to entire systems, electrical interfaces, telecommunication signals, ...
for older hardware: the same bitstream can be consumed by basic hardware which can only decode a low-resolution subset (i.e.
720p 720p (1280×720 px; also called HD ready, standard HD or just HD) is a progressive HDTV signal format with 720 horizontal lines/1280 columns and an aspect ratio (AR) of 16:9, normally known as widescreen HDTV (1.78:1). All major HDTV broadcas ...
or
1080i 1080i (also known as Full HD or BT.709) is a combination of frame resolution and scan type. 1080i is used in high-definition television (HDTV) and high-definition video. The number "1080" refers to the number of horizontal lines on the scree ...
), while more advanced hardware will be able decode high quality video stream (
1080p 1080p (1920×1080 progressively displayed pixels; also known as Full HD or FHD, and BT.709) is a set of HDTV high-definition video modes characterized by 1,920 pixels displayed across the screen horizontally and 1,080 pixels down the screen ve ...
).


Background and applications

Bit-stream scalability for video is a desirable feature for many multimedia applications. The need for scalability arises from graceful degradation transmission requirements, or adaptation needs for spatial formats, bit rates or power. To fulfill these requirements, it is beneficial that video is simultaneously transmitted or stored with a variety of spatial or temporal resolutions or qualities which is the purpose of video bit-stream scalability. Traditional digital video transmission and storage systems are based on
H.222.0 MPEG-2 (a.k.a. H.222/H.262 as was defined by the ITU) is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, w ...
/
MPEG-2 MPEG-2 (a.k.a. H.222/H.262 as was defined by the ITU) is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, ...
TS systems for broadcasting services over satellite, cable, and terrestrial transmission channels, and for DVD storage, or on H.320 for conversational video conferencing services. These channels are typically characterized by a fixed spatio-temporal format of the video signal (
SDTV Standard-definition television (SDTV, SD, often shortened to standard definition) is a television system which uses a resolution that is not considered to be either high or enhanced definition. "Standard" refers to it being the prevailing sp ...
or
HDTV High-definition television (HD or HDTV) describes a television system which provides a substantially higher image resolution than the previous generation of technologies. The term has been used since 1936; in more recent times, it refers to the g ...
or CIF for H.320 video telephone). The application behavior in such systems typically falls into one of the two categories: it works or it doesn't wor

Modern video transmission and storage systems using the Internet and mobile networks are typically based on Real-time Transport Protocol, RTP/ IP for real-time services (conversational and streaming) and on computer file formats like mp4 or
3gp 3GP (3GPP file format) is a multimedia container format defined by the Third Generation Partnership Project (3GPP) for 3G UMTS multimedia services. It is used on 3G mobile phones but can also be played on some 2G and 4G phones. 3G2 (3GPP2 ...
. Most RTP/IP access networks are typically characterized by a wide range of connection qualities and receiving devices. The varying connection quality results from adaptive resource sharing mechanisms of these networks addressing the time varying data throughput requirements of a varying number of users. The variety of devices with different capabilities ranging from cell phones with small screens and restricted processing power to high-end PCs with high-definition displays results from the continuous evolution of these endpoints. Scalable video coding (SVC) is one solution to the problems posed by the characteristics of modern video transmission systems. The following video applications can benefit from SVC: * Streaming * Conferencing * Surveillance * Broadcast * Storage


History and timeline

* October 2003: The
Moving Picture Experts Group The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and IEC that sets standards for media coding, including compression coding of audio, video, graphics, and genomic data; and transmission and f ...
(MPEG) issued a call for proposals on SVC Technology. * April 2004: Fourteen proposals were submitted; twelve were based on compression by wavelets, and two were extensions of H.264/MPEG-4 AVC. * October 2004: The proposal made by the image communication group of the Heinrich-Hertz-Institute (HHI) was chosen by MPEG as the starting point of its SVC standardization project. * January 2005: MPEG and the
Video Coding Experts Group The Video Coding Experts Group or Visual Coding Experts Group (VCEG, also known as Question 6) is a working group of the ITU Telecommunication Standardization Sector (ITU-T) concerned with standards for compression coding of video, images, audio, ...
(VCEG) did agree to standardize the SVC project as an amendment of the H.264/MPEG-4 AVC standard. * July 2007: The SVC project received final approval{{Clarify, date=July 2010


Profiles and levels

As a result of the Scalable Video Coding extension, the standard contains five additional ''scalable profiles'': Scalable Baseline, Scalable High, Scalable High Intra, Scalable Constrained Baseline and Scalable Constrained High Profile. These profiles are defined as a combination of the H.264/MPEG-4 AVC profile for the base layer (2nd word in scalable profile name) and tools that achieve the scalable extension: * Scalable Baseline Profile: Mainly targeted for conversational, mobile, and surveillance applications. ** A bitstream conforming to Scalable Baseline profile contains a base layer bitstream that conforms to a restricted version of Baseline profile of H.264/MPEG-4 AVC. ** Supports B slices, weighted prediction,
CABAC Context-adaptive binary arithmetic coding (CABAC) is a form of entropy encoding used in the H.264/MPEG-4 AVC and High Efficiency Video Coding (HEVC) standards. It is a lossless compression technique, although the video coding standards in which it ...
entropy coding, and 8×8 luma transform in enhancement layers (CABAC and the 8×8 transform are only supported for certain levels), although the base layer has to conform to the restricted Baseline profile, which does not support these tools. Coding tools for interlaced sources are not included. ** Spatial scalable coding is restricted to resolution ratios of 1.5 and 2 between successive spatial layers in both horizontal and vertical direction and to macroblock-aligned cropping. ** Quality and temporal scalable coding are supported without any restriction. * Scalable High Profile: Primarily designed for broadcast, streaming, storage and
videoconferencing Videotelephony, also known as videoconferencing and video teleconferencing, is the two-way or multipoint reception and transmission of audio signal, audio and video signals by people in different locations for Real-time, real time communication. ...
applications. ** A bitstream conforming to Scalable High profile contains a base layer bitstream that conforms to High profile of H.264/MPEG-4 AVC. ** Supports all tools specified in the Scalable Video Coding extension. ** Spatial scalable coding without any restriction, i.e., arbitrary resolution ratios and cropping parameters is supported. ** Quality and temporal scalable coding are supported without any restriction. * Scalable High Intra Profile: Mainly designed for professional applications. ** Uses Instantaneous Decoder Refresh (IDR) pictures only. IDR pictures can be decoded without reference to previous frames. ** A bitstream conforming to Scalable High Intra profile contains a base layer bitstream that conforms to High profile of H.264/MPEG-4 AVC with only IDR pictures allowed. ** All scalability tools are allowed as in Scalable High profile but only IDR pictures are permitted in any layer. *Scalable Constrained Baseline Profile *Scalable Constrained High Profile


See also

*
Adam7 algorithm Adam7 is an interlacing algorithm for raster images, best known as the interlacing scheme optionally used in PNG images. An Adam7 interlaced image is broken into seven subimages, which are defined by replicating this 8×8 pattern across the ...
, used in PNG interlacing *
Bitrate peeling Bitrate peeling is a technique used in Ogg Vorbis audio encoded streams, wherein a stream can be encoded at one bitrate but can be served at that or any lower bitrate. The purpose is to provide access to the clip for people with slower Interne ...
*
Hierarchical modulation Hierarchical modulation, also called layered modulation, is one of the signal processing techniques for multiplexing and modulating multiple data streams into one single symbol stream, where base-layer symbols and enhancement-layer symbols are sy ...
*
JPEG 2000 JPEG 2000 (JP2) is an image compression standard and coding system. It was developed from 1997 to 2000 by a Joint Photographic Experts Group committee chaired by Touradj Ebrahimi (later the JPEG president), with the intention of superseding th ...
*
Scalability Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
* MPEG-5 Part 2 / Low Complexity Enhancement Video Coding / LC EVC * AV1 § Scalable video coding * HEVC Scalability Extensions


References


External links


Introduction and overview


Overview paper on SVC by H. Schwarz, D. Marpe, and T. Wiegand
(
Wayback Machine The Wayback Machine is a digital archive of the World Wide Web founded by the Internet Archive, a nonprofit based in San Francisco, California. Created in 1996 and launched to the public in 2001, it allows the user to go "back in time" and see ...
copy)
MPEG - Technologies - Overview of Scalable Video Coding
(chiariglione.org)


Standardization committee


JVT document archive site


Miscellaneous


Open SVC decoder: Open implementation of the SVC standard.

Scalable Video Coding Toolchain: A toolchain for using Scalable Video Coding with Dynamic Adaptive Streaming over HTTP

Scalable Video Coding DASH Dataset: A publicly available dataset with SVC videos
MPEG-4 Open standards covered by patents Video compression Videotelephony