MPEG-7
   HOME

TheInfoList



OR:

MPEG-7 is a
multimedia Multimedia is a form of communication that uses a combination of different content forms such as text, audio, images, animations, or video into a single interactive presentation, in contrast to tradit ...
content Content or contents may refer to: Media * Content (media), information or experience provided to audience or end-users by publishers or media producers ** Content industry, an umbrella term that encompasses companies owning and providing mas ...
description
standard Standard may refer to: Symbols * Colours, standards and guidons, kinds of military signs * Standard (emblem), a type of a large symbol or emblem used for identification Norms, conventions or requirements * Standard (metrology), an object th ...
. It was standardized in
ISO ISO is the most common abbreviation for the International Organization for Standardization. ISO or Iso may also refer to: Business and finance * Iso (supermarket), a chain of Danish supermarkets incorporated into the SuperBest chain in 2007 * Iso ...
/ IEC 15938 (Multimedia content description interface). This description will be associated with the content itself, to allow fast and efficient searching for material that is of interest to the user. MPEG-7 is formally called ''Multimedia Content Description Interface''. Thus, it is ''not'' a standard which deals with the actual encoding of moving pictures and audio, like
MPEG-1 MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s (26:1 and 6:1 compression ratios respectively) without excessive quality loss, making ...
, MPEG-2 and MPEG-4. It uses
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable ...
to store metadata, and can be attached to
timecode A timecode (alternatively, time code) is a sequence of numeric codes generated at regular intervals by a timing synchronization system. Timecode is used in video production, show control and other applications which require temporal coordinatio ...
in order to tag particular events, or synchronise
lyrics Lyrics are words that make up a song, usually consisting of verses and choruses. The writer of lyrics is a lyricist. The words to an extended musical composition such as an opera are, however, usually known as a " libretto" and their writer, ...
to a
song A song is a musical composition intended to be performed by the human voice. This is often done at distinct and fixed pitches (melodies) using patterns of sound and silence. Songs contain various forms, such as those including the repetit ...
, for example. It was designed to standardize: * a set of Description Schemes ("DS") and Descriptors ("D") * a language to specify these schemes, called the Description Definition Language ("DDL") * a scheme for coding the description The combination of MPEG-4 and MPEG-7 has been sometimes referred to as MPEG-47.


Introduction

MPEG-7 is intended to provide complementary functionality to the previous
MPEG The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and IEC that sets standards for media coding, including compression coding of audio, video, graphics, and genomic data; and transmission and f ...
standards, representing information about the content, not the content itself ("the bits about the bits"). This functionality is the standardization of multimedia content descriptions. MPEG-7 can be used independently of the other MPEG standards - the description might even be attached to an analog movie. The representation that is defined within MPEG-4, i.e. the representation of audio-visual data in terms of objects, is however very well suited to what will be built on the MPEG-7 standard. This representation is basic to the process of categorization. In addition, MPEG-7 descriptions could be used to improve the functionality of previous MPEG standards. With these tools, we can build an MPEG-7 Description and deploy it. According to the requirements document,1 "a Description consists of a Description Scheme (structure) and the set of Descriptor Values (instantiations) that describe the Data." A Descriptor Value is "an instantiation of a Descriptor for a given data set (or subset thereof)." The Descriptor is the syntactic and semantic definition of the content. Extraction algorithms are inside the scope of the standard because their standardization is not required to allow interoperability.


Parts

The MPEG-7 (ISO/IEC 15938) consists of different Parts. Each part covers a certain aspect of the whole specification.


Relation between description and content

An MPEG-7 architecture requirement is that description must be separate from the audiovisual content. On the other hand, there must be a relation between the content and description. Thus the description is multiplexed with the content itself. On the right side you can see this relation between description and content.


MPEG-7 tools

MPEG-7 uses the following tools: * Descriptor (D): It is a representation of a feature defined syntactically and semantically. It could be that a unique object was described by several descriptors. * Description Schemes (DS): Specify the structure and semantics of the relations between its components, these components can be descriptors (D) or description schemes (DS). * Description Definition Language (DDL): It is based on XML language used to define the structural relations between descriptors. It allows the creation and modification of description schemes and also the creation of new descriptors (D). * System tools: These tools deal with binarization, synchronization, transport and storage of descriptors. It also deals with
Intellectual Property Intellectual property (IP) is a category of property that includes intangible creations of the human intellect. There are many types of intellectual property, and some countries recognize more than others. The best-known types are patents, cop ...
protection. On the right side you can see the relation between MPEG-7 tools.


MPEG-7 applications

There are many applications and application domains which will benefit from the MPEG-7 standard. A few application examples are: *
Digital library A digital library, also called an online library, an internet library, a digital repository, or a digital collection is an online database of digital objects that can include text, still images, audio, video, digital documents, or other digital ...
: Image/video catalogue, musical dictionary. * Multimedia directory services: e.g. yellow pages. * Broadcast media selection: Radio channel, TV channel. * Multimedia editing: Personalized electronic news service, media authoring. * Security services: Traffic control, production chains... * E-business: Searching process of products. * Cultural services: Art-galleries, museums... * Educational applications. * Biomedical applications. * Intelligent multimedia applications that leverage low-level multimedia
semantics Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comp ...
via formal representation and automated reasoning.


Software and demonstrators for MPEG-7


Caliph & Emir
Annotation and retrieval of images based on MPEG-7 (GPL). Creates MPEG-7 XML files.Lux, Mathias. "Caliph & Emir: MPEG-7 photo annotation and retrieval." Proceedings of the 17th ACM international conference on Multimedia. ACM, 2009.
C# Implementation
Open Source implementation of the MPEG-7 descriptors in C#.
Frameline 47 Video Notation
Frameline 47 from Versatile Delivery Systems. The first commercial MPEG-7 application, Frameline 47 uses an advanced content schema based on MPEG-7 so as to be able to notate entire video files, or segments and groups of segments from within that video file according to the MPEG-7 convention (commercial tool)
Eptascape ADS200
uses a real-time MPEG 7 encoder on an analog camera video signal to identify interesting events, especially in surveillance applications, check th

to see MPEG-7 in action (commercial tool)
IBM VideoAnnEx Annotation Tool
Creating MPEG-7 documents for video streams describing structure and giving keywords from a controlled vocabulary (binary release, restrictive license)
iFinder Medienanalyse- und Retrievalsystem
Metadata extraction and search engine based on MPEG-7 (commercial tool)
MPEG-7 Audio Encoder
Creating MPEG-7 documents for audio documents describing low level audio characteristics (binary & source release, Java, GPL)
MPEG-7 Visual Descriptor Extraction
Software to extract MPEG-7 visual descriptors from images and image regions.
XM Feature Extraction Web Service
The functionalities of the eXperimentation Model (XM) are made available via web service interface to enable automatic MPEG-7 low-level visual description characterization of images.
TU Berlin MPEG-7 Audio Analyzer
(Web-Demo): Creating MPEG-7 documents (XML) for audio documents (WAV, MP3). All 17 MPEG-7 low level audio descriptors are implemented (commercial)
TU Berlin MPEG-7 Spoken Content Demonstrator
(Web-Demo): Creating MPEG-7 documents (XML) with SpokenContent description from an input speech signal (WAV, MP3) (commercial)
MP7JRS C++ Library
Complete MPEG-7 implementation of part 3, 4 and 5 (visual, audio and MDS) by Joanneum Research Institute for Information and Communication Technologies - Audiovisual Media Group.
BilVideo-7
MPEG-7 compatible, distributed video indexing and retrieval system, supporting complex, multimodal, composite queries; developed by Bilkent University Multimedia Database Group
BILMDG
.
UniSay
Sophisticated Post-production file analysis and audio processing based on MPEG-7.


See also

*
Exif Exchangeable image file format (officially Exif, according to JEIDA/JEITA/CIPA specifications) is a standard that specifies formats for images, sound, and ancillary tags used by digital cameras (including smartphones), scanners and other syste ...
*
ID3 ID3 is a metadata container most often used in conjunction with the MP3 audio file format. It allows information such as the title, artist, album, track number, and other information about the file to be stored in the file itself. There are tw ...
*
Metadata standards A metadata standard is a requirement which is intended to establish a common understanding of the meaning or semantics of the data, to ensure correct and proper use and interpretation of the data by its owners and users. To achieve this common unde ...
*
MPEG-4 Part 11 MPEG-4 Part 11 ''Scene description and application engine'' was published as ISO/IEC 14496-11 in 2005. MPEG-4 Part 11 is also known as BIFS, XMT, MPEG-J. It defines: * the coded representation of the spatio-temporal positioning of audio-visual obje ...
– Scene description and application engine *
Multimedia information retrieval Multimedia information retrieval (MMIR or MIR) is a research discipline of computer science that aims at extracting semantic information from multimedia data sources.H Eidenberger. ''Fundamental Media Understanding'', atpress, 2011, p. 1. Data sour ...
* Query by humming


Limitations

The MPEG-7 standard was originally written in
XML Schema An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constra ...
(XSD), which constitutes
semi-structured data Semi-structured data is a form of structured data that does not obey the tabular structure of data models associated with relational databases or other forms of data tables, but nonetheless contains tags or other markers to separate semantic elem ...
. For example, the running time of a movie annotated using MPEG-7 in
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable ...
is machine-readable data, so software agents will know that the number expressing the running time is a positive integer, but such data is not machine-interpretable (cannot be understood by agents), because it does not convey
semantics Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comp ...
(meaning), known as the "
Semantic Gap The semantic gap characterizes the difference between two descriptions of an object by different linguistic representations, for instance languages or symbols. According to Andreas Hein, the semantic gap can be defined as "the difference in meani ...
." To address this issue, there were many attempts to map the MPEG-7
XML Schema An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constra ...
to the
Web Ontology Language The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Ontologies are a formal way to describe taxonomies and classification networks, essentially defining the structure of knowledge for vario ...
(OWL), which is a
structured data A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be c ...
equivalent of the terms of the MPEG-7 standard (MPEG-7Ontos, COMM, SWIntO, etc.). However, these mappings did not really bridge the "
Semantic Gap The semantic gap characterizes the difference between two descriptions of an object by different linguistic representations, for instance languages or symbols. According to Andreas Hein, the semantic gap can be defined as "the difference in meani ...
," because low-level video features alone are inadequate for representing video semantics. In other words, annotating an automatically extracted video feature, such as color distribution, does not provide the meaning of the actual visual content.


Compare

*
Material Exchange Format Material Exchange Format (MXF) is a container format for professional digital video and audio media defined by a set of SMPTE standards. A typical example of its use is for delivering advertisements to TV stations and tapeless archiving of broadc ...
(MXF), a container format for professional digital video and audio media defined by
SMPTE The Society of Motion Picture and Television Engineers (SMPTE) (, rarely ), founded in 1916 as the Society of Motion Picture Engineers or SMPE, is a global professional association of engineers, technologists, and executives working in the m ...
.


References

* B.S. Manjunath (Editor), Philippe Salembier (Editor), and Thomas Sikora (Editor): Introduction to MPEG-7: Multimedia Content Description Interface. Wiley & Sons, April 2002 - * Harald Kosch: Distributed Multimedia Database Technologies Supported by MPEG-7 and MPEG-21. CRC Press, January 2004 - * Giorgos Stamou (Editor) and Stefanos Kollias (Editor): Multimedia Content and the Semantic Web: Standards, Methods and Tools. Wiley & Sons, May 2005 - * Hyoung-Gook Kim, Nicolas Moreau, and Thomas Sikora: MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval. Wiley & Sons, October 2005 -


External links


MPEG-7 Overview

MPEG-7/-21 Community Portal
{{List of IEC standards MPEG Digital container formats ISO/IEC standards Metadata standards