Semantic audio is the extraction of meaning from
audio signal
An audio signal is a representation of sound, typically using either a changing level of electrical voltage for analog signals, or a series of binary numbers for digital signals. Audio signals have frequencies in the audio frequency range of r ...
s. The field of semantic audio is primarily based around the analysis of audio to create some meaningful metadata, which can then be used in a variety of different ways.
Semantic Analysis
Semantic analysis of audio is performed to reveal some deeper understanding of an audio signal. This typically results in high-level
metadata
Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive metadata – the descriptive ...
descriptors such as musical chords and tempo, or the identification of the individual speaking, to facilitate content-based management of audio recordings. In recent years, the growth of automatic data analysis techniques has grown considerably,
*
Music Information Retrieval
*
Sound recognition
Sound recognition is a technology, which is based on both traditional pattern recognition theories and audio signal analysis methods. Sound recognition technologies contain preliminary data processing, feature extraction and classification algori ...
*
Speech segmentation
* Automatic music transcription
*
Blind source separation
*
Musical similarity
* Audio indexing, hashing, searching
* Broadcast Monitoring
* Musical performance analysis
Applications
With the development of applications that use this semantic information to support the user in identifying, organising, and exploring audio signals, and interacting with them. These applications include music information retrieval, semantic web technologies, audio production, sound reproduction, education, and gaming. Semantic technology involves some kind of understanding of the meaning of the information it deals with and to this end may incorporate machine learning, digital signal processing, speech processing, source separation, perceptual models of hearing, musicological knowledge, metadata, and ontologies.
Aside from audio retrieval and recommendation technologies, the semantics of audio signals are also becoming increasingly important, for instance, in object-based audio coding, as well as intelligent audio editing, and processing. Recent product releases already demonstrate this to a great extent, however, more innovative functionalities relying on semantic audio analysis and management are imminent. These functionalities may utilise, for instance, (informed) audio source separation, speaker segmentation and identification, structural music segmentation, or social and
Semantic Web technologies, including ontologies and linked open data.
Speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the m ...
is an important semantic audio application. But for speech, other semantic operations include
language identification,
speaker identification
Speaker recognition is the identification of a person from characteristics of voices. It is used to answer the question "Who is speaking?" The term voice recognition can refer to ''speaker recognition'' or speech recognition. Speaker verification ...
or gender identification. For more general audio or music, it includes identifying a piece of music (e.g.
Shazam (service)) or a movie soundtrack.
Areas of research in semantic audio include the ability to label an audio waveform with where the harmonies change and what they are and where material is repeated and what instruments are playing.
Semantic Audio and the Semantic Web
The
Semantic Web provides a powerful framework for the expression and reuse of structured data. The use and storage of semantic audio descriptors in the semantic web framework, allows for a much greater reach and unifying standard for storing and managing associated semantic audio metadata. A number of ontologies have been developed for storing and managing audio on the semantic web, including the (Music Ontolog
the (Studio Ontolog
and the (Audio Feature ontolog
External links
Tutorial on Source SeparationThe Audio Engineering Society Technical Committee on Semantic Audio AnalysisAES 42nd International Conference on Semantic AudioAES 53rd International Conference on Semantic AudioAES 2017 International Conference on Semantic Audio
Acoustics
Audio engineering
Semantic Web