Semantic audio is the extraction of meaning from

audio signal An audio signal is a representation of sound, typically using either a changing level of electrical voltage for analog signals, or a series of binary numbers for digital signals. Audio signals have frequencies in the audio frequency range of ro ...

s. The field of semantic audio is primarily based around the analysis of audio to create some meaningful metadata, which can then be used in a variety of different ways.

Semantic Analysis

Semantic analysis of audio is performed to reveal some deeper understanding of an audio signal. This typically results in high-level metadata descriptors such as musical chords and tempo, or the identification of the individual speaking, to facilitate content-based management of audio recordings. In recent years, the growth of automatic data analysis techniques has grown considerably, *

Music Information Retrieval Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. MIR is a small but growing field of research with many real-world applications. Those involved in MIR may have a background in academic musico ...

* Sound recognition *

Speech segmentation Speech segmentation is the process of identifying the boundaries between words, syllables, or phonemes in spoken natural languages. The term applies both to the mental processes used by humans, and to artificial processes of natural language proces ...

* Automatic music transcription *

Blind source separation Source separation, blind signal separation (BSS) or blind source separation, is the separation of a set of source signals from a set of mixed signals, without the aid of information (or with very little information) about the source signals or t ...

Musical similarity The notion of musical similarity is particularly complex because there are numerous dimensions of similarity. If similarity takes place between different fragments from one musical piece, a musical similarity implies a repetition of the first occurr ...

* Audio indexing, hashing, searching * Broadcast Monitoring * Musical performance analysis

Applications

With the development of applications that use this semantic information to support the user in identifying, organising, and exploring audio signals, and interacting with them. These applications include music information retrieval, semantic web technologies, audio production, sound reproduction, education, and gaming. Semantic technology involves some kind of understanding of the meaning of the information it deals with and to this end may incorporate machine learning, digital signal processing, speech processing, source separation, perceptual models of hearing, musicological knowledge, metadata, and ontologies. Aside from audio retrieval and recommendation technologies, the semantics of audio signals are also becoming increasingly important, for instance, in object-based audio coding, as well as intelligent audio editing, and processing. Recent product releases already demonstrate this to a great extent, however, more innovative functionalities relying on semantic audio analysis and management are imminent. These functionalities may utilise, for instance, (informed) audio source separation, speaker segmentation and identification, structural music segmentation, or social and Semantic Web technologies, including ontologies and linked open data.

Speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ma ...

is an important semantic audio application. But for speech, other semantic operations include language identification, speaker identification or gender identification. For more general audio or music, it includes identifying a piece of music (e.g. Shazam (service)) or a movie soundtrack. Areas of research in semantic audio include the ability to label an audio waveform with where the harmonies change and what they are and where material is repeated and what instruments are playing.

Semantic Audio and the Semantic Web

The Semantic Web provides a powerful framework for the expression and reuse of structured data. The use and storage of semantic audio descriptors in the semantic web framework, allows for a much greater reach and unifying standard for storing and managing associated semantic audio metadata. A number of ontologies have been developed for storing and managing audio on the semantic web, including the (Music Ontolog

the (Studio Ontolog

and the (Audio Feature ontolog

External links

Tutorial on Source Separation

The Audio Engineering Society Technical Committee on Semantic Audio Analysis

AES 42nd International Conference on Semantic Audio

AES 53rd International Conference on Semantic Audio

AES 2017 International Conference on Semantic Audio
Acoustics Audio engineering Semantic Web