An acoustic fingerprint is a condensed digital summary, a
fingerprint,
deterministically generated from an
audio signal
An audio signal is a representation of sound, typically using either a changing level of electrical voltage for analog signals, or a series of binary numbers for digital signals. Audio signals have frequencies in the audio frequency range of r ...
, that can be used to identify an
audio sample
In sound and music, sampling is the reuse of a portion (or sample) of a sound recording in another recording. Samples may comprise elements such as rhythm, melody, speech, sounds or entire bars of music, and may be layered, equalized, sped up or ...
or quickly locate similar items in an
audio database.
Practical uses of acoustic fingerprinting include identifying
songs,
melodies,
tunes, or
advertisements;
sound effect library management; and
video file identification. Media identification using acoustic fingerprints can be used to monitor the use of specific musical works and performances on
radio broadcast,
record
A record, recording or records may refer to:
An item or collection of data Computing
* Record (computer science), a data structure
** Record, or row (database), a set of fields in a database related to one entity
** Boot sector or boot record, ...
s,
CDs,
streaming media
Streaming media is multimedia that is delivered and consumed in a continuous manner from a source, with little or no intermediate storage in network elements. ''Streaming'' refers to the delivery method of content, rather than the content it ...
and
peer-to-peer networks. This identification has been used in copyright compliance, licensing, and other
monetization schemes.
Attributes
A robust acoustic fingerprint algorithm must take into account the perceptual characteristics of the audio. If two files sound alike to the human ear, their acoustic fingerprints should match, even if their binary representations are quite different. Acoustic fingerprints are not
hash functions, which must be sensitive to any small changes in the data. Acoustic fingerprints are more analogous to human fingerprints where small variations that are insignificant to the features the fingerprint uses are tolerated. One can imagine the case of a smeared human fingerprint impression which can accurately be matched to another fingerprint sample in a reference database; acoustic fingerprints work in a similar way.
Perceptual characteristics often exploited by audio fingerprints include average
zero crossing rate, estimated
tempo, average
spectrum,
spectral flatness
Spectral flatness or tonality coefficient,
also known as Wiener entropy, is a measure used in digital signal processing to characterize an audio spectrum. Spectral flatness is typically measured in decibels, and provides a way to quantify how muc ...
, prominent tones across a set of
frequency bands, and
bandwidth.
Most
audio compression techniques will make radical changes to the binary encoding of an audio file, without radically affecting the way it is perceived by the human ear. A robust acoustic fingerprint will allow a recording to be identified after it has gone through such compression, even if the audio quality has been reduced significantly. For use in
radio broadcast monitoring, acoustic fingerprints should also be insensitive to analog
transmission
Transmission may refer to:
Medicine, science and technology
* Power transmission
** Electric power transmission
** Propulsion transmission, technology allowing controlled application of power
*** Automatic transmission
*** Manual transmission
*** ...
artifacts.
Spectrogram
Generating a signature from the audio is essential for
searching by sound. One common technique is creating a time-frequency graph called
spectrogram.
Any piece of audio can be translated to a spectrogram. Each piece of audio is split into some segments over time. In some cases adjacent segments share a common time boundary, in other cases adjacent segments might overlap. The result is a graph that plots three dimensions of audio: frequency vs amplitude (intensity) vs time.
Shazam
Shazam
Shazam () may refer to:
Comic book franchise
* Captain Marvel (DC Comics), also known as Shazam, a superhero character published by Fawcett Comics and DC Comics
** Shazam (wizard), a character from the ''Shazam!/Captain Marvel'' comics, who give ...
's algorithm picks out points where there are peaks in the spectrogram which represent higher energy content. Focusing on peaks in the audio greatly reduces the impact that
background noise has on audio identification. Shazam builds their fingerprint catalog out as a
hash table, where the key is the frequency. They do not just mark a single point in the spectrogram, rather they mark a pair of points: the ''peak intensity'' plus a second ''anchor point''. So their database key is not just a single frequency, it is a hash of the frequencies of both points. This leads to fewer
hash collision
In computer science, a hash collision or hash clash is when two pieces of data in a hash table share the same hash value. The hash value in this case is derived from a hash function which takes a data input and returns a fixed length of bits.
Al ...
s improving the performance of the hash table.
See also
*
Chromaprint
AcoustID is a webservice for the identification of music recordings based on the Chromaprint acoustic fingerprint algorithm. It can identify entire songs but not short snippets.
By 2017, the free service had 34 million "fingerprints" in store an ...
*
Automatic content recognition
*
Digital video fingerprinting
*
Feature extraction
*
Parsons code
*
Perceptual hashing
*
Search by sound
*
Sound recognition
References
{{reflist
External links
A Review of Algorithms for Audio Fingerprinting (P. Cano et al. In International Workshop on Multimedia Signal Processing, US Virgin Islands, December 2002)Content-Based Retrieval of Music and Audio by Jonathan Foote, ISS, National University of Singapore.
Fingerprinting algorithms
ca:Empremta digital multimèdia