An acoustic fingerprint is a condensed digital summary, a fingerprint, deterministically generated from an

audio signal An audio signal is a representation of sound, typically using either a changing level of electrical voltage for analog signals, or a series of binary numbers for digital signals. Audio signals have frequencies in the audio frequency range of r ...

, that can be used to identify an

audio sample In sound and music, sampling is the reuse of a portion (or sample) of a sound recording in another recording. Samples may comprise elements such as rhythm, melody, speech, sounds or entire bars of music, and may be layered, equalized, sped up or ...

or quickly locate similar items in an audio database. Practical uses of acoustic fingerprinting include identifying songs, melodies, tunes, or advertisements; sound effect library management; and video file identification. Media identification using acoustic fingerprints can be used to monitor the use of specific musical works and performances on radio broadcast,

record A record, recording or records may refer to: An item or collection of data Computing * Record (computer science), a data structure ** Record, or row (database), a set of fields in a database related to one entity ** Boot sector or boot record, ...

s, CDs,

streaming media Streaming media is multimedia that is delivered and consumed in a continuous manner from a source, with little or no intermediate storage in network elements. ''Streaming'' refers to the delivery method of content, rather than the content it ...

and peer-to-peer networks. This identification has been used in copyright compliance, licensing, and other monetization schemes.

Attributes

A robust acoustic fingerprint algorithm must take into account the perceptual characteristics of the audio. If two files sound alike to the human ear, their acoustic fingerprints should match, even if their binary representations are quite different. Acoustic fingerprints are not hash functions, which must be sensitive to any small changes in the data. Acoustic fingerprints are more analogous to human fingerprints where small variations that are insignificant to the features the fingerprint uses are tolerated. One can imagine the case of a smeared human fingerprint impression which can accurately be matched to another fingerprint sample in a reference database; acoustic fingerprints work in a similar way. Perceptual characteristics often exploited by audio fingerprints include average zero crossing rate, estimated tempo, average spectrum,

spectral flatness Spectral flatness or tonality coefficient, also known as Wiener entropy, is a measure used in digital signal processing to characterize an audio spectrum. Spectral flatness is typically measured in decibels, and provides a way to quantify how muc ...

, prominent tones across a set of frequency bands, and bandwidth. Most audio compression techniques will make radical changes to the binary encoding of an audio file, without radically affecting the way it is perceived by the human ear. A robust acoustic fingerprint will allow a recording to be identified after it has gone through such compression, even if the audio quality has been reduced significantly. For use in radio broadcast monitoring, acoustic fingerprints should also be insensitive to analog

transmission Transmission may refer to: Medicine, science and technology * Power transmission ** Electric power transmission ** Propulsion transmission, technology allowing controlled application of power *** Automatic transmission *** Manual transmission *** ...

artifacts.

Spectrogram

Generating a signature from the audio is essential for searching by sound. One common technique is creating a time-frequency graph called spectrogram. Any piece of audio can be translated to a spectrogram. Each piece of audio is split into some segments over time. In some cases adjacent segments share a common time boundary, in other cases adjacent segments might overlap. The result is a graph that plots three dimensions of audio: frequency vs amplitude (intensity) vs time.

Shazam

Shazam Shazam () may refer to: Comic book franchise * Captain Marvel (DC Comics), also known as Shazam, a superhero character published by Fawcett Comics and DC Comics ** Shazam (wizard), a character from the ''Shazam!/Captain Marvel'' comics, who give ...

's algorithm picks out points where there are peaks in the spectrogram which represent higher energy content. Focusing on peaks in the audio greatly reduces the impact that background noise has on audio identification. Shazam builds their fingerprint catalog out as a hash table, where the key is the frequency. They do not just mark a single point in the spectrogram, rather they mark a pair of points: the ''peak intensity'' plus a second ''anchor point''. So their database key is not just a single frequency, it is a hash of the frequencies of both points. This leads to fewer

hash collision In computer science, a hash collision or hash clash is when two pieces of data in a hash table share the same hash value. The hash value in this case is derived from a hash function which takes a data input and returns a fixed length of bits. Al ...

s improving the performance of the hash table.

References

{{reflist

External links

A Review of Algorithms for Audio Fingerprinting (P. Cano et al. In International Workshop on Multimedia Signal Processing, US Virgin Islands, December 2002)

Content-Based Retrieval of Music and Audio by Jonathan Foote, ISS, National University of Singapore.
Fingerprinting algorithms ca:Empremta digital multimèdia

Attributes

Spectrogram

Shazam

See also

References

External links