Perceptual Evaluation of Audio Quality (PEAQ) is a standardized algorithm for objectively measuring perceived

audio quality Sound quality is typically an assessment of the accuracy, fidelity, or Intelligibility (communication), intelligibility of sound, audio output from an electronic device. Quality can be measured objectively, such as when tools are used to gau ...

, developed in 1994–1998 by a joint venture of experts within Task Group 6Q of the International Telecommunication Union's Radiocommunication Sector (

ITU-R The ITU Radiocommunication Sector (ITU-R) is one of the three sectors (divisions or units) of the International Telecommunication Union (ITU) and is responsible for radio communications. Its role is to manage the international radio-frequenc ...

). It was originally released as ITU-R Recommendation BS.1387 in 1998 and last updated in 2023. It utilizes software to simulate perceptual properties of the human ear and then integrates multiple model output variables into a single metric. PEAQ characterizes the perceived audio quality as subjects would do in a listening test according to ITU-R BS.1116. PEAQ results principally model mean opinion scores that cover a scale from 1 (bad) to 5 (excellent). The ''Subjective Difference Grade'' (SDG), which measures the degree of compression damage (impairment) is defined as the difference between the opinion scores of tested version and the reference (source). The SDG typically ranges from 0 (no perceived impairment) to -4 (terrible impairment). The ''Objective Difference Grade'' (ODG) is the actual output of the algorithm, designed to match SDG.

Motivation

The need to conserve bandwidth has led to developments in the compression of the audio data to be transmitted. Various encoding methods remove both redundancy and perceptual irrelevancy in the audio signal so that the bit rate required to encode the signal is significantly reduced. They take into account knowledge of human auditory perception and typically achieve a reduced bit rate by ignoring audio information that is not likely to be heard by most listeners. Traditional audio measurements like frequency response based on sinusoidal sweeps, S/N, THD+N do not necessarily correlate well with the audio codec quality. A

psychoacoustic model Psychoacoustics is the branch of psychophysics involving the scientific study of the perception of sound by the human auditory system. It is the branch of science studying the psychological responses associated with sound including noise, speech, ...

must be used to predict how the information is masked by louder audio content adjacent in time and frequency. Since subjective listening tests are time-consuming, expensive and impractical for everyday use, it was beneficial to substitute listening tests with objective, computer-based methods. Steered by the ITU-R Task Group 6Q, a group of leading sound quality experts developed a new objective model for sound quality: PEAQ. These contributors were: * OPTICOM GmbH, Erlangen, Germany * the

Fraunhofer Institute The Fraunhofer Society () is a German publicly-owned research organization with 76institutes spread throughout Germany, each focusing on different fields of applied science (as opposed to the Max Planck Society, which works primarily on basic sc ...

for Integrated Circuits, IIS-A, Erlangen, Germany *

Deutsche Telekom Deutsche Telekom AG (, ; often just Telekom, DTAG or DT; stylised as ·T·) is a partially state-owned German telecommunications company headquartered in Bonn and the largest telecommunications provider in Europe by revenue. It was formed in 199 ...

Berkom, Berlin, Germany * the

University of Berlin The Humboldt University of Berlin (, abbreviated HU Berlin) is a public research university in the central borough of Mitte in Berlin, Germany. The university was established by Frederick William III on the initiative of Wilhelm von Humbol ...

, Berlin, Germany * the

Institut für Rundfunktechnik The GmbH (IRT) (''Institute for Broadcasting Technology Ltd.'') was a research centre of German broadcasters ( ARD / ZDF / DLR), Austria's broadcaster ( ORF) and the Swiss public broadcaster ( SRG / SSR). It was responsible for research on broa ...

, IRT, Munich, Germany * KPN Research, Dr. Neher Laboratorium, Leidschendam, The Netherlands * Centre commun d'études de télévision et télécommunications, France * Communications Research Centre, CRC, Ottawa, Canada

Principles

In perceptual coding it is fundamental to determine the level of noise that can be introduced into a signal before it becomes audible. Because the human auditory system is highly non-linear, noise levels vary with time and frequency characteristics of the audio signal. Psychoacoustic studies can deliver threshold criteria for various acoustic events and the resulting perceived sounds. The key is masking, that describes the effect that a sound produces into another simultaneous sound. Masking depends on the spectral composition of both masker and masking signal, and on other variations with time. The basic block diagram of a perceptual coding system is shown in the figure. Basic block diagram perceptual coding system

Basic block diagram perceptual coding system

The input signal is decomposed into subsampled spectral components. For each sample an estimation of the actual masked threshold is derived using rules known from psychoacoustics. This is the perceptual model of the encoding system. The spectral components are quantized and coded, keeping the quantization noise below the masked threshold. Finally, the

bitstream A bitstream (or bit stream), also known as binary sequence, is a sequence of bits. A bytestream is a sequence of bytes. Typically, each byte is an 8-bit quantity, and so the term octet stream is sometimes used interchangeably. An octet may ...

is formed. The analysis of the results are based on the Subjective Difference Grade. It compares the signal under test with the original reference signal.

Models

The model follows the fundamental properties of the auditory system and it differences stages of physiological and psychoacoustic effects. The first part models the construction of the signal with a

Discrete Fourier transform In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced Sampling (signal processing), samples of a function (mathematics), function into a same-length sequence of equally-spaced samples of the discre ...

and filter banks. The second part provides cognitive processing as the human brain does. The next image represents a simple block diagram of the relationship between the human audio system and an objective psychoacoustic model. Perceptual framework psychoacoustic model

Perceptual framework psychoacoustic model

From the model comparison of the test signal with the (original) reference signal, a number of model output variables are derived. Each model output variable may measure different psychoacoustic dimensions. In the final stage the model output variables are combined using a neural network (weights defined in standard) to produce a result that copes with subjective quality assessment. There are two variations of the model. The Basic version (less processing intensive) was developed to be fast enough for real-time monitoring and only uses FFT. The Advanced version is computationally more demanding and may deliver slightly more accurate results; it uses FFT and filter banks to produce more MOVs for the neural network to work with.

License

The PEAQ technology as recommended by ITU-R Rec. BS.1387 is protected by several patents and is available under license together with the original code for commercial applications according to ITU

fair, reasonable, and non-discriminatory Reasonable and non-discriminatory (RAND) terms, also known as fair, reasonable, and non-discriminatory (FRAND) terms, denote a voluntary licensing commitment that standards organizations often request from the owner of an intellectual property r ...

terms.

Royalty-free implementations

* An early open-source implementation of the basic model, named EAQUAL, was discontinued in 2002 because of patent infringement claims. * For educational use, there exists a free cross-platform program called Peaqb which accomplishes the same functions in a limited manner, as it has not been validated with the ITU data. Evaluation by GstPEAQ authors show an RMSE of 0.2063 for 16 ITU test vectors. * Another unvalidated implementation of the PEAQ basic model for educational use, PQevalAudio, is available from the TSP Lab of McGill University. Evaluation by GstPEAQ authors show an RMSE of 0.2329 for 16 ITU test vectors. * GstPEAQ implements both the basic and advanced models, but fails to conform to BS.1387-1 tolerances. Nevertheless, the difference from conformance (RMSE 0.2009 in basic mode) is smaller than previous open-source implementations. The author also found that the difference to be statistically insignificant in terms of using the ODG as an estimate of the SDG.

References

External links

* http://www.peaq.org PEAQ official site * https://web.archive.org/web/20061207095623/http://www.crc.ca/en/html/aas/home/peaq/peaq PEAQ at the CRC * https://web.archive.org/web/20090423074959/http://www.opticom.de/technology/technology.html PEAQ information from OPTICOM * http://elvera.nue.tu-berlin.de/files/0829Thiede1998.pdf PEAQ - der künftige ITU-Standard zur objektiven Messung der wahrgenommenen Audioqualität * https://ieeexplore.ieee.org/document/1613524 IEEE - Estimating Perceptual Audio System Quality Using PEAQ Algorithm * http://sourceforge.net/projects/peaqb/ Peaqb project * http://www-mmsp.ece.mcgill.ca/Documents/Software/index.html PQevalAudio - Matlab and C implementation of PEAQ Basic Model. * http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL source code Audio codecs Digital audio ITU-R recommendations de:PEXQ#Perceptual Evaluation of Audio Quality (PEAQ)