In mathematics and
signal processing
Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...
, the constant-Q transform and variable-Q transform, simply known as CQT and VQT, transforms a data series to the frequency domain. It is related to the
Fourier transform
A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
Judith C. Brown Judith C. Brown is a historian and Dean of the College of Arts and Humanities at the Minerva Schools at KGI in San Francisco. A specialist on the Italian Renaissance, she is considered a pioneer in the study of the history of sexuality whose work e ...
Calculation of a constant Q spectral transform
''J. Acoust. Soc. Am.'', 89(1):425–434, 1991. and very closely related to the complex
Morlet wavelet
In mathematics, the Morlet wavelet (or Gabor wavelet)0).
The parameter \sigma in the Morlet wavelet allows trade between time and frequency resolutions. Conventionally, the restriction \sigma>5 is used to avoid problems with the Morlet wavelet ...
transform. Its design is suited for musical representation.

The transform can be thought of as a series of filters ''f''
''k'', logarithmically spaced in frequency, with the ''k''-th filter having a
spectral width In telecommunications, spectral width is the wavelength interval over which the magnitude of all spectral components is equal to or greater than a specified fraction of the magnitude of the component having the maximum value.
In optical communicat ...
''δf''
''k'' equal to a multiple of the previous filter's width:
:
where ''δf''
''k'' is the bandwidth of the ''k''-th filter, ''f''
min is the central frequency of the lowest filter, and ''n'' is the number of filters per
octave
In music, an octave ( la, octavus: eighth) or perfect octave (sometimes called the diapason) is the interval between one musical pitch and another with double its frequency. The octave relationship is a natural phenomenon that has been refer ...
.
Calculation
The
short-time Fourier transform
The short-time Fourier transform (STFT), is a Fourier-related transform used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time. In practice, the procedure for computing STFTs is to divi ...
of ''x''
'n''for a frame shifted to sample ''m'' is calculated as follows:
:
Given a data series at sampling frequency ''f''
s = 1/''T'', ''T'' being the sampling period of our data, for each frequency bin we can define the following:
* Filter width, ''δf''
''k''.
* ''Q'', the "quality factor":
::
: This is shown below to be the integer number of cycles processed at a center frequency ''f
k''. As such, this somewhat defines the time complexity of the transform.
* Window length for the ''k''-th bin:
::
:Since ''f
s''/''f
k'' is the number of samples processed per cycle at frequency ''f
k'', ''Q'' is the number of integer cycles processed at this central frequency.
The equivalent transform kernel can be found by using the following substitutions:
* The window length of each bin is now a function of the bin number:
::
* The relative power of each bin will decrease at higher frequencies, as these sum over fewer terms. To compensate for this, we normalize by ''N''
'k''
* Any windowing function will be a function of window length, and likewise a function of window number. For example, the equivalent
Hamming window
In discrete-time signal processing, windowing is a preliminary signal shaping technique, usually applied to improve the appearance and usefulness of a subsequent Discrete Fourier Transform. Several ''window functions'' can be defined, based on a ...
would be
::
* Our digital frequency,
, becomes
.
After these modifications, we are left with
:
Variable-Q bandwidth calculation
The variable-Q transform is the same as constant-Q transform, but the only difference is the filter Q is variable, hence the name variable-Q transform. The variable-Q transform is useful where time resolution on low frequencies is important. There are ways to calculate the bandwidth of the VQT, one of them using
equivalent rectangular bandwidth
The equivalent rectangular bandwidth or ERB is a measure used in psychoacoustics, which gives an approximation to the bandwidths of the filters in human hearing, using the unrealistic but convenient simplification of modeling the filters as rectan ...
as a value for VQT bin's bandwidth.
The simplest way to implement a variable-Q transform is add a bandwidth offset called ''γ'' like this one:
:
This formula can be modified to have extra parameters to adjust sharpness of the transition between constant-Q and constant-bandwidth like this:
:
with ''α'' as a parameter for transition sharpness and where ''α'' of 2 is equals to
hyperbolic sine
In mathematics, hyperbolic functions are analogues of the ordinary trigonometric functions, but defined using the hyperbola rather than the circle. Just as the points form a circle with a unit radius, the points form the right half of the ...
frequency scale, in terms of frequency resolution.
Fast calculation
The direct calculation of the constant-Q transform (either using naive
DFT or slightly faster
Goertzel algorithm
The Goertzel algorithm is a technique in digital signal processing (DSP) for efficient evaluation of the individual terms of the discrete Fourier transform (DFT). It is useful in certain practical applications, such as recognition of dual-tone mult ...
) is slow when compared against the
fast Fourier transform
A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in t ...
(FFT). However, the FFT can itself be employed, in conjunction with the use of a
kernel
Kernel may refer to:
Computing
* Kernel (operating system), the central component of most operating systems
* Kernel (image processing), a matrix used for image convolution
* Compute kernel, in GPGPU programming
* Kernel method, in machine lea ...
, to perform the equivalent calculation but much faster.
Judith C. Brown Judith C. Brown is a historian and Dean of the College of Arts and Humanities at the Minerva Schools at KGI in San Francisco. A specialist on the Italian Renaissance, she is considered a pioneer in the study of the history of sexuality whose work e ...
and Miller S. Puckette
An efficient algorithm for the calculation of a constant Q transform
''J. Acoust. Soc. Am.'', 92(5):2698–2701, 1992. An approximate inverse to such an implementation was proposed in 2006; it works by going back to the DFT, and is only suitable for pitch instruments.
A development on this method with improved invertibility involves performing CQT (via FFT) octave-by-octave, using lowpass filtered and downsampled results for consecutively lower pitches. Implementations of this method include the MATLAB implementation and LibROSA's Python implementation.
LibROSA combines the subsampled method with the direct FFT method (which it dubs "pseudo-CQT") by having the latter process higher frequencies as a whole.
[
The ]sliding DFT
In applied mathematics, the sliding discrete Fourier transform is a recursive algorithm to compute successive STFTs of input data frames that are a single sample apart (hopsize − 1).
Definition
Assuming that the hopsize between two c ...
can be used for faster calculation of constant-Q transform, since the sliding DFT does not have to be linear-frequency spacing and same window size per bin.[Bradford, R, ffitch, J & Dobson, R 2008]
Sliding with a constant-Q
in ''11th International Conference on Digital Audio Effects (DAFx-08) Proceedings September 1-4th, 2008 Espoo, Finland'' . DAFx, Espoo, Finland, pp. 363-369, Proc. of the Int. Conf. on Digital Audio Effects (DAFx-08), 1/09/08.
Alternatively, the constant-Q transform can be approximated by using multiple FFTs of different window sizes and/or sampling rate at different frequency ranges then stitch it together. This is called multiresolution STFT
The short-time Fourier transform (STFT), is a Fourier-related transform used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time. In practice, the procedure for computing STFTs is to divid ...
, however the window sizes for multiresolution FFTs are different per-octave, rather than per-bin.
Comparison with the Fourier transform
In general, the transform is well suited to musical data, and this can be seen in some of its advantages compared to the fast Fourier transform. As the output of the transform is effectively amplitude/phase against log frequency, fewer frequency bins are required to cover a given range effectively, and this proves useful where frequencies span several octaves. As the range of human hearing covers approximately ten octaves from 20 Hz to around 20 kHz, this reduction in output data is significant.
The transform exhibits a reduction in frequency resolution with higher frequency bins, which is desirable for auditory applications. The transform mirrors the human auditory system, whereby at lower-frequencies spectral resolution is better, whereas temporal resolution improves at higher frequencies. At the bottom of the piano scale (about 30 Hz), a difference of 1 semitone is a difference of approximately 1.5 Hz, whereas at the top of the musical scale (about 5 kHz), a difference of 1 semitone is a difference of approximately 200 Hz.[http://newt.phys.unsw.edu.au/jw/graphics/notes.GIF ] So for musical data the exponential frequency resolution of constant-Q transform is ideal.
In addition, the harmonics of musical notes form a pattern characteristic of the timbre of the instrument in this transform. Assuming the same relative strengths of each harmonic, as the fundamental frequency changes, the relative position of these harmonics remains constant. This can make identification of instruments much easier. The constant Q transform can also be used for automatic recognition of musical keys based on accumulated chroma content.[Hendrik Purwins, Benjamin Blankertz and Klaus Obermayer]
A New Method for Tracking Modulations in Tonal Music in Audio Data Format
''International Joint Conference on Neural Network (IJCNN’00).'', 6:270-275, 2000.
Relative to the Fourier transform, implementation of this transform is more tricky. This is due to the varying number of samples used in the calculation of each frequency bin, which also affects the length of any windowing function implemented.[Benjamin Blankertz]
1999.
Also note that because the frequency scale is logarithmic, there is no true zero-frequency / DC term present, which may be a drawback in applications that are interested in the DC term. Although for applications that are not interested in the DC such as audio, this is not a drawback.
References
{{DEFAULTSORT:Constant Q Transform
Integral transforms
Harmonic analysis
Music information retrieval