Spectral modeling synthesis
   HOME

TheInfoList



OR:

Spectral modeling synthesis (SMS) is an acoustic modeling approach for speech and other signals. SMS considers
sounds In physics, sound is a vibration that propagates as an acoustic wave, through a transmission medium such as a gas, liquid or solid. In human physiology and psychology, sound is the ''reception'' of such waves and their ''perception'' by the ...
as a combination of harmonic content and
noise Noise is unwanted sound considered unpleasant, loud or disruptive to hearing. From a physics standpoint, there is no distinction between noise and desired sound, as both are vibrations through a medium, such as air or water. The difference aris ...
content. Harmonic components are identified based on peaks in the
frequency spectrum The power spectrum S_(f) of a time series x(t) describes the distribution of power into frequency components composing that signal. According to Fourier analysis, any physical signal can be decomposed into a number of discrete frequencies, ...
of the signal, normally as found by the
short-time Fourier transform The short-time Fourier transform (STFT), is a Fourier-related transform used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time. In practice, the procedure for computing STFTs is to divi ...
. The signal that remains following removal of the spectral components, sometimes referred to as the residual, is then modeled as
white noise In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used, with this or similar meanings, in many scientific and technical disciplines ...
passed through a time-varying filter. The output of the model, then, are the frequencies and levels of the detected harmonic components and the
coefficients In mathematics, a coefficient is a multiplicative factor in some term of a polynomial, a series, or an expression; it is usually a number, but may be any expression (including variables such as , and ). When the coefficients are themselves ...
of the time-varying filter. Intuitively, the model can be applied to many types of audio signals. Speech signals, for example, include slowly changing harmonic sounds caused by vibration of the
vocal cords In humans, vocal cords, also known as vocal folds or voice reeds, are folds of throat tissues that are key in creating sounds through vocalization. The size of vocal cords affects the pitch of voice. Open when breathing and vibrating for speec ...
plus wideband, noise-like sounds caused by the lips and mouth. Musical instruments also produce sounds containing both harmonic components and percussive, noise-like sounds when the notes are struck or changed.


See also

*
Speech coding Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic d ...
*
CELP Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algori ...
* Source-filter model of speech production *
FM synthesis Frequency modulation synthesis (or FM synthesis) is a form of sound synthesis whereby the frequency of a waveform is changed by modulating its frequency with a modulator. The frequency of an oscillator is altered "in accordance with the amplitud ...
*
Sound synthesis A synthesizer (also spelled synthesiser) is an electronic musical instrument that generates audio signals. Synthesizers typically create sounds by generating waveforms through methods including subtractive synthesis, additive synthesis and f ...

SPEAR - Sinusoidal Partial Editing Analysis and Resynthesis


References

* * * * * Speech recognition {{science-software-stub