Spectral modeling synthesis (SMS) is an acoustic modeling approach for speech and other signals. SMS considers

sounds In physics, sound is a vibration that propagates as an acoustic wave through a transmission medium such as a gas, liquid or solid. In human physiology and psychology, sound is the ''reception'' of such waves and their ''perception'' by the br ...

as a combination of

harmonic In physics, acoustics, and telecommunications, a harmonic is a sinusoidal wave with a frequency that is a positive integer multiple of the ''fundamental frequency'' of a periodic signal. The fundamental frequency is also called the ''1st har ...

content and

noise Noise is sound, chiefly unwanted, unintentional, or harmful sound considered unpleasant, loud, or disruptive to mental or hearing faculties. From a physics standpoint, there is no distinction between noise and desired sound, as both are vibrat ...

content. Harmonic components are identified based on peaks in the

frequency spectrum In signal processing, the power spectrum S_(f) of a continuous time signal x(t) describes the distribution of power into frequency components f composing that signal. According to Fourier analysis, any physical signal can be decomposed int ...

of the signal, normally as found by the

short-time Fourier transform The short-time Fourier transform (STFT) is a Fourier-related transform used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time. In practice, the procedure for computing STFTs is to divide ...

. The signal that remains following removal of the spectral components, sometimes referred to as the residual, is then modeled as

white noise In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used with this or similar meanings in many scientific and technical disciplines, i ...

passed through a time-varying filter. The output of the model, then, are the frequencies and levels of the detected harmonic components and the

coefficients In mathematics, a coefficient is a multiplicative factor involved in some term of a polynomial, a series, or any other type of expression. It may be a number without units, in which case it is known as a numerical factor. It may also be a ...

of the time-varying filter. Intuitively, the model can be applied to many types of audio signals. Speech signals, for example, include slowly changing harmonic sounds caused by vibration of the

vocal cords In humans, the vocal cords, also known as vocal folds, are folds of throat tissues that are key in creating sounds through Speech, vocalization. The length of the vocal cords affects the pitch of voice, similar to a violin string. Open when brea ...

plus wideband, noise-like sounds caused by the lips and mouth. Musical instruments also produce sounds containing both harmonic components and percussive, noise-like sounds when the notes are struck or changed.

References

* * * * * Speech recognition {{science-software-stub

See also

References