Selectable Mode Vocoder (SMV) is
variable bitrate speech coding
Speech coding is an application of data compression to digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic da ...
standard used in
CDMA2000
CDMA2000 (also known as C2K or IMT Multi‑Carrier (IMT‑MC)) is a family of 3G mobile technology standards for sending voice, data, and signaling data between mobile phones and cell sites. It is developed by 3GPP2 as a backwards-compatib ...
networks.
SMV provides multiple modes of operation that are selected based on input speech characteristics.
Technical specification
Codecs
The SMV for
Wideband CDMA is based on 4 codecs: full rate at 8.5 kbit/s, half rate at 4 kbit/s, quarter rate at 2 kbit/s, and eighth rate at 800 bit/s.
The full rate and half rate are based on the
CELP algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
that is based on a combined closed-loop-open-loop-analysis (COLA). In SMV the signal frames are first classified as:
* Silence/Background noise
* Non-stationary unvoiced
* Stationary unvoiced
* Onset
* Non-stationary voiced
* Stationary voiced
Algorithm
The algorithm includes
voice activity detection (VAD) followed by an elaborate
frame classification scheme. Silence/background noise and stationary unvoiced frames are represented by
spectrum
A spectrum (: spectra or spectrums) is a set of related ideas, objects, or properties whose features overlap such that they blend to form a continuum. The word ''spectrum'' was first used scientifically in optics to describe the rainbow of co ...
-
modulated noise and coded at 1/4 or 1/8 rate. The SMV uses 4 subframes for full rate and two/three subframes for half rate. The stochastic (fixed) codebook structure is also elaborate and uses sub-codebooks each tuned for a particular type of speech. The sub-codebooks have different degrees of pulse sparseness (more sparse for noise like excitation). SMV scores a high of 3.6
MOS at full rate with clean speech.
The coder works on a frame of 160 speech samples (20 ms) and requires a look ahead of 80 samples (10 ms) if noise-suppression option B is used. An additional 24 samples of look ahead is required if noise-suppression option A is used. So the algorithmic delay for the coder is 30 ms with noise-suppression option B and 33 ms with noise-suppression option A.
The next evolution of CDMA speech codecs is
VMR-WB which provides much higher speech quality with
wideband while fitting to the same networks.
SMV can be also used in 3GPP2 container file format –
3G2.
References
External links
* - RTP Payload Format for Enhanced Variable Rate Codecs (EVRC) and Selectable Mode Vocoders (SMV)
{{Compression formats
Speech codecs
3rd Generation Partnership Project 2 standards
Mobile telecommunications standards
Code division multiple access