This article is about Compressed sensing in speech signals.
In
communications technology, the technique of
compressed sensing (CS) may be applied to
the processing of speech signals under certain conditions. In particular, CS can be used to reconstruct a
sparse vector
In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. There is no strict definition regarding the proportion of zero-value elements for a matrix to qualify as sparse b ...
from a smaller number of measurements, provided the signal can be represented in sparse
domain
Domain may refer to:
Mathematics
*Domain of a function, the set of input values for which the (total) function is defined
**Domain of definition of a partial function
**Natural domain of a partial function
**Domain of holomorphy of a function
* Do ...
. "Sparse domain" refers to a domain in which only a few measurements have non-zero values.
Theory
Suppose a signal
can be represented in a domain where only
coefficients out of
(where
) are non-zero, then the signal is said to be sparse in that domain.
This reconstructed sparse
vector can be used to construct back the original signal if the sparse domain of signal is known. CS can be applied to speech signal only if sparse domain of speech signal is known.
Consider a speech signal
, which can be represented in a domain
such that
, where speech signal
, dictionary matrix
and the sparse coefficient vector
. This speech signal is said to be sparse in domain
, if the number of significant (non zero) coefficients in sparse vector
is
, where
.
The observed signal
is of
dimension
In physics and mathematics, the dimension of a Space (mathematics), mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any Point (geometry), point within it. Thus, a Line (geometry), lin ...
. To reduce the complexity for solving
using CS speech signal is observed using a measurement matrix
such that
where
, and measurement matrix
such that
.
Sparse decomposition problem for eq. 1 can be solved as standard
minimization
as
If measurement matrix
satisfies the
restricted isometric property (RIP) and is incoherent with
dictionary matrix . then the reconstructed signal is much closer to the original speech signal.
Different types of measurement matrices like
random matrices can be used for speech signals.
Estimating the sparsity of a speech signal is a problem since the speech signal varies greatly over time and thus sparsity of speech signal also varies highly over time. If sparsity of speech signal can be calculated over time without much complexity that will be best. If this is not possible then worst-case scenario for sparsity can be considered for a given speech signal.
Sparse vector (
) for a given speech signal is reconstructed from as small as possible a number of measurements (
) using
minimization.
Then original speech signal is reconstructed form the calculated sparse vector
using the fixed dictionary matrix as
as
=
.
Estimation of both the dictionary matrix and sparse vector from
random
In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no :wikt:order, order and does not follow an intelligible pattern or combination. Ind ...
measurements only has been done
iterative
Iteration is the repetition of a process in order to generate a (possibly unbounded) sequence of outcomes. Each repetition of the process is a single iteration, and the outcome of each iteration is then the starting point of the next iteration. ...
ly.
The speech signal reconstructed from estimated sparse vector and dictionary matrix is much closer to the original signal.
Some more iterative approaches to calculate both dictionary matrix and speech signal from just random measurements of speech signal have been developed.
Applications
The application of structured sparsity for joint speech localization-separation in
reverberant
Reverberation (also known as reverb), in acoustics, is a persistence of sound, after a sound is produced. Reverberation is created when a sound or signal is reflected causing numerous reflections to build up and then decay as the sound is abso ...
acoustics has been investigated for multiparty speech recognition. Further applications of the concept of sparsity are yet to be studied in the field of
speech processing. The idea behind applying CS to speech signals is to formulate
algorithms
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing c ...
or methods that use only those random measurements (
) to carry out various forms of application-based processing such as
speaker recognition
Speaker recognition is the identification of a person from characteristics of voices. It is used to answer the question "Who is speaking?" The term voice recognition can refer to ''speaker recognition'' or speech recognition. Speaker verification ...
and
speech enhancement Speech enhancement aims to improve speech quality by using various algorithms. The objective of enhancement is improvement in intelligibility and/or overall perceptual quality of degraded speech signal using audio signal processing techniques.
E ...
.
References
{{reflist
Speech processing