HOME

TheInfoList



OR:

Keyword spotting (or more simply, word spotting) is a problem that was historically first defined in the context of speech processing. In speech processing, keyword spotting deals with the identification of keywords in
utterances In spoken language analysis, an utterance is a continuous piece of speech, often beginning and ending with a clear pause. In the case of oral languages, it is generally, but not always, bounded by silence. Utterances do not exist in written langu ...
. Keyword spotting is also defined as a separate, but related, problem in the context of document image processing. In document image processing, keyword spotting is the problem of finding all instances of a query word that exist in a scanned document image, without fully recognizing it.


In speech processing

The first works in keyword spotting appeared in the late 1980s. A special case of keyword spotting is wake word (also called hot word) detection used by personal digital assistants such as
Alexa Alexa may refer to: Technology *Amazon Alexa, a virtual assistant developed by Amazon * Alexa Internet, a defunct website ranking and traffic analysis service * Arri Alexa, a digital motion picture camera People *Alexa (name) Alexa is a fem ...
or Siri to activate the dormant speaker, in other words "wake up" when their name is spoken. In the United States, the National Security Agency has made use of keyword spotting since at least 2006. This technology allows analysts to search through large volumes of recorded conversations and isolate mentions of suspicious keywords. Recordings can be indexed and analysts can run queries over the database to find conversations of interest.
IARPA The Intelligence Advanced Research Projects Activity (IARPA) is an organization within the Office of the Director of National Intelligence responsible for leading research to overcome difficult challenges relevant to the United States Intellige ...
funded research into keyword spotting in the Babel program. Some algorithms used for this task are: *
Sliding window A sliding window protocol is a feature of packet-based data transmission protocols. Sliding window protocols are used where reliable in-order delivery of packets is required, such as in the data link layer (OSI layer 2) as well as in the Trans ...
and garbage model * K-best hypothesis *
Iterative Viterbi decoding Iterative Viterbi decoding is an algorithm that spots the subsequence ''S'' of an observation ''O'' = having the highest average probability (i.e., probability scaled by the length of ''S'') of being generated by a given hidden Markov model ''M'' w ...
*
Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Networ ...
on
Mel-frequency cepstrum In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. Mel-frequency cepstral coe ...
coefficients *
Transformer A transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple circuits. A varying current in any coil of the transformer produces a varying magnetic flux in the transformer's ...
-based small-footprint keyword spotting


In document image processing

Keyword spotting in document image processing can be seen as an instance of the more generic problem of
content-based image retrieval Content-based image retrieval, also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching f ...
(CBIR). Given a query, the goal is to retrieve the most relevant instances of words in a collection of scanned documents. The query may be a text string (query-by-string keyword spotting) or a word image (query-by-example keyword spotting).


References

{{reflist Pattern recognition