Keyword spotting (or more simply, word spotting) is a problem that was historically first defined in the context of
speech processing
Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied to ...
.
In speech processing, keyword spotting deals with the identification of
keywords in
utterances
In spoken language analysis, an utterance is a continuous piece of speech, by one person, before or after which there is silence on the part of the person. In the case of spoken languages, it is generally, but not always, bounded by silence. In ...
.
Keyword spotting is also defined as a separate, but related, problem in the context of document image processing.
In document image processing, keyword spotting is the problem of finding all instances of a query word that exist in a scanned document image, without fully recognizing it.
In speech processing
The first works in keyword spotting appeared in the late 1980s.
A special case of keyword spotting is wake word (also called hot word) detection used by personal digital assistants such as
Alexa
Alexa may refer to: Technology
*Amazon Alexa, a virtual assistant developed by Amazon
* Alexa Internet, a defunct website ranking and traffic analysis service
* Alexa Fluor, a family of fluorescent dyes
* Arri Alexa, a digital motion picture ca ...
or
Siri
Siri ( , backronym: Speech Interpretation and Recognition Interface) is a digital assistant purchased, developed, and popularized by Apple Inc., which is included in the iOS, iPadOS, watchOS, macOS, Apple TV, audioOS, and visionOS operating sys ...
to activate the dormant speaker, in other words "wake up" when their name is spoken.
In the United States, the
National Security Agency
The National Security Agency (NSA) is an intelligence agency of the United States Department of Defense, under the authority of the director of national intelligence (DNI). The NSA is responsible for global monitoring, collection, and proces ...
has made use of keyword spotting since at least 2006. This technology allows analysts to search through large volumes of recorded conversations and isolate mentions of suspicious keywords. Recordings can be indexed and analysts can run queries over the database to find conversations of interest.
IARPA funded research into keyword spotting in the
Babel program.
Some algorithms used for this task are:
*
Sliding window
A sliding window protocol is a feature of packet-based data transmission Protocol (computing), protocols. Sliding window protocols are used where reliable in-order delivery of packets is required, such as in the data link layer (OSI model#Laye ...
and
garbage model
*
K-best hypothesis
*
Iterative Viterbi decoding
*
Convolutional neural network
A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...
on
Mel-frequency cepstrum coefficients
*
Transformer
In electrical engineering, a transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple Electrical network, circuits. A varying current in any coil of the transformer produces ...
-based small-footprint keyword spotting
In document image processing
Keyword spotting in document image processing can be seen as an instance of the more generic problem of
content-based image retrieval
Content-based image retrieval, also known as query by image content ( QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching ...
(CBIR).
Given a query, the goal is to retrieve the most relevant instances of words in a collection of scanned documents.
The query may be a text string (query-by-string keyword spotting) or a word image (query-by-example keyword spotting).
References
{{reflist
Pattern recognition