{{Short description, Interactive voice user interface
A voice browser is a
software application
Application software is any computer program that is intended for end-user use not computer operator, operating, system administration, administering or computer programming, programming the computer. An application (app, application program, sof ...
that presents an interactive
voice user interface
A voice-user interface (VUI) enables spoken human interaction with computers, using speech recognition to understand spoken commands and answer questions, and typically text to speech to play a reply. A voice command device is a device controlle ...
to the user in a manner analogous to the functioning of a
web browser
A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...
interpreting
Hypertext Markup Language
Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheet ...
(HTML). Dialog documents interpreted by voice browser are often encoded in standards-based markup languages, such as Voice Dialog Extensible Markup Language (VoiceXML), a standard by the
World Wide Web Consortium
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
.
A voice browser presents information aurally, using pre-recorded audio file playback or
text-to-speech
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or Computer hardware, hardware products. A text-to-speech (TTS) system conv ...
synthesis software. A voice browser obtains information using
speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
and keypad entry, such as
DTMF
Dual-tone multi-frequency (DTMF) signaling is a telecommunication signaling system using the voice-frequency band over telephone lines between telephone equipment and other communications devices and switching centers. DTMF was first developed ...
detection.
As speech recognition and web technologies have matured, voice applications are deployed commercially in many industries and voice browsers are supplanting traditional proprietary
interactive voice response
Interactive voice response (IVR) is a technology that allows telephone users to interact with a computer-operated telephone system through the use of voice and DTMF tones input with a keypad. In telephony, IVR allows customers to interact with a ...
(IVR) systems. Voice browser software is delivered in a variety of implementations models.
Systems that present a voice browser to a user, typically provide interfaces to the
public switched telephone network
The public switched telephone network (PSTN) is the aggregate of the world's telephone networks that are operated by national, regional, or local telephony operators. It provides infrastructure and services for public telephony. The PSTN consists o ...
or to a
private branch exchange
A business telephone system is a telephone system typically used in business environments, encompassing the range of technology from the key telephone system (KTS) to the private branch exchange (PBX).
A business telephone system differs from ...
.
See also
*
Call Control eXtensible Markup Language
Call Control eXtensible Markup Language (CCXML) is an XML standard designed to provide asynchronous event-based telephony support to VoiceXML. Its current status is a W3C recommendation, adopted May 10, 2011. Whereas VoiceXML is designed to provid ...
(CCXML)
*
Speech Recognition Grammar Specification
Speech Recognition Grammar Specification (SRGS) is a W3C standard for how ''speech recognition grammars'' are specified. A speech recognition grammar is a set of word patterns, and tells a speech recognition system what to expect a human to say. ...
(SRGS)
*
Semantic Interpretation for Speech Recognition
Semantic Interpretation for Speech Recognition (SISR) defines the syntax and semantics of annotations to grammar rules in the Speech Recognition Grammar Specification (SRGS). Since 5 April 2007, it is a World Wide Web Consortium recommendation.
...
(SISR)
*
Speech Synthesis Markup Language
Speech Synthesis Markup Language (SSML) is an XML-based markup language for speech synthesis applications. It is a recommendation of the W3C's Voice Browser Working Group. SSML is often embedded in VoiceXML scripts to drive interactive telephony sy ...
(SSML)
*
Pronunciation Lexicon Specification
The Pronunciation Lexicon Specification (PLS) is a W3C Recommendation, which is designed to enable interoperable specification of pronunciation information for both speech recognition and speech synthesis engines within voice browsing application ...
(PLS)
*
ECMAScript
ECMAScript (; ES) is a standard for scripting languages, including JavaScript, JScript, and ActionScript. It is best known as a JavaScript standard intended to ensure the interoperability of web pages across different web browsers. It is stan ...
- Scripting language supported by most voice browsers