Speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ma ...
software is available for many
computing platform
A computing platform or digital platform is an environment in which a piece of software is executed. It may be the hardware or the operating system (OS), even a web browser and associated application programming interfaces, or other underlying ...
s,
operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s, use models, and
software license
A software license is a legal instrument (usually by way of contract law, with or without printed material) governing the use or redistribution of software. Under United States copyright law, all software is copyright protected, in both sour ...
s. Here is a listing of such, grouped in various useful ways.
Acoustic models and speech corpus (compilation)
The following list presents notable
speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ma ...
software engines with a brief synopsis of characteristics.
Macintosh
Cross-platform web apps based on Chrome
The following list presents notable speech recognition software that operate in a Chrome browser as web apps. They make use of HTML5 Web-Speech-API.
Mobile devices and smartphones
Many
mobile phone
A mobile phone, cellular phone, cell phone, cellphone, handphone, hand phone or pocket phone, sometimes shortened to simply mobile, cell, or just phone, is a portable telephone that can make and receive telephone call, calls over a radio freq ...
handsets, including
feature phone
A feature phone (also spelled featurephone) is a type or class of mobile phone that retains the form factor of earlier generations of mobile telephones, typically with press-button based inputs and a small non-touch display. They tend to use an ...
s and
smartphone
A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, whic ...
s such as
iPhones and
BlackBerry
The blackberry is an edible fruit produced by many species in the genus ''Rubus'' in the family Rosaceae, hybrids among these species within the subgenus ''Rubus'', and hybrids between the subgenera ''Rubus'' and ''Idaeobatus''. The taxonomy of ...
s, have basic dial-by-voice features built in. Many third-party apps have implemented natural-language speech recognition support, including:
Windows
Windows built-in speech recognition
The
Windows Speech Recognition
Windows Speech Recognition (WSR) is speech recognition developed by Microsoft for Windows Vista that enables voice commands to control the desktop user interface; dictate text in electronic documents and email; navigate websites; perform key ...
version 8.0 by
Microsoft
Microsoft Corporation is an American multinational corporation, multinational technology company, technology corporation producing Software, computer software, consumer electronics, personal computers, and related services headquartered at th ...
comes built into
Windows Vista
Windows Vista is a major release of the Windows NT operating system developed by Microsoft. It was the direct successor to Windows XP, which was released five years before, at the time being the longest time span between successive releases of ...
,
Windows 7
Windows 7 is a major release of the Windows NT operating system developed by Microsoft. It was released to manufacturing on July 22, 2009, and became generally available on October 22, 2009. It is the successor to Windows Vista, released nearl ...
,
Windows 8
Windows 8 is a major release of the Windows NT operating system developed by Microsoft. It was released to manufacturing on August 1, 2012; it was subsequently made available for download via MSDN and TechNet on August 15, 2012, and later to ...
and
Windows 10
Windows 10 is a major release of Microsoft's Windows NT operating system. It is the direct successor to Windows 8.1, which was released nearly two years earlier. It was released to manufacturing on July 15, 2015, and later to retail on ...
.
Speech Recognition is available only in English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese and only in the corresponding version of Windows; meaning you cannot use the speech recognition engine in one language if you use a version of Windows in another language. Windows 7 Ultimate and Windows 8 Pro allow you to change the system language, and therefore change which speech engine is available. Windows Speech Recognition evolved into
Cortana (software)
Cortana is a virtual assistant developed by Microsoft that uses the Bing search engine to perform tasks such as setting reminders and question answering, answering questions for the user.
Cortana is currently available in English language, Engl ...
, a personal assistant included in
Windows 10
Windows 10 is a major release of Microsoft's Windows NT operating system. It is the direct successor to Windows 8.1, which was released nearly two years earlier. It was released to manufacturing on July 15, 2015, and later to retail on ...
.
Windows 7, 8, 10, 11 third-party speech recognition
*
Braina
Braina is an intelligent personal assistant and speech to text dictation application for Microsoft Windows marketed by Brainasoft. Braina uses natural language interface and speech recognition to interact with its users and allows users to use na ...
– Dictate into third party software and websites, fill web forms and execute vocal commands.
*
Dragon NaturallySpeaking
Dragon NaturallySpeaking (also known as Dragon for PC, or DNS) is a speech recognition software package developed by Dragon Systems of Newton, Massachusetts, which was acquired in turn by Lernout & Hauspie Speech Products, Nuance Communication ...
from
Nuance Communications
Nuance Communications, Inc. is an American multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that markets speech recognition and artificial intelligence software.
Nuance merged with its compe ...
– Successor to the older
DragonDictate DragonDictate, Dragon Dictate, or Dragon for Mac is proprietary speech recognition software. The older program, DragonDictate, was originally developed by Dragon Systems for Microsoft Windows. It has now been replaced by Dragon NaturallySpeaking f ...
product. Focus on
dictation
Dictation can refer to:
*Dictation (exercise), when one person speaks while another person transcribes
*'' Dictation: A Quartet'', a collection of short stories by Cynthia Ozick, published in 2008
* Digital dictation, the use of digital electronic ...
. 64-bit Windows support since version 10.1.
*
SpeechMagic –
Nuance Communications
Nuance Communications, Inc. is an American multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that markets speech recognition and artificial intelligence software.
Nuance merged with its compe ...
acquired
Philips
Koninklijke Philips N.V. (), commonly shortened to Philips, is a Dutch multinational conglomerate corporation that was founded in Eindhoven in 1891. Since 1997, it has been mostly headquartered in Amsterdam, though the Benelux headquarters is ...
owned. Medical industry focus according to
Frost & Sullivan
Frost & Sullivan is an American business consulting firm. It offers market research and analysis, growth strategy consulting, and corporate training. It has about 45 offices in the Americas, Africa, Asia and Europe; the principal office is in S ...
. Standalone or embedded.
*
Tazti
Tazti is a speech recognition software package developed and sold by Voice Tech Group, Inc. for Windows personal computers. The most recent package is version 3.2, which supports Windows 10, Windows 8.1
Windows 8.1 is a release of the Win ...
– Create speech command profiles to play PC games and control applications – programs. Create speech commands to open files, folders, webpages, applications. Windows 7, Windows 8 and Windows 8.1 versions.
*
Voice Finger
Voice Finger is a software tool that enables users to control the mouse cursor and keyboard through speech recognition. Voice Finger improves on the default Windows Speech Recognition tools by reducing the number or length of voice commands requi ...
– software that improves the
Windows speech recognition
Windows Speech Recognition (WSR) is speech recognition developed by Microsoft for Windows Vista that enables voice commands to control the desktop user interface; dictate text in electronic documents and email; navigate websites; perform key ...
system by adding several extensions to it. The software enables controlling the mouse and the keyboard by only using the voice. It is especially useful for aiding users to overcome disabilities or to heal from computer injuries.
Windows XP or 2000 only
*
Microsoft Speech API
The Speech Application Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows applications. To date, a number of versions of the API have been released, which ha ...
– Speech recognition functionality included as part of Microsoft Office and on
Tablet PC
A tablet computer, commonly shortened to tablet, is a mobile device, typically with a mobile operating system and touchscreen display processing circuitry, and a rechargeable battery in a single, thin and flat package. Tablets, being compu ...
s running Microsoft Windows XP Tablet PC Edition. It can also be downloaded as part of the Speech SDK 5.1 for Windows applications, but since that is aimed at developers building speech applications, the pure SDK form lacks any user interface, and thus is unsuitable for end users.
Built-in software
*Microsoft
Kinect
Kinect is a line of motion sensing input devices produced by Microsoft and first released in 2010. The devices generally contain RGB cameras, and infrared projectors and detectors that map depth through either structured light or time of fl ...
includes built-in software which allows speech recognition of commands.
*Older generations of
Nokia
Nokia Corporation (natively Nokia Oyj, referred to as Nokia) is a Finnish multinational telecommunications, information technology, and consumer electronics corporation, established in 1865. Nokia's main headquarters are in Espoo, Finlan ...
phones like Nokia N Series (before using
Windows 7
Windows 7 is a major release of the Windows NT operating system developed by Microsoft. It was released to manufacturing on July 22, 2009, and became generally available on October 22, 2009. It is the successor to Windows Vista, released nearl ...
mobile technology) used speech-recognition with family names from contact list and a few commands.
*
Siri
Siri ( ) is a virtual assistant that is part of Apple Inc.'s iOS, iPadOS, watchOS, macOS, tvOS, and audioOS operating systems. It uses voice queries, gesture based control, focus-tracking and a natural-language user interface to answer ques ...
, originally implemented in the
iPhone 4S
The iPhone 4S (originally styled as iPhone 4 S, retroactively stylized with a lowercase 's' as iPhone 4s as of September 2013) is a smartphone that was designed and marketed by Apple Inc. It is the fifth generation of the iPhone, succe ...
,
Apple's personal assistant for
iOS
iOS (formerly iPhone OS) is a mobile operating system created and developed by Apple Inc. exclusively for its hardware. It is the operating system that powers many of the company's mobile devices, including the iPhone; the term also include ...
, which uses technology from
Nuance Communications
Nuance Communications, Inc. is an American multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that markets speech recognition and artificial intelligence software.
Nuance merged with its compe ...
.
*
Cortana (software)
Cortana is a virtual assistant developed by Microsoft that uses the Bing search engine to perform tasks such as setting reminders and question answering, answering questions for the user.
Cortana is currently available in English language, Engl ...
,
Microsoft
Microsoft Corporation is an American multinational corporation, multinational technology company, technology corporation producing Software, computer software, consumer electronics, personal computers, and related services headquartered at th ...
's personal assistant built into
Windows Phone
Windows Phone (WP) is a discontinued family of mobile operating systems developed by Microsoft for smartphones as the replacement successor to Windows Mobile and Zune. Windows Phone featured a new user interface derived from the Metro design lan ...
and
Windows 10
Windows 10 is a major release of Microsoft's Windows NT operating system. It is the direct successor to Windows 8.1, which was released nearly two years earlier. It was released to manufacturing on July 15, 2015, and later to retail on ...
.
Interactive voice response
The following are
interactive voice response
Interactive voice response (IVR) is a technology that allows telephone users to interact with a computer-operated telephone system through the use of voice and DTMF tones input with a keypad. In telecommunications, IVR allows customers to interac ...
(IVR) systems:
*
CSLU Toolkit
*
Genesys Genesys may refer to:
* Genesys (company), a customer experience and contact center technology company
* ''Genesys'' (video game), an educational video game released in 2000
* Genesys (website), a portal to information about plant genetic resource ...
*
HTK – copyrighted by Microsoft, but allows altering software for licensee's internal use
*
LumenVox
LumenVox is a privately held speech recognition software company based in San Diego, California. LumenVox has been described as one of the market leaders in the speech recognition software industry.
History
LumenVox was founded in 2001 as su ...
ASR
*
Tellme Networks; acquired by
Microsoft
Microsoft Corporation is an American multinational corporation, multinational technology company, technology corporation producing Software, computer software, consumer electronics, personal computers, and related services headquartered at th ...
Unix-like x86 and x86-64 speech transcription software
*
Janus Recognition Toolkit
Janus Recognition Toolkit (JRTk), sometimes referred to as Janus, is a general purpose speech recognition toolkit developed and maintained by the Interactive Systems Laboratories at Carnegie Mellon University and Karlsruhe Institute of Technology. ...
(JRTk)
*Mozilla
DeepSpeech is developing an
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
Speech-To-Text engine based on Baidu's deep speech research paper.
Discontinued software
* IBM VoiceType (formerly IBM Personal Dictation System)
*
IBM ViaVoice
IBM ViaVoice was a range of language-specific continuous speech recognition software products offered by IBM. The current version is designed primarily for use in embedded devices. The latest stable version of IBM Via Voice was 9.0 and was able ...
– Embedded version still maintained by
IBM. No longer supported for versions above Windows Vista. Untested above macOS 10.4 or on Macintoshes with an Intel chipset.
*
Quack.com; acquired by
AOL
AOL (stylized as Aol., formerly a company known as AOL Inc. and originally known as America Online) is an American web portal and online service provider based in New York City. It is a brand marketed by the current incarnation of Yahoo! Inc. ...
; the name has now been reused for an iPad search app.
*
SpeechWorks
SpeechWorks was a company founded in Boston in 1994 by speech recognition pioneer Mike Phillips and Bill O'Farrell. The Boston-based company developed and supported speech-related computer software. Originally known as Applied Language Technologi ...
from
Nuance Communications
Nuance Communications, Inc. is an American multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that markets speech recognition and artificial intelligence software.
Nuance merged with its compe ...
.
*
Yap Speech Cloud – Speech-to-text platform acquired by
Amazon.com
Amazon.com, Inc. ( ) is an American multinational technology company focusing on e-commerce, cloud computing, online advertising, digital streaming, and artificial intelligence. It has been referred to as "one of the most influential econom ...
.
See also
*
*
References
{{DEFAULTSORT:List Of Speech Recognition Software
*
Speech recognition
Speech recognition software
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...