Speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
software is available for many
computing platform
A computing platform, digital platform, or software platform is the infrastructure on which software is executed. While the individual components of a computing platform may be obfuscated under layers of abstraction, the ''summation of the requi ...
s,
operating system
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
s, use models, and
software license
A software license is a legal instrument governing the use or redistribution of software.
Since the 1970s, software copyright has been recognized in the United States. Despite the copyright being recognized, most companies prefer to sell lic ...
s. Here is a listing of such, grouped in various useful ways.
Acoustic models and speech corpus (compilation)
The following list presents notable
speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
software engines with a brief synopsis of characteristics.
Macintosh
Cross-platform web apps based on Chrome
The following list presents notable speech recognition software that operate in a Chrome browser as web apps. They make use of HTML5 Web-Speech-API.
Mobile devices and smartphones
Many
mobile phone
A mobile phone or cell phone is a portable telephone that allows users to make and receive calls over a radio frequency link while moving within a designated telephone service area, unlike fixed-location phones ( landline phones). This rad ...
handsets, including
feature phone
Feature may refer to:
Computing
* Feature recognition, could be a hole, pocket, or notch
* Feature (computer vision), could be an edge, corner or blob
* Feature (machine learning), in statistics: individual measurable properties of the phenome ...
s and
smartphone
A smartphone is a mobile phone with advanced computing capabilities. It typically has a touchscreen interface, allowing users to access a wide range of applications and services, such as web browsing, email, and social media, as well as multi ...
s such as
iPhone
The iPhone is a line of smartphones developed and marketed by Apple that run iOS, the company's own mobile operating system. The first-generation iPhone was announced by then–Apple CEO and co-founder Steve Jobs on January 9, 2007, at ...
s and
BlackBerry
BlackBerry is a discontinued brand of handheld devices and related mobile services, originally developed and maintained by the Canadian company Research In Motion (RIM, later known as BlackBerry Limited) until 2016. The first BlackBerry device ...
s, have basic dial-by-voice features built in. Many third-party apps have implemented natural-language speech recognition support, including:
Windows
Windows built-in speech recognition
The
Windows Speech Recognition
Windows Speech Recognition (WSR) is speech recognition developed by Microsoft for Windows Vista that enables hands-free computing, voice commands to control the desktop metaphor, desktop user interface, transcription (linguistics), dictate text i ...
version 8.0 by
Microsoft
Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
comes built into
Windows Vista
Windows Vista is a major release of the Windows NT operating system developed by Microsoft. It was the direct successor to Windows XP, released five years earlier, which was then the longest time span between successive releases of Microsoft W ...
,
Windows 7
Windows 7 is a major release of the Windows NT operating system developed by Microsoft. It was Software release life cycle#Release to manufacturing (RTM), released to manufacturing on July 22, 2009, and became generally available on October 22, ...
,
Windows 8
Windows 8 is a major release of the Windows NT operating system developed by Microsoft. It was Software release life cycle#Release to manufacturing (RTM), released to manufacturing on August 1, 2012, made available for download via Microsoft ...
and
Windows 10
Windows 10 is a major release of Microsoft's Windows NT operating system. The successor to Windows 8.1, it was Software release cycle#Release to manufacturing (RTM), released to manufacturing on July 15, 2015, and later to retail on July 2 ...
.
Speech Recognition is available only in English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese and only in the corresponding version of Windows; meaning you cannot use the speech recognition engine in one language if you use a version of Windows in another language. Windows 7 Ultimate and Windows 8 Pro allow you to change the system language, and therefore change which speech engine is available. Windows Speech Recognition evolved into
Cortana (software)
Cortana was a virtual assistant developed by Microsoft
Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company beca ...
, a personal assistant included in
Windows 10
Windows 10 is a major release of Microsoft's Windows NT operating system. The successor to Windows 8.1, it was Software release cycle#Release to manufacturing (RTM), released to manufacturing on July 15, 2015, and later to retail on July 2 ...
.
Windows 7, 8, 10, 11 third-party speech recognition
*
Braina – Dictate into third party software and websites, fill web forms and execute vocal commands.
*
Dragon NaturallySpeaking from
Nuance Communications
Nuance Communications, Inc. is an American multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that markets speech recognition and artificial intelligence software.
Nuance merged with its comp ...
– Successor to the older
DragonDictate product. Focus on
dictation. 64-bit Windows support since version 10.1.
*
Tazti – Create speech command profiles to play PC games and control applications – programs. Create speech commands to open files, folders, webpages, applications. Windows 7, Windows 8 and Windows 8.1 versions.
*
Voice Finger – software that improves the
Windows speech recognition
Windows Speech Recognition (WSR) is speech recognition developed by Microsoft for Windows Vista that enables hands-free computing, voice commands to control the desktop metaphor, desktop user interface, transcription (linguistics), dictate text i ...
system by adding several extensions to it. The software enables controlling the mouse and the keyboard by only using the voice. It is especially useful for aiding users to overcome disabilities or to heal from computer injuries.
Microsoft Speech API
The first version of the
Microsoft Speech API was released for Windows NT 3.51 and Windows 95 in 1995, it was then part of Windows up to Windows Vista. This initial version already contained Direct Speech Recognition and Direct Text To Speech APIs which applications could use to directly control engines, as well as simplified 'higher-level' Voice Command and Voice Talk APIs. Speech recognition functionality included as part of Microsoft Office and on
Tablet PCs running Microsoft Windows XP Tablet PC Edition. It can also be downloaded as part of the Speech SDK 5.1 for Windows applications, but since that is aimed at developers building speech applications, the pure SDK form lacks any user interface (numerous applications were available), and thus is unsuitable for end users.
Built-in software
*Microsoft
Kinect
Kinect is a discontinued line of motion sensing input devices produced by Microsoft and first released in 2010. The devices generally contain RGB color model, RGB cameras, and Thermographic camera, infrared projectors and detectors that map dep ...
includes built-in software which allows speech recognition of commands.
*Older generations of
Nokia
Nokia Corporation is a Finnish multinational corporation, multinational telecommunications industry, telecommunications, technology company, information technology, and consumer electronics corporation, originally established as a pulp mill in 1 ...
phones like Nokia N Series (before using
Windows 7
Windows 7 is a major release of the Windows NT operating system developed by Microsoft. It was Software release life cycle#Release to manufacturing (RTM), released to manufacturing on July 22, 2009, and became generally available on October 22, ...
mobile technology) used speech-recognition with family names from contact list and a few commands.
*
Siri
Siri ( , backronym: Speech Interpretation and Recognition Interface) is a digital assistant purchased, developed, and popularized by Apple Inc., which is included in the iOS, iPadOS, watchOS, macOS, Apple TV, audioOS, and visionOS operating sys ...
, originally implemented in the
iPhone 4S
The is a smartphone that was developed and marketed by Apple Inc. It is the List of iPhone models, fifth generation of the iPhone, succeeding the iPhone 4 and preceding the iPhone 5. It was announced on October 4, 2011, at Apple's Cupertino ...
,
Apple's personal assistant for
iOS
Ios, Io or Nio (, ; ; locally Nios, Νιός) is a Greek island in the Cyclades group in the Aegean Sea. Ios is a hilly island with cliffs down to the sea on most sides. It is situated halfway between Naxos and Santorini. It is about long an ...
, which uses technology from
Nuance Communications
Nuance Communications, Inc. is an American multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that markets speech recognition and artificial intelligence software.
Nuance merged with its comp ...
.
*
Cortana (software)
Cortana was a virtual assistant developed by Microsoft
Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company beca ...
,
Microsoft
Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
's personal assistant built into
Windows Phone
Windows Phone (WP) is a discontinued mobile operating system developed by Microsoft Mobile for smartphones as the replacement successor to Windows Mobile and Zune. Windows Phone featured a new user interface derived from the Metro design languag ...
and
Windows 10
Windows 10 is a major release of Microsoft's Windows NT operating system. The successor to Windows 8.1, it was Software release cycle#Release to manufacturing (RTM), released to manufacturing on July 15, 2015, and later to retail on July 2 ...
.
Interactive voice response
The following are
interactive voice response
Interactive voice response (IVR) is a technology that allows telephone users to interact with a computer-operated telephone system through the use of voice and DTMF tones input with a keypad. In telephony, IVR allows customers to interact with a ...
(IVR) systems:
* CSLU Toolkit
*
Genesys
*
HTK – copyrighted by Microsoft, but allows altering software for licensee's internal use
*
LumenVox ASR
*
Tellme Networks
Tellme Networks, Inc. was an American company founded in 1999 by Mike McCue and Angus Davis, which specialized in telephone-based applications. Its headquarters were in Mountain View, California.
Tellme Networks was acquired by Microsoft on Mar ...
; acquired by
Microsoft
Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
Unix-like x86 and x86-64 speech transcription software
*
Janus Recognition Toolkit (JRTk)
*Mozilla
DeepSpeech is developing an
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
Speech-To-Text engine based on Baidu's deep speech research paper.
Discontinued software
* IBM VoiceType (formerly IBM Personal Dictation System)
*
IBM ViaVoice – Embedded version still maintained by
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
. No longer supported for versions above Windows Vista. Untested above macOS 10.4 or on Macintoshes with an Intel chipset.
*
Quack.com; acquired by
AOL; the name has now been reused for an iPad search app.
*
SpeechWorks from
Nuance Communications
Nuance Communications, Inc. is an American multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that markets speech recognition and artificial intelligence software.
Nuance merged with its comp ...
.
*
Yap Speech Cloud – Speech-to-text platform acquired by
Amazon.com
Amazon.com, Inc., doing business as Amazon, is an American multinational technology company engaged in e-commerce, cloud computing, online advertising, digital streaming, and artificial intelligence. Founded in 1994 by Jeff Bezos in Bellevu ...
.
See also
*
*
References
{{DEFAULTSORT:List Of Speech Recognition Software
*
Speech recognition
Speech recognition software