List Of Speech Recognition Software
Speech recognition software is available for many computing platforms, operating systems, use models, and software licenses. Here is a listing of such, grouped in various useful ways. Acoustic models and speech corpus (compilation) The following list presents notable speech recognition software engines with a brief synopsis of characteristics. Macintosh Cross-platform web apps based on Chrome The following list presents notable speech recognition software that operate in a Chrome browser as web apps. They make use of HTML5 Web-Speech-API. Mobile devices and smartphones Many mobile phone handsets, including feature phones and smartphones such as iPhones and BlackBerrys, have basic dial-by-voice features built in. Many third-party apps have implemented natural-language speech recognition support, including: Windows Windows built-in speech recognition The Windows Speech Recognition version 8.0 by Microsoft comes built into Windows Vista, Windows 7, Windows 8 and Windows 10. ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Speech Recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech-to-text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis. Some speech recognition systems require "training" (also called "enrollment") where an individual speaker reads text or isolated vocabulary into the system. The system analyzes the person's specific voice and uses it to fine-tune the recognition of that person's speech, resulting in increased accuracy. Systems that do not use training are called "speaker-independent" systems. Systems that use training are called "speaker dependent". Speech recognition applications include voice user interfaces ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Julius (software)
Julius is a speech recognition engine, specifically a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers. It can perform almost real-time computing (RTC) decoding on most current personal computers (PCs) in 60k word dictation task using word trigram (3-gram) and context-dependent Hidden Markov model (HMM). Major search methods are fully incorporated. It is also modularized carefully to be independent from model structures, and various HMM types are supported such as shared-state triphones and tied-mixture models, with any number of mixtures, states, or phones. Standard formats are adopted to cope with other free modeling toolkit. The main platform is Linux and other Unix workstations, and it works on Windows. Julius is free and open-source software, released under a revised BSD style software license. Julius has been developed as part of a free software toolkit for Japanese LVCSR re ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
IListen
{{lowercase, iListen iListen, developed by MacSpeech, is a speech recognition program for the Apple Macintosh. In 2006, iListen was the only third-party software that allowed inputting text using one's voice that works on newer Macintosh models. Its competitors were Apple's own speech recognition software (built into Mac OS X); Dragon Naturally Speaking by Nuance, running under Windows virtualization software such as Parallels Desktop for Mac or VMware Fusion; and the discontinued speech recognition program ViaVoice by Nuance/IBM. In January 2008, iListen was discontinued by MacSpeech, and was replaced by its new speech recognition program, "Dictate" (released February 15, 2008). "Dictate" uses the same speech-recognition engine as Dragon NaturallySpeaking, having licensed it from Nuance. iListen's speech recognition engine was based on software developed by Philips. See also * MacSpeech_Dictate * MacSpeech MacSpeech, Inc. was a New Hampshire-based technology company that pro ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
MacSpeech Scribe
MacSpeech Scribe is speech recognition software for Mac OS X designed specifically for transcription of recorded voice dictation. It runs on Mac OS X 10.6 Snow Leopard. The software transcribes dictation recorded by an individual speaker. Typically, the speaker will record their dictation using a digital recording device such as a handheld digital recorder, mobile smartphone (e.g. iPhone), or desktop or laptop computer with a suitable microphone. MacSpeech Scribe supports specific audio file formats for recorded dictation: .aif, .aiff, .wav, .mp4, .m4a, and .m4v. MacSpeech Scribe was originally developed by MacSpeech, Inc. and released February 11, 2010, at Macworld Expo in San Francisco. The product is now owned by Nuance Communications which acquired MacSpeech on February 16, 2010. Nuance is the developer of other speech recognition products including Dragon NaturallySpeaking for Windows, Dragon Dictate for Mac (formerly "MacSpeech Dictate MacSpeech, Inc. was a New Hampshire-b ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Dragon Dictate
DragonDictate, Dragon Dictate, or Dragon for Mac is proprietary speech recognition software. The older program, DragonDictate, was originally developed by Dragon Systems for Microsoft Windows, and was replaced by Dragon NaturallySpeaking for Windows. It was later acquired by Nuance Communications. Dragon Dictate for Mac 2.0 (originally named MacSpeech Dictate) is supported only on Mac OS X 10.6 (Snow Leopard). Nuance's other products for Mac include MacSpeech Scribe. Original DragonDictate DragonDictate for Windows was the original speech recognition application from Dragon Systems and used discrete speech where the user must pause between speaking each word. The first version, 1.0 was available only through a few distribution and support partners. It included a Shure cardioid microphone headset. Later it was replaced by Dragon NaturallySpeaking, which allows continuous speech recognition and correction and training of words via the keyboard. NaturallySpeaking remains a Wi ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Python (programming Language)
Python is a high-level programming language, high-level, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation. Python is type system#DYNAMIC, dynamically type-checked and garbage collection (computer science), garbage-collected. It supports multiple programming paradigms, including structured programming, structured (particularly procedural programming, procedural), object-oriented and functional programming. It is often described as a "batteries included" language due to its comprehensive standard library. Guido van Rossum began working on Python in the late 1980s as a successor to the ABC (programming language), ABC programming language, and he first released it in 1991 as Python 0.9.0. Python 2.0 was released in 2000. Python 3.0, released in 2008, was a major revision not completely backward-compatible with earlier versions. Python 2.7.18, released in 2020, was the last release of ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
MIT License
The MIT License is a permissive software license originating at the Massachusetts Institute of Technology (MIT) in the late 1980s. As a permissive license, it puts very few restrictions on reuse and therefore has high license compatibility. Unlike copyleft software licenses, the MIT License also permits reuse within proprietary software, provided that all copies of the software or its substantial portions include a copy of the terms of the MIT License and also a copyright notice. In 2015, the MIT License was the most popular software license on GitHub, and was still the most popular in 2025. Notable projects that use the MIT License include the X Window System, Ruby on Rails, Node.js, Lua (programming language), Lua, jQuery, .NET, Angular (web framework), Angular, and React (JavaScript library), React. License terms The MIT License has the identifier MIT in the SPDX License List. It is also known as the "#Ambiguity and variants, Expat License". It has the following terms: Co ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Transformer (machine Learning Model)
The transformer is a deep learning architecture based on the multi-head attention mechanism, in which text is converted to numerical representations called tokens, and each token is converted into a vector via lookup from a word embedding table. At each layer, each token is then contextualized within the scope of the context window with other (unmasked) tokens via a parallel multi-head attention mechanism, allowing the signal for key tokens to be amplified and less important tokens to be diminished. Transformers have the advantage of having no recurrent units, therefore requiring less training time than earlier recurrent neural architectures (RNNs) such as long short-term memory (LSTM). Later variations have been widely adopted for training large language models (LLM) on large (language) datasets. The modern version of the transformer was proposed in the 2017 paper " Attention Is All You Need" by researchers at Google. Transformers were first developed as an improvement ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Whisper (speech Recognition System)
Whisper is a machine learning model for speech recognition and Transcription (linguistics), transcription, created by OpenAI and first released as open-source software in September 2022. It is capable of transcribing speech in English and several other languages, and is also capable of translating several non-English languages into English. OpenAI claims that the combination of different training data used in its development has led to improved recognition of accents, background noise and jargon compared to previous approaches. Whisper is a Weak_supervision, weakly-supervised deep learning acoustic model, made using an Transformer_(machine_learning_model)#Encoder/decoder_architecture, encoder-decoder transformer architecture. Whisper Large V2 was released on December 8, 2022. Whisper Large V3 was released in November 2023, on the OpenAI Dev Day. Background Speech recognition has had a long history in research; the first approaches made use of statistical methods, such as dyna ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
MacOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. Within the market of Desktop computer, desktop and laptop computers, it is the Usage share of operating systems#Desktop and laptop computers, second most widely used desktop OS, after Microsoft Windows and ahead of all Linux distributions, including ChromeOS and SteamOS. , the most recent release of macOS is MacOS Sequoia, macOS 15 Sequoia, the 21st major version of macOS. Mac OS X succeeded classic Mac OS, the primary Mac operating systems, Macintosh operating system from 1984 to 2001. Its underlying architecture came from NeXT's NeXTSTEP, as a result of NeXT#1997–2006: Acquisition by Apple, Apple's acquisition of NeXT, which also brought Steve Jobs back to Apple. The first desktop version, Mac OS X 10.0, was released on March 24, 2001. Mac ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, packaged as a Linux distribution (distro), which includes the kernel and supporting system software and library (computing), libraries—most of which are provided by third parties—to create a complete operating system, designed as a clone of Unix and released under the copyleft GPL license. List of Linux distributions, Thousands of Linux distributions exist, many based directly or indirectly on other distributions; popular Linux distributions include Debian, Fedora Linux, Linux Mint, Arch Linux, and Ubuntu, while commercial distributions include Red Hat Enterprise Linux, SUSE Linux Enterprise, and ChromeOS. Linux distributions are frequently used in server platforms. Many Linux distributions use the word "Linux" in their name, but the Free ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |