Kaldi (software)
   HOME

TheInfoList



OR:

Kaldi is an
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
toolkit written in C++ for
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
and
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, Scalar potential, potential fields, Seismic tomograph ...
,
freely Freely is a British free-to-air IPTV service launched in 2024 by Everyone TV, a joint venture between the country's public broadcasters BBC, ITV, Channel 4 and 5. The service offers the ability to watch live television and on demand media from t ...
available under the
Apache License The Apache License is a permissive free software license written by the Apache Software Foundation (ASF). It allows users to use the software for any purpose, to distribute it, to modify it, and to distribute modified versions of the software ...
v2.0. Kaldi aims to provide software that is flexible and extensible, and is intended for use by automatic speech recognition (ASR) researchers for building a recognition system. It supports linear transforms, MMI, boosted MMI and MCE discriminative training, feature-space discriminative training, and
deep neural network Deep learning is a subset of machine learning that focuses on utilizing multilayered neural network (machine learning), neural networks to perform tasks such as Statistical classification, classification, Regression analysis, regression, and re ...
s. Kaldi is capable of generating features like mfcc, fbank,
fMLLR In signal processing, Feature space Maximum Likelihood Linear Regression (fMLLR) is a global feature transform that are typically applied in a speaker adaptive way, where fMLLR transforms acoustic features to speaker adapted features by a multiplica ...
, etc. Hence in recent deep neural network research, a popular usage of Kaldi is to pre-process raw waveform into acoustic feature for end-to-end neural models. Kaldi has been incorporated as part of th
CHiME Speech Separation and Recognition Challenge
over several successive events. The software was initially developed as part of a 2009 workshop at
Johns Hopkins University The Johns Hopkins University (often abbreviated as Johns Hopkins, Hopkins, or JHU) is a private university, private research university in Baltimore, Maryland, United States. Founded in 1876 based on the European research institution model, J ...
. Kaldi is named after the legendary
Ethiopia Ethiopia, officially the Federal Democratic Republic of Ethiopia, is a landlocked country located in the Horn of Africa region of East Africa. It shares borders with Eritrea to the north, Djibouti to the northeast, Somalia to the east, Ken ...
n goat herder
Kaldi Kaldi was a legendary Ethiopian goatherd who is credited for discovering the coffee plant around 850 CE, according to popular legend, after which such crop entered the Islamic world and then the rest of the world. Analysis The story is prob ...
who was said to have discovered the coffee plant.


See also

*
fMLLR In signal processing, Feature space Maximum Likelihood Linear Regression (fMLLR) is a global feature transform that are typically applied in a speaker adaptive way, where fMLLR transforms acoustic features to speaker adapted features by a multiplica ...
*
List of speech recognition software Speech recognition software is available for many computing platforms, operating systems, use models, and software licenses. Here is a listing of such, grouped in various useful ways. Acoustic models and speech corpus (compilation) The following l ...


References


External links

*
Kaldi
– The official
GitHub GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
project *Kaldi paper
The Kaldi Speech Recognition ToolkitVOSK
– open source and commercial models from Alpha Cephei on Kaldi foundations Free software projects Computational linguistics Speech recognition software Software using the Apache license {{Comp-ling-stub