Moses is a
free software
Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, ...
,
statistical machine translation engine that can be used to train
statistical model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
s of text translation from a source language to a target language, developed by the
University of Edinburgh
The University of Edinburgh ( sco, University o Edinburgh, gd, Oilthigh Dhùn Èideann; abbreviated as ''Edin.'' in post-nominals) is a public research university based in Edinburgh, Scotland. Granted a royal charter by King James VI in 15 ...
.
Moses then allows new source-language text to be decoded using these models to produce
automatic translations in the target language. Training requires a
parallel corpus of passages in the two languages, typically manually translated sentence pairs. Moses is released under the
LGPL licence and available both as source code and binaries for
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
and
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
. Its development is primarily supported by the
EuroMatrix project, with funding by the
European Commission
The European Commission (EC) is the executive of the European Union (EU). It operates as a cabinet government, with 27 members of the Commission (informally known as "Commissioners") headed by a President. It includes an administrative body ...
.
Among its features are:
* A
beam search algorithm that quickly finds the highest probability translation within a number of choices
* Phrase-based translation of short text chunks
* Handles words with multiple factored representations to enable the integration of linguistic and other information (e.g., surface form,
lemma
Lemma may refer to:
Language and linguistics
* Lemma (morphology), the canonical, dictionary or citation form of a word
* Lemma (psycholinguistics), a mental abstraction of a word about to be uttered
Science and mathematics
* Lemma (botany), a ...
and
morphology,
part-of-speech
In grammar, a part of speech or part-of-speech ( abbreviated as POS or PoS, also known as word class or grammatical category) is a category of words (or, more generally, of lexical items) that have similar grammatical properties. Words that are a ...
, word class)
* Decodes ambiguous forms of a source sentence, represented as a
confusion network, to support integration with upstream tools such as
speech recognizers
* Support for large
language models (LMs) such as IRSTLM (an exact LM using
memory-mapping) and RandLM (an inexact LM based on
Bloom filter
A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not – in ...
s)
See also
*
Apertium
*
OpenLogos
OpenLogos is an open source program that translates from English and German into French, Italian, Spanish and Portuguese. It accepts various document formats and maintains the format of the original document in translation. OpenLogos does not clai ...
*
Comparison of machine translation applications
*
Machine translation
Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates t ...
References
Further reading
* Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, Evan Herbst. (2007) "Moses: Open Source Toolkit for Statistical Machine Translation". ''Annual Meeting of the Association for Computational Linguistics (ACL), demonstration session, Prague, Czech Republic, June 2007''.
External links
*
IRSTLM HomepageRandLM Homepage
Machine translation software
Natural language processing toolkits
Free software programmed in C++
Statistical natural language processing
{{machine-translation-stub