Machine translation Machine translation is use of computational techniques to translate text or speech from one language to another, including the contextual, idiomatic and pragmatic nuances of both languages. Early approaches were mostly rule-based or statisti ...

is a sub-field of

computational linguistics Computational linguistics is an interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics ...

that investigates the use of software to translate text or speech from one natural language to another. In the 1950s, machine translation became a reality in research, although references to the subject can be found as early as the 17th century. The Georgetown experiment, which involved successful fully automatic translation of more than sixty Russian sentences into English in 1954, was one of the earliest recorded projects. Researchers of the Georgetown experiment asserted their belief that machine translation would be a solved problem within a few years. In the Soviet Union, similar experiments were performed shortly after. Consequently, the success of the experiment ushered in an era of significant funding for machine translation research in the United States. The achieved progress was much slower than expected; in 1966, the ALPAC report found that ten years of research had not fulfilled the expectations of the Georgetown experiment and resulted in dramatically reduced funding. Interest grew in statistical models for machine translation, which became more common and also less expensive in the 1980s as available computational power increased. Although there exists no autonomous system of "fully automatic high quality translation of unrestricted text," there are many programs now available that are capable of providing useful output within strict constraints. Several of these programs are available online, such as

Google Translate Google Translate is a multilingualism, multilingual neural machine translation, neural machine translation service developed by Google to translation, translate text, documents and websites from one language into another. It offers a web applic ...

and the SYSTRAN system that powers AltaVista's BabelFish (which was replaced by Microsoft Bing translator in May 2012).

The beginning

The origins of machine translation can be traced back to the work of

Al-Kindi Abū Yūsuf Yaʻqūb ibn ʼIsḥāq aṣ-Ṣabbāḥ al-Kindī (; ; ; ) was an Arab Muslim polymath active as a philosopher, mathematician, physician, and music theorist Music theory is the study of theoretical frameworks for understandin ...

, a 9th-century Arabic

cryptographer Cryptography, or cryptology (from "hidden, secret"; and ''graphein'', "to write", or '' -logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of adversarial behavior. More gen ...

who developed techniques for systemic language translation, including

cryptanalysis Cryptanalysis (from the Greek ''kryptós'', "hidden", and ''analýein'', "to analyze") refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach cryptographic se ...

frequency analysis In cryptanalysis, frequency analysis (also known as counting letters) is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers. Frequency analysis is based on th ...

, and

probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...

and

statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, which are used in modern machine translation. The idea of machine translation later appeared in the 17th century. In 1629,

René Descartes René Descartes ( , ; ; 31 March 1596 – 11 February 1650) was a French philosopher, scientist, and mathematician, widely considered a seminal figure in the emergence of modern philosophy and Modern science, science. Mathematics was paramou ...

proposed a universal language, with equivalent ideas in different tongues sharing one symbol. In the mid-1930s the first patents for "translating machines" were applied for by Georges Artsrouni, for an automatic bilingual dictionary using

paper tape Five- and eight-hole wide punched paper tape Paper tape reader on the Harwell computer with a small piece of five-hole tape connected in a circle – creating a physical program loop Punched tape or perforated paper tape is a form of data st ...

. Russian Peter Troyanskii submitted a more detailed proposal that included both the bilingual dictionary and a method for dealing with grammatical roles between languages, based on the grammatical system of

Esperanto Esperanto (, ) is the world's most widely spoken Constructed language, constructed international auxiliary language. Created by L. L. Zamenhof in 1887 to be 'the International Language' (), it is intended to be a universal second language for ...

. This system was separated into three stages: stage one consisted of a native-speaking editor in the source language to organize the words into their

logical form In logic, the logical form of a statement is a precisely specified semantic version of that statement in a formal system. Informally, the logical form attempts to formalize a possibly ambiguous statement into a statement with a precise, unamb ...

s and to exercise the syntactic functions; stage two required the machine to "translate" these forms into the target language; and stage three required a native-speaking editor in the target language to normalize this output. Troyanskii's proposal remained unknown until the late 1950s, by which time computers were well-known and utilized.

The early years

The first set of proposals for computer based machine translation was presented in 1949 by

Warren Weaver Warren Weaver (July 17, 1894 – November 24, 1978) was an American scientist, mathematician, and science administrator. He is widely recognized as one of the pioneers of machine translation and as an important figure in creating support for scie ...

, a researcher at the

Rockefeller Foundation The Rockefeller Foundation is an American private foundation and philanthropic medical research and arts funding organization based at 420 Fifth Avenue, New York City. The foundation was created by Standard Oil magnate John D. Rockefeller (" ...

, " Translation memorandum". These proposals were based on

information theory Information theory is the mathematical study of the quantification (science), quantification, Data storage, storage, and telecommunications, communication of information. The field was established and formalized by Claude Shannon in the 1940s, ...

, successes in code breaking during the Second World War, and theories about the universal principles underlying

natural language A natural language or ordinary language is a language that occurs naturally in a human community by a process of use, repetition, and change. It can take different forms, typically either a spoken language or a sign language. Natural languages ...

. A few years after Weaver submitted his proposals, research began in earnest at many universities in the United States. On 7 January 1954 the

Georgetown–IBM experiment The Georgetown–IBM experiment was an influential demonstration of machine translation, which was performed on January 7, 1954. Developed jointly by Georgetown University and IBM, the experiment involved completely automatic translation of more th ...

was held in New York at the head office of IBM. This was the first public demonstration of a machine translation system. The demonstration was widely reported in the newspapers and garnered public interest. The system itself, however, was no more than a "toy" system. It had only 250 words and translated 49 carefully selected Russian sentences into English – mainly in the field of

chemistry Chemistry is the scientific study of the properties and behavior of matter. It is a physical science within the natural sciences that studies the chemical elements that make up matter and chemical compound, compounds made of atoms, molecules a ...

. Nevertheless, it encouraged the idea that machine translation was imminent and stimulated the financing of the research, not only in the US but worldwide. Early systems used large bilingual dictionaries and hand-coded rules for fixing the word order in the final output which was eventually considered too restrictive in linguistic developments at the time. For example,

generative linguistics Generative grammar is a research tradition in linguistics that aims to explain the cognition, cognitive basis of language by formulating and testing explicit models of humans' subconscious grammatical knowledge. Generative linguists, or generat ...

and

transformational grammar In linguistics, transformational grammar (TG) or transformational-generative grammar (TGG) was the earliest model of grammar proposed within the research tradition of generative grammar. Like current generative theories, it treated grammar as a sys ...

were exploited to improve the quality of translations. During this period operational systems were installed. The United States Air Force used a system produced by

IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...

and

Washington University in St. Louis Washington University in St. Louis (WashU) is a private research university in St. Louis, Missouri, United States. Founded in 1853 by a group of civic leaders and named for George Washington, the university spans 355 acres across its Danforth ...

, while the Atomic Energy Commission and Euratom, in Italy, used a system developed at

Georgetown University Georgetown University is a private university, private Jesuit research university in Washington, D.C., United States. Founded by Bishop John Carroll (archbishop of Baltimore), John Carroll in 1789, it is the oldest Catholic higher education, Ca ...

. While the quality of the output was poor it met many of the customers' needs, particularly in terms of speed. At the end of the 1950s,

Yehoshua Bar-Hillel Yehoshua Bar-Hillel (; 8 September 1915 – 25 September 1975) was an Israeli philosopher, mathematician, and linguist. He was a pioneer in the fields of machine translation and formal linguistics. Early life and education Born Oscar Westreich ...

was asked by the US government to look into machine translation, to assess the possibility of fully automatic high-quality translation by machines. Bar-Hillel described the problem of semantic ambiguity or double-meaning, as illustrated in the following sentence: The word ''pen'' may have two meanings: the first meaning, something used to write in ink with; the second meaning, a container of some kind. To a human, the meaning is obvious, but Bar-Hillel claimed that without a "universal encyclopedia" a machine would never be able to deal with this problem. At the time, this type of semantic ambiguity could only be solved by writing source texts for machine translation in a

controlled language Controlled natural languages (CNLs) are subsets of natural languages that are obtained by restricting the grammar and vocabulary in order to reduce or eliminate ambiguity and complexity. Traditionally, controlled languages fall into two major types ...

that uses a

vocabulary A vocabulary (also known as a lexicon) is a set of words, typically the set in a language or the set known to an individual. The word ''vocabulary'' originated from the Latin , meaning "a word, name". It forms an essential component of languag ...

in which each word has exactly one meaning.

The 1960s, the ALPAC report and the seventies

Research in the 1960s in both the

Soviet Union The Union of Soviet Socialist Republics. (USSR), commonly known as the Soviet Union, was a List of former transcontinental countries#Since 1700, transcontinental country that spanned much of Eurasia from 1922 until Dissolution of the Soviet ...

and the United States concentrated mainly on the Russian–English language pair. The objects of translation were chiefly scientific and technical documents, such as articles from

scientific journal In academic publishing, a scientific journal is a periodical publication designed to further the progress of science by disseminating new research findings to the scientific community. These journals serve as a platform for researchers, schola ...

s. The rough translations produced were sufficient to get a basic understanding of the articles. If an article discussed a subject deemed to be confidential, it was sent to a human translator for a complete translation; if not, it was discarded. A great blow came to machine-translation research in 1966 with the publication of the ALPAC report. The report was commissioned by the US government and delivered by ALPAC, the Automatic Language Processing Advisory Committee, a group of seven scientists convened by the US government in 1964. The US government was concerned that there was a lack of progress being made despite significant expenditure. The report concluded that machine translation was more expensive, less accurate and slower than human translation, and that despite the expenditures, machine translation was not likely to reach the quality of a human translator in the near future. The report recommended, however, that tools be developed to aid translators – automatic dictionaries, for example – and that some research in computational linguistics should continue to be supported. The publication of the report had a profound impact on research into machine translation in the United States, and to a lesser extent the

and United Kingdom. Research, at least in the US, was almost completely abandoned for over a decade. In Canada, France and Germany, however, research continued. In the US the main exceptions were the founders of SYSTRAN ( Peter Toma) and

Logos ''Logos'' (, ; ) is a term used in Western philosophy, psychology and rhetoric, as well as religion (notably Logos (Christianity), Christianity); among its connotations is that of a rationality, rational form of discourse that relies on inducti ...

(Bernard Scott), who established their companies in 1968 and 1970 respectively and served the US Department of Defense. In 1970, the SYSTRAN system was installed for the

United States Air Force The United States Air Force (USAF) is the Air force, air service branch of the United States Department of Defense. It is one of the six United States Armed Forces and one of the eight uniformed services of the United States. Tracing its ori ...

, and subsequently by the

Commission of the European Communities The European Communities (EC) were three international organizations that were governed by the same set of institutions. These were the European Coal and Steel Community (ECSC), the European Atomic Energy Community (EAEC or Euratom), and the ...

in 1976. The METEO System, developed at the

Université de Montréal The Université de Montréal (; UdeM; ) is a French-language public research university in Montreal, Quebec, Canada. The university's main campus is located in the Côte-des-Neiges neighborhood of Côte-des-Neiges–Notre-Dame-de-Grâce on M ...

, was installed in Canada in 1977 to translate weather forecasts from English to French, and was translating close to 80,000 words per day or 30 million words per year until it was replaced by a competitor's system on 30 September 2001. While research in the 1960s concentrated on limited language pairs and input, demand in the 1970s was for low-cost systems that could translate a range of technical and commercial documents. This demand was spurred by the increase of

globalisation Globalization is the process of increasing interdependence and integration among the economies, markets, societies, and cultures of different countries worldwide. This is made possible by the reduction of barriers to international trade, th ...

and the demand for translation in Canada, Europe, and Japan.

The 1980s and early 1990s

By the 1980s, both the diversity and the number of installed systems for machine translation had increased. A number of systems relying on

mainframe A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterpris ...

technology were in use, such as SYSTRAN,

, Ariane-G5, and

Metal A metal () is a material that, when polished or fractured, shows a lustrous appearance, and conducts electrical resistivity and conductivity, electricity and thermal conductivity, heat relatively well. These properties are all associated wit ...

. As a result of the improved availability of

microcomputer A microcomputer is a small, relatively inexpensive computer having a central processing unit (CPU) made out of a microprocessor. The computer also includes memory and input/output (I/O) circuitry together mounted on a printed circuit board (P ...

s, there was a market for lower-end machine translation systems. Many companies took advantage of this in Europe, Japan, and the USA. Systems were also brought onto the market in China, Eastern Europe, Korea, and the

. During the 1980s there was a lot of activity in MT in Japan especially. With the fifth-generation computer, Japan intended to leap over its competition in computer hardware and software, and one project that many large Japanese electronics firms found themselves involved in was creating software for translating into and from English (Fujitsu, Toshiba, NTT, Brother, Catena, Matsushita, Mitsubishi, Sharp, Sanyo, Hitachi, NEC, Panasonic, Kodensha, Nova, Oki). Research during the 1980s typically relied on translation through some variety of intermediary linguistic representation involving morphological, syntactic, and semantic analysis. At the end of the 1980s, there was a large surge in a number of novel methods for machine translation. One system was developed at

that was based on

statistical methods Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

. Makoto Nagao and his group used methods based on large numbers of translation examples, a technique that is now termed

example-based machine translation Example-based machine translation (EBMT) is a method of machine translation often characterized by its use of a bilingual corpus with parallel texts as its main knowledge base at run-time. It is essentially a translation by analogy and can be vie ...

. A defining feature of both of these approaches was the neglect of syntactic and semantic rules and reliance instead on the manipulation of large text

corpora Corpus (plural ''corpora'') is Latin for "body". It may refer to: Linguistics * Text corpus, in linguistics, a large and structured set of texts * Speech corpus, in linguistics, a large set of speech audio files * Corpus linguistics, a branch of ...

. During the 1990s, encouraged by successes in

speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...

and

speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...

, research began into speech translation with the development of the German Verbmobil project. The Forward Area Language Converter (FALCon) system, a machine translation technology designed by the

Army Research Laboratory The U.S. Army Combat Capabilities Development Command Army Research Laboratory (DEVCOM ARL) is the foundational research laboratory for the United States Army under the United States Army Futures Command (AFC). DEVCOM ARL conducts intramural an ...

, was fielded 1997 to translate documents for soldiers in Bosnia. There was significant growth in the use of machine translation as a result of the advent of low-cost and more powerful computers. It was in the early 1990s that machine translation began to make the transition away from large

mainframe computer A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterprise ...

s toward personal computers and

workstation A workstation is a special computer designed for technical or computational science, scientific applications. Intended primarily to be used by a single user, they are commonly connected to a local area network and run multi-user operating syste ...

s. Two companies that led the PC market for a time were Globalink and MicroTac, following which a merger of the two companies (in December 1994) was found to be in the corporate interest of both. Intergraph and Systran also began to offer PC versions around this time. Sites also became available on the internet, such as

AltaVista AltaVista was a web search engine established in 1995. It became one of the most-used early search engines, but lost ground to Google and was purchased by Yahoo! in 2003, which retained the brand, but based all AltaVista searches on its own sear ...

's Babel Fish (using Systran technology) and Google Language Tools (also initially using Systran technology exclusively).

2000s

The field of machine translation has seen major changes in the 2000s. A large amount of research was done into

statistical machine translation Statistical machine translation (SMT) is a machine translation approach where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The statistical approach contra ...

and

. In the area of speech translation, research was focused on moving from domain-limited systems to domain-unlimited translation systems. In different research projects in Europe (like TC-STAR) and in the United States (STR-DUST and DARPA Global autonomous language exploitation program), solutions for automatically translating Parliamentary speeches and broadcast news was developed. In these scenarios the domain of the content was no longer limited to any special area, but rather the speeches to be translated cover a variety of topics. The French–German project

Quaero Quaero (Latin for ''I seek'') was a European initiative designed to compete with the Google search engine. It was announced in 2005 by Jacques Chirac and Gerhard Schröder, the political leaders of France and of Germany. As a research and develo ...

investigated the possibility of making use of machine translations for a multi-lingual internet. The project sought to translate not only webpages, but also videos and audio files on the internet.

2010s

The past decade witnessed

neural machine translation Neural machine translation (NMT) is an approach to machine translation that uses an artificial neural network to predict the likelihood of a sequence of words, typically modeling entire sentences in a single integrated model. It is the dominant a ...

(NMT) methods replace

. The term neural machine translation was coined by Bahdanau et al and Sutskever et al who also published the first research regarding this topic in 2014. Neural networks only needed a fraction of the memory needed by statistical models and whole sentences could be modeled in an integrated manner. The first large scale NMT was launched by

Baidu Baidu, Inc. ( ; ) is a Chinese multinational technology company specializing in Internet services and artificial intelligence. It holds a dominant position in China's search engine market (via Baidu Search), and provides a wide variety of o ...

in 2015, followed by Google Neural Machine Translation (GNMT) in 2016. This was followed by other translation services like

DeepL Translator DeepL Translator is a neural machine translation service that was launched in August 2017 and is owned by Cologne-based DeepL SE. The translating system was first developed within Linguee and launched as entity ''DeepL''. It initially offered t ...

and the adoption of NMT technology in older translation services like

Microsoft translator Microsoft Translator or Bing Translator is a multilingual machine translation cloud service provided by Microsoft. Microsoft Translator is a part of Microsoft Cognitive Services and integrated across multiple consumer, developer, and enterprise pro ...

. Neural networks use a single end to end neural

network architecture Network architecture is the design of a computer network. It is a framework for the specification of a network's physical components and their functional organization and configuration, its operational principles and procedures, as well as commun ...

known as sequence to sequence (

seq2seq Seq2seq is a family of machine learning approaches used for natural language processing. Applications include language translation, image captioning, conversational models, speech recognition, and text summarization. Seq2seq uses sequence transfor ...

) which uses two

recurrent neural network Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important. Unlike feedforward neural networks, which proces ...

s (RNN). An encoder RNN and a decoder RNN. Encoder RNN uses encoding vectors on the source sentence and the decoder RNN generates the target sentence based on the previous encoding vector. Further advancements in the attention layer, transformation and back propagation techniques have made NMTs flexible and adopted in most machine translation, summarization and

chatbot A chatbot (originally chatterbot) is a software application or web interface designed to have textual or spoken conversations. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of main ...

technologies.

Notes

References

* *