Arabic
Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walter ...
is one of the major languages that have been given attention by
machine translation
Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates ...
(MT) researchers since the very early days of MT and specifically in the U.S. The language has always been considered "due to its
morphological,
syntactic
In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituency) ...
,
phonetic
Phonetics is a branch of linguistics that studies how humans produce and perceive sounds, or in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians. ...
and
phonological
Phonology is the branch of linguistics that studies how languages or dialects systematically organize their sounds or, for sign languages, their constituent parts of signs. The term can also refer specifically to the sound or sign system of a ...
properties
o beone of the most difficult languages for
written
Writing is a medium of human communication which involves the representation of a language through a system of physically inscribed, mechanically transferred, or digitally represented symbols.
Writing systems do not themselves constitute h ...
and
spoken language
A spoken language is a language produced by articulate sounds or (depending on one's definition) manual gestures, as opposed to a written language. An oral language or vocal language is a language produced with the vocal tract in contrast with a si ...
processing."
Arabic "differs tremendously in terms of its
characters
Character or Characters may refer to:
Arts, entertainment, and media Literature
* ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk
* ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to The ...
, morphology and
diacritization from other languages."
Accordingly, researchers cannot always import solutions from other languages, and today Arabic machine translation still needs more efforts to be improved, mainly in the area of semantic representation systems, which are essential for achieving high-quality translation.
Approaches for the study of machine processing of Arabic
Particularistic approaches
Particularistic approaches describe the linguistic features of Arabic and use them for a local processing approach specific to the internal linguistic system of Arabic. They are concerned with the morphological and semantic aspects of Arabic.
Sakhr is one of the Arabic speaking groups developing systematically machine processing of Arabic.
Universalist approaches
Universalist approaches use the methods and systems proved to be useful in other languages like
English
English usually refers to:
* English language
* English people
English may also refer to:
Peoples, culture, and language
* ''English'', an adjective for something of, from, or related to England
** English national ...
or
French
French (french: français(e), link=no) may refer to:
* Something of, from, or related to France
** French language, which originated in France, and its various dialects and accents
** French people, a nation and ethnic group identified with Franc ...
making some adaptations if necessary. The focus here is on the syntactic aspects of the linguistic system in general. This approach is followed by most of the companies producing
software application
Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work.
At the lowest programming level, executable code consists ...
s for Arabic.
References
External links
*
*
*{{cite journal, url=https://www.academia.edu/530244, title=MSc Thesis: A generic framework for Arabic to English machine translation of simplex sentences using the Role and Reference Grammar linguistic model, last=Salem, first=Yasser, date=April 2009, publisher=Academia.edu, accessdate=24 March 2014
Arabic-language computing
Machine translation