Ḍamma
   HOME

TheInfoList



OR:

The
Arabic script The Arabic script is the writing system used for Arabic (Arabic alphabet) and several other languages of Asia and Africa. It is the second-most widely used alphabetic writing system in the world (after the Latin script), the second-most widel ...
has numerous
diacritic A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacrit ...
s, which include consonant pointing known as (, ), and supplementary diacritics known as (, ). The latter include the vowel marks termed (, ; , ', ). The Arabic script is a modified
abjad An abjad ( or abgad) is a writing system in which only consonants are represented, leaving the vowel sounds to be inferred by the reader. This contrasts with alphabets, which provide graphemes for both consonants and vowels. The term was introd ...
, where all letters are consonants, leaving it up to the reader to fill in the vowel sounds. Short consonants and long vowels are represented by letters, but short vowels and
consonant length In phonetics and phonology, gemination (; from Latin 'doubling', itself from '' gemini'' 'twins'), or consonant lengthening, is an articulation of a consonant for a longer period of time than that of a singleton consonant. It is distinct from ...
are not generally indicated in writing. ' is optional to represent missing vowels and consonant length. Modern Arabic is always written with the ''i‘jām''—consonant pointing—but only religious texts, children's books and works for learners are written with the full ''tashkīl''—vowel guides and consonant length. It is, however, not uncommon for authors to add diacritics to a word or letter when the grammatical case or the meaning is deemed otherwise ambiguous. In addition, classical works and historical documents rendered to the general public are often rendered with the full ''tashkīl'', to compensate for the gap in understanding resulting from stylistic changes over the centuries. Moreover, tashkīl can change the meaning of the entire word, for example, the words: (دِين), meaning (religion), and (دَين), meaning (debt). Even though they have the same letters, their meanings are different because of the tashkīl. In sentences without tashkīl, readers understand the meaning of the word by simply using context.


''Tashkīl''

The literal meaning of ' is 'formation'. As the normal Arabic text does not provide enough information about the correct pronunciation, the main purpose of ' (and ') is to provide a phonetic guide or a phonetic aid; i.e. show the correct pronunciation for children who are learning to read or foreign learners. The bulk of Arabic script is written without ' (or short vowels). However, they are commonly used in texts that demand strict adherence to exact pronunciation. This is true, primarily, of the
Qur'an The Quran, also romanized Qur'an or Koran, is the central religious text of Islam, believed by Muslims to be a revelation directly from God ('' Allāh''). It is organized in 114 chapters (, ) which consist of individual verses ('). Besides ...
(') and
poetry Poetry (from the Greek language, Greek word ''poiesis'', "making") is a form of literature, literary art that uses aesthetics, aesthetic and often rhythmic qualities of language to evoke meaning (linguistics), meanings in addition to, or in ...
. It is also quite common to add ' to
hadith Hadith is the Arabic word for a 'report' or an 'account f an event and refers to the Islamic oral tradition of anecdotes containing the purported words, actions, and the silent approvals of the Islamic prophet Muhammad or his immediate circle ...
s ('; plural: ') and the
Bible The Bible is a collection of religious texts that are central to Christianity and Judaism, and esteemed in other Abrahamic religions such as Islam. The Bible is an anthology (a compilation of texts of a variety of forms) originally writt ...
. Another use is in children's literature. Moreover, ' are used in ordinary texts in individual words when an ambiguity of pronunciation cannot easily be resolved from context alone. Arabic dictionaries with vowel marks provide information about the correct pronunciation to both native and foreign Arabic speakers. In art and
calligraphy Calligraphy () is a visual art related to writing. It is the design and execution of lettering with a pen, ink brush, or other writing instruments. Contemporary calligraphic practice can be defined as "the art of giving form to signs in an e ...
, ' might be used simply because their writing is considered aesthetically pleasing. An example of a fully ''vocalised'' (''vowelised'' or ''vowelled'') Arabic from the ''
Bismillah The (; also known by its opening words ; , "In the name of God") is the titular name of the Islamic phrase “In the name of God, the Most Gracious, the Most Merciful” (, ). It is one of the most important phrases in Islam and frequent ...
'': Some Arabic textbooks for foreigners now use ' as a phonetic guide to make learning reading Arabic easier. The other method used in textbooks is phonetic
romanisation In linguistics, romanization is the conversion of text from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, and transcription, ...
of unvocalised texts. Fully vocalised Arabic texts (i.e. Arabic texts with '/diacritics) are sought after by learners of Arabic. Some online bilingual dictionaries also provide ' as a phonetic guide similarly to English dictionaries providing transcription.


Ḥarakāt (short vowel marks)

The ' , which literally means 'motions', are the short vowel marks. There is some ambiguity as to which ' are also '; the ', for example, are markers for both vowels and consonants.


Fatḥah

The is a small diagonal line placed ''above'' a letter, and represents a short (like the /a/ sound in the English word "cat"). The word ' itself () means ''opening'' and refers to the opening of the mouth when producing an . For example, with '' '' (henceforth, the base consonant in the following examples): . When a is placed before a plain letter ('' '') (i.e. one having no hamza or vowel of its own), it represents a long (close to the sound of "a" in the English word "dad", with an open front vowel /æː/, not back /ɑː/ as in "father"). For example: . The ' is not usually written in such cases. When a fathah is placed before the letter ⟨⟩ (yā’), it creates an (as in "lie"); and when placed before the letter ⟨⟩ (wāw), it creates an (as in "cow"). Although paired with a plain letter creates an open front vowel (/a/), often realized as near-open (/ æ/), the standard also allows for variations, especially under certain surrounding conditions. Usually, in order to have the more central (/ ä/) or back (/ ɑ/) pronunciation, the word features a nearby back consonant, such as the emphatics, as well as ''
qāf Qoph is the nineteenth Letter (alphabet), letter of the Semitic abjads, including Phoenician alphabet, Phoenician ''qōp'' 𐤒, Hebrew alphabet, Hebrew ''qūp̄'' , Aramaic alphabet, Aramaic ''qop'' 𐡒, Syriac alphabet, Syriac ''qōp̄'' ܩ, ...
'', or '' rā’''. A similar "back" quality is undergone by other vowels as well in the presence of such consonants, however not as drastically realized as in the case of .Karin C. Ryding, "A Reference Grammar of Modern Standard Arabic", Cambridge University Press, 2005, pgs. 25-34, specifically “Chapter 2, Section 4: Vowels” 's are encoded , , , or .


Kasrah

A similar diagonal line ''below'' a letter is called a and designates a short (as in "me", "be") and its allophones , ɪ, e, e̞, ɛ(as in "Tim", "sit"). For example: . When a ' is placed before a plain letter ('' ''), it represents a long (as in the English word "steed"). For example: . The ' is usually not written in such cases, but if '' '' is pronounced as a diphthong , ' should be written on the preceding letter to avoid mispronunciation. The word ' means 'breaking'. 's are encoded , , , or .


Ḍammah

The is a small curl-like diacritic placed above a letter to represent a short /u/ (as in "duke", shorter "you") and its allophones , ʊ, o, o̞, ɔ(as in "put", or "bull"). For example: . When a is placed before a plain letter ('), it represents a long (like the 'oo' sound in the English word "swoop"). For example: . The ' is usually not written in such cases, but if ' is pronounced as a diphthong , ' should be written on the preceding consonant to avoid mispronunciation. The word ''ḍammah'' (ضَمَّة) in this context means ''rounding'', since it is the only rounded vowel in the vowel inventory of Arabic. 's are encoded , , , or .


Alif Khanjarīyah

The superscript (or dagger) ' ('), is written as short vertical stroke on top of a letter. It indicates a long sound for which '' '' is normally not written. For example: (') or ('). The dagger ' occurs in only a few words, but they include some common ones; it is seldom written, however, even in fully vocalised texts. Most keyboards do not have dagger ''.'' The word
Allah Allah ( ; , ) is an Arabic term for God, specifically the God in Abrahamic religions, God of Abraham. Outside of the Middle East, it is principally associated with God in Islam, Islam (in which it is also considered the proper name), althoug ...
()(
God In monotheistic belief systems, God is usually viewed as the supreme being, creator, and principal object of faith. In polytheistic belief systems, a god is "a spirit or being believed to have created, or for controlling some part of the un ...
) is usually produced automatically by entering ''.'' The word consists of ' + ligature of doubled ' with a ' and a dagger ' above ', followed by ''ha.


Maddah

The is a
tilde The tilde (, also ) is a grapheme or with a number of uses. The name of the character came into English from Spanish , which in turn came from the Latin , meaning 'title' or 'superscription'. Its primary use is as a diacritic (accent) in ...
-shaped diacritic, which can only appear on top of an
alif Alif may refer to: Languages * Alif (ا) in the Arabic alphabet#Alif, Arabic alphabet, equivalent to aleph, the first letter of many Semitic alphabets ** Dagger alif, superscript alif in Arabic alphabet * Alif, the first letter of the Urdu alpha ...
(آ) and indicates a
glottal stop The glottal stop or glottal plosive is a type of consonantal sound used in many Speech communication, spoken languages, produced by obstructing airflow in the vocal tract or, more precisely, the glottis. The symbol in the International Phonetic ...
followed by a long . In theory, the same sequence could also be represented by two 's, as in *, where a hamza above the first ' represents the while the second ' represents the . However, consecutive 's are never used in the Arabic orthography. Instead, this sequence must always be written as a single ' with a ' above it, the combination known as an '. For example: . In Quranic writings, a ''maddah'' is placed on any other letter to denote the name of the letter, though some letters may take on a dagger ''alif''. For example: (''lām''-''mīm''-''ṣād'') or (''yāʼ-sīn)''


Alif waṣlah

The , or looks like the head of a small '' '' on top of an ' (also indicated by an ' without a '). It means that the ' is not pronounced when its word does not begin a sentence. For example: ('), but (''imshū'' not ''mshū''). This is because in Arabic, the first consonant in a word must always be followed by a vowel sound: If the second letter from the has a kasrah, the alif-waslah makes the sound /i/. However, when the second letter from it has a dammah, it makes the sound /u/. It occurs only in the beginning of words, but it can occur after prepositions and the definite article. It is commonly found in imperative verbs, the perfective aspect of verb stems VII to X and their verbal nouns ('). The ''alif'' of the definite article is considered a '. It occurs in phrases and sentences (connected speech, not isolated/dictionary forms): * To replace the elided hamza whose alif-seat has assimilated to the previous vowel. For example: or (') 'in Yemen'. * In hamza-initial imperative forms following a vowel, especially following the conjunction (') 'and'. For example: َ (') 'rise and then drink the water'. Like the superscript alif, it is not written in fully vocalized scripts, except for sacred texts, like the Quran and Arabized Bible.


Sukūn

The is a circle-shaped diacritic placed above a letter (). It indicates that the letter to which it is attached is not followed by a vowel, i.e.,
zero 0 (zero) is a number representing an empty quantity. Adding (or subtracting) 0 to any number leaves that number unchanged; in mathematical terminology, 0 is the additive identity of the integers, rational numbers, real numbers, and compl ...
-vowel. It is a necessary symbol for writing consonant-vowel-consonant syllables, which are very common in Arabic. For example: ('). The may also be used to help represent a diphthong. A ' followed by the letter ('' '') with a over it () indicates the diphthong ' ( IPA ). A ', followed by the letter (') with a ', () indicates . 's are encoded , , or .
The may have also an alternative form of the small high head of ' (), particularly in some Qurans. Other shapes may exist as well (for example, like a small comma above ⟨ʼ⟩ or like a
circumflex The circumflex () is a diacritic in the Latin and Greek scripts that is also used in the written forms of many languages and in various romanization and transcription schemes. It received its English name from "bent around"a translation of ...
⟨ˆ⟩ in ').


Tanwīn

The three vowel diacritics may be doubled at the end of a word to indicate that the vowel is followed by the consonant ''n''. They may or may not be considered and are known as , or nunation. The signs indicate, from left to right, . These endings are used as non-pausal grammatical indefinite case endings in Literary Arabic or
classical Arabic Classical Arabic or Quranic Arabic () is the standardized literary form of Arabic used from the 7th century and throughout the Middle Ages, most notably in Umayyad Caliphate, Umayyad and Abbasid Caliphate, Abbasid literary texts such as poetry, e ...
( triptotes only). In a vocalised text, they may be written even if they are not pronounced (see
pausa In linguistics, pausa (Latin for 'break', from Greek παῦσις, ''pâusis'' 'stopping, ceasing') is the hiatus between prosodic declination units. The concept is somewhat broad, as it is primarily used to refer to allophones that occur in ...
). See '' '' for more details. In many spoken Arabic dialects, the endings are absent. Many Arabic textbooks introduce standard Arabic without these endings. The grammatical endings may not be written in some vocalized Arabic texts, as knowledge of '' '' varies from country to country, and there is a trend towards simplifying Arabic grammar. The sign is most commonly written in combination with '' '' , '' '' , ' , or stand-alone ' . ' should always be written (except for words ending in ' or diptotes) even if ' is not. Grammatical cases and ' endings in indefinite triptote forms: * ':
nominative case In grammar, the nominative case ( abbreviated ), subjective case, straight case, or upright case is one of the grammatical cases of a noun or other part of speech, which generally marks the subject of a verb, or (in Latin and formal variants ...
; * ':
accusative case In grammar, the accusative case ( abbreviated ) of a noun is the grammatical case used to receive the direct object of a transitive verb. In the English language, the only words that occur in the accusative case are pronouns: "me", "him", "he ...
, also serves as an adverbial marker; * ':
genitive case In grammar, the genitive case ( abbreviated ) is the grammatical case that marks a word, usually a noun, as modifying another word, also usually a noun—thus indicating an attributive relationship of one noun to the other noun. A genitive ca ...
.


Shaddah

The shadda or shaddah ('), or tashdid ('), is a diacritic shaped like a small written Latin " w". It is used to indicate
gemination In phonetics and phonology, gemination (; from Latin 'doubling', itself from '' gemini'' 'twins'), or consonant lengthening, is an articulation of a consonant for a longer period of time than that of a singleton consonant. It is distinct from ...
(consonant doubling or extra length), which is phonemic in Arabic. It is written above the consonant which is to be doubled. It is the only ' that is commonly used in ordinary spelling to avoid
ambiguity Ambiguity is the type of meaning (linguistics), meaning in which a phrase, statement, or resolution is not explicitly defined, making for several interpretations; others describe it as a concept or statement that has no real reference. A com ...
. For example: ; ' ('school') vs. ' ('teacher', female). Note that when the doubled letter bears a vowel, it is the shaddah that the vowel is attached to, not the letter itself: , . 's are encoded , , or .


I‘jām

The ''i‘jām'' (; sometimes also called ') are the diacritic points that distinguish various consonants that have the same form ('), such as , . Typically ''i‘jām'' are not considered diacritics but part of the letter. Early manuscripts of the
Quran The Quran, also Romanization, romanized Qur'an or Koran, is the central religious text of Islam, believed by Muslims to be a Waḥy, revelation directly from God in Islam, God (''Allah, Allāh''). It is organized in 114 chapters (, ) which ...
did not use diacritics either for vowels or to distinguish the different values of the ''.'' Vowel pointing was introduced first, as a red dot placed above, below, or beside the ', and later consonant pointing was introduced, as thin, short black single or multiple dashes placed above or below the ''rasm''. These ''i‘jām'' became black dots about the same time as the ' became small black letters or strokes. Typically, Egyptians do not use dots under final ' (), which looks exactly like alif maqsurah () in handwriting and in print. This practice is also used in copies of the '' '' (
Qurʾān The Quran, also romanized Qur'an or Koran, is the central religious text of Islam, believed by Muslims to be a revelation directly from God (''Allāh''). It is organized in 114 chapters (, ) which consist of individual verses ('). Besides i ...
) scribed by . The same unification of ' and ' has happened in Persian, resulting in what
the Unicode Standard Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 ch ...
calls "", that looks exactly the same as ' in initial and medial forms, but exactly the same as ' in final and isolated forms. At the time when the ''i‘jām'' was optional, unpointed letters were ambiguous. To clarify that a letter would lack ''i‘jām'' in pointed text, the letter could be marked with a small v- or
seagull Gulls, or colloquially seagulls, are seabirds of the subfamily Larinae. They are most closely related to terns and skimmers, distantly related to auks, and even more distantly related to waders. Until the 21st century, most gulls were placed ...
-shaped diacritic above, also a superscript semicircle (crescent), a subscript dot (except in the case of ; three dots were used with ), or a subscript miniature of the letter itself. A superscript stroke known as ''jarrah'', resembling a long ''fatħah'', was used for a contracted (assimilated) ''sin''. Thus were all used to indicate that the letter in question was truly and not . These signs, collectively known as ''‘alāmātu-l-ihmāl'', are still occasionally used in modern
Arabic calligraphy Arabic calligraphy is the artistic practice of penmanship, handwriting and calligraphy based on the Arabic alphabet. It is known in Arabic language, Arabic as ''khatt'' (), derived from the words 'line', 'design', or 'construction'. Kufic is the ...
, either for their original purpose (i.e. marking letters without ''i‘jām''), or often as purely decorative space-fillers. The small above the ''kāf'' in its final and isolated forms was originally an ''‘alāmatu-l-ihmāl'' that became a permanent part of the letter. Previously this sign could also appear above the medial form of ''kāf'', when that letter was written without the stroke on its ascender. When ''kaf'' was written without that stroke, it could be mistaken for ''lam'', thus ''kaf'' was distinguished with a superscript ''kaf'' or a small superscript ''
hamza The hamza ( ') () is an Arabic script character that, in the Arabic alphabet, denotes a glottal stop and, in non-Arabic languages, indicates a diphthong, vowel, or other features, depending on the language. Derived from the letter '' ʿayn'' ( ...
'' (''nabrah''), and ''lam'' with a superscript ''l-a-m'' (''lam-alif-mim'').


''Hamza''

Although not always considered a letter of the alphabet, the hamza (',
glottal stop The glottal stop or glottal plosive is a type of consonantal sound used in many Speech communication, spoken languages, produced by obstructing airflow in the vocal tract or, more precisely, the glottis. The symbol in the International Phonetic ...
), often stands as a separate letter in writing, is written in unpointed texts and is not considered a ''.'' It may appear as a letter by itself or as a diacritic over or under an ', ', or '. Which letter is to be used to support the ' depends on the quality of the adjacent vowels and its location in the word; * If the glottal stop occurs at the beginning of the word: ** Indicated by hamza on an ': above if the following vowel is or and below if it is . *** In order to clarify a starting /a/ or /u/, a respective ''fathah'' or ''dammah'' can be used * If the glottal stop occurs in the middle of the word the following prioritization of writing qualities are used: ** First'':'' if ''hamza'' is it is preceded or followed by , ''hamza'' sits on a tooth; ex: <عَائِلَة> ** Second: if ''hamza'' is preceded or followed by /u/, ''hamza'' sits on '', <ؤ>'' ** Third: else hamza sits on ''alif'', <أ> * If the glottal stop occurs at the end of the word (ignoring any grammatical suffixes), ** First: if ''hamza'' follows a short vowel it is written above ', ', or ' the same as for a medial case; ** Second: if it follows a long vowel, diphthong or consonant, ''hamza'' is written on the line <ء> * Exception: Two 's in succession are never allowed: is written with '' '' and is written with a free ' on the line . Consider the following words: ("brother"), ("Ismael"), ("mother"). All three of above words "begin" with a vowel opening the syllable, and in each case, ' is used to designate the initial glottal stop (the ''actual'' beginning). But if we consider ''middle'' syllables "beginning" with a vowel: ("origin"), ("hearts"—notice the syllable; singular ), ("heads", singular ), the situation is different, as noted above. See the comprehensive article on ''hamzah'' for more details.


Diacritics not used in Modern Standard Arabic

Diacritics not used in Modern Standard Arabic but in other languages that use the Arabic script, and sometimes to write Arabic dialects, include (the list is not exhaustive):


Rohingya tone markers

Historically Arabic script has been adopted and used by many tonal languages, examples include
Xiao'erjing Xiao'erjing, Xiaorjing, Xiaojing or Benjing, is a Arabic script, Perso-Arabic script used to write Sinitic languages, including Lanyin Mandarin, Zhongyuan Mandarin, Northeastern Mandarin, and Dungan language, Dungan. It is used on occasion ...
for
Mandarin Chinese Mandarin ( ; zh, s=, t=, p=Guānhuà, l=Mandarin (bureaucrat), officials' speech) is the largest branch of the Sinitic languages. Mandarin varieties are spoken by 70 percent of all Chinese speakers over a large geographical area that stretch ...
as well as
Ajami script Ajami (, ) or Ajamiyya (, ), which comes from the Arabic root for 'foreign' or 'stranger', is an Arabic script, Arabic-derived script used for writing Languages of Africa, African languages, particularly Songhai languages, Songhai, Mandé languages ...
adopted for writing various languages of Western Africa. However, the Arabic script never had an inherent way of representing tones until it was adapted for the
Rohingya language Rohingya (; Hanifi Rohingya: , ,, ) is an Indo-Aryan language spoken by the Rohingya people living in Rakhine State, Myanmar and Chittagong Division of Bangladesh. It is an Eastern Indo-Aryan language belonging to the Bengali–Assamese br ...
. The ''Rohingya Fonna'' are 3 tone markers which are part of the standardized and accepted orthographic convention of Rohingya. It remains the only known instance of tone markers within the
Arabic script The Arabic script is the writing system used for Arabic (Arabic alphabet) and several other languages of Asia and Africa. It is the second-most widely used alphabetic writing system in the world (after the Latin script), the second-most widel ...
. Tone markers act as "modifiers" of vowel diacritics. In simpler words, they are "diacritics for the diacritics". They are written "outside" of the word, meaning that they are written above the vowel diacritic if the diacritic is written above the word, and they are written below the diacritic if the diacritic is written below the word. They are only ever written where there are vowel diacritics. This is important to note, as without the diacritic present, there is no way to distinguish between tone markers and ''I‘jām'' i.e. dots that are used for purpose of phonetic distinctions of consonants. ''Hārbāy'' The Hārbāy as it is called in Rohingya, is a single dot that's placed on top of ''Fatḥah'' and ''Ḍammah'', or ''curly Fatḥah'' and ''curly Ḍammah'' (vowel diacritics unique to Rohinghya), or their respective ''Fatḥatan'' and ''Ḍammatan'' versions, and it's placed underneath ''Kasrah'' or ''curly Kasrah'', or their respective ''Kasratan'' version. (e.g. ) This tone marker indicates a short high tone (). ''Ṭelā'' The Ṭelā as it is called in Rohingya, is two dots that are placed on top of ''Fatḥah'' and ''Ḍammah'', or ''curly Fatḥah'' and ''curly Ḍammah'', or their respective ''Fatḥatan'' and ''Ḍammatan'' versions, and it's placed underneath ''Kasrah'' or ''curly Kasrah'', or their respective ''Kasratan'' version. (e.g. ) This tone marker indicates a long falling tone (). ''Ṭāna'' The Ṭāna as it is called in Rohingya, is a fish-like looping line that is placed on top of ''Fatḥah'' and ''Ḍammah'', or ''curly Fatḥah'' and ''curly Ḍammah'', or their respective ''Fatḥatan'' and ''Ḍammatan'' versions, and it's placed underneath ''Kasrah'' or ''curly Kasrah'', or their respective ''Kasratan'' version. (e.g. ) This tone marker indicates a long rising tone ().


History

According to tradition, the first to commission a system of ''ḥarakāt'' was Ali who appointed
Abu al-Aswad al-Du'ali Abu al-Aswad ad-Duʾali (, '; -16 BH/603 – 69 AH/688/89), whose full name was ʾAbū al-Aswad Ẓālim ibn ʿAmr ibn Sufyān ibn Jandal ibn Yamār ibn Hīls ibn Nufātha ibn al-ʿĀdi ibn ad-Dīl ibn Bakr, surnamed ad-Dīlī, or ad-Duwalī, was ...
for the task. Abu al-Aswad devised a system of dots to signal the three short vowels (along with their respective allophones) of Arabic. This system of dots predates the ', dots used to distinguish between different consonants. File:Basmala kufi.svg, Early Basmala Kufic File:Kufi.jpg, Middle Kufic File:Folio from a Qur’an, sura 91,14-15; sura 92,1-5 (F1929.70).jpg, Modern Kufic in Qur'an


Abu al-Aswad's system

Abu al-Aswad's system of Harakat was different from the system we know today. The system used red dots with each arrangement or position indicating a different short vowel. A dot above a letter indicated the vowel ', a dot below indicated the vowel ', a dot on the side of a letter stood for the vowel ', and two dots stood for the '' ''. However, the early manuscripts of the Qur'an did not use the vowel signs for every letter requiring them, but only for letters where they were necessary for a correct reading.


Al Farahidi's system

The precursor to the system we know today is Al Farahidi's system. '' '' found that the task of writing using two different colours was tedious and impractical. Another complication was that the ' had been introduced by then, which, while they were short strokes rather than the round dots seen today, meant that without a color distinction the two could become confused. Accordingly, he replaced the ' with small superscript letters: small alif, yā’, and wāw for the short vowels corresponding to the long vowels written with those letters, a small ''s(h)īn'' for ''shaddah'' (geminate), a small ''khā’'' for ''khafīf'' (short consonant; no longer used). His system is essentially the one we know today.


Automatic diacritization

The process of automatically restoring diacritical marks is called diacritization or diacritic restoration. It is useful to avoid ambiguity in applications such as Arabic machine translation,
text-to-speech Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or Computer hardware, hardware products. A text-to-speech (TTS) system conv ...
, and
information retrieval Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an Information needs, information need. The information need can be specified in the form ...
. Automatic diacritization algorithms have been developed. For
Modern Standard Arabic Modern Standard Arabic (MSA) or Modern Written Arabic (MWA) is the variety of Standard language, standardized, Literary language, literary Arabic that developed in the Arab world in the late 19th and early 20th centuries, and in some usages al ...
, the
state-of-the-art The state of the art (SOTA or SotA, sometimes cutting edge, leading edge, or bleeding edge) refers to the highest level of general development, as of a device, technique, or scientific field achieved at a particular time. However, in some contex ...
algorithm has a
word error rate Word error rate (WER) is a common metric of the performance of a speech recognition or machine translation system. The WER metric typically ranges from 0 to 1, where 0 indicates that the compared pieces of text are exactly identical, and 1 (or larg ...
(WER) of 4.79%. The most common mistakes are proper
nouns In grammar, a noun is a word that represents a concrete or abstract thing, like living creatures, places, actions, qualities, states of existence, and ideas. A noun may serve as an object or subject within a phrase, clause, or sentence.Example n ...
and
case endings A grammatical case is a category of nouns and noun modifiers (determiners, adjectives, participles, and numerals) that corresponds to one or more potential grammatical functions for a nominal group in a wording. In various languages, nominal ...
. Similar algorithms exist for other
varieties of Arabic Varieties of Arabic (or dialects or vernaculars) are the linguistic systems that Arabic speakers speak natively. Arabic is a Semitic languages, Semitic language within the Afroasiatic languages, Afroasiatic family that originated in the Arabian P ...
.


See also

*
Arabic alphabet The Arabic alphabet, or the Arabic abjad, is the Arabic script as specifically codified for writing the Arabic language. It is a unicase, unicameral script written from right-to-left in a cursive style, and includes 28 letters, of which most ...
: ** '' '' (), the case system of Arabic ** '' '' (), the basic system of Arabic consonants ** '' '' (), the phonetic rules of recitation of Qur'an in Arabic *
Hebrew Hebrew (; ''ʿÎbrit'') is a Northwest Semitic languages, Northwest Semitic language within the Afroasiatic languages, Afroasiatic language family. A regional dialect of the Canaanite languages, it was natively spoken by the Israelites and ...
: ** Hebrew diacritics, the Hebrew equivalent ** ''
Niqqud In Hebrew orthography, niqqud or nikud ( or ) is a system of diacritical signs used to represent vowels or distinguish between alternative pronunciations of letters of the Hebrew alphabet. Several such diacritical systems were developed in the Ea ...
,'' the Hebrew equivalent of ' ** ''
Dagesh The dagesh () is a diacritic that is used in the Hebrew alphabet. It takes the form of a dot placed inside a consonant. A dagesh can either indicate a "hard" plosive version of the consonant (known as , literally 'light dot') or that the conson ...
,'' the Hebrew diacritic similar to Arabic ' and shaddah


References


Alexis Neme and Sébastien Paumier (2019), "Restoring Arabic vowels through omission-tolerant dictionary lookup", ''Lang Resources & Evaluation'', Vol. 53, pp. 1-65
{{Navbox diacritical marks Arabic words and phrases Quranic orthography Phonetic guides