HOME

TheInfoList



OR:

Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms. In English, the common ampersand (&) developed from a ligature in which the handwritten Latin letters ''e'' and ''t'' (spelling ''et'',
Latin Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power of the ...
for ''and'') were combined. The rules governing ligature formation in Arabic can be quite complex, requiring special script-shaping technologies such as the Arabic Calligraphic Engine by DecoType. As of
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
15.0, the Arabic script is contained in the following blocks: *
Arabic Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walter ...
(0600–06FF, 256 characters) * Arabic Supplement (0750–077F, 48 characters) *
Arabic Extended-B Arabic Extended-B is a Unicode block A Unicode block is one of several contiguous ranges of numeric character codes ( code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purpo ...
(0870–089F, 41 characters) * Arabic Extended-A (08A0–08FF, 96 characters) *
Arabic Presentation Forms-A Arabic Presentation Forms-A is a Unicode block encoding contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. This block also allocates 32 noncharacters in Unicode, designed specifically f ...
(FB50–FDFF, 631 characters) *
Arabic Presentation Forms-B Arabic Presentation Forms-B is a Unicode block encoding spacing forms of Arabic diacritics, and contextual letter forms. The special codepoint ZWNBSP is also here, which is only meant for a byte order mark The byte order mark (BOM) is a parti ...
(FE70–FEFF, 141 characters) * Rumi Numeral Symbols (10E60–10E7F, 31 characters) * Arabic Extended-C (10EC0-10EFF, 3 characters) *
Indic Siyaq Numbers Indic Siyaq Numbers is a Unicode block containing a specialized subset of the Arabic script that was used for accounting in India under the Mughals The Mughal Empire was an early-modern empire that controlled much of South Asia between ...
(1EC70–1ECBF, 68 characters) * Ottoman Siyaq Numbers (1ED00–1ED4F, 61 characters) * Arabic Mathematical Alphabetic Symbols (1EE00–1EEFF, 143 characters) The basic Arabic range encodes the standard letters and diacritics, but does not encode contextual forms (U+0621–U+0652 being directly based on ISO 8859-6); and also includes the most common diacritics and
Arabic-Indic digits The Eastern Arabic numerals, also called Arabic-Hindu numerals or Indo–Arabic numerals, are the symbols used to represent numerical digits in conjunction with the Arabic alphabet in the countries of the Mashriq (the east of the Arab world), ...
. The Arabic Supplement range encodes letter variants mostly used for writing African (non-Arabic) languages. The Arabic Extended-B and Arabic Extended-A ranges encode additional Qur'anic annotations and letter variants used for various non-Arabic languages. The Arabic Presentation Forms-A range encodes contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. The Arabic Presentation Forms-B range encodes spacing forms of Arabic diacritics, and more contextual letter forms. The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text. The Arabic Mathematical Alphabetical Symbols block encodes characters used in Arabic mathematical expressions. The Indic Siyaq Numbers block contains a specialized subset of Arabic script that was used for accounting in India under the
Mughal Empire The Mughal Empire was an early-modern empire that controlled much of South Asia between the 16th and 19th centuries. Quote: "Although the first two Timurid emperors and many of their noblemen were recent migrants to the subcontinent, the d ...
by the 17th century through the middle of the 20th century. The Ottoman Siyaq Numbers block contains a specialized subset of Arabic script, also known as ''Siyakat'' numbers, used for accounting in Ottoman Turkish documents.


Contextual forms

A demonstration for the basic alphabet used in
Modern Standard Arabic Modern Standard Arabic (MSA) or Modern Written Arabic (MWA), terms used mostly by linguists, is the variety of standardized, literary Arabic that developed in the Arab world in the late 19th and early 20th centuries; occasionally, it also ref ...
:


Punctuation and ornaments

Only the Arabic question mark ⟨⟩ and the Arabic comma ⟨⟩ are used in regular Arabic script typing and the comma is often substituted for the Latin script comma ( ,). * * * * * * * *U+066D ٭ * * * * * *U+FD3E Arabic ornate left parenthesis *U+FD3F ﴿ Arabic ornate right parenthesis


Word ligatures

Arabic Presentation Forms-A has a few characters defined as "word ligatures" for terms frequently used in formulaic expressions in Arabic. They are rarely used out of professional liturgical typing, also the Rial grapheme is normally written fully, not by the ligature. * * * *, as in the phrase الله أكبر ' * * * * * * * * * *


Code blocks


Arabic


Character table


Compact table


Arabic Supplement


Arabic Extended-B


Arabic Extended-A


Arabic Presentation Forms A

They are mostly ligatures which can be created from the previous charts' characters, with the exception of the bracket-like graphemes ﴾ ﴿ and some of them are ligatures of common liturgical phrases.


Arabic Presentation Forms B

These can all be created from the basic chart's characters.


Rumi Numeral Symbols


Arabic Extended-C


Indic Siyaq Numbers


Ottoman Siyaq Numbers


Arabic Mathematical Alphabetic Symbols


References


External links

* * * /software.sil.org/Scheherazade Scheherazadeor /fonts.google.com/specimen/Scheherazade+New?subset=arabic Scheherazade New an extended Arabic script font designed by SIL International, distributed under the SIL Open Font License (OFL) * /fonts.google.com/specimen/Harmattan?subset=arabic Harmattan an extended Arabic script font designed by SIL International for West Africa, distributed under the SIL Open Font License (OFL) {{Unicode navigation *
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...