
An alphabet is a standardized set of basic written
graphemes (called
letters) that represent the
phoneme
In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language.
For example, in most dialects of English, with the notable exception of the West Midlands and the north-wes ...
s of certain
spoken languages. Not all
writing system
A writing system is a method of visually representing verbal communication, based on a script and a set of rules regulating its use. While both writing and speech are useful in conveying messages, writing differs in also being a reliable for ...
s represent language in this way; in a
syllabary
In the linguistic study of written languages, a syllabary is a set of written symbols that represent the syllables or (more frequently) moras which make up words.
A symbol in a syllabary, called a syllabogram, typically represents an (optiona ...
, each character represents a
syllable
A syllable is a unit of organization for a sequence of speech sounds typically made up of a syllable nucleus (most often a vowel) with optional initial and final margins (typically, consonants). Syllables are often considered the phonological "bu ...
, and
logographic systems use
characters to represent words,
morphemes, or other semantic units.
The first fully phonemic script, the
Proto-Sinaitic script, later known as the
Phoenician alphabet
The Phoenician alphabet is an alphabet (more specifically, an abjad) known in modern times from the Canaanite and Aramaic inscriptions found across the Mediterranean region. The name comes from the Phoenician civilization.
The Phoenician a ...
, is considered to be the first alphabet and is the ancestor of most modern alphabets, including
Arabic
Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walte ...
,
Cyrillic,
Greek,
Hebrew
Hebrew (; ; ) is a Northwest Semitic language of the Afroasiatic language family. Historically, it is one of the spoken languages of the Israelites and their longest-surviving descendants, the Jews and Samaritans. It was largely preserved ...
,
Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power ...
, and possibly
Brahmic.
It was created by Semitic-speaking workers and slaves in the
Sinai Peninsula (as the Proto-Sinaitic script), by selecting a small number of
hieroglyphs commonly seen in their
Egyptian surroundings to
describe the sounds, as opposed to the semantic values of the
Canaanite languages. However,
Peter T. Daniels distinguishes an
abugida
An abugida (, from Ge'ez language, Ge'ez: ), sometimes known as alphasyllabary, neosyllabary or pseudo-alphabet, is a segmental Writing systems#Segmental writing system, writing system in which consonant-vowel sequences are written as units; ...
, a set of graphemes that represent consonantal base letters that
diacritic
A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacriti ...
s modify to represent vowels (as in
Devanagari and other South Asian scripts), an
abjad
An abjad (, ar, أبجد; also abgad) is a writing system in which only consonants are represented, leaving vowel sounds to be inferred by the reader. This contrasts with other alphabets, which provide graphemes for both consonants and vowels ...
, in which letters predominantly or exclusively represent
consonant
In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract. Examples are and pronounced with the lips; and pronounced with the front of the tongue; and pronounced ...
s (as in the original Phoenician,
Hebrew
Hebrew (; ; ) is a Northwest Semitic language of the Afroasiatic language family. Historically, it is one of the spoken languages of the Israelites and their longest-surviving descendants, the Jews and Samaritans. It was largely preserved ...
or
Arabic
Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walte ...
), and an alphabet, a set of graphemes that represent both consonants and
vowels. In this narrow sense of the word, the first true alphabet was the Greek alphabet,
which was based on the earlier Phoenician alphabet.
Of the dozens of alphabets in use today, the most popular is the Latin alphabet, which was derived from the Greek alphabet, and which is now used by many languages worldwide, often with the addition of extra letters or diacritical marks. While most alphabets have
letters composed of lines, there are also
exceptions such as the alphabets used in
Braille
Braille (Pronounced: ) is a tactile writing system used by people who are visually impaired, including people who are blind, deafblind or who have low vision. It can be read either on embossed paper or by using refreshable braille display ...
.
Alphabets are usually associated with a standard ordering of letters. This makes them useful for purposes of
collation, specifically by allowing words to be sorted in
alphabetical order. It also means that their letters can be used as an alternative method of "numbering" ordered items, in such contexts as
numbered list
Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof. Collation is a fundamental element of most office filin ...
s and number placements.
Etymology
The English word ''alphabet'' came into
Middle English
Middle English (abbreviated to ME) is a form of the English language that was spoken after the Norman conquest of 1066, until the late 15th century. The English language underwent distinct variations and developments following the Old English ...
from the
Late Latin
Late Latin ( la, Latinitas serior) is the scholarly name for the form of Literary Latin of late antiquity.Roberts (1996), p. 537. English dictionary definitions of Late Latin date this period from the , and continuing into the 7th century in the ...
word ''alphabetum'', which in turn originated in the Greek ἀλφάβητος (''alphabētos''), it was made from the first two letters, ''
alpha'' (α) and ''
beta'' (β). The names for the Greek letters came from the first two letters of the Phoenician alphabet; ''
aleph'', which also meant ''ox'', and ''
bet'', which also meant ''house''.
History

Ancient Northeast African and Middle Eastern scripts
The history of the alphabet started in ancient Egypt. Egyptian writing had a set of some
24 hieroglyphs that are called uniliterals, representing syllables that begin with a single consonant of their language, plus a vowel (or no vowel) to be supplied by the native speaker. These glyphs were used as pronunciation guides for
logograms, to write grammatical inflections, and, later, to transcribe loan words and foreign names.

In the
Middle Bronze Age
The Bronze Age is a historic period, lasting approximately from 3300 BC to 1200 BC, characterized by the use of bronze, the presence of writing in some areas, and other early features of urban civilization. The Bronze Age is the second pri ...
, an apparently "alphabetic" system known as the Proto-Sinaitic script appeared in Egyptian turquoise mines in the
Sinai peninsula dated to circa the 15th century BCE, apparently left by Canaanite workers. In 1999, John and Deborah Darnell discovered an earlier version of this first alphabet at
Wadi el-Hol. Dating to circa 1800 BCE and showing evidence of having been adapted from specific forms of Egyptian hieroglyphs that could be dated to circa 2000 BCE, strongly suggesting that the first alphabet had developed about that time. Based on letter appearances and names, believed to be based on Egyptian hieroglyphs.
This script had no characters representing vowels. Originally, it probably was a syllabary, with symbols that were not needed being removed. An alphabetic
cuneiform
Cuneiform is a logo- syllabic script that was used to write several languages of the Ancient Middle East. The script was in active use from the early Bronze Age until the beginning of the Common Era. It is named for the characteristic wedg ...
script with 30 signs, including three that indicate the following vowel invented in
Ugarit before the 15th century BCE. This script was not used after the destruction of Ugarit.
The Proto-Sinaitic script eventually developed into the Phoenician alphabet, conventionally called "Proto-Canaanite" before c. 1050 BCE.
The oldest text in Phoenician script is an inscription on the sarcophagus of King
Ahiram. This script is the parent script of all western alphabets. By the tenth century, two other forms distinguish themselves,
Canaanite and
Aramaic
The Aramaic languages, short Aramaic ( syc, ܐܪܡܝܐ, Arāmāyā; oar, 𐤀𐤓𐤌𐤉𐤀; arc, 𐡀𐡓𐡌𐡉𐡀; tmr, אֲרָמִית), are a language family containing many varieties (languages and dialects) that originated i ...
. The Aramaic gave rise to the
Hebrew
Hebrew (; ; ) is a Northwest Semitic language of the Afroasiatic language family. Historically, it is one of the spoken languages of the Israelites and their longest-surviving descendants, the Jews and Samaritans. It was largely preserved ...
script. The
South Arabian alphabet, a sister script to the Phoenician alphabet, is the script from which the
Ge'ez alphabet (an
abugida
An abugida (, from Ge'ez language, Ge'ez: ), sometimes known as alphasyllabary, neosyllabary or pseudo-alphabet, is a segmental Writing systems#Segmental writing system, writing system in which consonant-vowel sequences are written as units; ...
) descended. Vowel-less alphabets are called
abjad
An abjad (, ar, أبجد; also abgad) is a writing system in which only consonants are represented, leaving vowel sounds to be inferred by the reader. This contrasts with other alphabets, which provide graphemes for both consonants and vowels ...
s, currently exemplified in others such as, Arabic, Hebrew, and
Syriac. The omission of vowels was not always a satisfactory solution, and some "weak"
consonants sometimes being used to indicate the vowel quality of a syllable). These letters have a dual function since they can also be used as pure consonants.
The Proto-Sinaitic script and the Ugaritic script were the first scripts with a limited number of signs, in contrast to the other widely used writing systems at the time,
Cuneiform
Cuneiform is a logo- syllabic script that was used to write several languages of the Ancient Middle East. The script was in active use from the early Bronze Age until the beginning of the Common Era. It is named for the characteristic wedg ...
,
Egyptian hieroglyphs, and
Linear B
Linear B was a syllabic script used for writing in Mycenaean Greek, the earliest attested form of Greek. The script predates the Greek alphabet by several centuries. The oldest Mycenaean writing dates to about 1400 BC. It is descended from ...
. The Phoenician script was probably the first phonemic script,
and it contained only about two dozen distinct letters, making it a script simple enough for traders to learn. Another advantage of Phoenician was that it could write different languages since it recorded words phonemically.

The script was spread across the Mediterranean by the Phoenicians.
In Greece, they added vowels to the alphabet, giving rise to the ancestor of all alphabets in the West. It was the first alphabet in which vowels have independent letter forms separate from those of consonants. The Greeks chose letters representing sounds that did not exist in Greek to represent vowels. Vowels are significant in the Greek language. The syllabical Linear B script that was used by the
Mycenaean Greeks from the 16th century BCE had 87 symbols, including five vowels. In its early years, there were many variants of the Greek alphabet, a situation that caused many different alphabets to evolve from it.
European alphabets

The Greek alphabet, in
Euboean form, was carried over by Greek colonists to the Italian peninsula, giving rise to many different alphabets used to write the
Italic languages
The Italic languages form a branch of the Indo-European language family, whose earliest known members were spoken on the Italian Peninsula in the first millennium BC. The most important of the ancient languages was Latin, the official langua ...
. One of these became the Latin alphabet, which spread across Europe as the Romans expanded their empire. Even after the fall of the Roman state, the alphabet survived in intellectual and religious works. It eventually became used for the descendant languages of Latin (the
Romance languages
The Romance languages, sometimes referred to as Latin languages or Neo-Latin languages, are the various modern languages that evolved from Vulgar Latin. They are the only extant subgroup of the Italic languages in the Indo-European language f ...
) and most of the other languages of western and central Europe.
Some adaptations of the Latin alphabet have
ligatures
Ligature may refer to:
* Ligature (medicine), a piece of suture used to shut off a blood vessel or other anatomical structure
** Ligature (orthodontic), used in dentistry
* Ligature (music), an element of musical notation used especially in the me ...
, such as
æ in
Danish and
Icelandic and
Ȣ in
Algonquian; borrowings from other alphabets, such as the
thorn þ in
Old English
Old English (, ), or Anglo-Saxon, is the earliest recorded form of the English language, spoken in England and southern and eastern Scotland in the early Middle Ages
In the history of Europe, the Middle Ages or medieval period la ...
and
Icelandic, which came from the
Futhark runes; and modified existing letters, such as the
eth ð of Old English and Icelandic, which is a modified ''d''. Other alphabets only use a subset of the Latin alphabet, such as Hawaiian, and
Italian, which uses the letters ''j, k, x, y,'' and ''w'' only in foreign words.
Another notable script is
Elder Futhark
The Elder Futhark (or Fuþark), also known as the Older Futhark, Old Futhark, or Germanic Futhark, is the oldest form of the runic alphabets. It was a writing system used by Germanic peoples for Northwest Germanic dialects in the Migration Peri ...
, believed to have evolved out of one of the
Old Italic alphabet
The Old Italic scripts are a family of similar ancient writing systems used in the Italian Peninsula between about 700 and 100 BC, for various languages spoken in that time and place. The most notable member is the Etruscan alphabet, which ...
s. Elder Futhark gave rise to other alphabets known collectively as the
Runic alphabet
Runes are the letters in a set of related alphabets known as runic alphabets native to the Germanic peoples. Runes were used to write various Germanic languages (with some exceptions) before they adopted the Latin alphabet, and for specialised ...
s. The Runic alphabets were used for Germanic languages from 100 CE to the late Middle Ages being engraved on stone and jewelry, although inscriptions found on bone and wood occasionally appear. These alphabets have since gotten replaced with the Latin alphabet. The exception being for decorative use, where the runes remained in use until the 20th century.
The
Old Hungarian script was the writing system of the Hungarians. It was in use during the entire history of Hungary, albeit not as an official writing system. From the 19th century, it once again became more and more popular.
The
Glagolitic alphabet
The Glagolitic script (, , ''glagolitsa'') is the oldest known Slavic alphabet. It is generally agreed to have been created in the 9th century by Saint Cyril, a monk from Thessalonica. He and his brother Saint Methodius were sent by the Byz ...
was the initial script of the liturgical language
Old Church Slavonic
Old Church Slavonic or Old Slavonic () was the first Slavic languages, Slavic literary language.
Historians credit the 9th-century Byzantine Empire, Byzantine missionaries Saints Cyril and Methodius with Standard language, standardizing the lan ...
and became, together with the Greek uncial script, the basis of the
Cyrillic script
The Cyrillic script ( ), Slavonic script or the Slavic script, is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking cou ...
. Cyrillic is one of the most widely used modern alphabetic scripts, and is notable for its use in Slavic languages and also for other languages within the former
Soviet Union
The Soviet Union,. officially the Union of Soviet Socialist Republics. (USSR),. was a List of former transcontinental countries#Since 1700, transcontinental country that spanned much of Eurasia from 1922 to 1991. A flagship communist state, ...
.
Cyrillic alphabets include
Serbian
Serbian may refer to:
* someone or something related to Serbia, a country in Southeastern Europe
* someone or something related to the Serbs, a South Slavic people
* Serbian language
* Serbian names
See also
*
*
* Old Serbian (disambiguat ...
,
Macedonian
Macedonian most often refers to someone or something from or related to Macedonia.
Macedonian(s) may specifically refer to:
People Modern
* Macedonians (ethnic group), a nation and a South Slavic ethnic group primarily associated with North M ...
,
Bulgarian,
Russian,
Belarusian
Belarusian may refer to:
* Something of, or related to Belarus
* Belarusians, people from Belarus, or of Belarusian descent
* A citizen of Belarus, see Demographics of Belarus
* Belarusian language
* Belarusian culture
* Belarusian cuisine
* Byelor ...
, and
Ukrainian. The Glagolitic alphabet believed to have been created by
Saints Cyril and Methodius, while the Cyrillic alphabet was invented by
Clement of Ohrid, their disciple. They feature many letters that appear to have been borrowed from or influenced by Greek and Hebrew.
Asian alphabets
Beyond the logographic
Chinese writing, many phonetic scripts exist in Asia. The
Arabic alphabet,
Hebrew alphabet
The Hebrew alphabet ( he, אָלֶף־בֵּית עִבְרִי, ), known variously by scholars as the Ktav Ashuri, Jewish script, square script and block script, is an abjad script used in the writing of the Hebrew language and other Jewis ...
,
Syriac alphabet
The Syriac alphabet ( ) is a writing system primarily used to write the Syriac language since the 1st century AD. It is one of the Semitic abjads descending from the Aramaic alphabet through the Palmyrene alphabet, and shares similarities with ...
, and other
abjad
An abjad (, ar, أبجد; also abgad) is a writing system in which only consonants are represented, leaving vowel sounds to be inferred by the reader. This contrasts with other alphabets, which provide graphemes for both consonants and vowels ...
s of the Middle East are developments of the
Aramaic alphabet.
Most alphabetic scripts of India and Eastern Asia descend from the
Brahmi script
Brahmi (; ; ISO 15919, ISO: ''Brāhmī'') is a writing system of ancient South Asia. "Until the late nineteenth century, the script of the Aśokan (non-Kharosthi) inscriptions and its immediate derivatives was referred to by various names such ...
, believed to be a descendant of Aramaic.
Hangul
In
Korea
Korea ( ko, 한국, or , ) is a peninsular region in East Asia. Since 1945, it has been divided at or near the 38th parallel, with North Korea (Democratic People's Republic of Korea) comprising its northern half and South Korea (Republi ...
,
Sejong the Great
Sejong of Joseon (15 May 1397 – 8 April 1450), personal name Yi Do (Korean: 이도; Hanja: 李祹), widely known as Sejong the Great (Korean: 세종대왕; Hanja: 世宗大王), was the fourth ruler of the Joseon dynasty of Korea. Initial ...
created the
Hangul
The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The ...
alphabet Hangul is a unique alphabet: it is a
featural alphabet, where design of many of the letters comes from a sound's place of articulation (P to look like the widened mouth, L to look like the tongue pulled in); creation of Hangul was planned by the government of the day; and it places individual letters in syllable clusters with equal dimensions, in the same way as
Chinese characters
Chinese characters () are logograms developed for the writing of Chinese. In addition, they have been adapted to write other East Asian languages, and remain a key component of the Japanese writing system where they are known as '' kan ...
, to allow for mixed-script writing (one syllable always takes up one type-space no matter how many letters get stacked into building that one sound-block).
Zhuyin
Zhuyin (sometimes called ''Bopomofo'') is a
semi-syllabary
A semi-syllabary is a writing system that behaves partly as an alphabet and partly as a syllabary. The main group of semi-syllabic writing are the Paleohispanic scripts of ancient Spain, a group of semi-syllabaries that transform redundant plosive ...
. It transcribes
Mandarin
Mandarin or The Mandarin may refer to:
Language
* Mandarin Chinese, branch of Chinese originally spoken in northern parts of the country
** Standard Chinese or Modern Standard Mandarin, the official language of China
** Taiwanese Mandarin, Stand ...
phonetically in the Republic of China. After the later establishment of the People's Republic of China and its adoption of
Hanyu Pinyin
Hanyu Pinyin (), often shortened to just pinyin, is the official romanization system for Standard Mandarin Chinese in China, and to some extent, in Singapore and Malaysia. It is often used to teach Mandarin, normally written in Chinese for ...
, the use of Zhuyin today is limited. However, it is still widely used in Taiwan, where the Republic of China governs. Zhuyin developed from a form of Chinese shorthand based on Chinese characters in the early 1900s and has elements of both an alphabet and a syllabary. Like an alphabet, the phonemes of
syllable initials get represented by individual symbols, but like a syllabary, the phonemes of the
syllable finals are not; each possible final (excluding the
medial glide) has its own character, an example being, ''luan'' written as ㄌㄨㄢ (''l-u-an''). The last symbol ㄢ takes place as the entire final ''-an''. While Zhuyin is not a mainstream writing system, it is still often used in ways similar to a
romanization system, for aiding pronunciation and as an input method for Chinese characters on computers and cellphones.
Romanization
European alphabets, especially Latin and Cyrillic, have been adapted for many languages of Asia. Arabic is also widely used, sometimes as an abjad (as with
Urdu and
Persian) and sometimes as a complete alphabet (as with
Kurdish and
Uyghur).
Types

The term "alphabet" gets used by
linguists
Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Linguis ...
and
paleographers in both a wide and a narrow sense. In a larger sense, an alphabet is a ''segmental'' script at the
phoneme
In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language.
For example, in most dialects of English, with the notable exception of the West Midlands and the north-wes ...
level—that is, it has separate glyphs for individual sounds and not for larger units such as syllables or words. In the narrower sense, some scholars distinguish "true" alphabets from two other types of segmental script,
abjad
An abjad (, ar, أبجد; also abgad) is a writing system in which only consonants are represented, leaving vowel sounds to be inferred by the reader. This contrasts with other alphabets, which provide graphemes for both consonants and vowels ...
s, and
abugida
An abugida (, from Ge'ez language, Ge'ez: ), sometimes known as alphasyllabary, neosyllabary or pseudo-alphabet, is a segmental Writing systems#Segmental writing system, writing system in which consonant-vowel sequences are written as units; ...
s. These three differ in how they treat vowels. Abjads have letters for consonants and leave most vowels unexpressed. Abugidas are also consonant-based but indicate vowels with
diacritic
A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacriti ...
s, a systematic graphic modification of the consonants. In alphabets in the narrow sense, on the other hand, consonants and vowels are written as independent letters. The earliest known alphabet in the wider sense is the
Wadi el-Hol script, believed to be an abjad. Its successor,
Phoenician is the ancestor of modern alphabets, including
Arabic
Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walte ...
,
Greek,
Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power ...
(via the
Old Italic alphabet
The Old Italic scripts are a family of similar ancient writing systems used in the Italian Peninsula between about 700 and 100 BC, for various languages spoken in that time and place. The most notable member is the Etruscan alphabet, which ...
),
Cyrillic (via the Greek alphabet), and
Hebrew
Hebrew (; ; ) is a Northwest Semitic language of the Afroasiatic language family. Historically, it is one of the spoken languages of the Israelites and their longest-surviving descendants, the Jews and Samaritans. It was largely preserved ...
(via
Aramaic
The Aramaic languages, short Aramaic ( syc, ܐܪܡܝܐ, Arāmāyā; oar, 𐤀𐤓𐤌𐤉𐤀; arc, 𐡀𐡓𐡌𐡉𐡀; tmr, אֲרָמִית), are a language family containing many varieties (languages and dialects) that originated i ...
).
Examples of present-day abjads are the
Arabic
Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walte ...
and
Hebrew scripts; true alphabets include
Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power ...
, Cyrillic, and Korean
hangul
The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The ...
; and abugidas, used to write
Tigrinya,
Amharic,
Hindi
Hindi (Devanāgarī: or , ), or more precisely Modern Standard Hindi (Devanagari: ), is an Indo-Aryan languages, Indo-Aryan language spoken chiefly in the Hindi Belt region encompassing parts of North India, northern, Central India, centr ...
, and
Thai. The
Canadian Aboriginal syllabics are also an abugida rather than a syllabary as their name would imply, because each glyph stands for a consonant and is modified by rotation to represent the following vowel (in a true syllabary, each consonant-vowel combination gets represented by a separate glyph).
All three types may be augmented with syllabic glyphs.
Ugaritic, for example, is basically an abjad but has syllabic letters for (these are the only times that vowels are indicated).
Coptic has a letter for .
Devanagari is typically an abugida augmented with dedicated letters for initial vowels, though some traditions use अ as a
zero consonant as the graphic base for such vowels.
The boundaries between the three types of segmental scripts are not always clear-cut. For example,
Sorani Kurdish is written in the
Arabic script, which when used for other languages is an abjad. In Kurdish, writing the vowels is mandatory, and whole letters get used, so the script is a true alphabet. Other languages may use a Semitic abjad with forced vowel diacritics, effectively making them abugidas. On the other hand, the
Phagspa script of the
Mongol Empire
The Mongol Empire of the 13th and 14th centuries was the largest contiguous land empire in history. Originating in present-day Mongolia in East Asia, the Mongol Empire at its height stretched from the Sea of Japan to parts of Eastern Europe, ...
was based closely on the
Tibetan abugida, but vowel marks are written after the preceding consonant rather than as diacritic marks. Although short ''a'' not getting written, as in the Indic abugidas, one could argue that the linear arrangement made this a true alphabet. Ironically, the original source of the term "abugida", namely the
Ge'ez abugida now used for
Amharic and
Tigrinya, has assimilated into their consonant modifications. It is now no longer systematic and must be learned as a syllabary rather than as a segmental script. Even more extreme, the Pahlavi abjad eventually became
logographic.

Thus the primary
categorisation of alphabets reflects how they treat vowels. For
tonal languages, further classification can be based on their treatment of tone though names do not yet exist to distinguish the various types. Some alphabets disregard tone entirely, especially when it does not carry a heavy functional load, as in
Somali
Somali may refer to:
Horn of Africa
* Somalis, an inhabitant or ethnicity associated with Greater Somali Region
** Proto-Somali, the ancestors of modern Somalis
** Somali culture
** Somali cuisine
** Somali language, a Cushitic language
** Soma ...
and many other languages of Africa and the Americas. Such scripts are to tone what abjads are to vowels. Most commonly, tones get indicated with diacritics, which is how vowels get treated in abugidas, which is the case for
Vietnamese (a true alphabet) and
Thai (an abugida). In Thai, the tone is determined primarily by a consonant, with diacritics for disambiguation. In the
Pollard script, an abugida, vowels get indicated by diacritics. The placing of the diacritic relative to the consonant is modified to indicate the tone. More rarely, a script may have separate letters for tones, as is the case for
Hmong and
Zhuang. For many, regardless of whether letters or diacritics get used, the most common tone is not marked, just as the most common vowel is not marked in Indic abugidas. In
Zhuyin, not only is one of the tones unmarked; but there is a diacritic to indicate a lack of tone, like the
virama of Indic.
Size
The number of letters in an alphabet can be small. The Book
Pahlavi
Pahlavi may refer to:
Iranian royalty
*Seven Parthian clans, ruling Parthian families during the Sasanian Empire
*Pahlavi dynasty, the ruling house of Imperial State of Persia/Iran from 1925 until 1979
**Reza Shah, Reza Shah Pahlavi (1878–1944 ...
script, an abjad, had only twelve letters at one point and may have had even fewer. Today the
Rotokas alphabet has only twelve letters (the
Hawaiian alphabet
The Hawaiian alphabet (in haw, ka pīʻāpā Hawaiʻi) is an alphabet used to write Hawaiian. It was adapted from the English alphabet in the early 19th century by American missionaries to print a bible in the Hawaiian language.
Origins
In ...
claimed to be that small. However, it consists of 18 letters, including the
ʻokina and five long vowels). While Rotokas has a small alphabet because it has few phonemes to represent (just eleven), Book Pahlavi was small because many letters got ''conflated''—or, the graphic distinctions had gotten lost over time. In later Pahlavi
papyri
Papyrus ( ) is a material similar to thick paper that was used in ancient times as a writing surface. It was made from the pith of the papyrus plant, ''Cyperus papyrus'', a wetland sedge. ''Papyrus'' (plural: ''papyri'') can also refer to a d ...
, up to half of the remaining graphic distinctions of these twelve letters were lost, and the script could no longer be read as a sequence of letters at all. Instead, each word had to be learned as a whole—or, they had become
logograms as in Egyptian
Demotic. Moreover, the spellings of some words were
heterograms; that is, those spellings did not reflect the pronunciation of those words in Pahlavi but instead reflected their
Aramaic
The Aramaic languages, short Aramaic ( syc, ܐܪܡܝܐ, Arāmāyā; oar, 𐤀𐤓𐤌𐤉𐤀; arc, 𐡀𐡓𐡌𐡉𐡀; tmr, אֲרָמִית), are a language family containing many varieties (languages and dialects) that originated i ...
equivalents used as logograms (as English ''e. g.'' ''for example'', from Latin ''exempli gratia'').
Abugidas
The largest segmental script is probably
Devanagari. When written in Devanagari, Vedic
Sanskrit
Sanskrit (; attributively , ; nominalization, nominally , , ) is a classical language belonging to the Indo-Aryan languages, Indo-Aryan branch of the Indo-European languages. It arose in South Asia after its predecessor languages had Trans-cul ...
has an alphabet of 53 letters, including the ''visarga'' mark for final aspiration and special letters for ''kš'' and ''jñ.'' However, one of the letters is theoretical and not used. The
Hindi alphabet
Devanagari ( ; , , Sanskrit pronunciation: ), also called Nagari (),Kathleen Kuiper (2010), The Culture of India, New York: The Rosen Publishing Group, , page 83 is a left-to-right abugida (a type of segmental Writing systems#Segmental syste ...
must represent both Sanskrit and modern vocabulary, and so has been expanded to 58 with the letters (letters with a dot added) to represent sounds from Persian and English.
Abjads
The largest known abjad is
Sindhi, with 52 letters. The largest alphabets in the narrow sense include
Abkhaz and
Kabardian (for
Cyrillic), with 62 and 60 letters, respectively, and
Slovak (for the
Latin script
The Latin script, also known as Roman script, is an alphabetic writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greece, Greek city of Cumae, in southe ...
), with 46. However, these scripts either count
di- and tri-graphs as separate letters, as Spanish did with ''ch'' and ''ll'' until recently, or uses
diacritic
A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacriti ...
s like Slovak ''č''.
Alphabets

The
Georgian alphabet
The Georgian scripts are the three writing systems used to write the Georgian language: Asomtavruli, Nuskhuri and Mkhedruli. Although the systems differ in appearance, their letters share the same names and alphabetical order and are written hor ...
is an alphabetic writing system. The modern Georgian alphabet has 33 letters. The original Georgian alphabet had 38 letters, but five letters were removed in the 19th century by
Ilia Chavchavadze
Prince Ilia Chavchavadze ( ka, ილია ჭავჭავაძე; 8 November 1837 – 12 September 1907) was a Georgian public figure, journalist, publisher, writer and poet who spearheaded the revival of Georgian nationalism during the ...
.
The
Armenian alphabet is an alphabetical writing system used to write the Armenian language. It was created in year 405 A.D. originally contained 36 letters. Two more letters, օ (o) and ֆ (f), were added in the Middle Ages. During the 1920s orthography reform, a new letter և (capital ԵՎ) was added, which was a ligature before ե+ւ. The letter Ւ ւ was discarded and reintroduced as part of a new letter ՈՒ ու (which was a digraph before).
Syllabaries
Syllabaries typically contain 50 to 400 glyphs. Glyphs of logographic systems typically number from the many hundreds into the thousands. Thus a simple count of the number of distinct symbols is an important clue to the nature of an unknown script.
Alphabetical order
Alphabets often come to be associated with a standard ordering of their letters, which is for
collation—namely, for the listing words and other items in ''
alphabetical order''.
The basic ordering of the
Latin alphabet (
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z), which derived from the Northwest Semitic "Abgad" order, is already well established. Although, languages using this alphabet have different conventions for their treatment of modified letters (such as the
French
French (french: français(e), link=no) may refer to:
* Something of, from, or related to France
** French language, which originated in France, and its various dialects and accents
** French people, a nation and ethnic group identified with Franc ...
''é'', ''à'', and ''ô'') and certain combinations of letters (
multigraphs). In French, these do not get considered to be additional letters for the purposes of collation. However, in
Icelandic, the accented letters such as ''á'', ''í'', and ''ö'' are considered distinct letters representing different vowel sounds from sounds represented by their unaccented counterparts. In Spanish, ''ñ'' is considered a separate letter, but accented vowels such as ''á'' and ''é'' are not. The ''ll'' and ''ch'' were also considered single letters, but in 1994 the tenth congress of the
Association of Spanish Language Academies. The collating order got changed so that ''ll'' is between ''lk'' and ''lm'' in the dictionary and ''ch'' is between ''cg'' and ''ci'', and in 2010 the
Real Academia Española
The Royal Spanish Academy ( es, Real Academia Española, generally abbreviated as RAE) is Spain's official royal institution with a mission to ensure the stability of the Spanish language. It is based in Madrid, Spain, and is affiliated with ...
changed it so they were no longer letters at all.
In German, words starting with ''sch-'' (which spells the German phoneme ) are inserted between words with initial ''sca-'' and ''sci-'' (all incidentally loanwords) instead of appearing after initial ''sz'', as though it were a single letter—in contrast to several languages such as
Albanian, in which ''dh-'', ''ë-'', ''gj-'', ''ll-'', ''rr-'', ''th-'', ''xh-'' and ''zh-'' (all representing phonemes and considered separate single letters) would follow the letters ''d'', ''e'', ''g'', ''l'', ''n'', ''r'', ''t'', ''x'' and ''z'' respectively, as well as Hungarian and Welsh. Further, German words with an
umlaut are collated ignoring the umlaut as per
DIN
DIN or Din or din may refer to:
People and language
* Din (name), people with the name
* Dīn, an Arabic word with three general senses: judgment, custom, and religion from which the name originates
* Dinka language (ISO 639 code: din), spoken by ...
5007-1—contrary to
Turkish
Turkish may refer to:
*a Turkic language spoken by the Turks
* of or about Turkey
** Turkish language
*** Turkish alphabet
** Turkish people, a Turkic ethnic group and nation
*** Turkish citizen, a citizen of Turkey
*** Turkish communities and mi ...
that adopted the
graphemes ö and ü, and where a word like ''tüfek'', would come after ''tuz'', in the dictionary. An exception is the German telephone directory where umlauts are sorted like ''ä''=''ae'' since names such as ''Jäger'' also appear with the spelling ''Jaeger'', and are not distinguished in the spoken language.
The
Danish and
Norwegian alphabets end with ''æ''—''ø''—''å'', whereas the Swedish and
Finnish ones conventionally put ''å''—''ä''—''ö'' at the end.
It is unknown whether the earliest alphabets had a defined sequence. Some alphabets today, such as the
Hanuno'o script
Hanunoo (), also rendered Hanunó'o, is one of the scripts indigenous to the Philippines and is used by the Mangyan peoples of southern Mindoro to write the Hanunó'o language.
It is an abugida descended from the Brahmic scripts, closely r ...
, are learned one letter at a time, in no particular order, and are not used for
collation where a definite order is required. However, a dozen
Ugaritic tablets from the fourteenth century BCE preserve the alphabet in two sequences. One, the ''ABCDE'' order later used in Phoenician, has continued with minor changes in
Hebrew
Hebrew (; ; ) is a Northwest Semitic language of the Afroasiatic language family. Historically, it is one of the spoken languages of the Israelites and their longest-surviving descendants, the Jews and Samaritans. It was largely preserved ...
,
Greek,
Armenian,
Gothic
Gothic or Gothics may refer to:
People and languages
*Goths or Gothic people, the ethnonym of a group of East Germanic tribes
**Gothic language, an extinct East Germanic language spoken by the Goths
**Crimean Gothic, the Gothic language spoken b ...
,
Cyrillic, and
Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power ...
; the other, ''HMĦLQ,'' was used in southern Arabia and is preserved today in
Ethiopic. Both orders have therefore been stable for at least 3000 years.
Runic used an unrelated
Futhark sequence, which was later
simplified.
Arabic
Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walte ...
uses its own sequence, although Arabic retains the traditional
abjadi order for numbering.
The
Brahmic family of alphabets used in India use a unique order based on
phonology
Phonology is the branch of linguistics that studies how languages or dialects systematically organize their sounds or, for sign languages, their constituent parts of signs. The term can also refer specifically to the sound or sign system of a ...
: The letters are arranged according to how and where they are produced in the mouth. This organization is used in Southeast Asia, Tibet, Korean
hangul
The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The ...
, and even Japanese
kana
The term may refer to a number of syllabaries used to write Japanese phonological units, morae. Such syllabaries include (1) the original kana, or , which were Chinese characters (kanji) used phonetically to transcribe Japanese, the most pr ...
, which is not an alphabet.
Names of letters
The Phoenician letter names, in which each letter was associated with a word that begins with that sound (
acrophony), continue to be used to varying degrees in
Samaritan,
Aramaic
The Aramaic languages, short Aramaic ( syc, ܐܪܡܝܐ, Arāmāyā; oar, 𐤀𐤓𐤌𐤉𐤀; arc, 𐡀𐡓𐡌𐡉𐡀; tmr, אֲרָמִית), are a language family containing many varieties (languages and dialects) that originated i ...
,
Syriac,
Hebrew
Hebrew (; ; ) is a Northwest Semitic language of the Afroasiatic language family. Historically, it is one of the spoken languages of the Israelites and their longest-surviving descendants, the Jews and Samaritans. It was largely preserved ...
,
Greek and
Arabic
Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walte ...
.
The names were abandoned in
Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power ...
, which instead referred to the letters by adding a vowel (usually "e") before or after the consonant; the two exceptions were
Y and
Z, which were borrowed from the Greek alphabet rather than Etruscan, and were known as ''Y Graeca'' "Greek Y" and ''zeta'' (from Greek)—this discrepancy was inherited by many European languages, as in the term ''zed'' for Z in all forms of English other than American English. Over time names sometimes shifted or were added, as in ''double U'' for
W ("double V" in French), the English name for Y, and American ''zee'' for Z. Comparing names in English and French gives a clear reflection of the
Great Vowel Shift: A, B, C and D are pronounced in today's English, but in contemporary French they are . The French names (from which the English names are derived) preserve the qualities of the English vowels from before the Great Vowel Shift. By contrast, the names of F, L, M, N and S () remain the same in both languages, because "short" vowels were largely unaffected by the Shift.
In Cyrillic originally the letters were given names based on Slavic words; this was later abandoned as well in favor of a system similar to that used in Latin.
Letters of
Armenian alphabet also have distinct letter names.
Orthography and pronunciation
When an alphabet is adopted or developed to represent a given language, an
orthography generally comes into being, providing rules for the
spelling of words in that language. In accordance with the principle on which alphabets are based, these rules will generally map letters of the alphabet to the
phoneme
In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language.
For example, in most dialects of English, with the notable exception of the West Midlands and the north-wes ...
s (significant sounds) of the spoken language. In a perfectly
phonemic orthography there would be a consistent one-to-one correspondence between the letters and the phonemes, so that a writer could predict the spelling of a word given its pronunciation, and a speaker would always know the pronunciation of a word given its spelling, and vice versa. However, this ideal is not usually achieved in practice; some languages (such as Spanish and
Finnish) come close to it, while others (such as English) deviate from it to a much larger degree.
The pronunciation of a language often evolves independently of its writing system, and writing systems have been borrowed for languages they were not designed for, so the degree to which letters of an alphabet correspond to phonemes of a language varies greatly from one language to another and even within a single language.
Languages may fail to achieve a one-to-one correspondence between letters and sounds in any of several ways:
*A language may represent a given phoneme by a combination of letters rather than just a single letter. Two-letter combinations are called
digraphs and three-letter groups are called
trigraphs.
German uses the
tetragraphs (four letters) "tsch" for the phoneme and (in a few borrowed words) "dsch" for .
Kabardian also uses a tetragraph for one of its phonemes, namely "кхъу". Two letters representing one sound occur in several instances in Hungarian as well (where, for instance, ''cs'' stands for
ʃ ''sz'' for
''zs'' for
� ''dzs'' for
ʒ.
*A language may represent the same phoneme with two or more different letters or combinations of letters. An example is
modern Greek which may write the phoneme in six different ways: , , , , , and .
*A language may spell some words with unpronounced letters that exist for historical or other reasons. For example, the spelling of the Thai word for "beer"
��บียร์retains a letter for the final consonant "r" present in the English word it was borrowed from, but silences it.
*Pronunciation of individual words may change according to the presence of surrounding words in a sentence (
sandhi).
*Different dialects of a language may use different phonemes for the same word.
*A language may use different sets of symbols or different rules for distinct sets of vocabulary items, such as the Japanese
hiragana and
katakana syllabaries, or the various rules in English for spelling words from Latin and Greek, or the original
Germanic vocabulary.
National languages sometimes elect to address the problem of dialects by associating the alphabet with the national standard. Some national languages like
Finnish,
Armenian,
Turkish
Turkish may refer to:
*a Turkic language spoken by the Turks
* of or about Turkey
** Turkish language
*** Turkish alphabet
** Turkish people, a Turkic ethnic group and nation
*** Turkish citizen, a citizen of Turkey
*** Turkish communities and mi ...
,
Russian,
Serbo-Croatian (
Serbian
Serbian may refer to:
* someone or something related to Serbia, a country in Southeastern Europe
* someone or something related to the Serbs, a South Slavic people
* Serbian language
* Serbian names
See also
*
*
* Old Serbian (disambiguat ...
,
Croatian
Croatian may refer to:
* Croatia
*Croatian language
*Croatian people
*Croatians (demonym)
See also
*
*
* Croatan (disambiguation)
* Croatia (disambiguation)
* Croatoan (disambiguation)
* Hrvatski (disambiguation)
* Hrvatsko (disambiguation)
* S ...
and
Bosnian) and
Bulgarian have a very regular spelling system with a nearly one-to-one correspondence between letters and phonemes. Strictly speaking, these national languages lack a word corresponding to the verb "to spell" (meaning to split a word into its letters), the closest match being a verb meaning to split a word into its syllables. Similarly, the
Italian verb corresponding to 'spell (out)', ''compitare'', is unknown to many Italians because spelling is usually trivial, as Italian spelling is highly phonemic. In standard Spanish, one can tell the pronunciation of a word from its spelling, but not vice versa, as certain phonemes can be represented in more than one way, but a given letter is consistently pronounced.
French
French (french: français(e), link=no) may refer to:
* Something of, from, or related to France
** French language, which originated in France, and its various dialects and accents
** French people, a nation and ethnic group identified with Franc ...
, with its
silent letters and its heavy use of
nasal vowel
A nasal vowel is a vowel that is produced with a lowering of the soft palate (or velum) so that the air flow escapes through the nose and the mouth simultaneously, as in the French vowel or Amoy []. By contrast, oral vowels are produced wit ...
s and elision, may seem to lack much correspondence between spelling and pronunciation, but its rules on pronunciation, though complex, are actually consistent and predictable with a fair degree of accuracy.
At the other extreme are languages such as English, where the pronunciations of many words must be memorized as they do not correspond to the spelling in a consistent way. For English, this is partly because the
Great Vowel Shift occurred after the orthography was established, and because English has acquired a large number of loanwords at different times, retaining their original spelling at varying levels. Even English has general, albeit complex, rules that predict pronunciation from spelling, and these rules are successful most of the time; rules to predict spelling from the pronunciation have a higher failure rate.
Sometimes, countries have the written language undergo a
spelling reform to realign the writing with the contemporary spoken language. These can range from simple spelling changes and word forms to switching the entire writing system itself, as when Turkey switched from the Arabic alphabet to a Latin-based
Turkish alphabet, and as when
Kazakh
Kazakh, Qazaq or Kazakhstani may refer to:
* Someone or something related to Kazakhstan
*Kazakhs, an ethnic group
*Kazakh language
*The Kazakh Khanate
* Kazakh cuisine
* Qazakh Rayon, Azerbaijan
*Qazax, Azerbaijan
*Kazakh Uyezd, administrative dis ...
changes from an Arabic script to a Cyrillic script due to the Soviet Union's influence, and in 2021, having a transition to the Latin alphabet, just like Turkish. The Cyrillic script used to be official in Uzbekistan and Turkmenistan before they all switched to the Latin alphabets, including Uzbekistan that is having a reform of the alphabet to use diacritics on the letters that is marked by apostrophes and the letters that are digraphs.
The standard system of symbols used by
linguists to represent sounds in any language, independently of orthography, is called the
International Phonetic Alphabet
The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic transcription, phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standa ...
.
See also
*
Abecedarium
*
Acrophony
*
Akshara
*
Alphabet book
*
Alphabet effect
*
Alphabet song
*
Alphabetical order
*
Butterfly Alphabet
*
Character encoding
Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values tha ...
*
Constructed script
*
Fingerspelling
*
NATO phonetic alphabet
*
Lipogram
*
List of writing systems
*
Pangram
*
Thoth
*
Transliteration
*
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
References
Bibliography
*
* Overview of modern and some ancient writing systems.
*
*
* Chapter 3 traces and summarizes the invention of alphabetic writing.
*
*
*
*
*
*
*
* Chapter 4 traces the invention of writing
External links
The Origins of abc"Language, Writing and Alphabet: An Interview with Christophe Rico" ''Damqātum 3'' (2007)
*
Michael Everson'
Alphabets of Europe animation by Prof. Robert Fradkin at the
University of MarylandHow the Alphabet Was Born from Hieroglyphs��Biblical Archaeology Review
An Early Hellenic AlphabetThe Alphabet BBC Radio 4 discussion with Eleanor Robson, Alan Millard and Rosalind Thomas (''In Our Time'', 18 December 2003)
{{Authority control
Orthography