List of Latin-script alphabets
   HOME

TheInfoList



OR:

The lists and tables below summarize and compare the letter inventories of some of the Latin-script alphabets. In this article, the scope of the word "
alphabet An alphabet is a standardized set of basic written graphemes (called letters) that represent the phonemes of certain spoken languages. Not all writing systems represent language in this way; in a syllabary, each character represents a syllab ...
" is broadened to include letters with
tone marks Tone is the use of pitch in language to distinguish lexical or grammatical meaning – that is, to distinguish or to inflect words. All verbal languages use pitch to express emotional and other paralinguistic information and to convey emph ...
, and other
diacritics A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacriti ...
used to represent a wide range of orthographic traditions, without regard to whether or how they are sequenced in their alphabet or the table. Parentheses indicate characters not used in modern standard orthographies of the languages, but used in obsolete and/or dialectal forms.


Letters contained in the ISO basic Latin alphabet


Alphabets that contain only ISO basic Latin letters

Among alphabets for natural languages the
English English usually refers to: * English language * English people English may also refer to: Peoples, culture, and language * ''English'', an adjective for something of, from, or related to England ** English national ide ...
, 6/sup> Indonesian, and Malay alphabets only use the 26 letters in both cases. Among alphabets for constructed languages the
Ido Ido () is a constructed language derived from Reformed Esperanto, and similarly designed with the goal of being a universal second language for people of diverse backgrounds. To function as an effective ''international auxiliary language'', I ...
and
Interlingua Interlingua (; ISO 639 language codes ia, ina) is an international auxiliary language (IAL) developed between 1937 and 1951 by the American International Auxiliary Language Association (IALA). It ranks among the most widely used IALs and is t ...
alphabets only use the 26 letters.


Extended by ligatures

* German (ß), French (æ, œ)


Extended by diacritical marks

* Spanish (ñ), German (ä, ö, and ü), Dutch (ë)


Extended by multigraphs

* Filipino (ng)


Alphabets that contain all ISO basic Latin letters

Among alphabets for natural languages the
Afrikaans Afrikaans (, ) is a West Germanic language that evolved in the Dutch Cape Colony from the Dutch vernacular of Holland proper (i.e., the Hollandic dialect) used by Dutch, French, and German settlers and their enslaved people. Afrikaans gra ...
, 4/sup> Aromanian, Azerbaijani (some dialects) 3/sup>,
Basque Basque may refer to: * Basques, an ethnic group of Spain and France * Basque language, their language Places * Basque Country (greater region), the homeland of the Basque people with parts in both Spain and France * Basque Country (autonomous co ...
, /sup>, Celtic British,
Catalan Catalan may refer to: Catalonia From, or related to Catalonia: * Catalan language, a Romance language * Catalans, an ethnic group formed by the people from, or with origins in, Northern or southern Catalonia Places * 13178 Catalan, asteroid #1 ...
, /sup> Cornish,
Czech Czech may refer to: * Anything from or related to the Czech Republic, a country in Europe ** Czech language ** Czechs, the people of the area ** Czech culture ** Czech cuisine * One of three mythical brothers, Lech, Czech, and Rus' Places * Czech, ...
, /sup>
Danish Danish may refer to: * Something of, from, or related to the country of Denmark People * A national or citizen of Denmark, also called a "Dane," see Demographics of Denmark * Culture of Denmark * Danish people or Danes, people with a Danish a ...
, /sup>
Dutch Dutch commonly refers to: * Something of, from, or related to the Netherlands * Dutch people () * Dutch language () Dutch may also refer to: Places * Dutch, West Virginia, a community in the United States * Pennsylvania Dutch Country People E ...
, 0/sup> Emilian-Romagnol,
Filipino Filipino may refer to: * Something from or related to the Philippines ** Filipino language, standardized variety of 'Tagalog', the national language and one of the official languages of the Philippines. ** Filipinos, people who are citizens of th ...
, 1/sup>
Finnish Finnish may refer to: * Something or someone from, or related to Finland * Culture of Finland * Finnish people or Finns, the primary ethnic group in Finland * Finnish language, the national language of the Finnish people * Finnish cuisine See also ...
, French, 2/sup>,
German German(s) may refer to: * Germany (of or related to) ** Germania (historical use) * Germans, citizens of Germany, people of German ancestry, or native speakers of the German language ** For citizens of Germany, see also German nationality law **Ge ...
, 3/sup> Greenlandic, Hungarian, 5/sup> Javanese, Karakalpak, 3/sup>
Kurdish Kurdish may refer to: *Kurds or Kurdish people *Kurdish languages *Kurdish alphabets *Kurdistan, the land of the Kurdish people which includes: **Southern Kurdistan **Eastern Kurdistan **Northern Kurdistan **Western Kurdistan See also * Kurd (dis ...
,
Modern Latin New Latin (also called Neo-Latin or Modern Latin) is the revival of Literary Latin used in original, scholarly, and scientific works since about 1500. Modern scholarly and technical nomenclature, such as in zoological and botanical taxonomy a ...
,
Luxembourgish Luxembourgish ( ; also ''Luxemburgish'', ''Luxembourgian'', ''Letzebu(e)rgesch''; Luxembourgish: ) is a West Germanic language that is spoken mainly in Luxembourg. About 400,000 people speak Luxembourgish worldwide. As a standard form of th ...
,
Norwegian Norwegian, Norwayan, or Norsk may refer to: *Something of, from, or related to Norway, a country in northwestern Europe * Norwegians, both a nation and an ethnic group native to Norway * Demographics of Norway *The Norwegian language, including ...
, /sup> Oromo 5/sup>, Papiamento 3/sup>,
Polish Polish may refer to: * Anything from or related to Poland, a country in Europe * Polish language * Poles Poles,, ; singular masculine: ''Polak'', singular feminine: ''Polka'' or Polish people, are a West Slavic nation and ethnic group, w ...
2/sup>,
Portuguese Portuguese may refer to: * anything of, from, or related to the country and nation of Portugal ** Portuguese cuisine, traditional foods ** Portuguese language, a Romance language *** Portuguese dialects, variants of the Portuguese language ** Portu ...
,
Quechua Quechua may refer to: *Quechua people, several indigenous ethnic groups in South America, especially in Peru *Quechuan languages, a Native South American language family spoken primarily in the Andes, derived from a common ancestral language **So ...
,
Rhaeto-Romance Rhaeto-Romance, Rheto-Romance, or Rhaetian, is a purported subfamily of the Romance languages that is spoken in south-eastern Switzerland and north-eastern Italy. The name "Rhaeto-Romance" refers to the former Roman province of Raetia. The questi ...
,
Romanian Romanian may refer to: *anything of, from, or related to the country and nation of Romania **Romanians, an ethnic group **Romanian language, a Romance language *** Romanian dialects, variants of the Romanian language ** Romanian cuisine, tradition ...
, Slovak, 4/sup>
Spanish Spanish might refer to: * Items from or related to Spain: **Spaniards are a nation and ethnic group indigenous to Spain **Spanish language, spoken in Spain and many Latin American countries **Spanish cuisine Other places * Spanish, Ontario, Can ...
, 5/sup> Sundanese,
Swedish Swedish or ' may refer to: Anything from or related to Sweden, a country in Northern Europe. Or, specifically: * Swedish language, a North Germanic language spoken primarily in Sweden and Finland ** Swedish alphabet, the official alphabet used by ...
,
Tswana Tswana may refer to: * Tswana people, the Bantu speaking people in Botswana, South Africa, Namibia, Zimbabwe, Zambia, and other Southern Africa regions * Tswana language, the language spoken by the (Ba)Tswana people * Bophuthatswana, the former ba ...
, 2/sup> Uyghur,
Venda Venda () was a Bantustan in northern South Africa, which is fairly close to the South African border with Zimbabwe to the north, while to the south and east, it shared a long border with another black homeland, Gazankulu. It is now part of the ...
, 1/sup>
Võro Võro may refer to: * Võro people, an ethnic group of Estonia * Võro language Võro ( vro, võro kiilʼ, link=no , et, võru keel) is a language belonging to the Finnic branch of the Uralic languages. Traditionally, it has been con ...
, Walloon, 7/sup> West Frisian,
Xhosa Xhosa may refer to: * Xhosa people, a nation, and ethnic group, who live in south-central and southeasterly region of South Africa * Xhosa language, one of the 11 official languages of South Africa, principally spoken by the Xhosa people See als ...
, Zhuang, Zulu alphabets include all 26 letters, ''at least'' in their largest version. Among alphabets for constructed languages the
Interglossa Interglossa (lit. "between + language") is a constructed language devised by biologist Lancelot Hogben during World War II, as an attempt to put the international lexicon of science and technology, mainly of Greek and Latin origin, into a langua ...
and Occidental alphabets include all 26 letters. The
International Phonetic Alphabet The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standardized representation ...
(IPA) includes all 26 letters in their lowercase forms, although ''g'' is always single-storey ('' ɡ'') in the IPA and never double-storey ().


Alphabets that do not contain all ISO basic Latin letters

This list is based on official definitions of each alphabet. However, excluded letters might occur in non-integrated loan words and place names. The I is used in two distinct versions in Turkic languages: dotless (I ı) and dotted (İ i). They are considered different letters, and case conversion must take care to preserve the distinction.
Irish Irish may refer to: Common meanings * Someone or something of, from, or related to: ** Ireland, an island situated off the north-western coast of continental Europe ***Éire, Irish language name for the isle ** Northern Ireland, a constituent unit ...
traditionally does not write the dot, or
tittle A tittle or superscript dot is a small distinguishing mark, such as a diacritic in the form of a dot on a letter (for example, lowercase ''i'' or ''j''). The tittle is an integral part of the glyph of ''i'' and ''j'', but dot (diacritic), diacri ...
, over the small letter ''i'', but the language makes no distinction here if a dot is displayed, so no specific encoding and special case conversion rule is needed as it is for Turkic alphabets.


Statistics

The chart above lists a variety of alphabets that do not officially contain all 26 letters of the ISO basic Latin alphabet. In this list, one letter is used by all of them: A. For each of the 26 basic ISO Latin alphabet letters, the number of alphabets in the list above using it is as follows:


Letters not contained in the ISO basic Latin alphabet

Some languages have extended the Latin alphabet with ligatures, modified letters, or digraphs. These symbols are listed below.


Additional letters by type


Independent letters and ligatures


Letter–diacritic combinations: connected or overlaid


Other letters in collation order

The tables below are a work in progress. Eventually, table cells with light blue shading will indicate letter forms that do not constitute distinct letters in their associated alphabets. Please help with this task if you have the required linguistic knowledge and technical editing skill. For the order in which the characters are sorted in each alphabet, see
collating sequence Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof. Collation is a fundamental element of most office fili ...
.


Letters derived from A–H


Letters derived from I–O


Letters derived from P–Z


Notes

# In classical Latin, the digraphs '' CH'', '' PH'', '' RH'', '' TH'' were used in loanwords from
Greek Greek may refer to: Greece Anything of, from, or related to Greece, a country in Southern Europe: *Greeks, an ethnic group. *Greek language, a branch of the Indo-European language family. **Proto-Greek language, the assumed last common ancestor ...
, but they were not included in the alphabet. The ligatures '' Æ'', '' Œ'' and '' W'', as well as lowercase letters, were added to the alphabet only in
Middle Ages In the history of Europe, the Middle Ages or medieval period lasted approximately from the late 5th to the late 15th centuries, similar to the post-classical period of global history. It began with the fall of the Western Roman Empire ...
. The letters ''J'' and ''U'' were used as typographical variants of ''I'' and ''V'', respectively, roughly until the Enlightenment. # In Afrikaans, ''C'' and ''Q'' are only (and ''X'' and ''Z'' almost only) used in loanwords. # Albanian officially has the digraphs ''dh, gj, ll, nj, rr, sh, th, xh, zh'', which is sufficient to represent the
Tosk Tosk ( sq-definite, toskërishtja) is the southern group of dialects of the Albanian language, spoken by the ethnographic group known as Tosks. The line of demarcation between Tosk and Gheg (the northern variety) is the Shkumbin River. Tosk is t ...
dialect. The
Gheg Gheg (also spelled Geg; Gheg Albanian: ''gegnishtja'', Standard sq, gegërishtja) is one of the two major varieties of Albanian, the other being Tosk. The geographic dividing line between the two varieties is the Shkumbin River, which winds ...
dialect supplements the official alphabet with 6 nasal vowels, namely â, ê, î, ô, û, ŷ. # Arbëresh officially has the digraphs ''dh, gj, hj, ll, nj, rr, sh, th, xh, zh''. Arbëresh has the distinctive hj, which is considered as a letter in its own right. # Achomi also has the digraph ''A'''. # A with diaeresis is used only as a replacement character for schwa if the latter cannot be used (it was replaced by the schwa one year later because it is the most common letter). These cases should be avoided! The letters ''W'', ''Ð'', ''Ŋ'', ''Q̇'', ''Ć'' (or the digraph ''ts''), and the digraph ''dz'' are only used in certain dialects. # Basque has several digraphs: ''dd, ll, rr, ts, tt, tx, tz''. The ü, which is pronounced as /ø/, is required for various words in its Zuberoan dialect. ''C, Ç, Q, V, W, Y'' are used in foreign words, but are officially considered part of the alphabet (except Ç, which is considered a variant of C). # Bambara also has the digraphs: ''kh''(only present in loanwords), ''sh'' (also written as ''ʃ''; only present in some dialects). Historically, ''è'' was used instead of ''ɛ'', ''ny'' was used instead of ''ɲ'', and ''ò'' was used instead of ''ɔ'' in Mali. # Belarusian also has several digraphs: ''ch, dz, dź, dž''. # Bislama also has the digraph ''ng''. # Breton also has the digraphs ''ch, c'h, zh''. ''C, Q, X'' are used in foreign words or digraphs only. # Catalan also has a large number of digraphs: ''dj, gu, gü, ig, ix, ll, l·l, ny, qu, qü, rr, ss, tg, tj, ts, tx, tz''. The letters ''K'', ''Ñ'', ''Q'' ''W'', and ''Y'' are only used in loanwords or the digraphs mentioned. # The Alphabet of Chad also uses the unique letters ''N̰n̰'' and ''R̰r̰''. # Chamorro also has the digraphs ''ch, ng''. ''C'' used only in digraphs. # Corsican has the trigraphs: ''chj, ghj''. # Croatian Gaj's alphabet also has the digraphs: ''dž'', ''lj'', ''nj''. There are also four tone markers that are sometimes used on vowels to avoid ambiguity in
homophones A homophone () is a word that is pronounced the same (to varying extent) as another word but differs in meaning. A ''homophone'' may also differ in spelling. The two words may be spelled the same, for example ''rose'' (flower) and ''rose'' (pa ...
, but this is generally uncommon. Gaj's alphabet has been adopted by the Serbian and Bosnian standards and that it has complete one-to-one congruence with
Serbian Cyrillic The Serbian Cyrillic alphabet ( sr, / , ) is a variation of the Cyrillic script used to write the Serbian language, updated in 1818 by Serbian linguist Vuk Karadžić. It is one of the two alphabets used to write standard modern Serbian, t ...
, where the three digraphs map to Cyrillic letters ''џ'', ''љ'' and ''њ'', respectively. Rarely and non-standardly, digraph ''dj'' is used instead of ''đ'' (like it was previously) (Cyrillic ''ђ''). Montenegrin variant additionally uses ''Ś'' and ''Ź'' to indicate dialectal pronunciation. # Cypriot Arabic also has the letters ''Θ'' and ''Δ''. # Czech also has the digraph ''ch'', which is considered a separate letter and is sorted between ''h'' and ''i''. While'' á, ď, é, ě, í, ň, ó, ť, ú, ů, ''and ''ý'' are considered separate letters, in collation they are treated merely as letters with diacritics. However, ''č, ř, š, ''and ž'' ''are actually sorted as separate letters. ''Q, W, X'' occur only in loanwords. # Dakelh also contains the letter ''''', which represents the glottal stop. The letters ''F'', ''P'', ''R'', and ''V'' are only used in loanwords. # The Norwegian alphabet is currently identical with the Danish alphabet. ''C'' is part of both alphabets and is not used in native Danish or Norwegian words (except some proper names), but occurs quite frequently in well-established loanwords in Danish. Norwegian and Danish use é in some words such as ''én'', although é is considered a diacritic mark, while å, æ and ø are letters. ''Q, W, X, Z'' are not used except for names and some foreign words. # Dinka also has the digraphs: ''dh'', ''nh'', ''ny'', ''th''. ''H'' is only present in these digraphs. Dinka also used the letters Ää, Ëë, Ïï, Öö, Ɛ̈ɛ̈, Ɔ̈ɔ̈ (the last two which do not exist as precomposed characters in Unicode) # The status of '' ij'' as a letter, ligature or digraph in Dutch is disputed. ''C'' (outside the digraph ''ch'') ''Q'', ''X'', and ''Y'' occur mostly in foreign words. Letters with grave and letters with circumflex occur only in loanwords. # English generally now uses extended Latin letters only in loan words, such as fiancé, fiancée, and résumé. Rare publication guides may still use the dieresis on words, such as "coöperate", rather than the now-more-common "co-operate" (UK) or "cooperate" (US). For a fuller discussion, see articles branching from
Lists of English words of international origin The following are lists of words in the English language that are known as "loanwords" or "borrowings," which are derived from other languages. For Old English-derived words, see List of English words of Old English origin. * English words of A ...
, which was used to determine the diacritics needed for more unambiguous English. However, an ''é'' or ''è'' is sometimes used in poetry to show that a normally silent vowel is to be pronounced, as in "blessèd". # Filipino also known as Tagalog also uses the digraph ''ng'', even originally with a large tilde that spanned both n and g (as in n͠g) when a vowel follows the digraph. (The use of the tilde over the two letters is now rare). Only ñ is required for everyday use (only in loanwords). The accented vowels are used in dictionaries to indicate pronunciation, and g with tilde is only present in older works. # Uppercase diacritics in French are often (incorrectly) thought of as being optional, but the official rules of French orthography designate accents on uppercase letters as obligatory in most cases. Many pairs or triplets are read as digraphs or trigraphs depending on context, but are not treated as such lexicographically: consonants ''ph, (ng), th, gu/gü, qu, ce, ch/(sh/sch), rh''; vocal vowels ''(ee), ai/ay, ei/ey, eu, au/eau, ou''; nasal vowels ''ain/aim, in/im/ein, un/um/eun, an/am, en/em, om/on''; the half-consonant ''-(i)ll-''; half-consonant and vowel pairs ''oi, oin/ouin, ien, ion''. When rules that govern the French orthography are not observed, they are read as separate letters, or using an approximating phonology of a foreign language for loan words, and there are many exceptions. In addition, most final consonants are mute (including those consonants that are part of feminine, plural, and conjugation endings). Y with diaeresis and U with diaeresis are only used in certain geographical names and proper names plus their derivatives, or, in the case of U with diaeresis, newly proposed reforms. E.g. capharnaüm `shambles' is derived from the proper name Capharnaüm. ''Æ'' occurs only in Latin or Greek loanwords. # Fula has ''X'' as part of the alphabet in all countries except Guinea, Guinea-Bissau, Liberia, and Sierra Leone (used only in loanwords in these countries). ''Ɠ'', which is used only in loanwords (but still part of the alphabet), is used in Guinea only. Fula also uses the digraphs ''mb'' (In Guinea spelled ''nb''), ''nd'', ''ng'', and ''nj''. ''aa'', ''ee'', ''ii'', ''oo'', and ''uu'' are part of the alphabet in all countries except Guinea, Guinea-Bissau, Liberia, and Sierra Leone. ''Ƴ'' is used in all countries except for Nigeria, where it is written '''y''. ''Ŋ'' is used in all countries except for Nigeria. ''Ɲ'' is used in Guinea, Mali, and Burkina Faso, ''Ñ'' is used in Senegal, Gambia, Mauritania, Guinea-Bissau, Liberia, and Sierra Leone, and the digraph ''ny'' is used in Niger, Cameroon, Chad, Central African Republic, and Nigeria. The apostrophe is a letter (representing the glottal stop) in Guinea-Bissau, Liberia, and Sierra Leone. ''Q'', ''V'', and ''Z'' are only used in loanwords, and are not part of the alphabet. # Galician. The standard of 1982 set also the digraphs ''gu'', ''qu'' (both always before ''e'' and ''i''), ''ch, ll, nh'' and ''rr''. In addition, the standard of 2003 added the grapheme ''ao'' as an alternative writing of ''ó''. Although not marked (or forgotten) in the list of digraphs, they are used to represent the same sound, so the sequence ''ao'' should be considered as a digraph. The sequence ''nh'' represents a
velar Velars are consonants articulated with the back part of the tongue (the dorsum) against the soft palate, the back part of the roof of the mouth (known also as the velum). Since the velar region of the roof of the mouth is relatively extensive a ...
nasal (not a
palatal The palate () is the roof of the mouth in humans and other mammals. It separates the oral cavity from the nasal cavity. A similar structure is found in crocodilians, but in most other tetrapods, the oral and nasal cavities are not truly separ ...
as in
Portuguese Portuguese may refer to: * anything of, from, or related to the country and nation of Portugal ** Portuguese cuisine, traditional foods ** Portuguese language, a Romance language *** Portuguese dialects, variants of the Portuguese language ** Portu ...
) and is restricted only to three feminine words, being either demonstrative or
pronoun In linguistics and grammar, a pronoun (abbreviated ) is a word or a group of words that one may substitute for a noun or noun phrase. Pronouns have traditionally been regarded as one of the parts of speech, but some modern theorists would not c ...
: ''unha'' ('a' and 'one'), ''algunha'' ('some') and ''ningunha'' ('not one'). The Galician '' reintegracionismo'' movement uses it as in Portuguese. ''J'' (outside of the Limia Baixa region), ''K'', ''W'', and ''Y'' are only used in loanwords, and are not part of the alphabet. # German also retains most original letters in French loan words. Swiss German does not use ''ß'' any more. The
long s The long s , also known as the medial s or initial s, is an archaic form of the lowercase letter . It replaced the single ''s'', or one or both of the letters ''s'' in a 'double ''s sequence (e.g., "ſinfulneſs" for "sinfulness" and "poſ ...
''(ſ)'' was in use until the mid-20th century. ''Sch'' is usually not treated like a true trigraph, neither are ''ch'', ''ck'', ''st'', ''sp'', ''th'', (''ph'', ''rh'') and ''qu'' digraphs. ''Q'' only appears in the sequence ''qu'' and in loanwords, while ''x'' and ''y'' are found almost only in loan words. The capital ''ß'' (''ẞ'') is almost never used. The accented letters (other than the letters ''ä'', ''ö'', ''ü'', and ''ß'') are used only in loanwords. # Guaraní also uses digraphs ''ch, mb, nd, ng, nt, rr'' and the glottal stop '''''. ''B'', ''C'', and ''D'' are only used in these digraphs. # Gwich'in also contains the letter ''''', which represents the glottal stop. Gwich'in also uses the letters ''Ą̀'', ''Ę̀'', ''Į̀'', ''Ǫ̀'', and ''Ų̀'', which are not available as precomposed characters in Unicode. Gwich'in also uses the digraphs and trigraphs: ''aa, ąą, àà, ą̀ą̀, ch, ch', ddh, dh, dl, dr, dz, ee, ęę, èè, ę̀ę̀, gh, ghw, gw, ii, įį, ìì, į̀į̀, kh, kw, k', nd, nh, nj, oo, ǫǫ, òò, ǫ̀ǫ̀, rh, sh, shr, th, tl, tl', tr, tr', ts, ts', tth, tth', t', uu, ųų, ùù, ų̀ų̀, zh, zhr''. The letter ''C'' is only used the digraphs above. The letters ''B'', ''F'', and ''M'' are only used in loanwords. # Hausa has the digraphs: ''sh, ts''. Vowel length and tone are usually not marked. Textbooks usually use macron or doubled vowel to mark the length, grave to mark the low tone and circumflex to mark the falling tone. Therefore, in some systems, it is possible that macron is used in combination with grave or circumflex over a, e, i, o or u. The letter ''P'' is only used in loanwords. # Hungarian also has the digraphs: ''cs, dz, gy, ly, ny, sz, ty, zs''; and the trigraph: ''dzs''. Letters ''á, é, í, ó, ő, ú, ''and ''ű'' are considered separate letters, but are collated as variants of a, e, i, o, ö, u, and ü. # Irish formerly used the dot diacritic in ''ḃ, ċ, ḋ, ḟ, ġ, ṁ, ṗ, ṡ, ṫ''. These have been replaced by the digraphs: ''bh, ch, dh, fh, gh, mh, ph, sh, th'' except for in formal instances. ''V'' only occurs in onomotopoeia, such as ''vácarnach'', ''vác'', or ''vrác'', or in rare alternative spellings ''víog'' and ''vís'' (usually spelled ''bíog'' and ''bís''), or in loanwords. ''Z'' only occurs in the West Muskerry dialect in the digraph ''zs'' (a rare eclipsis of s, spelled s in other dialects and the language proper) or in loanwords. # Igbo writes Ṅṅ alternatively as N̄n̄. Igbo has the digraphs: ''ch, gb, gh, gw, kp, kw, nw, ny, sh''. ''C'' is only used in the digraph before. Also, vowels take a grave accent, an acute accent, or no accent, depending on tone. # Italian also has the digraphs: ''ch, gh, gn, gl, sc''. J, K, W, X, Y are used in foreign words, and are not part of the alphabet. X is also used for native words derived from Latin and Greek; J is also used for just a few native words, mainly names of persons (as in Jacopo) or of places (as in
Jesolo Jesolo or Iesolo (; vec, Gèxoło) is a seaside resort town and ''comune'' in the Metropolitan City of Venice, Italy of 26,447 inhabitants. With around six million visitors per year, Jesolo is one of the largest beach resorts in the country, and ...
and
Jesi Jesi, also spelled Iesi (), is a town and ''comune'' of the province of Ancona in Marche, Italy. It is an important industrial and artistic center in the floodplain on the left (north) bank of the Esino river before its mouth on the Adriatic ...
), in which is always pronounced as letter I. While it does not occur in ordinary running texts, geographical names on maps are often written only with acute accents. The circumflex is used on an -i ending that was anciently written -ii (or -ji, -ij, -j, etc.) to distinguish homograph plurals and verb forms: e.g. e.g. principî form principi, genî from geni. # Karakalpak also has the digraphs: ''ch, sh''. ''C, F, V'' are used in foreign words. # Kazakh also has the digraphs: ''ia, io, iu''. ''F, H, V'' and the digraph ''io'' are used in foreign words. # Latvian also has the digraphs: ''dz, dž, ie.'' ''Dz'' and ''dž'' are occasionally considered separate letters of the alphabet in more archaic examples, which have been published as recently as the 1950s; however, modern alphabets and teachings discourage this due to an ongoing effort to set decisive rules for Latvian and eliminate barbaric words accumulated during the Soviet occupation. The digraph "ie" is never considered a separate letter. ''Ō'', ''Ŗ'', and the digraphs ''CH'' (only used in loanwords) and ''UO'' are no longer part of the alphabet, but are still used in certain dialects and newspapers that use the old orthography. ''Y'' is used only in certain dialects and not in the standard language. ''F'' and ''H'' are only used in loanwords. # A nearby language, Pite Sami, uses Lule Sami orthography but also includes the letters ''Đđ'' and ''Ŧŧ'', which are not in Lule Sami. # Lithuanian also has the digraphs: ''ch, dz, dž, ie, uo''. However, these are not considered separate letters of the alphabet. ''F'', ''H'', and the digraph ''CH'' are only used in loanwords. Demanding publications such as dictionaries, maps, schoolbooks etc. need additional diacritical marks to differentiate homographs. Using grave accent on A, E, I, O, U, acute accent on all vowels, and tilde accent on all vowels and on L, M, N and R. Small E and I (also with ogonek) must retain the dot when additional accent mark is added to the character; the use of ì and í (with missing dot) is considered unacceptable. # In Livonian, the letters ''Ö, Ȫ, Y, Ȳ'' were used by the older generation, but the younger generation merged these sounds; Around the late 1990s, these letters were removed from the alphabet. # Marshallese often uses the old orthography (because people did not approve of the new orthography), which writes ''ļ'' as l, ''m̧'' as m, ''ņ'' as n, ''p'' as b, ''o̧'' as o at the ends of words or in the word ''yokwe'' (also spelled iakwe under the old orthography; under the new orthography, spelled io̧kwe), but a at other places, and ''d'' as dr before vowels, or r after vowels. The old orthography writes ''ā'' as e in some words, but ā in others; it also writes ''ū'' as i between consonants. The old orthography writes geminates and long vowels as two letters instead. Allophones of , written as only e o ō in the new orthography, are also written as i u and very rarely, ū. The letter ''Y'' only occurs in the words ''yokwe'' or the phrase ''yokwe yuk'' (also spelled iakwe iuk in the old orthography or io̧kwe eok in the new orthography). # Maltese also has the digraphs: ''ie, għ''. # Māori uses ''g'' only in ''ng'' digraph. ''Wh'' is also a digraph. # Some Mohawk speakers use orthographic ''i'' in place of the consonant ''y''. The glottal stop is indicated with an apostrophe ''’'' and long vowels are written with a colon '':''. # a'vi'' uses the letter ''ʼ'' and the digraphs ''aw'', ''ay'', ''ew'', ''ey'', ''kx'', ''ll'', ''ng'' (sometimes written as ''G''), ''px'', ''rr'', ''ts'' (sometimes written as ''C''), ''tx''. ''G'' (in standard orthography) and ''X'' are used only in digraphs. # Massachusett also uses the digraphs ''ch, ee, sh, ty'' and the letter ''8'' (which was previously written ''oo''). ''C'' is only used in the digraph ''ch''. # Oromo uses the following digraphs: ''ch, dh, ny, ph, sh''. ''P'' is only used in the digraph ''ph'' and loanwords. ''V'' and ''Z'' are only used in loanwords. # Papiamento also has the digraphs: ''ch, dj, sh, zj''. ''Q'' and ''X'' are only used in loanwords and proper names. ''J'' is only used in digraphs, loanwords, and proper names. Papiamentu in Bonaire and Curaçao is different from Papiamento in Aruba in the following ways: Papiamento in Aruba uses a more etymological spelling, so Papiamento uses ''C'' in native words outside of the digraph ''ch'', but Papiamentu in Bonaire and Curaçao does not. Papiamentu in Bonaire and Curaçao uses ''È'', ''Ò'', ''Ù'', and ''Ü'' for various sounds and ''Á'', ''É'', ''Í'', ''Ó'', and ''Ú'' for stress, but Papiamento in Aruba does not use these letters. # Piedmontese also uses the letter ''n-'' to indicate a velar nasal N-sound (pronounced as the gerundive termination in going), which usually precedes a vowel, as in lun-a oon # Pinyin has four tone markers that can go on top of any of the six vowels ''(a, e, i, o, u, ü)''; e.g.: macron ''(ā, ē, ī, ō, ū, ǖ)'', acute accent ''(á, é, í, ó, ú, ǘ)'', caron ''(ǎ, ě, ǐ, ǒ, ǔ, ǚ)'', grave accent ''(à, è, ì, ò, ù, ǜ)''. It also uses the digraphs: ''ch, sh, zh''. # Polish also has the digraphs: ''ch, cz, dz, dż, dź, sz, rz''. ''Q, V, X'' occur only in loanwords, and are sometimes not considered as part of the alphabet. # Portuguese also uses the digraphs ''ch, lh, nh, rr, ss''. The trema on ''ü'' was used in
Brazilian Portuguese Brazilian Portuguese (' ), also Portuguese of Brazil (', ) or South American Portuguese (') is the set of varieties of the Portuguese language native to Brazil and the most influential form of Portuguese worldwide. It is spoken by almost all of ...
before 2009, and in Portuguese in Portugal before 1990. The grave accent was used on e, i, o, and u, until 1973. The letters ''è''and ''ò'' are used in geographical names outside Europe and not part of the language proper. The now abandoned practice was to indicate underlying stress in words ending in -mente—sòmente, ùltimamente etc. Neither the digraphs nor accented letters are considered part of the alphabet. ''K'', ''W'', and ''Y'' occur only in loanwords, and were not letters of the alphabet until 2009, but these letters were used before 1911. # Romanian normally uses a comma diacritic below the letters ''s'' and ''t'' (''ș, ț''), but it is frequently replaced with an attached cedilla below these letters (''ş, ţ'') due to past lack of standardization. ''K'', ''Q'', ''W'', ''X'', and ''Y'' occur only in loanwords. # Romani has the digraphs: ''čh, dž, kh, ph, th''. # Slovak also has the digraphs ''dz, dž,'' and'' ch,'' which are considered separate letters. While'' á, ä, ď, é, í, ĺ, ň, ó, ô, ŕ, ť, ú, ''and ''ý'' are considered separate letters, in collation they are treated merely as letters with diacritics. However, ''č, ľ, š, ''and ''ž'', as well as the digraphs, are actually sorted as separate letters. ''Q, W, X, Ö, Ü'' occur only in loanwords. # ''Sorbian'' also uses the digraphs: ''ch'', ''dź''. ''Ř'' is only used in Upper Sorbian, and ''Ŕ'', ''Ś'', and ''Ź'' (outside the digraph ''dź'') are only used in Lower Sorbian. # Spanish uses several digraphs to represent single sounds: ''ch'', ''gu'' (preceding ''e'' or ''i''), ''ll'', ''qu'', ''rr''; of these, the digraphs '' ch'' and '' ll'' were traditionally considered individual letters with their own name (''che'', ''elle'') and place in the alphabet (after ''c'' and ''l'', respectively), but in order to facilitate international compatibility the Royal Spanish Academy decided to cease this practice in 1994 and all digraphs are now collated as combinations of two separate characters. While cedilla is etymologically Spanish diminutive of ceda (z) and Sancho Pança is the original form in Cervantes books, C with cedilla ''ç'' is now completely displaced by z in contemporary language. In poetry, the diaeresis may be used to break a diphthong into separate vowels. Regarding that usage, Ortografía de la lengua española states that "diaeresis is usually placed over the closed vowel .e. 'i' or 'u'and, when both are closed, generally over the first". In this context, the use of ï is rare, but part of the normative orthography. # Swedish uses é in well integrated loan words like '' idé'' and '' armé'', although é is considered a modified e, while å, ä, ö are letters. á and à are rarely used words. W and z are used in some integrated words like
webb Webb most often refers to James Webb Space Telescope which is named after James E. Webb, second Administrator of NASA. It may also refer to: Places Antarctica * Webb Glacier (South Georgia) * Webb Glacier (Victoria Land) * Webb Névé, Victor ...
and zon. Q, ü, è, and ë are used for names only, but exist in Swedish names. For foreign names ó, ç, ñ and more are sometimes used, but usually not. Swedish has many digraphs and some trigraphs. ''ch'', ''dj'', ''lj'', ''rl'', ''rn'', ''rs'', ''sj'', ''sk'', ''si'', ''ti'', ''sch'', ''skj'', ''stj'' and others are usually pronounced as one sound. # Tswana also has the digraphs: ''kg, kh, ng, ph, th, tl, tlh, ts, tsh, tš, tšh''. The letters ''C'', ''Q'', and ''X'' only appear in onomotopeia and loanwords. The letters ''V'' and ''Z'' only appear in loanwords. # Turkmen had a slightly different alphabet in 1993–1995 (which used some unusual letters) ''Ýý'' was written as ''¥ÿ'', ''Ňň'' was written as ''Ññ'', and ''Şş'' was written as ''$¢'', and ''Žž'' was written ''£⌠'' (so that all characters were available in Code page 437). In the new alphabet, all characters are available in ISO/IEC 8859-2. # Ulithian also has the digraphs: ''ch'', ''l''', ''mw'', ''ng''. ''C'' is used only in digraphs. # Uzbek also has the digraphs: ''ch, ng, sh'' considered as letters. ''C'' is used only in digraphs. ''G', O''' and apostrophe (') are considered as letters. These letters have preferred typographical variants: Gʻ, Oʻ and ʼ respectively. # Venda also has the digraphs and trigraphs: ''bv, bw, dz, dzh, dzw, fh, hw, kh, khw, ng, ny, nz, ṅw, ph, pf, pfh, sh, sw, th, ts, tsh, tsw, ty, ṱh, vh, zh, zw''. ''C, J, Q'' are used in foreign words. # Vietnamese has seven additional base letters: ''ă â đ ê ô ơ ư''. It uses five tone markers that can go on top (or below) any of the 12 vowels ''(a, ă, â, e, ê, i, o, ô, ơ, u, ư, y)''; e.g.: grave accent ''(à, ằ, ầ, è, ề, ì, ò, ồ, ờ, ù, ừ, ỳ)'', hook above ''(ả, ẳ, ẩ, ẻ, ể, ỉ, ỏ, ổ, ở, ủ, ử, ỷ)'', tilde ''(ã, ẵ, ẫ, ẽ, ễ, ĩ, õ, ỗ, ỡ, ũ, ữ, ỹ)'', acute accent ''(á, ắ, ấ, é, ế, í, ó, ố, ớ, ú, ứ, ý)'', and dot below ''(ạ, ặ, ậ, ẹ, ệ, ị, ọ, ộ, ợ, ụ, ự, ỵ)''. It also uses several digraphs and trigraphs ''ch, gh, gi, kh, ng, ngh, nh, ph, th, tr'' but they are no longer considered letters. # Walloon has the digraphs and trigraphs: ''ae, ch, dj, ea, jh, oe, oen, oi, sch, sh, tch, xh''. The letter ''X'' outside the digraph ''xh'' is in some orthographies, but not the default two. The letter ''Q'' is in some orthographies (including one default orthography), but not in the other default orthography. Also in some orthographies are ''À'', ''Ì'', ''Ù'', ''Ö'', and even ''E̊'' (which is not available as a precomposed character in Unicode, so ''Ë'' is used as a substitute) # Welsh has the digraphs ''ch'', ''dd'', ''ff'', ''ng'', ''ll'', ''ph'', ''rh'', ''th''. Each of these digraphs is collated as a separate letter, and ''ng'' comes immediately after ''g'' in the alphabet. It also frequently uses circumflexes, and occasionally uses diaereses, acute accents and grave accents, on its seven vowels (''a'', ''e'', ''i'', ''o'', ''u'', ''w'', ''y''), but accented characters are not regarded as separate letters of the alphabet. # Xhosa has a large number of digraphs, trigraphs, and even one tetragraph are used to represent various phonemes: ''bh, ch, dl, dy, dz, gc, gq, gr, gx, hh, hl, kh, kr, lh, mb, mf, mh, nc, ndl, ndz, ng, ng', ngc, ngh, ngq, ngx, nh, nkc, nkq, nkx, nq, nx, ntl, ny, nyh, ph, qh, rh, sh, th, ths, thsh, ts, tsh, ty, tyh, wh, xh, yh, zh''. It also occasionally uses acute accents, grave accents, circumflexes, and diaereses on its five vowels ''(a, e, i, o, u)'', but accented characters are not regarded as separate letters of the alphabet. # Yapese has the digraphs and trigraphs: ''aa, ae, ch, ea, ee, ii, k', l', m', n', ng, ng', oe, oo, p', t', th, th', uu, w', y'''. ''Q'', representing the glottal stop, is not always used. Often an apostrophe is used to represent the glottal stop instead. ''C'' is used only in digraphs. ''H'' is used only in digraphs and loanwords. ''J'' is used only in loanwords. # Yoruba uses the digraph ''gb''. Also, vowels take a grave accent, an acute accent, or no accent, depending on tone. Although the "dot below" diacritic is widely used, purists prefer a short vertical underbar (Unicode COMBINING VERTICAL LINE BELOW U+0329) - this resembles the IPA notation for a syllabic consonant, attached to the base of the letter (E, O or S). The seven Yoruba vowels (A, E, E underbar, I, O, O underbar, U) can be uttered in three different tones: high (acute accent); middle (no accent) and low (grave accent). The letters M and N, when written without diacritics, indicate nasalisation of the preceding vowel. M and N also occur as syllabics - in these circumstances, they take acute or grave tonal diacritics, like the vowels. Middle tone is marked with a macron to differentiate it from the unmarked nasalising consonants. A tilde was used in older orthography (still occasionally used) to indicate a double vowel. This is tonally ambiguous, and has now been replaced by showing the paired vowels, each marked with the appropriate tones. However, where a double vowel has the tonal sequence high-low or low-high, it may optionally be replaced by a single vowel with a circumflex (high-low) or caron (low-high), e.g. á + à = â; à + á = ǎ. # Zuni contains the glottal stop ''''' and the digraph: ''ch''; ''C'' is only used in that digraph. The other digraphs ''kw'', ''sh'', and ''ts'' are not part of the alphabet.


Miscellanea

*
Africa Alphabet The Africa Alphabet (also International African Alphabet or IAI alphabet) was developed by the International Institute of African Languages and Cultures in 1928, with the help of some Africans led by Diedrich Hermann Westermann, who served as d ...
*
African reference alphabet An African reference alphabet was first proposed in 1978 by a UNESCO-organized conference held in Niamey, Niger, and the proposed alphabet was revised in 1982. The conference recommended the use of single letters for a sound (that is, a phoneme) ...
* Beghilos *
Dinka alphabet The Dinka alphabet is used by South Sudanese Dinka people. The written Dinka language is based on the ISO basic Latin alphabet, but with some added letters adapted from the International Phonetic Alphabet. The current orthography is derived from the ...
*
Gaj's Latin alphabet Gaj's Latin alphabet ( sh-Latn-Cyrl, Gajeva latinica, separator=" / ", Гајева латиница}, ), also known as ( sh-Cyrl, абецеда, ) or ( sh-Cyrl, гајица, link=no, ), is the form of the Latin script used for writing Serb ...
, is the only script of the Croatian and Bosnian standard languages in current use, and one of the two scripts of the Serbian standard language. * Hawaiian alphabet *
Initial Teaching Alphabet The Initial Teaching Alphabet (I.T.A. or i.t.a.) is a variant of the Latin alphabet developed by Sir James Pitman (the grandson of Sir Isaac Pitman, inventor of a system of shorthand) in the early 1960s. It was not intended to be a strictly phone ...
*
International Phonetic Alphabet The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standardized representation ...
* Łatynka for Ukrainian *
Leet Leet (or "1337"), also known as eleet or leetspeak, is a system of modified spellings used primarily on the Internet. It often uses character replacements in ways that play on the similarity of their glyphs via reflection or other resemblance. ...
(1337 alphabet) *
Romanization Romanization or romanisation, in linguistics, is the conversion of text from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, a ...
schemes * Romani alphabet for most Romani languages * Sámi Latin alphabet * Standard Alphabet by Lepsius *
Tatar alphabet Two scripts are currently used for the Tatar language: Arabic (in China) and Cyrillic (in Russia and Kazakhstan). History of Tatar writing Before 1928, the Tatar language was usually written using alphabets based on the Arabic alphabet: İske ...
, similar to Turkish alphabet and Jaꞑalif as a part of
Uniform Turkic alphabet A uniform is a variety of clothing worn by members of an organization while participating in that organization's activity. Modern uniforms are most often worn by armed forces and paramilitary organizations such as police, emergency services, ...
*
Uralic Phonetic Alphabet The Uralic Phonetic Alphabet (UPA) or Finno-Ugric transcription system is a phonetic transcription or notational system used predominantly for the transcription and reconstruction of Uralic languages. It was first published in 1901 by Eemil Nes ...


See also

*
Diacritic A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacriti ...
* Latin-script alphabet * Latin-script multigraph *
Latin characters in Unicode Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with c ...
* List of Latin letters *
List of precomposed Latin characters in Unicode This is a list of precomposed Latin characters in Unicode. Unicode typefaces may be needed for these to display correctly. Letters with diacritics Digraphs and ligatures * DZ, Dz, dz * DŽ, Dž, dž * ff * ffi * ffl * fi * fl * IJ, ij ...
*
Romanization Romanization or romanisation, in linguistics, is the conversion of text from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, a ...
*
Typographical ligature In writing and typography, a ligature occurs where two or more graphemes or letters are joined to form a single glyph. Examples are the characters æ and œ used in English and French, in which the letters 'a' and 'e' are joined for the first ...
*
Writing systems of Africa The writing systems of Africa refer to the current and historical practice of writing systems on the African continent, both indigenous and those introduced. Today, the Latin script is commonly encountered across Africa, especially in the Western ...
;Categories * * *


Footnotes


External links

*
Michael Everson Michael Everson (born January 9, 1963) is an American and Irish linguist, script encoder, typesetter, type designer and publisher. He runs a publishing company called Evertype, through which he has published over a hundred books since 2006. H ...
'
Alphabets of EuropeTypo.cz Information on Central European typography and typesettingLetter database of the Institute of Estonian LanguageDiacritics Project – All you need to design a font with correct accents
{{DEFAULTSORT:Latin alphabets Collation Writing-related lists Letters with diacritics