ISO Latin Alphabet
   HOME

TheInfoList



OR:

The ISO basic Latin alphabet is an international standard (beginning with ISO/IEC 646) for a
Latin-script alphabet A Latin-script alphabet (Latin alphabet or Roman alphabet) is an alphabet that uses Letter (alphabet), letters of the Latin script. The 21-letter archaic Latin alphabet and the 23-letter classical Latin alphabet belong to the oldest of this gr ...
that consists of two sets (
uppercase Letter case is the distinction between the letters that are in larger uppercase or capitals (more formally ''#Majuscule, majuscule'') and smaller lowercase (more formally ''#Minuscule, minuscule'') in the written representation of certain langua ...
and lowercase) of 26 letters, codified in various national and
international standards An international standard is a technical standard developed by one or more international standards organizations. International standards are available for consideration and use worldwide. The most prominent such organization is the International O ...
and used widely in
international communication International communication (also referred to as the ''study of global communication'' or transnational communication) is the communication practice that occurs across international borders. The need for international communication was due to th ...
. They are the same letters that comprise the current
English alphabet Modern English is written with a Latin-script alphabet consisting of 26 Letter (alphabet), letters, with each having both uppercase and lowercase forms. The word ''alphabet'' is a Compound (linguistics), compound of ''alpha'' and ''beta'', t ...
. Since medieval times, they are also the same letters of the modern
Latin alphabet The Latin alphabet, also known as the Roman alphabet, is the collection of letters originally used by the Ancient Rome, ancient Romans to write the Latin language. Largely unaltered except several letters splitting—i.e. from , and from â ...
. The order is also important for sorting words into
alphabetical order Alphabetical order is a system whereby character strings are placed in order based on the position of the characters in the conventional ordering of an alphabet. It is one of the methods of collation. In mathematics, a lexicographical order is ...
. The two sets contain the following 26 letters each:


History

By the 1960s it became apparent to the
computer A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
and
telecommunication Telecommunication, often used in its plural form or abbreviated as telecom, is the transmission of information over a distance using electronic means, typically through cables, radio waves, or other communication technologies. These means of ...
s industries in the
First World The concept of the First World was originally one of the " Three Worlds" formed by the global political landscape of the Cold War, as it grouped together those countries that were aligned with the Western Bloc of the United States. This groupin ...
that a non-proprietary method of encoding characters was needed. The
International Organization for Standardization The International Organization for Standardization (ISO ; ; ) is an independent, non-governmental, international standard development organization composed of representatives from the national standards organizations of member countries. M ...
(ISO) encapsulated the
Latin script The Latin script, also known as the Roman script, is a writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae in Magna Graecia. The Gree ...
in their (
ISO/IEC 646 ISO/IEC 646 ''Information technology â€” ISO 7-bit coded character set for information interchange'', is an International Organization for Standardization, ISO/International Electrotechnical Commission, IEC standard in the ...
) 7-bit character-encoding standard. To achieve widespread acceptance, this encapsulation was based on popular usage. The standard was based on the already published ''American Standard Code for Information Interchange'', better known as
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
, which included in the
character set Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical values that make up a c ...
the 26 Ă— 2 letters of the
English alphabet Modern English is written with a Latin-script alphabet consisting of 26 Letter (alphabet), letters, with each having both uppercase and lowercase forms. The word ''alphabet'' is a Compound (linguistics), compound of ''alpha'' and ''beta'', t ...
. Later standards issued by the ISO, for example
ISO/IEC 8859 ISO/IEC 8859 is a joint International Organization for Standardization, ISO and International Electrotechnical Commission, IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC ...
(8-bit character encoding) and
ISO/IEC 10646 ISO/IEC JTC 1, entitled "Information technology", is a joint technical committee (JTC) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). Its purpose is to develop, maintain and ...
(
Unicode Latin Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with co ...
), have continued to define the 26 Ă— 2 letters of the English alphabet as the basic Latin script with extensions to handle other letters in other languages.


Terminology

The
Unicode block A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the ...
that contains the alphabet is called "
C0 Controls and Basic Latin The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of ...
". Two subheadings exist: * "Uppercase Latin alphabet": the letters start at U+0041 and contain the string LATIN CAPITAL LETTER in their descriptions * "Lowercase Latin alphabet": the letters start at U+0061 and contain the string LATIN SMALL LETTER in their descriptions There are also another two sets in the
Halfwidth and Fullwidth Forms In CJK characters, CJK (Chinese, Japanese, and Korean) computing, graphic characters are traditionally classed into fullwidth and halfwidth characters. Unlike monospaced fonts, a halfwidth character occupies half the width of a fullwidth characte ...
block: * Uppercase: the letters start at U+FF21 and contain the string FULLWIDTH LATIN CAPITAL LETTER in their descriptions * Lowercase: the letters start at U+FF41 and contain the string FULLWIDTH LATIN SMALL LETTER in their descriptions


Timeline for encoding standards

* 1865 International Morse Code was standardized at the International Telegraphy Congress in Paris, and was later made the standard by the International Telecommunication Union (ITU) * 1950s
Radiotelephony Spelling Alphabet The International Radiotelephony Spelling Alphabet or simply the Radiotelephony Spelling Alphabet, commonly known as the NATO phonetic alphabet, is the most widely used set of clear-code words for communicating the letters of the Latin/Roman ...
by ICAO


Timeline for widely used computer codes supporting the alphabet

* 1963:
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
(7-bit character-encoding standard from the
American Standards Association The American National Standards Institute (ANSI ) is a private nonprofit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organiz ...
, which became the
American National Standards Institute The American National Standards Institute (ANSI ) is a private nonprofit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organiz ...
in 1969) * 1963/1964:
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight- bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding si ...
(developed by
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
and supporting the same alphabetic characters as ASCII, but with different code values) * 1965-04-30: Ratified by ECMA as ECMA-6 based on work the ECMA's Technical Committee TC1 had carried out since December 1960. * 1972:
ISO 646 ISO/IEC 646 ''Information technology â€” ISO 7-bit coded character set for information interchange'', is an International Organization for Standardization, ISO/International Electrotechnical Commission, IEC standard in the ...
(
ISO The International Organization for Standardization (ISO ; ; ) is an independent, non-governmental, international standard development organization composed of representatives from the national standards organizations of member countries. Me ...
7-bit character-encoding standard, using the same alphabetic code values as ASCII, revised in second edition ISO 646:1983 and third edition ISO/IEC 646:1991 as a joint
ISO/IEC ISO/IEC JTC 1, entitled "Information technology", is a joint technical committee (JTC) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). Its purpose is to develop, maintain and ...
standard) * 1983: ITU-T Rec. T.51 ,
ISO/IEC 6937 T.51 / ISO/IEC 6937:2001, ''Information technology — Coded graphic character set for text communication — Latin alphabet'', is a multibyte extension of ASCII, or more precisely ISO/IEC 646-IRV. It was developed in common with ITU-T (then CCI ...
(a multi-byte extension of ASCII) * 1987:
ISO/IEC 8859-1 ISO/IEC 8859-1:1998, ''Information technology—8-bit single-byte coded graphic character sets—Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987 ...
:1987 (8-bit character encoding) ** Subsequently, other versions and parts of ISO/IEC 8859 have been published. * Mid-to-late 1980s:
Windows-1250 Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use the Latin script. It is primarily used by Czech. It is also used for Polish (as can Windows-1257), Slovak, H ...
,
Windows-1252 Windows-1252 or CP-1252 ( Windows code page 1252) is a legacy single-byte character encoding that is used by default (as the "ANSI code page") in Microsoft Windows throughout the Americas, Western Europe, Oceania, and much of Africa. Initially ...
, and other encodings used in
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
(some roughly similar to ISO/IEC 8859-1) * 1990:
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
1.0 (developed by the
Unicode Consortium The Unicode Consortium (legally Unicode, Inc.) is a 501(c)(3) non-profit organization incorporated and based in Mountain View, California, U.S. Its primary purpose is to maintain and publish the Unicode Standard which was developed with the in ...
), contained in the block "
C0 Controls and Basic Latin The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of ...
" using the same alphabetic code values as ASCII and ISO/IEC 646 ** Subsequently, other versions of Unicode have been published and it later became a joint
ISO/IEC ISO/IEC JTC 1, entitled "Information technology", is a joint technical committee (JTC) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). Its purpose is to develop, maintain and ...
standard as well, as identified below. * 1993: ISO/IEC 10646-1:1993, ISO/IEC standard for characters in Unicode 1.1 ** Subsequently, other versions of ISO/IEC 10646-1 and one of ISO/IEC 10646-2 have been published. Since 2003, the standards have been published under the name "ISO/IEC 10646" without the separation into two parts. * 1997:
Windows Glyph List 4 Windows Glyph List 4, or more commonly WGL4 for short, also known as the ''Pan-European character set'', is a character repertoire on Microsoft operating systems comprising 657 Unicode characters, two of them for private use. Its purpose is to pro ...


Representation

In ASCII the letters belong to the
printable characters ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable and 33 control characters a total of 128 code points. ...
and in Unicode since version 1.0 they belong to the block "
C0 Controls and Basic Latin The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of ...
". In both cases, as well as in
ISO/IEC 646 ISO/IEC 646 ''Information technology â€” ISO 7-bit coded character set for information interchange'', is an International Organization for Standardization, ISO/International Electrotechnical Commission, IEC standard in the ...
,
ISO/IEC 8859 ISO/IEC 8859 is a joint International Organization for Standardization, ISO and International Electrotechnical Commission, IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC ...
and
ISO/IEC 10646 ISO/IEC JTC 1, entitled "Information technology", is a joint technical committee (JTC) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). Its purpose is to develop, maintain and ...
they are occupying the positions in hexadecimal notation 41 to 5A for uppercase and 61 to 7A for lowercase. Not case sensitive, all letters have code words in the
ICAO spelling alphabet The International Radiotelephony Spelling Alphabet or simply the Radiotelephony Spelling Alphabet, commonly known as the NATO phonetic alphabet, is the most widely used set of clear-code words for communicating the letters of the Latin/Roman ...
and can be represented with
Morse code Morse code is a telecommunications method which Character encoding, encodes Written language, text characters as standardized sequences of two different signal durations, called ''dots'' and ''dashes'', or ''dits'' and ''dahs''. Morse code i ...
.


Usage

All of the lowercase letters are used in the
International Phonetic Alphabet The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standard written representation ...
(IPA). In
X-SAMPA The Extended Speech Assessment Methods Phonetic Alphabet (X-SAMPA) is a variant of SAMPA developed in 1995 by John C. Wells, professor of phonetics at University College London. It is designed to unify the individual language SAMPA alphabets, and ...
and
SAMPA The Speech Assessment Methods Phonetic Alphabet (SAMPA) is a computer-readable phonetic script using 7-bit printable ASCII characters, based on the International Phonetic Alphabet (IPA). It was originally developed in the late 1980s for six Europ ...
these letters have the same sound value as in IPA.


Alphabets containing the same set of letters

The list below only includes alphabets that include all the 26 letters but exclude: * letters whose
diacritical marks A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacrit ...
make them distinct letters. * multigraphs that constitute distinct letters. *
ligatures Ligature may refer to: Language * Ligature (writing), a combination of two or more letters into a single symbol (typography and calligraphy) * Ligature (grammar), a morpheme that links two words Medicine * Ligature (medicine), a piece of suture us ...
that are distinct letters. Notable omissions due to these rules include
Spanish Spanish might refer to: * Items from or related to Spain: **Spaniards are a nation and ethnic group indigenous to Spain **Spanish language, spoken in Spain and many countries in the Americas **Spanish cuisine **Spanish history **Spanish culture ...
,
Esperanto Esperanto (, ) is the world's most widely spoken Constructed language, constructed international auxiliary language. Created by L. L. Zamenhof in 1887 to be 'the International Language' (), it is intended to be a universal second language for ...
, Filipino and
German German(s) may refer to: * Germany, the country of the Germans and German things **Germania (Roman era) * Germans, citizens of Germany, people of German ancestry, or native speakers of the German language ** For citizenship in Germany, see also Ge ...
. The German alphabet is sometimes considered by tradition to contain only 26 letters (with , , considered variants and considered a ligature of (
long s The long s, , also known as the medial ''s'' or initial ''s'', is an Archaism, archaic form of the lowercase letter , found mostly in works from the late 8th to early 19th centuries. It replaced one or both of the letters ''s'' in a double-''s ...
) and ), but the current German orthographic rules include , , , in the alphabet placed after . In Spanish orthography, the letters and are distinct; the
tilde The tilde (, also ) is a grapheme or with a number of uses. The name of the character came into English from Spanish , which in turn came from the Latin , meaning 'title' or 'superscription'. Its primary use is as a diacritic (accent) in ...
is not considered a diacritic in this case. * Constructed languages # English is one of the few modern European languages requiring no diacritics for native words (although a diaeresis is used by some American publishers in words such as "
coöperation Cooperation (written as co-operation in British English and, with a varied usage along time, coöperation) takes place when a group of organisms works or acts together for a collective benefit to the group as opposed to working in competition ...
"). #
Interlingua Interlingua (, ) is an international auxiliary language (IAL) developed between 1937 and 1951 by the American International Auxiliary Language Association (IALA). It is a constructed language of the "naturalistic" variety, whose vocabulary, ...
, a constructed language, never uses diacritics except in unassimilated loanwords. However, they can be removed if they are not used to modify the vowel (e.g. ''
cafe A coffeehouse, coffee shop, or café (), is an establishment that serves various types of coffee, espresso, latte, americano and cappuccino, among other hot beverages. Many coffeehouses in West Asia offer ''shisha'' (actually called ''nargil ...
'', from ). #
Latino sine flexione Latino sine flexione ("Latin without inflections"), Interlingua de Academia pro Interlingua (IL de ApI) or Peano's Interlingua (abbreviated as IL) is an international auxiliary language compiled by the Academia pro Interlingua under the chairmansh ...
, a.k.a. "Peano's Interlingua", allows but does not require the placement of an accent for unusual stress. (It antedates the other "Interlingua" by roughly four decades.) # Malay and Indonesian (based on Malay) use all the Latin alphabet and require no diacritics and ligatures. However, Malay and Indonesian learning materials may use ⟨é⟩ (E with acute) to clarify the pronunciation of the letter E; in that case, ⟨e⟩ is pronounced /ə/ while ⟨é⟩ is pronounced /e/ and (è) is pronounced /ɛ/. Many of the 700+ languages of Indonesia also use the Indonesian alphabet to write their languages, some—such as Javanese—adding diacritics é and è, and some omitting q, x, and z. # Xhosa is usually written without diacritics, but may optionally use diacritics over for tones: .


Column numbering

The Roman (Latin) alphabet is commonly used for column numbering in a table or chart. This avoids confusion with row numbers using
Arabic numerals The ten Arabic numerals (0, 1, 2, 3, 4, 5, 6, 7, 8, and 9) are the most commonly used symbols for writing numbers. The term often also implies a positional notation number with a decimal base, in particular when contrasted with Roman numera ...
. For example, a 3-by-3 table would contain columns A, B, and C, set against rows 1, 2, and 3. If more columns are needed beyond Z (normally the final letter of the alphabet), the column immediately after Z is AA, followed by AB, and so on (see bijective base-26 system). This can be seen by scrolling far to the right in a spreadsheet program such as
Microsoft Excel Microsoft Excel is a spreadsheet editor developed by Microsoft for Microsoft Windows, Windows, macOS, Android (operating system), Android, iOS and iPadOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a ...
or
LibreOffice Calc LibreOffice Calc is the spreadsheet component of the LibreOffice suite. After forking from OpenOffice.org in 2010, LibreOffice Calc underwent a massive re-work of external reference handling to fix many defects in formula calculations involvi ...
. The letters are often used for indexing nested bullet points. In this case after the 26th it is more common to use AA, BB, CC, ... instead of base-26 numbers.


See also

*
Hebrew alphabet The Hebrew alphabet (, ), known variously by scholars as the Ktav Ashuri, Jewish script, square script and block script, is a unicase, unicameral abjad script used in the writing of the Hebrew language and other Jewish languages, most notably ...
*
Greek alphabet The Greek alphabet has been used to write the Greek language since the late 9th or early 8th century BC. It was derived from the earlier Phoenician alphabet, and is the earliest known alphabetic script to systematically write vowels as wel ...
*
Latin alphabet The Latin alphabet, also known as the Roman alphabet, is the collection of letters originally used by the Ancient Rome, ancient Romans to write the Latin language. Largely unaltered except several letters splitting—i.e. from , and from â ...
**
Latin-script alphabet A Latin-script alphabet (Latin alphabet or Roman alphabet) is an alphabet that uses Letter (alphabet), letters of the Latin script. The 21-letter archaic Latin alphabet and the 23-letter classical Latin alphabet belong to the oldest of this gr ...
for the sound correspondence **
List of Latin-script alphabets The lists and tables below summarize and compare the letter inventories of some of the Latin-script alphabets. In this article, the scope of the word "alphabet" is broadened to include letters with tone marks, and other diacritics used to represe ...
*
Early Cyrillic alphabet The Early Cyrillic alphabet, also called classical Cyrillic or paleo-Cyrillic, is an alphabetic writing system that was developed in Medieval Bulgaria in the Preslav Literary School during the late 9th century. It is used to write the Chur ...
,
Cyrillic alphabets Numerous Cyrillic alphabets are based on the Cyrillic script. The early Cyrillic alphabet was developed in the 9th century AD and replaced the earlier Glagolitic script developed by the theologians Saints Cyril and Methodius, Cyril and Methodi ...
*
Windows code page Windows code pages are sets of characters or code pages (known as character encodings in other operating systems) used in Microsoft Windows from the 1980s and 1990s. Windows code pages were gradually superseded when Unicode was implemented in Win ...
s


Notes


References

{{character encoding, state=collapsed Latin alphabets