ISO-8859-5
   HOME

TheInfoList



OR:

ISO/IEC 8859-5:1999, ''Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic alphabet'', is part of the
ISO/IEC 8859 ISO/IEC 8859 is a joint International Organization for Standardization, ISO and International Electrotechnical Commission, IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC ...
series of ASCII-based standard
character encoding Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical v ...
s, first edition published in 1988. It is informally referred to as Latin/Cyrillic. It was designed to cover languages using a
Cyrillic alphabet The Cyrillic script ( ) is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Easte ...
such as Bulgarian, Belarusian, Russian, Serbian and Macedonian but was never widely used. The 8-bit encodings
KOI8-R KOI8-R (RFC 1489) is an 8-bit character encoding derived from the KOI-8 encoding by the programmer Andrei Chernov in 1993 and designed to cover Russian, which uses the Russian subset of a Cyrillic script. KOI-8, on its turn, is an 8-bit exten ...
and KOI8-U, IBM-866, and also
Windows-1251 Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages. On the web, it is the second most-used ...
are far more commonly used. In contrast to the relationship between
Windows-1252 Windows-1252 or CP-1252 ( Windows code page 1252) is a legacy single-byte character encoding that is used by default (as the "ANSI code page") in Microsoft Windows throughout the Americas, Western Europe, Oceania, and much of Africa. Initially ...
and ISO 8859-1, Windows-1251 is not closely related to ISO 8859-5. However, the main Cyrillic block in
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
uses a layout based on ISO-8859-5. ISO 8859-5 would also have been usable for Ukrainian in the
Soviet Union The Union of Soviet Socialist Republics. (USSR), commonly known as the Soviet Union, was a List of former transcontinental countries#Since 1700, transcontinental country that spanned much of Eurasia from 1922 until Dissolution of the Soviet ...
from 1933 to 1990, but it is missing the Ukrainian letter ''ge'', ґ, which is required in Ukrainian orthography before and since, and during that period outside Soviet Ukraine. As a result, IBM created Code page 1124. ISO-8859-5 is the
IANA The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Internet P ...
preferred charset name for this standard when supplemented with the
C0 and C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, ...
from ISO/IEC 6429. The
Windows code page Windows code pages are sets of characters or code pages (known as character encodings in other operating systems) used in Microsoft Windows from the 1980s and 1990s. Windows code pages were gradually superseded when Unicode was implemented in Win ...
for ISO-8859-5 is code page 28595 a.k.a. Windows-28595. IBM assigned code page 915 to ISO-8859-5 until that code page was extended.


Codepage layout

Differences from
ISO 8859-1 ISO/IEC 8859-1:1998, ''Information technology— 8-bit single-byte coded graphic character sets—Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 19 ...
are shown with its
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
equivalent code point.


History and related code pages

The ECMA-113 standard has been equivalent to ISO-8859-5 since its second edition, its first edition ( ISO-IR-111) having been an extension of the earlier KOI-8 (defined by GOST 19768-74), which lays out the Russian letters in the same way as their ASCII Roman equivalents where possible. The initial draft of ISO-8859-5 (DIS-8859-5:1987) followed ISO-IR-111, but was revised after GOST 19768-74 was replaced by the new ISO-IR-153 in 1987, which re-arranged the Russian letters into alphabetical order (except for Ё). ISO-IR-153 contains the Russian letters, including Ё, and the non-breaking space and soft hyphen, whereas the full Cyrillic set of ISO-8859-5 is also called ISO-IR-144. Possibly as a consequence of this confusion, erroneously lists yet another code page as "ISO-IR-111", combining the letter order and case order of ISO-8859-5 with the row order of ISO-IR-111 (and consequently compatible with neither in practice, but in practice partially compatible with
Windows-1251 Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages. On the web, it is the second most-used ...
). IBM Code page 915 is an extension of ISO/IEC 8859-5, adding some semigraphic and other symbols in the C1 area. IBM Code page 1124 is mostly identical to ISO-8859-5, but replaces ѓ with ґ for Ukrainian use. ISO-IR-200, "Uralic Supplementary Cyrillic Set", was registered in 1998 by Everson Gunn Teoranta (which
Michael Everson Michael Everson (born January 1963) is an American and Irish linguistics, linguist, Character encoding, script encoder, typesetting, typesetter, type designer and Publishing, publisher. He runs a publishing company called Evertype, through which ...
was a director of, prior to the founding of Evertype in 2001), and changes several of the non-Russian letters in order to support the Kildin Sami, Komi and Nenets languages, not supported by ISO-8859-5 itself. Michael Everson also introduced Mac OS Barents Cyrillic for the same languages on classic Mac OS. FreeDOS calls it code page 59283. ISO-IR-201, "Volgaic Supplementary Cyrillic Set", was similarly introduced by Everson Gunn Teoranta in order to support the Chuvash, Komi, Mari and Udmurt languages, spoken in the titular
republics of Russia The republics are one type of federal subjects of Russia, federal subject of the Russian Federation. Twenty-one republics are internationally recognized as part of Russia; another is under its de facto control. The original republics were cre ...
. FreeDOS calls it code page 58259.


References


External links


ISO-IR 144
Cyrillic part of the Latin/Cyrillic Alphabet ''(May 1, 1988, from ISO 8859-5 2nd version)''
ISO/IEC 8859-5:1999Standard ECMA-113
8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet ''3rd edition (December 1999)'' {{DEFAULTSORT:ISO IEC 8859-5 ISO/IEC 8859 Computer-related introductions in 1988 Cyrillic script