ISO/IEC 8859-5
   HOME

TheInfoList



OR:

ISO/IEC 8859-5:1999, ''Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic alphabet'', is part of the
ISO/IEC 8859 ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. ...
series of ASCII-based standard
character encoding Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values tha ...
s, first edition published in 1988. It is informally referred to as Latin/Cyrillic. It was designed to cover languages using a
Cyrillic alphabet , bg, кирилица , mk, кирилица , russian: кириллица , sr, ћирилица, uk, кирилиця , fam1 = Egyptian hieroglyphs , fam2 = Proto-Sinaitic , fam3 = Phoenician , fam4 = Gr ...
such as Bulgarian, Belarusian, Russian,
Serbian Serbian may refer to: * someone or something related to Serbia, a country in Southeastern Europe * someone or something related to the Serbs, a South Slavic people * Serbian language * Serbian names See also * * * Old Serbian (disambiguation ...
and
Macedonian Macedonian most often refers to someone or something from or related to Macedonia. Macedonian(s) may specifically refer to: People Modern * Macedonians (ethnic group), a nation and a South Slavic ethnic group primarily associated with North Ma ...
but was never widely used. It would also have been usable for
Ukrainian Ukrainian may refer to: * Something of, from, or related to Ukraine * Something relating to Ukrainians, an East Slavic people from Eastern Europe * Something relating to demographics of Ukraine in terms of demography and population of Ukraine * So ...
in the
Soviet Union The Soviet Union,. officially the Union of Soviet Socialist Republics. (USSR),. was a transcontinental country that spanned much of Eurasia from 1922 to 1991. A flagship communist state, it was nominally a federal union of fifteen nationa ...
from 1933 to 1990, but it is missing the Ukrainian letter ''ge'', ґ, which is required in
Ukrainian orthography The Ukrainian orthography ( uk, Украї́нський право́пис, Ukrainskyi pravopys) is orthography for the Ukrainian language, a system of generally accepted rules that determine the ways of transmitting speech in writing. Until the ...
before and since, and during that period outside Soviet Ukraine. As a result, IBM created
Code page 1124 Code page 1124 (CCSID 1124), also known as CP1124, is a modified version of ISO 8859-5 that was designed to cover the Ukrainian language. It is identical to ISO 8859-5 except for replacing the Macedonian characters Ѓ and ѓ with the Ukrainian ...
. ISO-8859-5 is the
IANA The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Interne ...
preferred charset name for this standard when supplemented with the
C0 and C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a curso ...
from
ISO/IEC 6429 ISO/IEC JTC 1, entitled "Information technology", is a joint technical committee (JTC) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). Its purpose is to develop, maintain and ...
. The 8-bit encodings
KOI8-R KOI8-R (RFC 1489) is an 8-bit character encoding, derived from the KOI-8 encoding by the programmer Andrei Chernov in 1993 and designed to cover Russian, which uses a Cyrillic alphabet. KOI8-R was based on Russian Morse code, which was creat ...
and
KOI8-U KOI8-U (RFC 2319) is an 8-bit character encoding, designed to cover Ukrainian, which uses a Cyrillic alphabet. It is based on KOI8-R, which covers Russian and Bulgarian, but replaces eight box drawing characters with four Ukrainian letters Ґ ...
,
CP866 Code page 866 ( CCSID 866) (CP 866, "DOS Cyrillic Russian") is a code page used under DOS and OS/2 in Russia to write Cyrillic script. It is based on the "alternative code page" (russian: Альтернативная кодировка) devel ...
, and also
Windows-1251 Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages. On the web, it is the second most-used ...
are far more commonly used. In contrast to
Windows-1252 Windows-1252 or CP-1252 ( code page 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages including Spanish, French, and German. ...
and
ISO 8859-1 ISO/IEC 8859-1:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in ...
, Windows-1251 is not closely related to ISO 8859-5. The
Windows code page Windows code pages are sets of characters or code pages (known as character encodings in other operating systems) used in Microsoft Windows from the 1980s and 1990s. Windows code pages were gradually superseded when Unicode was implemented in Wind ...
for ISO-8859-5 is code page 28595 a.k.a. Windows-28595. The
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
main Cyrillic block uses a layout based on ISO-8859-5.


Codepage layout

Differences from
ISO 8859-1 ISO/IEC 8859-1:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in ...
are shown with its
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
equivalent code point.


History and related code pages

The ECMA-113 standard has been equivalent to ISO-8859-5 since its second edition, its first edition (
ISO-IR-111 ISO-IR-111 or KOI8-E is an 8-bit character set. It is a multinational extension of KOI-8 for Belarusian, Macedonian, Serbian, and Ukrainian (except Ґґ which is added to KOI8-F). The name "ISO-IR-111" refers to its registration number in the I ...
) having been an extension of the earlier
KOI-8 KOI-8 (КОИ-8) is an 8-bit character set standardized in GOST 19768-74. Маркелова Л. Н. Эксплуатация программоуправляемой вычислительной машины «Искра 226». — М.: Ма ...
(defined by GOST 19768-74), which lays out the Russian letters in the same way as their ASCII Roman equivalents where possible. The initial draft of ISO-8859-5 (DIS-8859-5:1987) followed ISO-IR-111, but was revised after GOST 19768-74 was replaced by the new
ISO-IR-153 ISO-IR-153 (ST SEV 358-88) is an 8-bit character set that covers the Russian and Bulgarian alphabets. Unlike the KOI encodings, this encoding lists the Cyrillic letters in their correct traditional order. This has become the basis for ISO/IEC 8859 ...
in 1987, which re-arranged the Russian letters into alphabetical order (except for Ё). ISO-IR-153 contains the Russian letters, including Ё, and the non-breaking space and soft hyphen, whereas the full Cyrillic set of ISO-8859-5 is also called ISO-IR-144. Possibly as a consequence of this confusion, erroneously lists yet another code page as "ISO-IR-111", combining the letter order and case order of ISO-8859-5 with the row order of ISO-IR-111 (and consequently compatible with neither in practice, but in practice partially compatible with
Windows-1251 Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages. On the web, it is the second most-used ...
). IBM Code page 915 is an extension of ISO/IEC 8859-5, adding some semigraphic and other symbols in the C1 area. IBM
Code page 1124 Code page 1124 (CCSID 1124), also known as CP1124, is a modified version of ISO 8859-5 that was designed to cover the Ukrainian language. It is identical to ISO 8859-5 except for replacing the Macedonian characters Ѓ and ѓ with the Ukrainian ...
is mostly identical to ISO-8859-5, but replaces ѓ with ґ for
Ukrainian Ukrainian may refer to: * Something of, from, or related to Ukraine * Something relating to Ukrainians, an East Slavic people from Eastern Europe * Something relating to demographics of Ukraine in terms of demography and population of Ukraine * So ...
use. ISO-IR-200, "Uralic Supplementary Cyrillic Set", was registered in 1998 by Everson Gunn Teoranta (which
Michael Everson Michael Everson (born January 9, 1963) is an American and Irish linguist, script encoder, typesetter, type designer and publisher. He runs a publishing company called Evertype, through which he has published over a hundred books since 2006. H ...
was a director of, prior to the founding of
Evertype Michael Everson (born January 9, 1963) is an American and Irish linguist, script encoder, typesetter, type designer and publisher. He runs a publishing company called Evertype, through which he has published over a hundred books since 2006. Hi ...
in 2001), and changes several of the non-Russian letters in order to support the
Kildin Sami Kildin may refer to: * Kildin Island * Kildin class destroyer * Kildin Sami * Ostrov (air base) Ostrov (Russian: ''Веретье'' ("Veret"); also Ostrov-5, Gorokhovka) is a Russian Air Force air base
, Komi and Nenets languages, not supported by ISO-8859-5 itself. Michael Everson also introduced Mac OS Barents Cyrillic for the same languages on classic Mac OS. ISO-IR-201, "Volgaic Supplementary Cyrillic Set", was similarly introduced by Everson Gunn Teoranta in order to support the Chuvash, Komi, Mari and Udmurt languages, spoken in the titular
republics of Russia The republics of Russia are 22 territories in the Russian Federation that each constitute a federal subject, the highest-level administrative division of Russian territory. They are one of several types of federal subject in Russia. The republ ...
.


References


External links


ISO-IR 144
Cyrillic part of the Latin/Cyrillic Alphabet ''(May 1, 1988, from ISO 8859-5 2nd version)''
ISO/IEC 8859-5:1999
8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet ''3rd edition (December 1999)'' {{DEFAULTSORT:ISO IEC 8859-5 ISO/IEC 8859 Computer-related introductions in 1988