Code page 866 (
CCSID
A CCSID (coded character set identifier) is a 16-bit number that represents a particular character encoding, encoding of a specific code page. For example, Unicode is a code page that has several encoding (so called "transformation") forms, like UT ...
866) (CP 866, "DOS Cyrillic Russian")
is a
code page used under
DOS and
OS/2 in
Russia
Russia (, , ), or the Russian Federation, is a transcontinental country spanning Eastern Europe and Northern Asia. It is the largest country in the world, with its internationally recognised territory covering , and encompassing one-eigh ...
to write
Cyrillic script
The Cyrillic script ( ), Slavonic script or the Slavic script, is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking cou ...
.
It is based on the "alternative code page" (russian: Альтернативная кодировка) developed in 1984 in IHNA AS USSR and published in 1986 by a research group at the Academy of Science of the USSR.
[ Брябрин В. М., Ландау И. Я., Неменман М. Е]
О системе кодирования для персональных ЭВМ
// Микропроцессорные средства и системы. — 1986. — № 4. — С. 61–64. The code page was widely used during the DOS era because it preserves all of the
pseudographic symbols of
code page 437 (unlike the "
Main code page" or
Code page 855) and maintains alphabetic order (although non-contiguously) of Cyrillic letters (unlike
KOI8-R). Initially, this encoding was only available in the Russian version of MS-DOS 4.01 (1990) and since MS-DOS 6.22 in any language version.
The
WHATWG Encoding Standard, which specifies the character encodings permitted in
HTML5 which compliant browsers must support, includes Code page 866.
It is the only single-byte encoding listed which is not named as an
ISO 8859 part,
Mac OS specific encoding,
Microsoft Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
specific encoding (
Windows-874 or
Windows-125x) or
KOI-8 variant.
Authors of new pages and the designers of new protocols are instructed to use
UTF-8 instead.
Not identical, but two very similar encodings are standardised in
GOST
GOST (russian: ГОСТ) refers to a set of international technical standards maintained by the ''Euro-Asian Council for Standardization, Metrology and Certification (EASC)'', a regional standards organization operating under the auspices of th ...
R 34.303-92
[ ГОСТ Р 34.303-92]
Наборы 8-битных кодированных символов. 8-битный код обмена и обработки информации.
= 8-bit coded character sets. 8-bit code for information interchange. as KOI-8 N1 and KOI-8 N2 (not to be confused with the original
KOI-8).
Character set
Each character is shown with its equivalent
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
code point. Only the second half of the table (code points 128–255) is shown, the first half (code points 0–127) being the same as
code page 437.
Variants
There existed a few variants of the code page, but the differences were mostly in the last 16 code points (240–255).
Alternative code page
The original version of the code page by Bryabrin et al. (1986)
is called the "Alternative code page" (russian: Альтернативная кодировка), to distinguish it from the "Main code page" (russian: Основная кодировка) by the same authors. It supports only
Russian and
Bulgarian. It is mostly the same as code page 866, except for codes F2
hex through F7
hex (which code page 866 changes to
Ukrainian and
Belarusian
Belarusian may refer to:
* Something of, or related to Belarus
* Belarusians, people from Belarus, or of Belarusian descent
* A citizen of Belarus, see Demographics of Belarus
* Belarusian language
* Belarusian culture
* Belarusian cuisine
* Byelor ...
letters) and codes F8
hex through FB
hex (where code page 866 matches
code page 437 instead). The differing row is shown below.
Modified code page 866
An unofficial variant with code points 240–255 identical to
code page 437. However, the letter Ёё is usually placed at 240 and 241. This version supports only
Russian and
Bulgarian. The differing row is shown below.
Lithuanian variants
KBL
The ''KBL'' code page, unofficially known as Code page 771,
is the earliest DOS character encoding for Lithuanian.
It mostly matches code page 866 and the Alternative code page, but replaces the last row and some
block characters with letters from the
Lithuanian alphabet not otherwise present in ASCII. The Russian
Ё/
ё is not supported,
similarly to
KOI-7.
A modified version,
Code page 773, which replaces the Cyrillic letters with
Latvian and
Estonian letters, also exists.
LST 1284
Lithuanian Standard LST 1284:1993, known as Code page 1119 or unofficially as Code page 772,
mostly matches the "modified" Code page 866, except for the addition of
quotation marks in the last row and the replacement of the mixed single-double box-drawing characters with Lithuanian letters (compare
code page 850). Unlike KBL, the Russian
Ё/
ё is retained.
It accompanies LST 1283 (
Code page 774/1118), which encodes the additional Lithuanian letters at the same locations as LST 1284, but is based on
Code page 437 instead. It was later superseded by LST 1590-1 (
Code page 775),
which encodes these Lithuanian letters in the same locations, but does not include Cyrillic letters, replacing them with Latvian and Estonian letters.
Ukrainian and Belarusian variants
Ukrainian standard RST 2018-91 is designated by IBM as Code page 1125 (CCSID 1125), abbreviated CP1125, and also known as CP866U, CP866NAV or RUSCII. It matches the original Alternative code page for all points except for F2
hex through F9
hex inclusive, which are replaced with
Ukrainian letters.
Code page/CCSID 1131 matches code page 866 for all points except for F8
hex, F9
hex, and FC
hex through FE
hex inclusive, which are replaced with otherwise-missing Ukrainian and
Belarusian
Belarusian may refer to:
* Something of, or related to Belarus
* Belarusians, people from Belarus, or of Belarusian descent
* A citizen of Belarus, see Demographics of Belarus
* Belarusian language
* Belarusian culture
* Belarusian cuisine
* Byelor ...
letters, in the process displacing the
bullet character (∙) from F9
hex to FE
hex. The differing rows are shown below.
Euro sign updates
IBM code page/CCSID 808 is a variant of code page/CCSID 866; with the
euro sign (€, U+20AC) in position FD
hex, replacing the
universal currency sign (¤).
IBM code page/CCSID 848 is a variant of code page/CCSID 1125 with the euro sign at FD
hex, replacing ¤.
IBM code page/CCSID 849 is a variant of code page/CCSID 1131 with the euro sign at FB
hex, replacing ¤.
GOST R 34.303-92
The GOST R 34.303-92 standard defines two variants. The more extensive variant, KOI-8 N2 (but not to be confused with the
KOI-8 encoding, which it does not follow), matches code page 866 and the Alternative code page until the last row (codes 240 through 255, or F0
hex through FF
hex). For the last row, it supports letters for
Belarusian
Belarusian may refer to:
* Something of, or related to Belarus
* Belarusians, people from Belarus, or of Belarusian descent
* A citizen of Belarus, see Demographics of Belarus
* Belarusian language
* Belarusian culture
* Belarusian cuisine
* Byelor ...
and
Ukrainian in addition to Russian, but in a layout unrelated to code page 866 or 1125. Notably, even the Russian
Ё/
ё (which was unchanged between the Alternative code page and code page 866) is in a different location. The differing row is shown below.
The other variant, KOI-8 N1, is a subset of KOI-8 N2 which omits the non-Russian Cyrillic letters and mixed single/double lined
box-drawing characters, leaving them empty for further internationalization (compare with
code page 850). The affected rows are shown below.
Lehner–Czech modification
An unofficial modification used in software developed by
Michael Lehner
Michael may refer to:
People
* Michael (given name), a given name
* Michael (surname), including a list of people with the surname Michael
Given name "Michael"
* Michael (archangel), ''first'' of God's archangels in the Jewish, Christian and ...
and
Peter R. Czech. It replaces three mathematic symbols with
guillemets and the
section sign which are commonly used in the Russian language. (Lehner and Czech created a number of alternative character sets for other European languages as well, including one based on
CWI-2 for
Hungarian, a
Kamenicky-based one for
Czech and
Slovak, a
Mazovia variant for
Polish and a seemingly-unique encoding for
Lithuanian
Lithuanian may refer to:
* Lithuanians
* Lithuanian language
* The country of Lithuania
* Grand Duchy of Lithuania
* Culture of Lithuania
* Lithuanian cuisine
* Lithuanian Jews as often called "Lithuanians" (''Lita'im'' or ''Litvaks'') by other Jew ...
. The modified row is shown below.
Latvian variant
A Latvian variant, supported by Star printers and FreeDOS, is code page 3012. This encoding is nicknamed "RusLat".
FreeDOS
FreeDOS provides additional unofficial extensions of code page 866 for various non-Slavic languages:
* 30002 – Cyrillic
Tajik
Tajik, Tadjik, Tadzhik or Tajikistani may refer to:
* Someone or something related to Tajikistan
* Tajiks, an ethnic group in Tajikistan, Afghanistan and Uzbekistan
* Tajik language, the official language of Tajikistan
* Tajik (surname)
* Tajik cu ...
* 30008 – Cyrillic
Abkhaz and
Ossetian
* 30010 – Cyrillic
Gagauz and
Moldovan
* 30011 – Cyrillic Russian Southern District (
Kalmyk,
Karachay-Balkar,
Ossetian,
North Caucasian)
* 30012 – Cyrillic Russian Siberian and Far Eastern Districts (
Altai
Altai or Altay may refer to:
Places
*Altai Mountains, in Central and East Asia, a region shared by China, Mongolia, Kazakhstan and Russia
In China
* Altay Prefecture (阿勒泰地区), Xinjiang Uyghur Autonomous Region, China
* Altay City (阿� ...
,
Buryat,
Khakas,
Tuvan,
Yakut,
Tungusic Tungusic may refer to:
*The Tungusic languages
*The Tungusic peoples, people who speak a Tungusic language
{{dab ...
,
Paleo-Siberian)
* 30013 – Cyrillic Volga District – Turkic languages (
Bashkir,
Chuvash,
Tatar)
* 30014 – Cyrillic Volga District – Finno-Ugric languages (
Mari,
Udmurt)
* 30015 – Cyrillic
Khanty
* 30016 – Cyrillic
Mansi
* 30017 – Cyrillic Northwestern District (Cyrillic
Nenets, Latin
Karelian, Latin
Veps)
* 30018 – Latin
Tatar and Cyrillic Russian
* 30019 – Latin
Chechen and Cyrillic Russian
* 58152 – Cyrillic
Kazakh
Kazakh, Qazaq or Kazakhstani may refer to:
* Someone or something related to Kazakhstan
*Kazakhs, an ethnic group
*Kazakh language
*The Kazakh Khanate
* Kazakh cuisine
* Qazakh Rayon, Azerbaijan
*Qazax, Azerbaijan
*Kazakh Uyezd, administrative dis ...
with euro
* 58210 – Cyrillic
Azeri
* 59234 – Cyrillic
Tatar
* 60258 – Latin
Azeri and Cyrillic Russian
* 62306 – Cyrillic
Uzbek
Code page 900
Before Microsoft's final code page for
Russian MS-DOS 4.01 was registered with IBM by Franz Rau of Microsoft as CP866 in January 1990, draft versions of it developed by Yuri Starikov (Юрий Стариков) of Dialogue were still called code page 900 internally. While the documentation was corrected to reflect the new name before the release of the product, sketches of earlier draft versions still named code page 900 and without Ukrainian and Belarusian letters, which had been added in autumn 1989, were published in the Russian press in 1990.
Code page 900 slipped through into the distribution of the
Russian MS-DOS 5.0 LCD.CPI codepage information file.
Notes
References
Further reading
*
{{Character encoding
866
__NOTOC__
Year 866 ( DCCCLXVI) was a common year starting on Tuesday (link will display the full calendar) of the Julian calendar.
Events
By place
Byzantine Empire
* April 21 – Bardas, the regent of the Byzantine Empire, is murd ...