KOI8-R (RFC 1489) is an 8-bit
character encoding, derived from the
KOI-8 encoding by the programmer
Andrei Chernov
Andrei Aleksandrovich Chernov (russian: Андре́й Алекса́ндрович Чернов, translit=Andréj Aleksándrovič Černóv; 27 August 1966 – 16 August 2017), also known as Andrew Chernov and Ache, was a Soviet and Russian progr ...
in 1993 and designed to cover
Russian, which uses a
Cyrillic alphabet. KOI8-R was based on
Russian Morse code, which was created from a
phonetic version of Latin
Morse code. As a result, Russian Cyrillic letters are in pseudo-Roman order rather than the normal Cyrillic alphabetical order. Although this may seem unnatural, if the 8th bit is stripped, the text is partially readable in ASCII and may convert to syntactically correct
KOI-7. For example, "Русский Текст" in KOI8-R becomes ''rUSSKIJ tEKST'' ("Russian Text").
KOI8 stands for ''Kod Obmena Informatsiey, 8 bit'' (russian: Код Обмена Информацией, 8 бит) which means "Code for Information Exchange, 8 bit". In
Microsoft Windows, KOI8-R is assigned the code page number 20866. In
IBM, KOI8-R is assigned code page 878.
KOI8-R also happens to cover
Bulgarian, but has not been used for that purpose since
CP1251
Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages.
On the web, it is the second most-used s ...
was accepted. The use of these older code pages is being replaced with
Unicode as a more common way to represent Cyrillic together with other languages.
Unicode is preferred to
KOI-8 and its variants (KOI8-R, the most popular variant, is used by less than 0.004% of websites, mainly used for Russians, which prefer other encodings, and so do Bulgarians too) or other Cyrillic encodings in modern applications, especially on the Internet, making
UTF-8 the dominant encoding for web pages. (For further discussion of Unicode's complete coverage, of 436 Cyrillic letters/code points, including for
Old Cyrillic, and how single-byte character encodings, such as
Windows-1251
Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages.
On the web, it is the second most-used ...
and KOI8 variants, cannot provide this, see
Cyrillic script in Unicode
As of Unicode version 15.0 Cyrillic script is encoded across several blocks:
* CyrillicU+0400–U+04FF 256 characters
* Cyrillic SupplementU+0500–U+052F 48 characters
* Cyrillic Extended-AU+2DE0–U+2DFF 32 characters
* Cyrillic Extended-BU ...
.)
Character set
The following table shows the KOI8-R encoding. Each character is shown with its equivalent
Unicode code point.
See also
*
KOI8-B
KOI8-B is the informal name for an 8-bit Roman / Cyrillic character set constituting the common subset of the major KOI-8 variants (KOI8-R, KOI8-U, KOI8-RU, KOI8-E, KOI8-F). Accordingly, it is closely related to KOI8-R, but defines only t ...
, a derivation of KOI8-R with only the letter subset implemented
*
KOI8-U
KOI8-U (RFC 2319) is an 8-bit character encoding, designed to cover Ukrainian, which uses a Cyrillic alphabet. It is based on KOI8-R, which covers Russian and Bulgarian, but replaces eight box drawing characters with four Ukrainian letters Ґ ...
, another derivative encoding which adds
Ukrainian
Ukrainian may refer to:
* Something of, from, or related to Ukraine
* Something relating to Ukrainians, an East Slavic people from Eastern Europe
* Something relating to demographics of Ukraine in terms of demography and population of Ukraine
* So ...
characters
*
KOI character encodings
*
RELCOM
*
Windows-1251
Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages.
On the web, it is the second most-used ...
, another common Cyrillic character encoding
References
Further reading
*
*
*
*
*
External links
Universal Cyrillic decoder an online program that may help recovering
Cyrillic texts with broken KOI8-R or other
character encodings.
*
*
*
*
{{Character encoding
Character sets
Computing in the Soviet Union