KOI8-R (RFC 1489) is an 8-bit

character encoding Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmi ...

, derived from the

KOI-8 KOI-8 (КОИ-8) is an 8-bit character set standardized in GOST 19768-74. Маркелова Л. Н. Эксплуатация программоуправляемой вычислительной машины «Искра 226». — М.: Ма ...

encoding by the programmer Andrei Chernov in 1993 and designed to cover

Russian Russian(s) refers to anything related to Russia, including: *Russians (, ''russkiye''), an ethnic group of the East Slavic peoples, primarily living in Russia and neighboring countries *Rossiyane (), Russian language term for all citizens and peo ...

, which uses a Cyrillic alphabet. KOI8-R was based on

Russian Morse code The Russian Morse code approximates the Morse code for the Latin alphabet. It was enacted by the Russian government in 1856. Полное собрание законов Российской Империи. Собрание Второе. Том XX ...

, which was created from a

phonetic Phonetics is a branch of linguistics that studies how humans produce and perceive sounds, or in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians. ...

version of Latin Morse code. As a result, Russian Cyrillic letters are in pseudo-Roman order rather than the normal Cyrillic alphabetical order. Although this may seem unnatural, if the 8th bit is stripped, the text is partially readable in ASCII and may convert to syntactically correct KOI-7. For example, "Русский Текст" in KOI8-R becomes ''rUSSKIJ tEKST'' ("Russian Text"). KOI8 stands for ''Kod Obmena Informatsiey, 8 bit'' (russian: Код Обмена Информацией, 8 бит) which means "Code for Information Exchange, 8 bit". In Microsoft Windows, KOI8-R is assigned the code page number 20866. In IBM, KOI8-R is assigned code page 878. KOI8-R also happens to cover Bulgarian, but has not been used for that purpose since CP1251 was accepted. The use of these older code pages is being replaced with

Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...

as a more common way to represent Cyrillic together with other languages.

is preferred to

and its variants (KOI8-R, the most popular variant, is used by less than 0.004% of websites, mainly used for Russians, which prefer other encodings, and so do Bulgarians too) or other Cyrillic encodings in modern applications, especially on the Internet, making

UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of ...

the dominant encoding for web pages. (For further discussion of Unicode's complete coverage, of 436 Cyrillic letters/code points, including for

Old Cyrillic The Early Cyrillic alphabet, also called classical Cyrillic or paleo-Cyrillic, is a writing system that was developed in the First Bulgarian Empire during the late 9th century on the basis of the Greek alphabet for the Slavic people living ...

, and how single-byte character encodings, such as

Windows-1251 Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages. On the web, it is the second most-used ...

and KOI8 variants, cannot provide this, see Cyrillic script in Unicode.)

Character set

The following table shows the KOI8-R encoding. Each character is shown with its equivalent

code point.

References

External links

Universal Cyrillic decoder
an online program that may help recovering Cyrillic texts with broken KOI8-R or other

s. * * * * {{Character encoding Character sets Computing in the Soviet Union

Character set

See also

References

Further reading

External links