MIK (МИК) is an 8-bit
Cyrillic
The Cyrillic script ( ) is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Ea ...
code page
In computing, a code page is a character encoding and as such it is a specific association of a set of printable character (computing), characters and control characters with unique numbers. Typically each number represents the binary value in a s ...
used with
DOS. It is based on the character set used in the Bulgarian
Pravetz 16 IBM PC compatible system.
Kermit calls this character set "BULGARIA-PC" / "bulgaria-pc".
In Bulgaria, it was sometimes incorrectly referred to as code page 856 (which clashes with IBM's definition for a Hebrew code page). This code page is known by Star printers and
FreeDOS
FreeDOS (formerly PD-DOS) is a free software operating system for IBM PC compatible computers. It intends to provide a complete MS-DOS-compatible environment for running Legacy system, legacy software and supporting embedded systems. FreeDOS ca ...
as Code page 3021 (Earlier it was known by FreeDOS as ''code page 30033'' (now used for a
code page 857 variant which contains the Crimean Tatar hryvnia sign), but it was renumbered to match the Star Printer code page).
This is the most widespread
DOS/
OEM code page used in
Bulgaria
Bulgaria, officially the Republic of Bulgaria, is a country in Southeast Europe. It is situated on the eastern portion of the Balkans directly south of the Danube river and west of the Black Sea. Bulgaria is bordered by Greece and Turkey t ...
, rather than
CP 808, CP 855,
CP 866 or
CP 872.
Almost every DOS program created in Bulgaria, which has Bulgarian strings in it, was using MIK as encoding, and many such programs are still in use.
Character set
Each character is shown with its equivalent
Unicode
Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
code point and its decimal code point. Only the second half of the table (code points 128–255) is shown, the first half (code points 0–127) being the same as
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
.
Notes for implementors of mapping tables to Unicode
Implementors of mapping tables to Unicode should note that the MIK Code page unifies some characters:
Binary character manipulations
The MIK code page maintains in alphabetical order all Cyrillic letters which enables very easy character manipulation in binary form:
10xx xxxx - is a Cyrillic Letter
100x xxxx - is an Upper-case Cyrillic Letter
101x xxxx - is a Lower-case Cyrillic Letter
In such case testing and character manipulating functions as:
IsAlpha(), IsUpper(), IsLower(), ToUpper() and ToLower(),
are bit operations and sorting is by simple comparison of character values.
See also
*
Hardware code page
References
External links
* https://www.unicode.org/Public/MAPPINGS/VENDORS/IBM/IBM_conversions.html Unicode Consortium's mappings between IBM's code pages and Unicode
* http://www.cl.cam.ac.uk/~mgk25/unicode.html#conv UTF-8 and Unicode FAQ for Unix/Linux by
Markus Kuhn
{{DEFAULTSORT:MIK
DOS code pages
Character encoding