Windows code page 936 (abbreviated MS936, Windows-936 or (
ambiguously) CP936),
is
Microsoft
Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
's legacy (pre-
Unicode
Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
)
character encoding
Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical v ...
for representing
simplified Chinese
Simplification, Simplify, or Simplified may refer to:
Mathematics
Simplification is the process of replacing a mathematical expression by an equivalent one that is simpler (usually shorter), according to a well-founded ordering. Examples include: ...
text
on computers. It is one of the four Windows
DBCSs for
East Asian languages
The East Asian languages are a language family (alternatively '' macrofamily'' or ''superphylum'') proposed by Stanley Starosta in 2001. The proposal has since been adopted by George van Driem and others.
Classifications Early proposals
Early ...
, accompanying code pages
932
Year 932 (Roman numerals, CMXXXII) was a leap year starting on Sunday of the Julian calendar.
Events
By place
Europe
* Summer – Alberic II of Spoleto, Alberic II leads an uprising at Rome against his stepfather Hugh of Italy, Hu ...
(
Japanese),
949 (
Korean) and
950
Year 950 ( CML) was a common year starting on Tuesday of the Julian calendar.
Events
By place
Byzantine Empire
* Arab–Byzantine War: A Hamdanid army (30,000 men) led by Sayf al-Dawla raids into Byzantine theme Anatolia. He defea ...
(
Traditional Chinese
A tradition is a system of beliefs or behaviors (folk custom) passed down within a group of people or society with symbolic meaning or special significance with origins in the past. A component of cultural expressions and folklore, common examp ...
). It is a variant of the
Mainland Chinese
''Guójiā Biāozhǔn Kuòzhǎn'' (GBK) encoding, and roughly corresponds to IBM code page 1386 (CP1386 or IBM-1386).
History
Originally, Windows-936 covered
GB 2312
is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. ''GB'' refers to the Guobiao standards (国家标准), ...
(in its
EUC-CN form), but it was expanded to cover most of
GBK with the release of
Windows 95
Windows 95 is a consumer-oriented operating system developed by Microsoft and the first of its Windows 9x family of operating systems, released to manufacturing on July 14, 1995, and generally to retail on August 24, 1995. Windows 95 merged ...
. The
Euro sign
The euro sign () is the currency sign used for the euro, the official currency of the eurozone. The design was presented to the public by the European Commission on 12 December 1996. It consists of a stylized letter E (or epsilon), crossed by ...
(€), not defined in GBK, is encoded as 0x80 in Windows-936 and IBM-1386. On the other hand, 95 characters defined in GBK 1.0 were initially not encoded into Windows-936. This is partly resolved in later versions of Windows and, as in Windows 7, all GBK characters not in the Unicode BMP
Private Use Area
In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the standard. Three Private Use Areas are defined: one in the Basic Multilingual Plane (), and one each in, and nearly covering ...
can be displayed using code page 936, but encoding the 95 characters was still not supported .
Windows code page 936 was superseded by
code page 54936
GB 18030 is a Guobiao standards, Chinese government standard, described as ''Information Technology — Chinese coded character set'' and defines the required language and character support necessary for software in China. GB18030 is the reg ...
(
GB 18030
GB 18030 is a Chinese government standard, described as ''Information Technology — Chinese coded character set'' and defines the required language and character support necessary for software in China. GB18030 is the registered Internet n ...
), but was still prevalent in use. The
Windows console uses code page 936 as the default code page for simplified Chinese installations, although part of the GB 18030 was made mandatory for all software products sold in China. In 2002, the IANA Internet name GBK was registered with Windows-936's mapping,
Application of IANA Charset Registration for GBK
/ref> making it the ''de facto'' GBK definition on the Internet.
Terminology
The name "code page 936" is ambiguous. IBM's code page 936,, an obsolete IBM 5550 encoding, is also a Simplified Chinese encoding, but uses a different encoding method for ( Shift GB), and so is entirely incompatible with Windows code page 936 (in contrast to IBM code page 932 being, to a first approximation, a subset of Windows code page 932)—although International Components for Unicode
International Components for Unicode (ICU) is an open-source project of mature C/ C++ and Java libraries for Unicode support, software internationalization, and software globalization. ICU is widely portable to many operating systems and envir ...
does not include an IBM-936 codec, and uses the Windows code page for the label. IBM's code page for GBK coverage is code page 1386, which is defined as a combination of the single byte Code page 1114 and the double byte Code page 1385.
The concepts of "Windows-936", "GBK", "GB2312" and "EUC-CN" are sometimes conflated in various software products. EUC-CN is registered with the IANA
The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Internet P ...
as , although it is a specific, variable-width 8-bit stateless, encoding format of (which also has other, less widely used, encoding formats such as HZ-GB-2312, ISO-2022-CN or the aforementioned Shift GB).
Since GBK is a superset of EUC-CN (although not itself an EUC code) and superseded long ago, and since Microsoft software continued to assign the encoding label to code page 936 even after extending it to implement GBK rather than EUC-CN, most modern-day Windows-based software products mean partial support for GBK via Windows-936, rather than EUC-CN or other encoding formats of GB 2312, when they use the term "GB 2312" as a character encoding option. This can be observed in products such as Microsoft Internet Explorer and Notepad++.
Footnotes
References
External links
Windows-936:
Microsoft's reference for Windows-936
Code page file for Windows-936
Mapping of Windows-936 to Unicode
ICU demonstration of Windows-936
International Components for Unicode (ICU), windows-936-2000.ucm
IBM-1386:
ICU demonstration of IBM-1386
ICU mapping of IBM-1386 to Unicode
{{character encoding
1386
Year 1386 ( MCCCLXXXVI) was a common year starting on Monday of the Julian calendar.
Events
January–December
* February 24 – Elizabeth of Bosnia, the mother of the overthrown Queen Mary of Hungary and Croatia, arranges the a ...
Chinese character encodings