Windows Code page 936 (abbreviated MS936, Windows-936 or (ambiguously) CP936),
is Microsoft's character encoding for
simplified Chinese, one of the four
DBCSs for
East Asian languages. Originally, Windows-936 covered
GB 2312 (in its
EUC-CN form), but it was expanded to cover most of
GBK with the release of
Windows 95.
IBM's Code page 936 is a different encoding for Simplified Chinese, although
International Components for Unicode does not include an IBM-936 codec, and uses the Windows code page for the "cp936" label.
IBM's code page for GBK coverage is Code page 1386 (CP1386 or IBM-1386), which is defined as a combination of the single byte
Code page 1114 and the double byte Code page 1385.
It was superseded by
code page 54936 (
GB 18030), but was still prevalent in use. The
Windows command prompt
Command Prompt, also known as cmd.exe or cmd, is the default command-line interpreter for the OS/2, eComStation, ArcaOS, Microsoft Windows ( Windows NT family and Windows CE family), and ReactOS operating systems. On Windows CE .NET 4.2, ...
uses CP936 as the default code page for simplified Chinese installations, although part of the GB 18030 was made mandatory for all software products sold in China. In 2002, the IANA Internet name GBK was registered with Windows-936's mapping,
Application of IANA Charset Registration for GBK
/ref> making it the ''de facto'' GBK definition on the Internet.
The concepts of "Windows-936", "GBK", "GB2312" and "EUC-CN" are sometimes confused in various software products. Code pages MS936 and 1386 are not identical to GBK because a code page encodes characters, whereas GBK only defines code points. In addition, the Euro sign
The euro sign () is the currency sign used for the euro, the official currency of the eurozone and unilaterally adopted by Kosovo and Montenegro. The design was presented to the public by the European Commission on 12 December 1996. It consists ...
(€), encoded as 0x80 in both Windows-936 and IBM-1386, is not defined in GBK. On the other hand, 95 characters defined in GBK were initially not encoded into Windows-936.
This is partly resolved in later versions of Windows and, as in Windows 7, all GBK characters not in the Unicode BMP Private Use Area can be displayed using code page 936, but encoding the 95 characters was still not supported . However, "CP936" and "GBK" are often used interchangeably because of the popularity of Microsoft products on the Chinese market when GBK was then published.
Since GBK superseded GB 2312 long ago, these two terms have also become virtually equivalent to many users, so "Windows-936", "GBK" and "GB 2312" are misunderstood by many to mean the same thing while they actually differ significantly. Instead of supporting precisely EUC-CN / GB 2312, most modern-day Windows-based software products mean partial support for GBK via Windows-936 when they use the term "GB 2312" as a character encoding option. This can be observed in products such as Microsoft Internet Explorer and Notepad++.
Notes
References
External links
Windows-936:
Microsoft's reference for Windows-936
Code page file for Windows-936
Mapping of Windows-936 to Unicode
ICU demonstration of Windows-936
International Components for Unicode (ICU), windows-936-2000.ucm
IBM-1386:
ICU demonstration of IBM-1386
ICU mapping of IBM-1386 to Unicode
{{character encoding
1386
Year 1386 ( MCCCLXXXVI) was a common year starting on Monday (link will display the full calendar) of the Julian calendar.
Events
January–December
* February 24 – Elizabeth of Bosnia, the mother of the overthrown Queen Mary of ...
Encodings of Asian languages