Big5
Big-5 or Big5 ( zh, t=大五碼) is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for traditional Chinese characters. The People's Republic of China (PRC), which uses simplified Chinese characters, uses the GB 18030 character set instead (though it can also substitute Big-5 or UTF-8). Big5 gets its name from the consortium of five companies in Taiwan that developed it. Encoding The original Big5 character set is sorted first by usage frequency, second by stroke count, lastly by Kangxi radical. The original Big5 character set lacked many commonly used characters. To solve this problem, each vendor developed its own extension. The ETen extension became part of the current Big5 standard through popularity. The structure of Big5 does not conform to the ISO 2022 standard, but rather bears a certain similarity to the encoding. It is a double-byte character set (DBCS) with the following structure: (the prefix 0x signifying hexadecimal numbers). Sta ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
HKSCS
The Hong Kong Supplementary Character Set (; commonly abbreviated to HKSCS) is a set of Chinese characters – 4,702 in total in the initial release—used in Cantonese, as well as when writing the names of some places in Hong Kong (whether in written Cantonese or standard written Chinese sentences). It evolved from the preceding Government Chinese Character Set () or GCCS. GCCS is a set of supplementary Chinese characters coded in the user-defined areas of the Big5 character set. It was originally used within the Hong Kong Government and later used by the public. It later evolved into Hong Kong Supplementary Character Set when the characters in the set were submitted to ISO-10646 for coding. History and versions The HKSCS has gone through a few iterations. Big-5 extensions (1995–2009) HKSCS versions up to HKSCS-2008 are encoded in Big5 (Big5-HKSCS, big5hk) and ISO 10646 (Unicode). GCCS Due to the inherent differences between standard written Chinese and writt ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Hong Kong Supplementary Character Set
The Hong Kong Supplementary Character Set (; commonly abbreviated to HKSCS) is a set of Chinese characters – 4,702 in total in the initial release—used in Standard Cantonese, Cantonese, as well as when writing the List of places in Hong Kong, names of some places in Hong Kong (whether in written Cantonese or Vernacular Chinese, standard written Chinese sentences). It evolved from the preceding Government Chinese Character Set () or GCCS. GCCS is a set of supplementary Chinese character Chinese characters are logographs used to write the Chinese languages and others from regions historically influenced by Chinese culture. Of the four independently invented writing systems accepted by scholars, they represent the only on ...s coded in the user-defined areas of the Big5 character set. It was originally used within the Government of Hong Kong, Hong Kong Government and later used by the public. It later evolved into Hong Kong Supplementary Character Set when the charact ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
CNS 11643
The CNS 11643 character set (Chinese National Standard 11643), also officially known as the Chinese Standard Interchange Code or CSIC ( zh, tr=, t=中文標準交換碼), is officially the standard character set of Taiwan (Republic of China). Published and draft editions of CNS 11643 remain the source standards for Unicode reference glyphs for CJK Unified Ideographs submitted for use in Taiwan, and the character repertoire of CNS 11643 continues to be updated and used for administrative purposes in Taiwan. EUC-TW is an encoded representation of CNS 11643 and ASCII in Extended Unix Code (EUC) form. In practice, variants of the Big5 character set, which is closely related to the first two planes of CNS 11643, served as the ''de facto'' standard encoding for Traditional Chinese before the introduction of Unicode. Other encodings capable of representing certain CSIC planes include ISO-2022-CN (planes 1 and 2) and ISO-2022-CN-EXT (planes 1 through 7). Structure CNS 11643 is designed ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Chinese Character Encoding
In computing, Chinese character encodings can be used to represent text written in the CJK characters, CJK languages—Chinese language, Chinese, Japanese language, Japanese, Korean language, Korean—and (rarely) obsolete Chữ Nôm, Vietnamese, all of which use Chinese characters. Several general-purpose character encodings accommodate Chinese characters, and some of them were developed specifically for Chinese. In addition to Unicode (with the set of CJK Unified Ideographs), local encoding systems exist. The Chinese Guobiao code, Guobiao (or GB, "national standard") system is used in mainland China and Singapore, and the (mainly) Taiwanese Big5 system is used in Taiwan, Hong Kong and Macau as the two primary "legacy" local encoding systems. Guobiao is usually displayed using Simplified Chinese character, simplified characters and Big5 is usually displayed using traditional Chinese characters, traditional characters. There is however no mandated connection between the encoding sy ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Windows-950
Code page 950 is the code page used on Microsoft Windows for Traditional Chinese. It is Microsoft's implementation of the '' de facto'' standard Big5 character encoding. The code page is not registered with IANA, and hence, it is not a standard to communicate information over the internet, although it is usually labelled simply as , including by Microsoft library functions. Terminology and variants The major difference between Windows code page 950 and "common" (non-vendor-specific) Big5 is the incorporation of a subset of the ETEN extensions to Big5 at 0xF9D6 through 0xF9FE (comprising the seven Chinese characters 碁, 銹, 裏, 墻, 恒, 粧, and 嫺, followed by 34 box drawing characters and block elements). The ranges used by some of the other ETEN extended characters are instead defined as end-user defined (private use) characters. IBM's CCSID 950 comprises single byte code page 1114 (CCSID 1114) and double byte code page 947 (CCSID 947), and, while also a Big5 varian ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Simplified Chinese Characters
Simplified Chinese characters are one of two standardized Chinese characters, character sets widely used to write the Chinese language, with the other being traditional characters. Their mass standardization during the 20th century was part of an initiative by the People's Republic of China (PRC) to promote literacy, and their use in ordinary circumstances on the mainland has been encouraged by the Chinese government since the 1950s. They are the official forms used in mainland China, Malaysia, and Singapore, while traditional characters are officially used in Hong Kong, Macau, and Taiwan. Simplification of a component—either a character or a sub-component called a Radical (Chinese characters), radical—usually involves either a reduction in its total number of Chinese character strokes, strokes, or an apparent streamlining of which strokes are chosen in what places—for example, the radical used in the traditional character is simplified to to form the simplified charac ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
ISO 2022
ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202. Originating in 1971, it was most recently revised in 1994. ISO 2022 specifies a general structure which character encodings can conform to, dedicating particular ranges of bytes ( 0x00–1F and 0x7F–9F) to be used for non-printing control codes for formatting and in-band instructions (such as line breaks or formatting instructions for text terminals), rather than graphical characters. It also specifies a syntax for escape sequences, multiple-byte sequences beginning with the control code, which can likewise be used for in-band instructions. Specific sets of control codes and escape sequences designed to be used with ISO 2022 include ISO/IEC 6429, portions of which are implemented by ANSI.SYS and te ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Variable-width Encoding
A variable-width encoding is a type of character encoding scheme in which codes of differing lengths are used to encode a character set (a repertoire of symbols) for representation, usually in a computer. Most common variable-width encodings are multibyte encodings (aka MBCS – multi-byte character set), which use varying numbers of bytes (octets) to encode different characters. (Some authors, notably in Microsoft documentation, use the term ''multibyte character set,'' which is a misnomer, because representation size is an attribute of the encoding, not of the character set.) Early variable-width encodings using less than a byte per character were sometimes used to pack English text into fewer bytes in adventure games for early microcomputers. However disks (which unlike tapes allowed random access allowing text to be loaded on demand), increases in computer memory and general purpose compression algorithms have rendered such tricks largely obsolete. Multibyte encodings are ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Traditional Chinese Characters
Traditional Chinese characters are a standard set of Chinese character forms used to written Chinese, write Chinese languages. In Taiwan, the set of traditional characters is regulated by the Ministry of Education (Taiwan), Ministry of Education and standardized in the ''Standard Form of National Characters''. These forms were predominant in written Chinese until the middle of the 20th century, when various Chinese family of scripts, countries that use Chinese characters began standardizing simplified sets of characters, often with characters that existed before as well-known variant Chinese characters, variants of the predominant forms. Simplified characters as codified by the People's Republic of China are predominantly used in mainland China, Malaysia, and Singapore. "Traditional" as such is a retronym applied to non-simplified character sets in the wake of widespread use of simplified characters. Traditional characters are commonly used in Taiwan, Hong Kong, and Macau, as ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Double-byte Character Set
A double-byte character set (DBCS) is a character encoding in which either all characters (including control characters) are encoded in two bytes, or merely every graphic character not representable by an accompanying single-byte character set ( SBCS) is encoded in two bytes (Han characters would generally comprise most of these two-byte characters). A DBCS supports national languages that contain many unique characters or symbols (the maximum number of characters that can be represented with one byte is 256 characters, while two bytes can represent up to 65,536 characters). Examples of such languages include Japanese and Chinese. Hangul does not contain as many characters, but KS X 1001 supports both Hangul and Hanja, and uses two bytes per character. In CJK computing The term ''DBCS'' traditionally refers to a character encoding where each graphic character is encoded in two bytes. In an 8-bit code, such as Big-5 or Shift JIS, a character from the DBCS is represented with a le ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
CJK Characters
In internationalization, CJK characters is a collective term for graphemes used in the Chinese, Japanese, and Korean writing systems, which each include Chinese characters. It can also go by CJKV to include Chữ Nôm, the Chinese-origin logographic script formerly used for the Vietnamese language, or CJKVZ to also include Sawndip, used to write the Zhuang languages. Character repertoire Standard Mandarin Chinese and Standard Cantonese are written almost exclusively in Chinese characters. Over 3,000 characters are required for general literacy, with up to 40,000 characters for reasonably complete coverage. Japanese uses fewer characters—general literacy in Japanese can be expected with 2,136 characters. The use of Chinese characters in Korea is increasingly rare, although idiosyncratic use of Chinese characters in proper names requires knowledge (and therefore availability) of many more characters. Even today, however, some South Korean students learn 1,800 character ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
ETen Chinese System
ETen Chinese System (倚天中文系統) was the most popular DOS-compatible traditional Chinese operating system before Chinese Windows 95. DOS did not support Chinese characters, which are not in Extended ASCII. Many companies in Taiwan developed their own IBM PC compatible traditional Chinese operating system running on DOS, which were mutually incompatible between the OS, such as Kuo Chiao (國喬) and Acer. The developer of the Eten OS, E-TEN, earned their early profits from sales of their hardware based plug-in card based Chinese system products. Their software (only) Chinese systems were widely copied by many traditional Chinese users and software pirates, but this was difficult for E-TEN to control. Most traditional Chinese products were compatible with the Eten OS at that time. When Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |