HOME

TheInfoList



OR:

Thai Industrial Standard 620-2533, commonly referred to as TIS-620, is the most common single-byte
character encoding Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical v ...
for the
Thai language Thai,In or Central Thai (historically Siamese;Although "Thai" and "Central Thai" have become more common, the older term, "Siamese", is still used by linguists, especially when it is being distinguished from other Tai languages (Diller 2008:6 ...
. The standard is published by the Thai Industrial Standards Institute (TISI), an organ of the Ministry of Industry under the Royal Thai Government, and is the sole official standard for encoding Thai in
Thailand Thailand, officially the Kingdom of Thailand and historically known as Siam (the official name until 1939), is a country in Southeast Asia on the Mainland Southeast Asia, Indochinese Peninsula. With a population of almost 66 million, it spa ...
. The descriptive name of the standard is "Standard for Thai Character Codes for Computers" (Thai: รหัสสำหรับอักขระไทยที่ใช้กับคอมพิวเตอร์). "2533" refers to year 2533 of the Buddhist Era (1990), the year the present version of the standard was published; a previous revision, TIS 620-2529 (1986), is now obsolete. The code page layout is the same between the two editions. TIS-620 is the IANA preferred charset name for TIS-620, and that charset name is used also for ISO/IEC 8859-11 (which adds a no-break space character at 0xA0, which is unassigned in TIS-620). When the IANA name is used the codes are supplemented with the
C0 and C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, ...
from ISO/IEC 6429.


Structure

TIS-620 is a conventionally structured
Extended ASCII Extended ASCII is a repertoire of character encodings that include (most of) the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes critic ...
national character set that retains full compatibility with 7-bit
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
and uses the 8-bit range hex A1 to FB for encoding the Thai alphabet. Due to the complex combining nature of Thai vowels and diacritics, TIS-620 is intended for information interchange only, and an additional display engine is required to compose characters correctly.


Variants

A nearly identical version of TIS-620 has been adopted as ISO/IEC 8859-11 in 2001, the sole difference being that ISO/IEC 8859-11 defines hex A0 as a non-breaking space, while TIS-620 leaves it undefined but reserved. (In practice, this small distinction is usually ignored.) The ISO/IEC 8859-11 set has also been registered as ISO-IR-166 by
Ecma International Ecma International () is a Nonprofit organization, nonprofit standards organization for information and communication systems. It acquired its current name in 1994, when the European Computer Manufacturers Association (ECMA) changed its name to ...
, but this variation adds explicit escape codes for signaling the beginning and end of Thai character sequences. The TIS-620 character set ordering has been used essentially as is within
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
(
ISO/IEC 10646 ISO/IEC JTC 1, entitled "Information technology", is a joint technical committee (JTC) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). Its purpose is to develop, maintain and ...
) as well. Unicode's Thai block is U+0E01 through U+0E7F, and TIS-620 Thai characters can be converted to UTF-16 simply by prefixing each byte with 0E and subtracting hex A0 from the value.


Character set

In the table above, 20 is the regular SPACE character. Code values 00-1F, 7F, 80-9F, A0, DB-DE and FC-FF are not assigned to characters by TIS-620. Code values D1, D4-DA, E7-EE are
combining character In digital typography, combining characters are Character (computing), characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritic, diacritical marks (including c ...
s.


Further reading

*


References


External links


Official reference
(in Thai) * Announcement in Royal Gazette o
TIS 620-2533
an
TIS 620-2529
* {{Character encoding Encodings of Thai Thai Industrial Standards Computer-related introductions in 1986