HOME

TheInfoList



OR:

VSCII (Vietnamese Standard Code for Information Interchange), also known as TCVN 5712, ISO-IR-180, .VN, ABC or simply the TCVN encodings, is a set of three closely related Vietnamese national standard
character encoding Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical v ...
s for using the Vietnamese language with computers, developed by the TCVN Technical Committee on Information Technology (TCVN/TC1) and first adopted in 1993 (as TCVN 5712:1993). It should not be confused with the similarly-named unofficial VISCII encoding, which was sometimes used by overseas Vietnamese speakers. VISCII was also intended to stand for ''Vietnamese Standard Code for Information Interchange'', but is not related to VSCII. VSCII (TCVN) was used extensively in the north of Vietnam, while VNI was popular in the south.
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
and the Windows-1258 code page are now used for virtually all Vietnamese computer data, but legacy files or archived messages may need conversion.


Encodings

All three forms of VSCII keep the 95 printable characters of
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
unmodified. VSCII-3, also known as TCVN 5712-3, VN3 or simply TCVN3, includes the fewest assignments. It is an
extended ASCII Extended ASCII is a repertoire of character encodings that include (most of) the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes critic ...
, because it keeps all 128 codes of ASCII unmodified. It does not reassign any of the
C0 and C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, ...
. Compared to
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
, it adds 75 characters: * 67 lowercase characters, allowing full lowercase support. * 7 uppercase characters, allowing uppercase support for the 29 base letters without tone marks. * The non-breaking space. Tone marks on uppercase vowels is accomplished in TCVN3 by switching to an all-capital font. VSCII-2, also known as TCVN 5712-2 and VN2, is a superset of VSCII-3. It is an
extended ASCII Extended ASCII is a repertoire of character encodings that include (most of) the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes critic ...
, because it keeps all 128 codes of ASCII unmodified. It does not reassign any of the
C0 and C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, ...
, making it conformant with ISO 2022 as a 96-set. Compared to VSCII-3, it adds (for a total of 96 non-ASCII characters): * 16 more uppercase characters with pre-composed tone marks (for a total of 23 non-ASCII uppercase characters) * 5 combining diacritics for tone marks, allowing other combinations of uppercase letters and tone marks to be represented. Combining marks follow the base letter as in VNI (rather than preceding them as in ANSEL). VSCII-1, also known as TCVN 5712-1 and VN1, is an extension of VSCII-2, and is a modified ASCII, since it replaces 12 of the 33
control characters In computing and telecommunications, a control character or non-printing character (NPC) is a code point in a character set that does not represent a written character or symbol. They are used as in-band signaling to cause effects other than ...
with precomposed characters. Compared to VSCII-2, it (for a total of 140 non-ASCII characters): * Adds 44 more pre-composed uppercase letters, bringing them to the same count as the lowercase * Does this by replacing 12 ASCII control characters and allocating 32 graphical characters to the C1 control area, breaking ISO 2022 compatibility Conversion from VSCII-3 to VSCII-2 or VSCII-1 and conversion from VSCII-2 to VSCII-1 are not necessary, but can result in smaller files. Conversion from VSCII-1 to VSCII-2 or VSCII-3 and conversion from VSCII-2 to VSCII-3 require expansion of some pre-composed characters.


Character set


References


External links


tables with Unicode points and names
{{Character encodings Computer-related introductions in 1993 1993 establishments in Vietnam Vietnamese inventions Character sets Vietnamese writing systems