VISCII is an unofficially-defined modified
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
character encoding
Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical v ...
for
using the Vietnamese language with computers. It should not be confused with the similarly-named officially registered
VSCII encoding. VISCII keeps the 95 printable characters of ASCII unmodified, but it replaces 6 of the 33 control characters with printable characters. It adds 128
precomposed character
A precomposed character (alternatively composite character or decomposable character) is a Unicode entity that can also be defined as a sequence of one or more other characters. A precomposed character may typically represent a letter with a diac ...
s.
Unicode
Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
and the
Windows-1258 code page are now used for virtually all Vietnamese computer data, but legacy VSCII and VISCII files may need conversion.
History and naming
VISCII was designed by the Vietnamese Standardization Working Group (Viet-Std Group) led by Christopher Cuong T. Nguyen, Cuong M. Bui, and Hoc D. Ngo based in
Silicon Valley
Silicon Valley is a region in Northern California that is a global center for high technology and innovation. Located in the southern part of the San Francisco Bay Area, it corresponds roughly to the geographical area of the Santa Clara Valley ...
, California in 1992 while they were working with the Unicode consortium to include pre-composed Vietnamese characters in the Unicode standard. VISCII, along with
VIQR
Vietnamese Quoted-Readable (usually abbreviated VIQR), also known as Vietnet, is a convention for writing Vietnamese using ASCII characters encoded in only 7 bits, making possible for Vietnamese to be supported in computing and communication syste ...
, was first published in a bilingual report in September 1992, in which it was dubbed the "Vietnamese Standard Code for Information Interchange".
The report noted a proliferation in computer usage in Vietnam and the increasing volume of computer-based communications among Vietnamese abroad, that existing applications used vendor-specific encodings which were unable to interoperate with one another, and that
standardisation
Standardization (American English) or standardisation (British English) is the process of implementing and developing technical standards based on the consensus of different parties that include firms, users, interest groups, standards organiza ...
between vendors was therefore necessary. The successful inclusion of composed and precomposed Vietnamese in Unicode 1.0 was the result of the lessons learned from the development of 8-bit VISCII and 7-bit VIQR.
The next year, in 1993, Vietnam adopted
TCVN 5712, its first national standard in the
information technology
Information technology (IT) is a set of related fields within information and communications technology (ICT), that encompass computer systems, software, programming languages, data processing, data and information processing, and storage. Inf ...
domain.
This defined a character encoding named
VSCII, which had been developed by the
TCVN Technical Committee on Information Technology (TCVN/TC1), and with its name standing for "Vietnamese Standard Code for Information Interchange".
VSCII is incompatible with, and otherwise unrelated to, the earlier-published VISCII. Unlike VISCII, VSCII is a "Vietnamese Standard" in the sense of a
national standard.
VISCII and
VIQR
Vietnamese Quoted-Readable (usually abbreviated VIQR), also known as Vietnet, is a convention for writing Vietnamese using ASCII characters encoded in only 7 bits, making possible for Vietnamese to be supported in computing and communication syste ...
were approved as the informational-status , attributed to the Viet-Std group and dated May 1993. As is the case with IETF RFCs,
RFC 1456 notes them to be "conventions" used by overseas Vietnamese speakers on
Usenet
Usenet (), a portmanteau of User's Network, is a worldwide distributed discussion system available on computers. It was developed from the general-purpose UUCP, Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Elli ...
, and that it "specifies no level of standard". In spite of this, it continues to call VISCII the "VIetnamese Standard Code for Information Interchange" (the same name taken by VSCII). The labels
VISCII
and
csVISCII
are registered with the
IANA
The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Internet P ...
for VISCII, with reference to
RFC 1456.
(There is, on the other hand, no official IANA label for TCVN 5712 / VSCII, although
x-viet-tcvn5712
was previously supported by
Mozilla Firefox
Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements curren ...
.)
Design
A traditional
extended ASCII
Extended ASCII is a repertoire of character encodings that include (most of) the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes critic ...
character set consists of the ASCII set plus up to 128 characters. Vietnamese requires 134 additional letter-diacritic combinations, which is six too many. There are (short of dropping tone mark support for capital letters, as in
VSCII-3) essentially four different ways to handle this problem:
#Use
variable-width encoding
A variable-width encoding is a type of character encoding scheme in which codes of differing lengths are used to encode a character set (a repertoire of symbols) for representation, usually in a computer. Most common variable-width encodings are ...
(as does
UTF-8
UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8.
UTF-8 supports all 1,112,0 ...
)
#Include
combining diacritical marks
Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character " Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actua ...
for tone marks (as do
VSCII-2 and
Windows-1258) or for diacritics in general (as do
ANSEL Ansel may refer to:
Places
* Ansel, California
* Ansel Adams Wilderness, California
* Ansel Township, Cass County, Minnesota
* Mount Ansel Adams, California
Other uses
* Ansel (name), including a list of people with the name
* ANSEL (American Nati ...
and
VNI)
#Replace some ASCII punctuation, preferably punctuation which is not invariant in
ISO 646
ISO/IEC 646 ''Information technology — ISO 7-bit coded character set for information interchange'', is an International Organization for Standardization, ISO/International Electrotechnical Commission, IEC standard in the ...
(as does
VNI for DOS)
#Replace at least six of the basic ASCII
control character
In computing and telecommunications, a control character or non-printing character (NPC) is a code point in a character encoding, character set that does not represent a written Character (computing), character or symbol. They are used as in-ba ...
s (as do
VPS and
VSCII-1)
VISCII went for the last option, replacing six of the least problematic (e.g., least likely to be recognised by an application and acted on specially)
C0 control codes (STX, ENQ, ACK, DC4, EM, and RS) with six of the least-used uppercase letter-diacritic combinations.
While this option may cause programs that use those control codes to malfunction when handling VISCII text, it creates fewer complications than the other two options (the designers note that non-
8-bit clean transmission had been found to pose more difficulty in practice than the control character re-use).
Nonetheless, locations of both C0 or C1 control characters and the codes used for the
non-breaking space
In word processing and digital typesetting, a non-breaking space (), also called NBSP, required space, hard space, or fixed space ...
in
ISO-8859-1
ISO/IEC 8859-1:1998, ''Information technology—8-bit computing, 8-bit single-byte coded graphic character (computing), character sets—Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character enc ...
,
Mac OS Roman and
OEM-US were deliberately assigned to uppercase letters, with the intention of making use of lowercase codepoints with an all-capital font a serviceable workaround if graphical characters could not be displayed for those codes.
However, using up all the extended
code point
A code point, codepoint or code position is a particular position in a Table (database), table, where the position has been assigned a meaning. The table may be one dimensional (a column), two dimensional (like cells in a spreadsheet), three dime ...
s for accented letters left no room to add useful symbols, superscripted numbers, curved quotes, proper dashes, etc., like most other extended ASCII character sets.
Location of characters deliberately mostly follows
ISO-8859-1
ISO/IEC 8859-1:1998, ''Information technology—8-bit computing, 8-bit single-byte coded graphic character (computing), character sets—Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character enc ...
where there are characters in common between the two code pages (the uppercase
Õ being noted as an exception), motivated by user friendliness concerns.
Support
VISCII is partially supported by th
TriChlor Software Groupin California, which has released various VISCII-compliant software packages, libraries, and fonts for MS-DOS and Windows, Unix, and Macintosh. VISCII-compliant software is available at man
FTP sites
VISCII was historically offered as an encoding for outgoing
email
Electronic mail (usually shortened to email; alternatively hyphenated e-mail) is a method of transmitting and receiving Digital media, digital messages using electronics, electronic devices over a computer network. It was conceived in the ...
by
Mozilla Thunderbird
Mozilla Thunderbird is a free and open-source email client that also functions as a personal information manager with a Digital calendar, calendar and contactbook, as well as an RSS feed reader, chat client (IRC/XMPP/Matrix (protocol), Matrix), ...
. It was also supported by the Windows Vietnamese keyboard software, WinVNKey, created by Christopher Cuong T. Nguyen and later upgraded through various Windows versions by Hoc D. Ngo and others.
VISCII was mostly used by overseas Vietnamese speakers, with
VSCII (TCVN) being more popular in northern Vietnam and
VNI being more popular in southern Vietnam.
Character set
See also
*
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
*
Vietnamese Quoted-Readable (VIQR)
*
Vietnamese Standard Code for Information Interchange
VSCII (Vietnamese Standard Code for Information Interchange), also known as TCVN 5712, ISO-IR-180, .VN, ABC or simply the TCVN encodings, is a set of three closely related Vietnamese national standard character encodings for using the Vietnam ...
(VSCII)
*
Windows-1258
References
Further reading
*
*https://www.math.nmsu.edu/~mleisher/Software/csets/VISCII.TXT
External links
* - Conventions for Encoding the Vietnamese Language
Vietnamese-Standardization Working Groupbased in California
VISCII-compliant software and fonts for MS-DOS and WindowsVISCII-compliant software, libraries, and fonts for UnixWinVNKey Vietnamese keyboard driver for Windows supporting multinational character sets, including VISCII
MacVNKey VISCII-compliant keyboard driver for Macintosh classic
{{Character encodings
Computer-related introductions in 1992
1992 establishments in California
Character sets
Vietnamese writing systems