Vietnamese Language And Computers
   HOME

TheInfoList



OR:

The
Vietnamese language Vietnamese () is an Austroasiatic languages, Austroasiatic language Speech, spoken primarily in Vietnam where it is the official language. It belongs to the Vietic languages, Vietic subgroup of the Austroasiatic language family. Vietnamese is s ...
is written with a
Latin script The Latin script, also known as the Roman script, is a writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae in Magna Graecia. The Gree ...
with diacritics ( accent tones) which requires several accommodations when typing on phone or computers. Software-based systems are a form of writing Vietnamese on phones or computers with software that can be installed on the device or from third-party software such as UniKey.
Telex Telex is a telecommunication Telecommunication, often used in its plural form or abbreviated as telecom, is the transmission of information over a distance using electronic means, typically through cables, radio waves, or other communica ...
is the oldest input method devised to encode the Vietnamese language with its tones. Other input methods may also include VNI (Number key-based keyboard) and
VIQR Vietnamese Quoted-Readable (usually abbreviated VIQR), also known as Vietnet, is a convention for writing Vietnamese using ASCII characters encoded in only 7 bits, making possible for Vietnamese to be supported in computing and communication syste ...
. VNI input method is not to be confused with VNI code page. Historically, Vietnamese was also written in ', which is mainly used for ceremonial and traditional purposes in recent times, and remains in the field of historians and
philologists Philology () is the study of language in Oral tradition, oral and writing, written historical sources. It is the intersection of textual criticism, literary criticism, history, and linguistics with strong ties to etymology. Philology is also de ...
. There have been attempts to type
chữ Hán ( , ) are the Chinese characters that were used to write Literary Chinese in Vietnam, Literary Chinese (; ) and Sino-Vietnamese vocabulary in Vietnamese language, Vietnamese. They were officially used in Vietnam after the Red River Delta region ...
and
chữ Nôm Chữ Nôm (, ) is a logographic writing system formerly used to write the Vietnamese language. It uses Chinese characters to represent Sino-Vietnamese vocabulary and some native Vietnamese words, with other words represented by new characters ...
with existing Vietnamese input methods, but they are not widespread. Sometimes, Vietnamese can be typed without tone marks, which Vietnamese speakers can usually guess depending on context.


Fonts and character encodings


Vietnamese alphabet


Character encodings

There are as many as 46
character encoding Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical v ...
s for representing the
Vietnamese alphabet The Vietnamese alphabet (, ) is the modern writing script for the Vietnamese language. It uses the Latin script based on Romance languages like French language, French, originally developed by Francisco de Pina (1585–1625), a missionary from P ...
.
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
has become the most popular form for many of the world's writing systems, due to its great compatibility and software support. Diacritics may be encoded either as
combining character In digital typography, combining characters are Character (computing), characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritic, diacritical marks (including c ...
s or as
precomposed character A precomposed character (alternatively composite character or decomposable character) is a Unicode entity that can also be defined as a sequence of one or more other characters. A precomposed character may typically represent a letter with a diac ...
s, which are scattered throughout the
Latin-1 Supplement The Latin-1 Supplement (also called C1 Controls and Latin-1 Supplement) is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) – FF (U+00FF). C1 Controls (0080–009F) are not graphic. T ...
,
Latin Extended-A Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 (which is already encoded in the Latin-1 Supplement block) and also legacy characte ...
,
Latin Extended-B Latin Extended-B is the fourth block (0180-024F) of the Unicode Standard. It has been included since version 1.0, where it was only allocated to the code points 0180-01FF and contained 113 characters. During unification with ISO 10646 for version ...
, and
Latin Extended Additional Latin Extended Additional is a Unicode block. The characters in this block are mostly precomposed combinations of Latin letters with one or more general diacritical marks. Ninety of the characters are used in the Vietnamese alphabet The Vie ...
blocks. The
Vietnamese đồng The dong (; ; ; sign: ₫ or informally đ and sometimes Đ in Vietnamese; code: VND) is the currency of Vietnam, in use since 3 May 1978. It is issued by the State Bank of Vietnam. The dong was also the currency of the predecessor states of ...
symbol is encoded in the Currency Symbols block. Unicode's coverage of Vietnamese has been subject to several changes since the 1990s. Early versions of Unicode encoded and as and , respectively. In 2001, these two characters were deprecated as duplicate encodings of and ; this change was incorporated into Unicode 3.2, released in 2002. With the 2009 release of Unicode 5.2, and were undeprecated but discouraged. Historically, the Vietnamese language used other characters beyond the modern alphabet. The
Middle Vietnamese Vietnamese () is an Austroasiatic language spoken primarily in Vietnam where it is the official language. It belongs to the Vietic subgroup of the Austroasiatic language family. Vietnamese is spoken natively by around 86 million people, and ...
letter B with flourish (ꞗ) is included in the
Latin Extended-D Latin Extended-D is a Unicode block containing Latin (script), Latin characters for phonetic, Mayanist, and Medieval transcription and notation systems. 89 of the characters in this block are for medieval characters proposed by the Medieval Unic ...
block. The
apex The apex is the highest point of something. The word may also refer to: Arts and media Fictional entities * Apex (comics) A-Bomb Abomination Absorbing Man Abraxas Abyss Abyss is the name of two characters appearing in Ameri ...
is not separately encoded in Unicode, because it derives from the Portuguese
tilde The tilde (, also ) is a grapheme or with a number of uses. The name of the character came into English from Spanish , which in turn came from the Latin , meaning 'title' or 'superscription'. Its primary use is as a diacritic (accent) in ...
, whereas , which derives from the Greek '' perispomeni'', has always been misencoded as a tilde. As a workaround, represents the apex on
Wikisource Wikisource is an online wiki-based digital library of free-content source text, textual sources operated by the Wikimedia Foundation. Wikisource is the name of the project as a whole; it is also the name for each instance of that project, one f ...
and
Wiktionary Wiktionary (, ; , ; rhyming with "dictionary") is a multilingual, web-based project to create a free content dictionary of terms (including words, phrases, proverbs, linguistic reconstructions, etc.) in all natural languages and in a number o ...
. For systems that lack support for Unicode, dozens of 8-bit Vietnamese
code page In computing, a code page is a character encoding and as such it is a specific association of a set of printable character (computing), characters and control characters with unique numbers. Typically each number represents the binary value in a s ...
s have been designed. The most commonly used of them were
VISCII VISCII is an unofficially-defined modified ASCII character encoding for Vietnamese language and computers, using the Vietnamese language with computers. It should not be confused with the similarly-named officially registered VSCII encoding. VI ...
, VSCII (TCVN 5712:1993), VNI, VPS and
Windows-1258 Windows-1258 is a code page used in Microsoft Windows to represent Vietnamese texts. It makes use of combining diacritical marks. Windows-1258 is compatible with neither the Vietnamese standard ( TCVN 5712 / VSCII), nor the various other encodin ...
. Where
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
is required, such as when ensuring readability in plain text e-mail, Vietnamese letters are often encoded according to
Vietnamese Quoted-Readable Vietnamese Quoted-Readable (usually abbreviated VIQR), also known as Vietnet, is a convention for writing Vietnamese language, Vietnamese using ASCII characters encoded in only 7 bits, making possible for Vietnamese to be supported in computing and ...
(VIQR) or VSCII Mnemonic (VSCII-MNEM), though usage of either variable-width scheme has declined dramatically following the adoption of Unicode on the
World Wide Web The World Wide Web (WWW or simply the Web) is an information system that enables Content (media), content sharing over the Internet through user-friendly ways meant to appeal to users beyond Information technology, IT specialists and hobbyis ...
. For instance, support for all above mentioned 8-bit encodings, with the exception of Windows-1258, was dropped from
Mozilla Mozilla is a free software community founded in 1998 by members of Netscape. The Mozilla community uses, develops, publishes and supports Mozilla products, thereby promoting free software and open standards. The community is supported institution ...
software in 2014. Many Vietnamese fonts intended for
desktop publishing Desktop publishing (DTP) is the creation of documents using dedicated software on a personal ("desktop") computer. It was first used almost exclusively for print publications, but now it also assists in the creation of various forms of online co ...
are encoded in VNI or TCVN3 ( VSCII). Such fonts are known as "ABC fonts". Popular
web browser A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...
s lack support for specialty Vietnamese encodings, so any webpage that uses these fonts appears as unintelligible ''
mojibake Mojibake (; , 'character transformation') is the garbled or gibberish text that is the result of text being decoded using an unintended character encoding. The result is a systematic replacement of symbols with completely unrelated ones, often ...
'' on systems without them installed. Vietnamese often stacks diacritics, so typeface designers must take care to prevent stacked diacritics from colliding with adjacent letters or lines. When a tone mark is used together with another diacritic, offsetting the tone mark to the right preserves consistency and avoids slowing down
saccade In vision science, a saccade ( ; ; ) is a quick, simultaneous movement of both Eye movement (sensory), eyes between two or more phases of focal points in the same direction. In contrast, in Smooth pursuit, smooth-pursuit movements, the eyes mov ...
s. In advertising signage and in
cursive Cursive (also known as joined-up writing) is any style of penmanship in which characters are written joined in a flowing manner, generally for the purpose of making writing faster, in contrast to block letters. It varies in functionality and m ...
handwriting, diacritics often take forms unfamiliar to other Latin alphabets. For example, the lowercase letter I retains its
tittle The tittle or superscript dot is the dot on top of lowercase ''i'' and ''j''. In English writing the tittle is a diacritic which only appears as part of these glyphs, but diacritic dots can appear over other letters in various languages. In mos ...
in ''ì'', ''ỉ'', ''ĩ'', and ''í''. These nuances are rarely accounted for in computing environments.


Approaches to character encoding

Vietnamese writing requires 134 additional letters (between both cases) besides the 52 already present in ASCII. This exceeds the 128 additional characters available in a conventional
extended ASCII Extended ASCII is a repertoire of character encodings that include (most of) the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes critic ...
encoding. Although this can be solved by using a
variable-width encoding A variable-width encoding is a type of character encoding scheme in which codes of differing lengths are used to encode a character set (a repertoire of symbols) for representation, usually in a computer. Most common variable-width encodings are ...
(as is done by
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
), a number of approaches have been used by other encodings to support Vietnamese without doing so: * Replace at least six ASCII characters, selected either for being uncommon in Vietnamese, and/or for being non-invariant in
ISO 646 ISO/IEC 646 ''Information technology — ISO 7-bit coded character set for information interchange'', is an International Organization for Standardization, ISO/International Electrotechnical Commission, IEC standard in the ...
or DEC NRCS (as in VNI for DOS). * Drop the uppercase letters which are least frequently used, or all uppercase letters with tone marks (as in VSCII-3 (TCVN3)). These letters may still be supplied by means of all-capital fonts. * Drop forms of the letter Y with tone marks, necessitating use of the letter in those circumstances. This approach was rejected by the designers of
VISCII VISCII is an unofficially-defined modified ASCII character encoding for Vietnamese language and computers, using the Vietnamese language with computers. It should not be confused with the similarly-named officially registered VSCII encoding. VI ...
on the basis that a character encoding should not attempt to settle a spelling reform issue. * Replace at least six C0 control characters (as in
VISCII VISCII is an unofficially-defined modified ASCII character encoding for Vietnamese language and computers, using the Vietnamese language with computers. It should not be confused with the similarly-named officially registered VSCII encoding. VI ...
, VSCII-1 (TCVN1) and VPS). * Use combining characters, allowing one vowel with accents to be fully represented using a sequence of characters (as in VNI, VSCII-2 (TCVN2),
Windows-1258 Windows-1258 is a code page used in Microsoft Windows to represent Vietnamese texts. It makes use of combining diacritical marks. Windows-1258 is compatible with neither the Vietnamese standard ( TCVN 5712 / VSCII), nor the various other encodin ...
and
ANSEL Ansel may refer to: Places * Ansel, California * Ansel Adams Wilderness, California * Ansel Township, Cass County, Minnesota * Mount Ansel Adams, California Other uses * Ansel (name), including a list of people with the name * ANSEL (American Nati ...
).


Unicode code points

The following table provides Unicode code points for all non-ASCII Vietnamese letters.


Font substitution

Many fonts support a subset of the Latin writing system that omits much of the Vietnamese alphabet. Due to the high density of Vietnamese-specific characters in Vietnamese text, Web browsers that implement
font substitution Font substitution is the process of using one typeface in place of another when the intended typeface either is not available or does not contain glyphs for the required characters. Font substitution can be aided by: * classifying fonts into ...
reliably produce a
ransom note effect In typography, the ransom note effect is the result of using an excessive number of juxtaposed typefaces. It takes its name from the appearance of a stereotypical ransom note or poison pen letter, with the message formed from words or letters c ...
when the webpage specifies an inadequate font.


'

Unicode includes over 10,000 ' characters as part of Unicode's repertoire of
CJK Unified Ideographs The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Uni ...
. Of these characters, 10,082 can be found in the
CJK Unified Ideographs Extension B CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 1998 and 2000, plus seven gongche characters for ...
block, while the rest are distributed between the
CJK Unified Ideographs The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Uni ...
,
CJK Unified Ideographs Extension A __FORCETOC__ CJK Unified Ideographs Extension-A is a Unicode block A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for adminis ...
, and CJK Unified Ideographs Extension C blocks. A further 1,028 characters, including over 400 characters specific to the
Tày language Tày or Thổ (a name shared with the unrelated Thổ and Cuoi languages) is the major Tai language of Vietnam, spoken by more than a million Tày people in Northeastern Vietnam. Distribution *Vietnam: northern provinces (including Cao Ban ...
, are encoded in the CJK Unified Ideographs Extension E block. The characters are taken from the Vietnamese standards TCVN 5773:1993 and TCVN 6909:2001 rror for TCVN 6056:1995? as well as from research by the Han-Nom Research Institute and other groups. All the characters in TCVN 5773:1993 and about 95% of the characters in TCVN 6909:2001 rror for TCVN 6056:1995?have corresponding codepoints in Unicode 5.1, though TCVN 5773:1993 itself mapped most of its characters to the
Private Use Area In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the standard. Three Private Use Areas are defined: one in the Basic Multilingual Plane (), and one each in, and nearly covering ...
of Unicode. Unicode 13.0 added two diacritical characters to the
Ideographic Symbols and Punctuation Ideographic Symbols and Punctuation is a Unicode block containing symbols and punctuation marks used by ideographic scripts such as Tangut and Nüshu. History The following Unicode-related documents record the purpose and process of defining ...
block that were commonly used to indicate borrowed characters in . The two most comprehensive ' fonts are the Vietnamese Nôm Preservation Foundation's '' Light'' and the community-developed ''HAN NOM A''/''HAN NOM B'', both of which place a large number of unstandardized characters in the
Private Use Areas In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the standard. Three Private Use Areas are defined: one in the Basic Multilingual Plane (), and one each in, and nearly covering ...
. The Unicode Consortium's
Unihan Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature ...
database includes Vietnamese readings of some characters but does not distinguish between Sino-Vietnamese and ' readings. Like other CJKV writing systems, ' is traditionally written vertically, from top to bottom and right to left. and may also be annotated using
ruby character Ruby characters or rubi characters () are small, annotative glosses that are usually placed above or to the right of logographic characters of languages in the East Asian cultural sphere, such as Chinese ''hanzi'', Japanese ''kanji'', and Kor ...
s, which is the same as chữ Quốc Ngữ for Vietnamese.


Text input

A purely physical Vietnamese keyboard would be impractical, due to the sheer number of letter-diacritic-diacritic combinations in the alphabet e.g. ờ, ị. Instead, Vietnamese input relies on formulaic software-based keyboard layouts,
virtual keyboard A virtual keyboard is a software component that allows the Input device, input of characters without the need for physical keys. Interaction with a virtual Computer keyboard, keyboard happens mostly via a touchscreen interface, but can also take p ...
s, or
input method An input method (or input method editor, commonly abbreviated IME) is an operating system component or program that enables users to generate characters not natively available on their input devices by using sequences of characters (or mouse oper ...
s (also known as IMEs).


Keyboard layouts

Vietnamese keyboard layouts rely on
dead key A dead key is a special kind of modifier key on a mechanical typewriter, or computer keyboard, that is typically used to attach a specific diacritic to a base letter (alphabet), letter. The dead key does not generate a (complete) grapheme, charact ...
s to compose letters with diacritics. Most desktop operating systems include a Vietnamese keyboard layout similar to , a Vietnamese national standard. Previously, typewriters used an AZERTY-based Vietnamese layout (AĐERTY).


Input methods

The three most common Vietnamese input methods are
Telex Telex is a telecommunication Telecommunication, often used in its plural form or abbreviated as telecom, is the transmission of information over a distance using electronic means, typically through cables, radio waves, or other communica ...
, VNI, and
VIQR Vietnamese Quoted-Readable (usually abbreviated VIQR), also known as Vietnet, is a convention for writing Vietnamese using ASCII characters encoded in only 7 bits, making possible for Vietnamese to be supported in computing and communication syste ...
. Telex indicates diacritics using letters that are unlikely to appear at the end of a word, while VNI repurposes the number keys or function keys and VIQR repurposes various punctuation marks. The Telex and VIQR conventions originated in an earlier era of
telex Telex is a telecommunication Telecommunication, often used in its plural form or abbreviated as telecom, is the transmission of information over a distance using electronic means, typically through cables, radio waves, or other communica ...
machines and typewriters, respectively. Support for these input methods is provided by input method editors (IMEs), which are known in Vietnamese as ', literally "peckers", "typing sets" or "percussion" in more general terms. IMEs may be provided by the operating system, installed as a third-party application, installed as a
browser extension A browser extension is a software module for customizing a web browser. Browsers typically allow users to install a variety of extensions, including user interface modifications, cookie management, ad blocking, and the custom scripting and st ...
, or provided by an individual website in the form of a
script Script may refer to: Writing systems * Script, a distinctive writing system, based on a repertoire of specific elements or symbols, or that repertoire * Script (styles of handwriting) ** Script typeface, a typeface with characteristics of handw ...
. Common third-party applications include GoTiengViet, UniKey, VietKey,
VPSKeys VPSKeys is a freeware Input method, input method editor developed and distributed by the Vietnamese Professionals Society (VPS). One of the first input method editors for Vietnamese, it allows users to add diacritic, accent marks to Vietnamese text ...
, WinVNKey, and xvnkb. On
Unix-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
operating systems, the
IBus The principal factors that characterize beer are bitterness, the variety of flavours present in the beverage and their intensity, ethanol, alcohol content, and colour. Standards for those characteristics allow a more objective and uniform determ ...
and SCIM frameworks both support Vietnamese. IME scripts such as AVIM, Mudim, and VietTyping can be found on most Vietnamese message boards, the
Vietnamese Wikipedia The Vietnamese Wikipedia () is the Vietnamese language, Vietnamese-language edition of Wikipedia, a free, publicly editable, online encyclopedia supported by the Wikimedia Foundation. Like the rest of Wikipedia, its content is created and acces ...
, and other text-intensive websites. The Vietnamese Web browser Cốc Cốc comes with an input method built-in. Input methods allow words to be composed in a more flexible order than keyboard layouts allow. For example, to enter the word "" using the TCVN 6064:1995 keyboard layout, one must type , in that order. By contrast, most IMEs permit the user to insert diacritics at the end of the word: in Telex, in VNI, or in VIQR. Some IMEs even allow diacritics to be entered before their base letters. Depending on an IME's implementation, it may also be possible to edit an existing word's diacritics without retyping the word. Some
virtual keyboard A virtual keyboard is a software component that allows the Input device, input of characters without the need for physical keys. Interaction with a virtual Computer keyboard, keyboard happens mostly via a touchscreen interface, but can also take p ...
s supplement the standard dead keys with dedicated shortcut keys. For example, with the VIQR keyboard built into
iOS Ios, Io or Nio (, ; ; locally Nios, Νιός) is a Greek island in the Cyclades group in the Aegean Sea. Ios is a hilly island with cliffs down to the sea on most sides. It is situated halfway between Naxos and Santorini. It is about long an ...
, it is possible to add a
horn Horn may refer to: Common uses * Horn (acoustic), a tapered sound guide ** Horn antenna ** Horn loudspeaker ** Vehicle horn ** Train horn *Horn (anatomy), a pointed, bony projection on the head of various animals * Horn (instrument), a family ...
to "U" by tapping either or the dedicated key, which has no analogue on a physical keyboard. Borrowing a feature common amongst Chinese input methods, some Vietnamese IMEs allow one to skip diacritics altogether and instead, after typing the base letters, the user can select the accented word from a candidate list. In order to provide this
autocomplete Autocomplete, or word completion, is a feature in which an application software, application predicts the rest of a word a user is typing. In Android (operating system), Android and iOS smartphones, this is called predictive text. In graphical us ...
list, the IME may need to communicate with a
Web service A web service (WS) is either: * a service offered by an electronic device to another electronic device, communicating with each other via the Internet, or * a server running on a computer device, listening for requests at a particular port over a n ...
. Some IMEs also use candidate lists to allow the user to convert text from the Vietnamese alphabet to ', because there is no one-to-one correspondence between alphabetic words and ' characters.


Other considerations

Typical Vietnamese text contains a high proportion of compound words. Compound words are never hyphenated in contemporary usage, so
spell checker In software, a spell checker (or spelling checker or spell check) is a software feature that checks for misspellings in a text. Spell-checking features are often embedded in software or services, such as a word processor, email client, electronic ...
s are limited to checking individual syllables unless a statistical
language model A language model is a model of the human brain's ability to produce natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation,Andreas, Jacob, Andreas Vlachos, and Stephen Clark (2013)"S ...
is consulted. Vietnamese has rigid spelling rules and few exceptions, so text-to-speech engines may avoid dictionary lookups except when encountering a foreign loan word. TTS engines must account for
tone Tone may refer to: Visual arts and color-related * Tone (color theory), a mix of tint and shade, in painting and color theory * Tone (color), the lightness or brightness (as well as darkness) of a color * Toning (coin), color change in coins * ...
s, which are essential to the meaning of any Vietnamese word e.g. má (mother) is a different word to mà (but). Internationalized user interfaces are generally unable to use the full complement of
Vietnamese pronouns In general, a Vietnamese pronoun (, or ) can serve as a noun phrase. In Vietnamese, a pronoun usually connotes a degree of family relationship or kinship. In polite speech, the aspect of kinship terminology is used when referring to oneself, the au ...
that would be expected in a traditional social setting, even when much is known about the user. Instead, user interfaces typically use generic pronouns such as and , some of which make potentially incorrect assumptions about the user's age and relationship to other users. For example, when a
social media Social media are interactive technologies that facilitate the Content creation, creation, information exchange, sharing and news aggregator, aggregation of Content (media), content (such as ideas, interests, and other forms of expression) amongs ...
platform notifies a user about a younger user, it may refer to the latter in the third person as instead of , leading the user to misinterpret the notification as a reference to someone else.


See also

*
Chinese input methods for computers Several input methods allow the use of Chinese characters with computers. Most allow selection of characters based either on their pronunciation or their graphical shape. Phonetic input methods are easier to learn but are less efficient, while g ...
*
Japanese language and computers In relation to the Japanese language and computers many adaptation issues arise, some unique to Japanese language, Japanese and others common to languages which have a very large number of characters. The number of characters needed in order to w ...
*
Korean language and computers The writing system of the Korean language is a syllabic alphabet of character parts () organized into character blocks (; ) representing syllables. The character parts cannot be written from left to right on the computer, as in many Western la ...


References


Further reading

*


External links


Computing in Vietnamese: Progress & Challenges
2005 International Macintosh Users Group presentation
Vietnamese Conversions
online tool for recovering Vietnamese
mojibake Mojibake (; , 'character transformation') is the garbled or gibberish text that is the result of text being decoded using an unintended character encoding. The result is a systematic replacement of symbols with completely unrelated ones, often ...
{{Vietnam topics Natural language and computing Science and technology in Vietnam Vietnamese character input Vietnamese software