Vietnamese Phonology
   HOME

TheInfoList



OR:

The phonology of Vietnamese features 19 consonant phonemes, with 5 additional consonant phonemes used in Vietnamese's Southern dialect, and 4 exclusive to the Northern dialect. Vietnamese also has 14 vowel nuclei, and 6 tones that are integral to the interpretation of the language. Older interpretations of Vietnamese tones differentiated between "sharp" and "heavy" entering and departing tones. This article is a technical description of the sound system of the
Vietnamese language Vietnamese () is an Austroasiatic languages, Austroasiatic language Speech, spoken primarily in Vietnam where it is the official language. It belongs to the Vietic languages, Vietic subgroup of the Austroasiatic language family. Vietnamese is s ...
, including
phonetics Phonetics is a branch of linguistics that studies how humans produce and perceive sounds or, in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians ...
and
phonology Phonology (formerly also phonemics or phonematics: "phonemics ''n.'' 'obsolescent''1. Any procedure for identifying the phonemes of a language from a corpus of data. 2. (formerly also phonematics) A former synonym for phonology, often pre ...
. Two main varieties of Vietnamese,
Hanoi Hanoi ( ; ; ) is the Capital city, capital and List of cities in Vietnam, second-most populous city of Vietnam. The name "Hanoi" translates to "inside the river" (Hanoi is bordered by the Red River (Asia), Red and Black River (Asia), Black Riv ...
and
Saigon Ho Chi Minh City (HCMC) ('','' TP.HCM; ), commonly known as Saigon (; ), is the most populous city in Vietnam with a population of around 14 million in 2025. The city's geography is defined by rivers and canals, of which the largest is Saigo ...
, which are slightly different to each other, are described below.


Initial consonants

Initial consonants which exist only in the Northern dialect are in red, while those that exist only in the Southern dialect are in blue. * /w/ is the only initial consonant permitted to form consonant clusters with other consonants. * In many regions of Northern Vietnam, the pair and have merged into one, they are no longer two opposing phonemes. Some native Vietnamese speakers who lack linguistic knowledge believe that pronouncing the initial consonant of a word whose orthographic form begins with the letter ''l'' as , ''n'' as is ''nói ngọng''. The phenomenon of no longer distinguishing from in words whose orthographic form begins with the letter ''n'' or ''l'' has three manifestations: # The initial consonant of all words whose orthographic form begins with ''n'' or ''l'' is . # The initial consonant of all words is . # In some words, the initial consonant corresponding to the letter ''n'' at the beginning of the spelling form of the word is , with ''l'' being , in some other words the sound corresponding to ''n'' is , with ''l'' being . * In Northern dialects, some words have the initial consonant as the
voiced palatal nasal The voiced palatal nasal is a type of consonant used in some Speech communication, spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is , a lowercase letter ''n'' with a leftward-pointing tail protru ...
, such as ''nhuộm'', ''nhức'', ''nhỏ'' (''nhỏ'' in ''nhỏ giọt'', not ''nhỏ'' in ''nhỏ bé''), ''nhổ'', ''nhốt'', have phonetic variants with the initial consonant . This sound is written with the letter ''d'' or ''gi'' or ''r'' depending on the word (at least one of those three letters, sometimes two, or even all three). * Some words with the initial consonant being the voiced velar nasal also have phonetic variants with the initial consonant being the
voiced velar fricative The voiced velar fricative is a type of consonantal sound that is used in various spoken languages. It is not found in most varieties of Modern English but existed in Old English. The symbol in the International Phonetic Alphabet that represents ...
, which are used in some places in the North. For example, the words ''ngáy'' (''ngáy'' in ''ngáy ngủ''), ''ngẫm'' (''ngẫm'' in ''suy ngẫm'') also have phonetic variants ''gáy'', ''gẫm''. * In Northern dialects, the
voiceless bilabial plosive The voiceless bilabial plosive or stop is a type of consonantal sound used in most Speech communication, spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is , and the equivalent X-SAMPA symbol is p. F ...
is only the initial consonant in a few loanwords from other languages, mainly from French. In writing, the sound is written with the letter ''p'', as in ''sâm panh'', derived from French ''champagne''. Not every word in another language that has the initial consonant have the corresponding Vietnamese loanword with the initial consonant . In some words, the sound is replaced by the sound . For example, both syllables of the word ''búp bê'' (derived from the French word ''poupée'' /pu.pe/) have the initial consonant , not . In Southern dialects, the initial consonant of words whose spelling form begins with the letter ''p'' is in many speakers. * The glottalized stops are preglottalized and voiced: (the
glottis The glottis (: glottises or glottides) is the opening between the vocal folds (the rima glottidis). The glottis is crucial in producing sound from the vocal folds. Etymology From Ancient Greek ''γλωττίς'' (glōttís), derived from ''γ ...
is always closed before the oral closure). This glottal closure is often not released before the release of the oral closure, resulting in the characteristic implosive pronunciation. However, sometimes the glottal closure is released prior to the oral release in which case the stops are pronounced . Therefore, the primary characteristic is preglottalization with implosion being secondary. * are bilabial, while are
labiodental In phonetics, labiodentals are consonants articulated with the lower lip and the upper teeth, such as and . In English, labiodentalized /s/, /z/ and /r/ are characteristic of some individuals; these may be written . Labiodental consonants in ...
. * are denti-alveolar (), while are apico- alveolar. * are phonetically lamino- alveolar. * are often slightly
affricate An affricate is a consonant that begins as a stop and releases as a fricative, generally with the same place of articulation (most often coronal). It is often difficult to decide if a stop and fricative form a single phoneme or a consonant pai ...
d , but they are unaspirated. * A
glottal stop The glottal stop or glottal plosive is a type of consonantal sound used in many Speech communication, spoken languages, produced by obstructing airflow in the vocal tract or, more precisely, the glottis. The symbol in the International Phonetic ...
is inserted before words that begin with a vowel or in Northern dialects: :


Hanoi initials

* are denti- lamino- alveolar: . * is apico- alveolar: . * ''d'', ''gi'' and ''r'' are all pronounced . * ''ch'' and ''tr'' are both pronounced , while ''x'' and ''s'' are both pronounced . * The highly salient (and socially stigmatized) merger of and as mentioned above, characteristic of the speech of many lower- and working-class Vietnamese in the Red River Delta, is sometimes consciously manipulated to humorous and/or pejorative effect in colloquial Hanoi speech. * occur in a small number of foreign (mainly French) loans, e.g. < ''panne'' 'breakdown', < ''garage'', < ''billiard''. For many speakers, however, is realized as and as . * There are no retroflex consonants , , , instead there are palato-alveolar consonants: , , in spelling pronunciations taught in schools.


Saigon initials

* is apico- alveolar . * is palatalized lamino- alveolar: . * Some people pronounce ''d'' as , and ''gi'' as in situations where the distinction is necessary, most people pronounce both as . * Historically, is pronounced in common speech, merging with ''d'' and ''gi''. However, it is becoming distinct and pronounced as , especially in careful speech or when reading a text. In traditional performance including
Cải lương ''Tuồng cải lương'' (, Hán-Nôm: 從改良) often referred to as ''Cải lương'' (Chữ Hán: 改良), roughly "reformed theater") is a form of modern folk opera in Vietnam. It blends southern Vietnamese folk songs, classical music, ''h ...
, Đờn ca tài tử, Hát bội (Tuồng) and some old speakers of Overseas Vietnamese, it is pronounced as consonant cluster or . In loanwords, it is pronounced , or , for example, ''va li'' is pronounced , or . * Historically, a distinction is made between ''ch'' and ''tr'' , as well as between ''x'' and ''s'' . However, in many speakers, these two pairs are becoming merged as and respectively. * In southern speech, the phoneme , generally represented in Vietnamese linguistics by the letter , has a number of variant pronunciations depending on the speaker. A person can also have many pronunciations. It may occur as a retroflex fricative , an
alveolar approximant The voiced alveolar and postalveolar approximants are types of consonantal sounds used in some spoken languages. The symbol in the International Phonetic Alphabet that represents the alveolar and postalveolar approximants is , a lowercase lett ...
, an
alveolar flap The voiced alveolar tap or flap is a type of consonantal sound, used in some spoken languages. The symbol in the International Phonetic Alphabet The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based pri ...
, a trill , or a tapped fricative/ fricative trill . In the border area between
Ho Chi Minh City Ho Chi Minh City (HCMC) ('','' TP.HCM; ), commonly known as Saigon (; ), is the most populous city in Vietnam with a population of around 14 million in 2025. The city's geography is defined by rivers and canals, of which the largest is Saigo ...
and Long An province ( Bình Chánh, Cần Giuộc,
Cần Đước Cần Đước is a Commune-level town (Vietnam), township () and capital of Cần Đước District, Long An Province, Vietnam.palatal approximant The voiced palatal approximant is a type of consonant used in many spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is ; the equivalent X-SAMPA symbol is j, and in the Americanist phonetic notation i ...
In many areas in the
Mekong Delta The Mekong Delta ( or simply ), also known as the Western Region () or South-western region (), is the list of regions of Vietnam, region in southwestern Vietnam where the Mekong, Mekong River River delta, approaches and empties into the sea th ...
, the letter is pronounced as a velar fricative


Simplification of consonant clusters in Southern dialects

As mentioned above, the only cluster in Vietnamese is in which is a consonant. Southern dialects do not retain this cluster. But this cluster tends to be retained by many young urban people in southern Vietnam, especially in Ho Chi Minh City and surrounding areas. The cluster is reduced to one element. Depending on which consonant forms the cluster , there are two patterns in this simplification process. In one pattern the consonant is deleted and remains. In the other, is deleted while the consonant remains: * In informal speech, , , and are usually pronounced . The cluster ''go'' is very rare, seen only in ''goá'' ‘widowed’. ''ngw'' shows greatest loss in rural varieties. : However, they are becoming distinct and pronounced as or , , , , and respectively, especially in formal speech or when reading a text. * In informal speech, the
voiceless velar fricative The voiceless velar fricative is a type of consonantal sound used in some spoken languages. It was part of the consonant inventory of Old English and can still be found in some dialects of English, most notably in Scottish English, e.g. in ''lo ...
(represented by the letter ''kh'') is often transformed into the corresponding voiceless bilabial and labiodental consonants , and the prevocalic is deleted, for example: ''cá khoai'' is pronounced as ''cá phai'', ''khóa máy'' is pronounced as ''phá máy'', ''khỏe không?'' is pronounced as ''phẻ không?''. This pronunciation is observed only in rural Southern dialects, and it does not occur in the speech of educated speakers (Nguyễn 2005, Cao and Lê 2005). * After the bilabial, labiodental consonants followed by the prevocalic , there are only a few words and most of them are French loanwords, for example: ''tiền boa'' (''pourboire''), ''đậu pơ-ti-poa'' (''petit pois''), ''xe buýt'' (''bus''), ''vải voan'' (''voile''). The initial consonant is kept and the prevocalic is lost and pronounced as: ''tiền bo'', ''đậu bo'', ''xe bít'', ''vải von''. * After the consonant clusters of the remaining articulators ( alveolar,
postalveolar Postalveolar (post-alveolar) consonants are consonants articulated with the tongue near or touching the ''back'' of the alveolar ridge. Articulation is farther back in the mouth than the alveolar consonants, which are at the ridge itself, but n ...
, palatal consonants) followed by the prevocalic , the initial consonant is kept and the prevocalic is lost as above, for example: ''vô duyên'' is pronounced as ''vô diên'', ''cái loa (hát)'' is pronounced as ''cái lo''.


Comparison of initials

The table below summarizes these sound correspondences: :


Vowels


Vowel nuclei

The IPA chart of vowel nuclei above is based on the sounds in Hanoi Vietnamese; other regions may have slightly different inventories. Vowel nuclei consist of monophthongs (simple
vowel A vowel is a speech sound pronounced without any stricture in the vocal tract, forming the nucleus of a syllable. Vowels are one of the two principal classes of speech sounds, the other being the consonant. Vowels vary in quality, in loudness a ...
s) and three centering diphthongs. * All vowels are unrounded except for the four back rounded vowels: . * In the South, the high vowels are all diphthongized in open syllables: , ''Ba Vì'' (). * and are pronounced shorter than the other vowels. These short vowels only occur in closed syllables. * The vowels and are marginal. As with the other short/long vowel pairs, short and long and are only distinguished in closed syllables. For some speakers the distinction may be one of vowel quality or of the articulation of the syllable coda in addition to or instead of vowel quantity. * : Many descriptions, such as Thompson, , , consider this vowel to be close back unrounded: . However, Han's instrumental analysis indicates that it is more central than back. , and also transcribe this vowel as central.


Closing sequences

In Vietnamese, vowel nuclei are able to combine with offglides or to form closing
diphthong A diphthong ( ), also known as a gliding vowel or a vowel glide, is a combination of two adjacent vowel sounds within the same syllable. Technically, a diphthong is a vowel with two different targets: that is, the tongue (and/or other parts of ...
s and triphthongs. Below is a chart listing the closing sequences of general northern speech. : says that in Hanoi, words spelled with ''ưu'' and ''ươu'' are pronounced , respectively, whereas other dialects in the Tonkin delta pronounce them as and . This observation is also made by and .


Finals

When stops occur at the end of words, they have no audible release (): : When the velar consonants are after , they are articulated with a simultaneous bilabial closure (i.e. doubly articulated) or are strongly labialized . :


Hanoi finals


Analysis of final ''ch'', ''nh''

The pronunciation of syllable-final ''ch'' and ''nh'' in Hanoi Vietnamese has had different analyses. One analysis, that of has them as being phonemes , where contrasts with both syllable-final ''t'' and ''c'' , and contrasts with syllable-final ''n'' and ''ng'' . Final is, then, identified with syllable-initial . Another analysis has final and as representing different spellings of the velar phonemes and that occur after upper front vowels (orthographic ) and (orthographic ). This analysis interprets orthographic ⟨ach⟩ and ⟨anh⟩ as an underlying , which becomes phonetically open and diphthongized: → , → . This diphthongization also affects ⟨êch⟩ and ⟨ênh⟩: → , → . Arguments for the second analysis include the limited distribution of final and , the gap in the distribution of and which do not occur after and , the pronunciation of ⟨ach⟩ and ⟨anh⟩ as and in certain conservative central dialects, and the patterning of ~ and ~ in certain reduplicated words. Additionally, final is not articulated as far forward as the initial : and are pre-velar with no alveolar contact. The first analysis closely follows the surface pronunciation of a slightly different Hanoi dialect than the second. In this dialect, the in and is not diphthongized but is actually articulated more forward, approaching a front vowel . This results in a three-way contrast between the rimes ''ăn'' vs. ''anh'' vs. ''ăng'' . For this reason, a separate phonemic is posited.


Table of Hanoi finals

The following rimes ending with
velar consonant Velar consonants are consonants articulated with the back part of the tongue (the dorsum) against the soft palate, the back part of the roof of the mouth (also known as the "velum"). Since the velar region of the roof of the mouth is relativel ...
s have been
diphthong A diphthong ( ), also known as a gliding vowel or a vowel glide, is a combination of two adjacent vowel sounds within the same syllable. Technically, a diphthong is a vowel with two different targets: that is, the tongue (and/or other parts of ...
ized in the Hanoi dialect, but , and are more open: : With the above phonemic analyses, the following is a table of rimes ending in in the Hanoi dialect:


Saigon finals


Merger of finals

While the variety of Vietnamese spoken in Hanoi has retained finals faithfully from Middle Vietnamese, the variety spoken in Ho Chi Minh City has drastically changed its finals. Rimes ending in merged with those ending in , respectively, so they are always pronounced , respectively, after the short front vowels (only when is before "nh"). However, they are always pronounced after the other vowels . After rounded vowels , many speakers close their lips, i.e. they pronounce as . Subsequently, vowels of rimes ending in labiovelars have been diphthongized, while vowels of rimes ending in alveolar have been centralized. Otherwise, some Southern speakers distinguish and after in formal speech, but there are no Southern speakers who pronounce "ch" and "nh" at the end of syllables as .


Table of Saigon finals

The short back vowels in the rimes have been
diphthong A diphthong ( ), also known as a gliding vowel or a vowel glide, is a combination of two adjacent vowel sounds within the same syllable. Technically, a diphthong is a vowel with two different targets: that is, the tongue (and/or other parts of ...
ized and centralized, meanwhile, the consonants have been labialized. Similarly, the short front vowels have been centralized which are realized as central vowels and the "unspecified" consonants have been affected by coronal spreading from the preceding front vowels which are surfaced as coronals (alveolar) . : The other closed dialects (
Huế Huế (formerly Thừa Thiên Huế province) is the southernmost coastal Municipalities of Vietnam, city in the North Central Coast region, the Central Vietnam, Central of Vietnam, approximately in the center of the country. It borders Quảng ...
, Quảng Nam, Bình Định) which have also been merged in codas, but some vowels are pronounced differently in some dialects: The ''ông'', ''ôc'' rimes are merged into ''ong'', ''oc'' as , in many Southern speakers, but not with ''ôn'', ''ôt'' as pronounced , . The ''oong'', ''ooc'' and ''eng'', ''ec'' rimes are few and are mostly loanwords or
onomatopoeia Onomatopoeia (or rarely echoism) is a type of word, or the process of creating a word, that phonetics, phonetically imitates, resembles, or suggests the sound that it describes. Common onomatopoeias in English include animal noises such as Oin ...
. The ''ôông'', ''ôôc'' (''oong'', ''ooc, eng'', ''ec, êng, êc'' as well'')'' rimes are the "archaic" form before becoming ''ông'', ''ôc'' by diphthongization and still exist in the North Central dialect in many placenames. The articulation of these rimes in the North Central dialect are , without a simultaneous bilabial closure or labialization. : With the above phonemic analyses, the following is a table of rimes ending in in the Ho Chi Minh City dialect:


Tone

Vietnamese vowels are all pronounced with an inherent tone. Tones differ in * pitch * length * contour melody * intensity *
phonation The term phonation has slightly different meanings depending on the subfield of phonetics. Among some phoneticians, ''phonation'' is the process by which the vocal folds produce certain sounds through quasi-periodic vibration. This is the defi ...
(with or without accompanying constricted vocal cords) Unlike many Native American, African, and Chinese languages, Vietnamese tones do not rely solely on pitch contour. Vietnamese often uses instead a register complex (which is a combination of phonation type, pitch, length, vowel quality, etc.). Thus, it may be more accurate to categorize Vietnamese as a register language rather than a "pure" tonal language. In Vietnamese orthography, tone is indicated by diacritics written above or below the vowel.


Six-tone analysis

There is much variation among speakers concerning how tone is realized phonetically. There are differences between varieties of Vietnamese spoken in the major geographic areas (northern, central, southern) and smaller differences within the major areas (e.g. Hanoi vs. other northern varieties). In addition, there seems to be variation among individuals. More research is needed to determine the remaining details of tone realization and the variation among speakers.


Northern varieties

The six tones in the Hanoi and other northern varieties are: :


= tone

= * The tone is level at around the mid level (33) and is produced with
modal voice Modal voice is the vocal register used most frequently in speech and singing in most languages. It is also the term used in linguistics for the most common phonation of vowels. The term "modal" refers to the resonant mode of vocal folds; that ...
phonation The term phonation has slightly different meanings depending on the subfield of phonetics. Among some phoneticians, ''phonation'' is the process by which the vocal folds produce certain sounds through quasi-periodic vibration. This is the defi ...
(i.e. with "normal" phonation). Alexandre de Rhodes (1651) describes this as "level"; describes it as "high (or mid) level".


= tone

= * The tone starts low-mid and falls (21). Some Hanoi speakers start at a somewhat higher point (31). It is sometimes accompanied by
breathy voice Breathy voice (also called murmured voice, whispery voice, soughing and susurration) is a phonation in which the vocal folds vibrate, as they do in normal (modal) voicing, but are adjusted to let more air escape which produces a sighing-like s ...
(or lax) phonation in some speakers, but this is lacking in other speakers: = . Alexandre de Rhodes (1651) describes this as "grave-lowering"; describes it as "low falling".


= tone

= * The tone starts a mid level and falls. It starts with modal voice phonation, which moves increasingly toward tense voice with accompanying harsh voice (although the harsh voice seems to vary according to speaker). In Hanoi, the tone is mid falling (31). In other northern speakers, the tone is mid falling and then rises back to the mid level (313 or 323). This characteristic gives this tone its traditional description as "dipping". However, the falling-rising contour is most obvious in citation forms or when syllable-final; in other positions and when in fast speech, the rising contour is negligible. The also is relatively short compared with the other tones, but not as short as the tone. Alexandre de Rhodes (1651) describes this as "smooth-rising"; describes it as "dipping-rising".


= tone

= * The tone is mid rising (35). Many speakers begin the vowel with modal voice, followed by strong creaky voice starting toward the middle of the vowel, which is then lessening as the end of the syllable is approached. Some speakers with more dramatic glottalization have a
glottal stop The glottal stop or glottal plosive is a type of consonantal sound used in many Speech communication, spoken languages, produced by obstructing airflow in the vocal tract or, more precisely, the glottis. The symbol in the International Phonetic ...
closure in the middle of the vowel (i.e. as ). In Hanoi Vietnamese, the tone starts at a higher pitch (45) than other northern speakers. Alexandre de Rhodes (1651) describes this as "chesty-raised"; describes it as "creaking-rising".


= tone

= *The tone starts as mid and then rises (35) in much the same way as the tone. It is accompanied by tense voice phonation throughout the duration of the vowel. In some Hanoi speakers, the tone is noticeably higher than the tone, for example: = (34); = (45). Alexandre de Rhodes (1651) describes this as "acute-angry"; describes it as "high (or mid) rising".


= tone

= * The tone starts mid or low-mid and rapidly falls in pitch (32 or 21). It starts with tense voice that becomes increasingly tense until the vowel ends in a glottal stop closure. This tone is noticeably shorter than the other tones. Alexandre de Rhodes (1651) describes this as "chesty-heavy"; describes it as "constricted".


Southern varieties

In Southern varieties, tones , , have similar contours to Northern tones; however, these tones are produced with normal voice instead of breathy voice. The tone is pronounced as low rising tone (12) ˨in fast speech or low falling-rising tone (212) ˩˨in more careful utterance. The and tone are merged into a mid falling-rising (214) ˩˦ which is somewhat similar to the tone of the non-Hanoi Northern accent mentioned above. This merged tone is characteristic of Southern Vietnamese accents.


North-central and Central varieties

North-central and Central Vietnamese varieties are fairly similar with respect to tone although within the North-central dialect region there is considerable internal variation. It is sometimes said (by people from other provinces) that people from Nghệ An pronounce every tone as a nặng tone.


Eight-tone analysis

An older analysis assumes eight tones rather than six. This follows the lead of traditional Chinese phonology. In
Middle Chinese Middle Chinese (formerly known as Ancient Chinese) or the Qieyun system (QYS) is the historical variety of Chinese language, Chinese recorded in the ''Qieyun'', a rime dictionary first published in 601 and followed by several revised and expande ...
, syllables ending in a vowel or nasal allowed for three tonal distinctions, but syllables ending with , or had no tonal distinctions. Rather, they were consistently pronounced with a short high tone, which was called the entering tone and considered a fourth tone. Similar considerations lead to the identification of two additional tones in Vietnamese for syllables ending in , , and . These are not phonemically distinct from the and tones, however, and hence not considered as separate tones by modern linguists and are not distinguished in the orthography.


Syllables and phonotactics

According to , there are 4,500 to 4,800 possible spoken syllables (depending on dialect), and the standard national orthography ('' Quốc Ngữ'') can represent 6,200 syllables (''Quốc Ngữ'' orthography represents more phonemic distinctions than are made by any one dialect). A description of syllable structure and exploration of its patterning according to the Prosodic Analysis approach of J.R. Firth is given in Henderson (1966). The Vietnamese syllable structure follows the scheme: :(C1)(w)V(G, C2)+T where : In other words, a syllable has an obligatory nucleus and tone, and can have an optional consonant onset, an optional on-glide , and an optional coda or off-glide. More explicitly, the syllable types are as follows: : C1: Any consonant may occur in as an onset with the following exception: * does not occur in native Vietnamese words w: the onglide (sometimes transcribed instead as labialization on a preceding consonant): * does not occur after labial consonants * does not occur after in native Vietnamese words (it occurs in uncommon Sino-Vietnamese borrowings, such as '' noãn'' "ovule") V: The vowel nucleus V may be any of the following 14 monophthongs or diphthongs: . G: The offglide may be or . Together, V and G must form one of the diphthongs or triphthongs listed in the section on Vowels. * offglide does not follow the front vowels * offglide does not follow the rounded vowels * with some exceptions (such as '' khuỷu tay'' "elbow"), the offglide cannot occur if the syllable contains a onglide C2: The optional coda C2 is restricted to labial, coronal, and velar stops and nasals , which cannot cooccur with the offglides . T: Syllables are spoken with an inherent tone contour: * Six tone contours are possible for syllables with offglides , closed syllables with nasal codas , and
open syllable A syllable is a basic unit of organization within a sequence of Phone (phonetics), speech sounds, such as within a word, typically defined by linguists as a ''nucleus'' (most often a vowel) with optional sounds before or after that nucleus (''ma ...
s—i.e., those without consonant codas . * If the syllable is closed with one of the oral stops , only two contours are possible: the ''sắc'' and the ''nặng'' tones. * Less common rimes may not be represented in this table. * The ''nặng'' tone mark (dot below) has been added to all rimes in this table for illustration purposes only. It indicates which letter tone marks in general are added to, largely according to the "new style" rules of Vietnamese orthography as stated in Quy tắc đặt dấu thanh trong chữ quốc ngữ. In practice, not all these rimes have real words or syllables that have the ''nặng'' tone. * The IPA representations are based on Wikipedia's conventions. Different dialects may have different pronunciations.


Notes

Below is a table comparing four linguists' different transcriptions of Vietnamese vowels as well as the orthographic representation. Notice that this article mostly follows , with the exception of marking short vowels short. : says that the vowels (orthographic â) and (orthographic ă) are shorter than all of the other vowels, which is shown here with the length mark added to the other vowels. His vowels above are only the basic vowel phonemes. Thompson gives a very detailed description of each vowel's various allophonic realizations. uses acoustic analysis, including
spectrogram A spectrogram is a visual representation of the spectrum of frequencies of a signal as it varies with time. When applied to an audio signal, spectrograms are sometimes called sonographs, voiceprints, or voicegrams. When the data are represen ...
s and
formant In speech science and phonetics, a formant is the broad spectral maximum that results from an acoustic resonance of the human vocal tract. In acoustics, a formant is usually defined as a broad peak, or local maximum, in the spectrum. For harmo ...
measuring and plotting, to describe the vowels. She states that the primary difference between orthographic ''ơ'' & ''â'' and ''a'' & ''ă'' is a difference of length (a ratio of 2:1). ''ơ'' = , ''â'' = ; ''a'' = , ''ă'' = . Her formant plots also seem to show that may be slightly higher than in some contexts (but this would be secondary to the main difference of length). Another thing to mention about Han's studies is that she uses a rather small number of participants and, additionally, although her participants are native speakers of the Hanoi variety, they all have lived outside of Hanoi for a significant period of their lives (e.g. in
France France, officially the French Republic, is a country located primarily in Western Europe. Overseas France, Its overseas regions and territories include French Guiana in South America, Saint Pierre and Miquelon in the Atlantic Ocean#North Atlan ...
or
Ho Chi Minh City Ho Chi Minh City (HCMC) ('','' TP.HCM; ), commonly known as Saigon (; ), is the most populous city in Vietnam with a population of around 14 million in 2025. The city's geography is defined by rivers and canals, of which the largest is Saigo ...
). has a simpler, more symmetrical description. He says that his work is not a "complete grammar" but rather a "descriptive introduction." So, his chart above is more a phonological vowel chart rather than a phonetic one.


Footnotes


References


Bibliography

* * * * * * * * * * ** ** * * * * * * * * * * * * * * * * * * * ** (Revised version of Nguyễn 1959) ** (Revised & expanded version of Nguyễn 1966) * * * * * * * * * * * *


External links


Ngữ âm học
{{DEFAULTSORT:Vietnamese Phonology Phonologies by language Vietnamese language