Esperanto Esperanto ( or ) is the world's most widely spoken constructed international auxiliary language. Created by the Warsaw-based ophthalmologist L. L. Zamenhof in 1887, it was intended to be a universal second language for international communic ...

is a constructed

international auxiliary language An international auxiliary language (sometimes acronymized as IAL or contracted as auxlang) is a language meant for communication between people from all different nations, who do not share a common first language. An auxiliary language is primaril ...

designed to have a simple

phonology Phonology is the branch of linguistics that studies how languages or dialects systematically organize their sounds or, for sign languages, their constituent parts of signs. The term can also refer specifically to the sound or sign system of a ...

. The creator of Esperanto, L. L. Zamenhof, described Esperanto

pronunciation Pronunciation is the way in which a word or a language is spoken. This may refer to generally agreed-upon sequences of sounds used in speaking a given word or language in a specific dialect ("correct pronunciation") or simply the way a particular ...

by comparing the sounds of Esperanto with the sounds of several major European languages. With over a century of use, Esperanto has developed a phonological norm, including accepted details of

phonetics Phonetics is a branch of linguistics that studies how humans produce and perceive sounds, or in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians. ...

phonotactics Phonotactics (from Ancient Greek "voice, sound" and "having to do with arranging") is a branch of phonology that deals with restrictions in a language on the permissible combinations of phonemes. Phonotactics defines permissible syllable struc ...

, and intonation, so that it is now possible to speak of proper Esperanto pronunciation and of properly formed words independently of the languages originally used to describe it. This norm accepts only minor

allophonic In phonology, an allophone (; from the Greek , , 'other' and , , 'voice, sound') is a set of multiple possible spoken soundsor ''phones''or signs used to pronounce a single phoneme in a particular language. For example, in English, (as in ' ...

variation.

Inventory

The original Esperanto

lexicon A lexicon is the vocabulary of a language or branch of knowledge (such as nautical or medical). In linguistics, a lexicon is a language's inventory of lexemes. The word ''lexicon'' derives from Greek word (), neuter of () meaning 'of or fo ...

contains: * 23 consonants (including ĥ , which has become rare, and 4

affricates An affricate is a consonant that begins as a stop and releases as a fricative, generally with the same place of articulation (most often coronal). It is often difficult to decide if a stop and fricative form a single phoneme or a consonant pa ...

) * 11 vowels (5 simple vowels and 6

diphthong A diphthong ( ; , ), also known as a gliding vowel, is a combination of two adjacent vowel sounds within the same syllable. Technically, a diphthong is a vowel with two different targets: that is, the tongue (and/or other parts of the speech ...

s). A few additional sounds found in

loan words A loanword (also loan word or loan-word) is a word at least partly assimilated from one language (the donor language) into another language. This is in contrast to cognates, which are words in two or more languages that are similar because the ...

, such as , are not stable.

Consonants

The uncommon affricate does not have a distinct letter in the orthography, but is written with the digraph , as in ('husband'). Not everyone agrees with Kalocsay & Waringhien that and are a near rhyme, differing only in voicing, or on the status of as a phoneme; Wennergren considers it to be a simple sequence of /d/ + /z/. The phoneme has been largely replaced with /k/ and is now found mostly in loanwords and a very few established words such as ('a Czech'; cf. 'a check'). The letter ''ŭ'' is sometimes used as a consonant in onomatopoeia and unassimilated foreign names, in addition to the second element in diphthongs, which some argue is consonantal /w/ rather than vocalic (see below).

Vowels

Esperanto has between 5 and 11 vowels, depending on analysis: 5 monophthongs and up to 6 diphthongs. There are six historically stable diphthongs: , , , and , . However, some authors such as John C. Wells regard them as vowel–consonant sequences – , , , , , – while Wennergren regards , , , as vowel–consonant sequences and only , as diphthongs, there otherwise being no in Esperanto.

Origins

The Esperanto sound inventory and

are very close to those of

Yiddish Yiddish (, or , ''yidish'' or ''idish'', , ; , ''Yidish-Taytsh'', ) is a West Germanic language historically spoken by Ashkenazi Jews. It originated during the 9th century in Central Europe, providing the nascent Ashkenazi community with a ve ...

, Belarusian and Polish, which were personally important to

Zamenhof L. L. Zamenhof (15 December 185914 April 1917) was an ophthalmologist who lived for most of his life in Warsaw. He is best known as the creator of Esperanto, the most widely used constructed international auxiliary language. Zamenhof first de ...

, the creator of Esperanto. The primary difference is the absence of palatalization, although this was present in Proto-Esperanto (, now 'nations'; , now 'family') and arguably survives marginally in the affectionate suffixes and , and in the interjection Apart from this, the consonant inventory is identical to that of Eastern Yiddish. Minor differences from Belarusian are that ''g'' is pronounced as a stop, , rather than as a fricative, (in Belarusian, the stop pronunciation is found in recent loan words), and that Esperanto distinguishes and , a distinction that Yiddish makes but that Belarusian (and Polish) do not. As in Belarusian, Esperanto is found in syllable onsets and in syllable codas; however, unlike Belarusian, does not become if forced into coda position through compounding. According to Kalocsay & Waringhien, if Esperanto does appear before a voiceless consonant, it will devoice to , as in Yiddish. However, Zamenhof avoided such situations by adding an

epenthetic In phonology, epenthesis (; Greek ) means the addition of one or more sounds to a word, especially in the beginning syllable ('' prothesis'') or in the ending syllable (''paragoge'') or in-between two syllabic sounds in a word. The word ''epent ...

vowel: ('washbasin'), not or . The Esperanto vowel inventory is essentially that of Belarusian. Zamenhof's Litvish dialect of Yiddish (that of

Białystok Białystok is the largest city in northeastern Poland and the capital of the Podlaskie Voivodeship. It is the tenth-largest city in Poland, second in terms of population density, and thirteenth in area. Białystok is located in the Białystok U ...

) has an additional

schwa In linguistics, specifically phonetics and phonology, schwa (, rarely or ; sometimes spelled shwa) is a vowel sound denoted by the IPA symbol , placed in the central position of the vowel chart. In English and some other languages, it rep ...

and diphthong ''oŭ'' but no ''uj''.

Orthography and pronunciation

The Esperanto alphabet is nearly

phonemic In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language. For example, in most dialects of English, with the notable exception of the West Midlands and the north-west ...

. The letters, along with the

IPA IPA commonly refers to: * India pale ale, a style of beer * International Phonetic Alphabet, a system of phonetic notation * Isopropyl alcohol, a chemical compound IPA may also refer to: Organizations International * Insolvency Practitioners A ...

and nearest English equivalent of their principal

allophone In phonology, an allophone (; from the Greek , , 'other' and , , 'voice, sound') is a set of multiple possible spoken soundsor ''phones''or signs used to pronounce a single phoneme in a particular language. For example, in English, (as in '' ...

s, are:

Minimal pairs

Esperanto has many

minimal pair In phonology, minimal pairs are pairs of words or phrases in a particular language, spoken or signed, that differ in only one phonological element, such as a phoneme, toneme or chroneme, and have distinct meanings. They are used to demonstrate ...

s between the

voiced Voice or voicing is a term used in phonetics and phonology to characterize speech sounds (usually consonants). Speech sounds can be described as either voiceless (otherwise known as ''unvoiced'') or voiced. The term, however, is used to refer ...

and

voiceless In linguistics, voicelessness is the property of sounds being pronounced without the larynx vibrating. Phonologically, it is a type of phonation, which contrasts with other states of the larynx, but some object that the word phonation implies ...

plosives In phonetics, a plosive, also known as an occlusive or simply a stop, is a pulmonic consonant in which the vocal tract is blocked so that all airflow ceases. The occlusion may be made with the tongue tip or blade (, ), tongue body (, ), li ...

, ''b d g'' and ''p t k''; for example, "pay" vs. "pack", "bar" vs. "pair", "briefcase" vs. "group of ten". On the other hand, the distinctions between several Esperanto consonants carry very light functional loads, though they are not in

complementary distribution In linguistics, complementary distribution, as distinct from contrastive distribution and free variation, is the relationship between two different elements of the same kind in which one element is found in one set of environments and the other ele ...

and therefore not

s. The practical effect of this is that people who do not control these distinctions are still able to communicate without difficulty. These minor distinctions are ''ĵ'' vs. ''ĝ'' , contrasted in ('concrete thing') vs. ('age'); ''k'' vs. ''ĥ'' vs. ''h'' , contrasted in ('heart') vs. ('chorus') vs. ('hour'), and in the prefix (inchoative) vs. ('echo'); ''dz'' vs. ''z'' , not contrasted in basic vocabulary; and ''c'' vs. ''ĉ'' , found in a few minimal pairs such as ('tzar'), ('because'); ('thou'), (proximate particle used with deictics); ('goal'), ('cell'); ('-ness'), ('even'); etc. Belarusian seems to have provided the model for Esperanto's diphthongs, as well as the complementary distribution of ''v'' (restricted to the

onset Onset may refer to: * Onset (audio), the beginning of a musical note or sound * Onset, Massachusetts, village in the United States **Onset Island (Massachusetts), a small island located at the western end of the Cape Cod Canal * Interonset interva ...

of a syllable), and ''ŭ'' (occurring only as a vocalic offglide), although this was modified slightly, with Belarusian ''oŭ'' corresponding to Esperanto ''ov'' (as in ), and ''ŭ'' being restricted to the sequences in Esperanto. Although ''v'' and ''ŭ'' may both occur between vowels, as in ('ninth') and ('of naves'), the diphthongal distinction holds: vs. . (However, Zamenhof did allow initial ''ŭ'' in onomatopoeic words such as 'wah!'.) The semivowel ''j'' likewise does not occur after the vowel ''i'', but is also restricted from occurring before ''i'' in the same morpheme, whereas the Belarusian letter ''i'' represents . Later exceptions to these patterns, such as ('poop deck'), ('watt'), East Asian proper names beginning with , and ('Yiddish'), are marginal. The distinction between ''e'' and ''ej'' carries a light functional load, in the core vocabulary perhaps only distinctive before alveolar

sonorants In phonetics and phonology, a sonorant or resonant is a speech sound that is produced with continuous, non-turbulent airflow in the vocal tract; these are the manners of articulation that are most often voiced in the world's languages. Vowels are ...

, such as ('peg'), ('cellar'); ('mile'), ('badger'); ('Rhine'), ('kidney'). The recent borrowing ('homosexual') could contrast with the ambisexual prefix if used in compounds with a following consonant, and also creating possible confusion between ('homosexual couple') and ('heterosexual couple'), which are both pronounceable as . is also uncommon, and very seldom contrastive: ('a euro') vs. ('a bit').

Stress and prosody

Within a word,

stress Stress may refer to: Science and medicine * Stress (biology), an organism's response to a stressor such as an environmental condition * Stress (linguistics), relative emphasis or prominence given to a syllable in a word, or to a word in a phrase ...

is on the syllable with the second-to-last vowel, such as the ''li'' in ('family'). An exception is when the final ''-o'' of a noun is

elided In linguistics, an elision or deletion is the omission of one or more sounds (such as a vowel, a consonant, or a whole syllable) in a word or phrase. However, these terms are also used to refer more narrowly to cases where two words are run toget ...

, usually for poetic reasons, because this does not affect the placement of the stress: . On the rare occasions that stress needed to be specified, as in explanatory material or with proper names, Zamenhof used an acute accent. The most common such proper name is Zamenhof's own: . If the stress falls on the last syllable, it is common for an apostrophe to be used, as in poetic elision: . There is no set rule for which other syllables might receive stress in a polysyllabic word, or which monosyllabic words are stressed in a clause. Morphology, semantic load, and rhythm all play a role. By default, Esperanto is trochaic; stress tends to hit alternate syllables: . However, derivation tends to leave such "secondary" stress unchanged, at least for many speakers: or (or for some just ) Similarly, compound words generally retain their original stress. They never stress an epenthetic vowel: thus , not . Within a clause, rhythm also plays a role. However, referential words (

lexical word In grammar, a part of speech or part-of-speech ( abbreviated as POS or PoS, also known as word class or grammatical category) is a category of words (or, more generally, of lexical items) that have similar grammatical properties. Words that are ass ...

s and

pronoun In linguistics and grammar, a pronoun ( abbreviated ) is a word or a group of words that one may substitute for a noun or noun phrase. Pronouns have traditionally been regarded as one of the parts of speech, but some modern theorists would not ...

s) attract stress, whereas "connecting" words such as

preposition Prepositions and postpositions, together called adpositions (or broadly, in traditional grammar, simply prepositions), are a class of words used to express spatial or temporal relations (''in'', ''under'', ''towards'', ''before'') or mark various ...

s tend not to: or ('give to me'), not . In ('Do you see the dog that's running past the house?'), the

function word In linguistics, function words (also called functors) are words that have little lexical meaning or have ambiguous meaning and express grammatical relationships among other words within a sentence, or specify the attitude or mood of the speake ...

s do not take stress, not even two-syllable ('which') or ('beyond'). The verb ('to be') behaves similarly, as can be seen by the occasional elision of the ''e'' in poetry or rapid speech: ('I'm not here!') Phonological words do not necessarily match orthographic words. Pronouns, prepositions, the article, and other monosyllabic function words are generally pronounced as a unit with the following word: ('I have'), ('the boy'), ('of the word'), ('at table'). Exceptions include 'and', which may be pronounced more distinctly when it has a larger scope than the following word or phrase. Within poetry, of course, the meter determines stress: ('Oh my heart, do not beat uneasily'). Emphasis and contrast may override normal stress. Pronouns frequently take stress because of this. In a simple question like ('Did you see?'), the pronoun hardly needs to be said and is unstressed; compare and ('No, give ''me''). Within a word, a prefix that wasn't heard correctly may be stressed upon repetition: ('No, not over ''there!'' Go ''left'', I said!'). Because stress doesn't distinguish words in Esperanto, shifting it to an unexpected syllable calls attention to that syllable, but doesn't cause confusion as it might in English. As in many languages,

initialism An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...

s behave unusually. When grammatical, they may be unstressed: ''k.t.p.'' ('et cetera'); when used as proper names, they tend to be idiosyncratic: or but rarely . This seems to be a way of indicating that the term is not a normal word. However, full

acronym An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...

s tend to have regular stress: . Lexical tone is not phonemic. Nor is clausal intonation, as question particles and changes in word order serve many of the functions that intonation performs in English.

Phonotactics

syllable A syllable is a unit of organization for a sequence of speech sounds typically made up of a syllable nucleus (most often a vowel) with optional initial and final margins (typically, consonants). Syllables are often considered the phonological ...

in Esperanto is generally of the form (s/ŝ)(C)(C)V(C)(C). That is, it ''may'' have an

, of up to three consonants; ''must'' have a

nucleus Nucleus ( : nuclei) is a Latin word for the seed inside a fruit. It most often refers to: * Atomic nucleus, the very dense central region of an atom *Cell nucleus, a central organelle of a eukaryotic cell, containing most of the cell's DNA Nucl ...

of a single vowel or diphthong (except in

onomatopoeic Onomatopoeia is the process of creating a word that phonetically imitates, resembles, or suggests the sound that it describes. Such a word itself is also called an onomatopoeia. Common onomatopoeias include animal noises such as ''oink'', ''m ...

words such as ''zzz!''), and may have a

coda Coda or CODA may refer to: Arts, entertainment, and media Films * Movie coda, a post-credits scene * ''Coda'' (1987 film), an Australian horror film about a serial killer, made for television *''Coda'', a 2017 American experimental film from Na ...

of zero to one (occasionally two) consonants. Any consonant may occur initially, with the exception of ''j'' before ''i'' (though there is now one word that violates this restriction, ('Yiddish') which contrasts with "of an offspring"). Any consonant except ''h'' may close a syllable, though coda ''ĝ'' and ''ĵ'' are rare in monomorphemes (they contrast in 'age' vs. 'thing'). Within a morpheme, there may be a maximum of four sequential consonants, as for example in ('teaches'), ('to the right'). Long clusters generally include a

sibilant Sibilants are fricative consonants of higher amplitude and pitch, made by directing a stream of air with the tongue towards the teeth. Examples of sibilants are the consonants at the beginning of the English words ''sip'', ''zip'', ''ship'', and ...

such as ''s'' or one of the liquids ''l'' or ''r''.

Geminate In phonetics and phonology, gemination (), or consonant lengthening (from Latin 'doubling', itself from '' gemini'' 'twins'), is an articulation of a consonant for a longer period of time than that of a singleton consonant. It is distinct from ...

consonants generally only occur in polymorphemic words, such as ('short'), ('to flop down'), ('to mis-write'); in

ethnonym An ethnonym () is a name applied to a given ethnic group. Ethnonyms can be divided into two categories: exonyms (whose name of the ethnic group has been created by another group of people) and autonyms, or endonyms (whose name is created and us ...

s such as ('a Finn'), ('a Gaul') (now more commonly ); in

proper name A proper noun is a noun that identifies a single entity and is used to refer to that entity (''Africa'', ''Jupiter'', ''Sarah'', ''Microsoft)'' as distinguished from a common noun, which is a noun that refers to a class of entities (''continent, ...

s such as ('Schiller'), ('Buddha', now more commonly ); and in a handful of unstable borrowings such as ('a sports match'). In compounds of

s, Zamenhof separated identical consonants with an epenthetic vowel, as in ('the evening of life'), never . Word-final consonants occur, though final voiced

obstruent An obstruent () is a speech sound such as , , or that is formed by ''obstructing'' airflow. Obstruents contrast with sonorants, which have no such obstruction and so resonate. All obstruents are consonants, but sonorants include vowels as well as ...

s are generally rejected. For example, Latin ('to') became Esperanto , and Polish ('than') morphed into Esperanto ('than').

Sonorant In phonetics and phonology, a sonorant or resonant is a speech sound that is produced with continuous, non-turbulent airflow in the vocal tract; these are the manners of articulation that are most often voiced in the world's languages. Vowels ar ...

s and voiceless obstruents, on the other hand, are found in many of the numerals: ('hundred'), ('eight'), ('seven'), ('six'), ('five'), ('four'); also ('during'), ('even'). Even the poetic elision of final ''-o'' is rarely seen if it would leave a final voiced obstruent. A very few words with final voiced obstruents do occur, such as ('but') and ('next to'), but in such cases there is no minimal-pair contrast with a voiceless counterpart (that is, there is no or to cause confusion). This is because many people, including the Slavs and Germans, do not contrast voicing in final obstruents. For similar reasons, sequences of

s with mixed voicing are not found in Zamenhofian compounds, apart from numerals and grammatical forms, thus 'for a long time', not . (Note that is an exception to this rule, like in the Slavic languages. It is effectively ambiguous between fricative and approximant. The other exception is , which is commonly treated as .) Syllabic consonants occur only as

interjection An interjection is a word or expression that occurs as an utterance on its own and expresses a spontaneous feeling or reaction. It is a diverse category, encompassing many different parts of speech, such as exclamations ''(ouch!'', ''wow!''), curse ...

s and

onomatopoeia Onomatopoeia is the process of creating a word that phonetically imitates, resembles, or suggests the sound that it describes. Such a word itself is also called an onomatopoeia. Common onomatopoeias include animal noises such as ''oink'', ''m ...

: . All triconsonantal onsets begin with a sibilant, ''s'' or ''ŝ''. Disregarding proper names, such as , the following initial consonant clusters occur: *Stop + liquid – ''bl, br; pl, pr; dr; tr; gl, gr; kl, kr'' *Voiceless fricative + liquid – ''fl, fr; sl; ŝl, ŝr'' *Voiceless sibilant + voiceless stop (+ liquid) – ''sc'' , ''sp, spl, spr; st, str; sk, skl, skr; ŝp, ŝpr; ŝt, ŝtr'' *Obstruent + nasal – ''gn, kn, sm, sn, ŝm, ŝn'' *Obstruent + – ''gv, kv, sv, ŝv'' And more marginally, :Consonant + – ''(tj), ĉj, fj, vj, nj'' Although it does not occur initially, the sequence is pronounced as an affricate, as in ('a husband') with an open first syllable not as . In addition, initial occurs in German-derived ('penny'), in

Sanskrit Sanskrit (; attributively , ; nominally , , ) is a classical language belonging to the Indo-Aryan languages, Indo-Aryan branch of the Indo-European languages. It arose in South Asia after its predecessor languages had Trans-cultural diffusion ...

kshatriya Kshatriya ( hi, क्षत्रिय) (from Sanskrit ''kṣatra'', "rule, authority") is one of the four varna (social orders) of Hindu society, associated with warrior aristocracy. The Sanskrit term ''kṣatriyaḥ'' is used in the co ...

'), and several additional uncommon initial clusters occur in technical words of

Greek Greek may refer to: Greece Anything of, from, or related to Greece, a country in Southern Europe: *Greeks, an ethnic group. *Greek language, a branch of the Indo-European language family. **Proto-Greek language, the assumed last common ancestor ...

origin, such as ''mn-, pn-, ks-, ps-, sf-, ft-, kt-, pt-, bd-'', such as ('a sphincter' which also has the coda ). Quite a few more clusters turn up in sufficiently obscure words, such as in "Thlaspi" (a

genus Genus ( plural genera ) is a taxonomic rank used in the biological classification of living and fossil organisms as well as viruses. In the hierarchy of biological classification, genus comes above species and below family. In binomial nom ...

of herb), and

Aztec The Aztecs () were a Mesoamerican culture that flourished in central Mexico in the post-classic period from 1300 to 1521. The Aztec people included different ethnic groups of central Mexico, particularly those groups who spoke the Nahuatl ...

deities such as ('Tlaloc'). (The phonemes are presumably devoiced in these words.) As this might suggest, greater phonotactic diversity and complexity is tolerated in learnèd than in quotidian words, almost as if "difficult" phonotactics were an iconic indication of "difficult" vocabulary. Diconsonantal codas, for example, generally only occur in technical terms, proper names, and in geographical and ethnic terms: ('a conjunction'), ('Arctic'), ('isthmus'). However, there is a strong tendency for more basic terms to avoid coda clusters, although ('hundred'), ('after'), ('holy'), and the prefix ('ex-') (which can be used as an interjection: 'Down with the king!') are exceptions. Even when coda clusters occur in the source languages, they are often eliminated in Esperanto. For instance, many European languages have words relating to "body" with a root of . This root gave rise to two words in Esperanto, neither of which keep the full cluster: ('a military corps') (retaining the original Latin ''u''), and ('a biological body') (losing the ''s''). Many ordinary roots end in two or three consonants, such as ('a bicycle'), ('a shoulder'), ('a needle'), ('to cut'). However, these roots do not normally entail coda clusters except when followed by another consonant in compounds, or with poetic elision of the final ''-o''. Even then, only sequences with decreasing sonority are possible, so although poetic occurs, *, *, and * do not. (Note that the humorous jargon does not follow this restriction, because it elides the grammatical suffix of all nouns no matter how awkward the result.) Within compounds, an

vowel is added to break up what would otherwise be unacceptable clusters of consonants. This vowel is most commonly the nominal affix ''-o'', regardless of number or case, as in ('a songbird') (the root , 'to sing', is inherently a verb), but other part-of-speech endings may be used when ''-o-'' is judged to be grammatically inappropriate, as in ('expensive'). There is a great deal of personal variation as to when an epenthetic vowel is used.

Allophonic variation

With only five oral and no nasal or long vowels, Esperanto allows a fair amount of allophonic variation, though the distinction between and , and arguably and , is phonemic. The may be a labiodental fricative or a labiodental approximant , again in free variation; or , especially in the sequences ''kv'' and ''gv'' ( and , like English "qu" and "gu"), but with considered normative. Alveolar consonants ''t, d, n, l'' are acceptably either

apical Apical means "pertaining to an apex". It may refer to: *Apical ancestor, refers to the last common ancestor of an entire group, such as a species (biology) or a clan (anthropology) *Apical (anatomy), an anatomical term of location for features loc ...

(as in English) or

laminal A laminal consonant is a phone (speech sound) produced by obstructing the air passage with the blade of the tongue, the flat top front surface just behind the tip of the tongue in contact with upper lip, teeth, alveolar ridge, to possibly, as ...

(as in French, generally but incorrectly called "dental"). Postalveolars ''ĉ, ĝ, ŝ, ĵ'' may be ''palato-alveolar'' (semi- palatalized) as in English and French, or ''retroflex'' (non-palatalized) as in Polish, Russian, and Mandarin Chinese. ''H'' and ''ĥ'' may be voiced , especially between vowels.

Rhotics

The consonant ''r'' can be realised in many ways, as it was defined differently in each language version of the

Fundamento de Esperanto ''Fundamento de Esperanto'' (English: ''Foundation of Esperanto'') is a 1905 book by L. L. Zamenhof, in which the author explains the basic grammar rules and vocabulary that constitute the basis of the constructed language Esperanto. On August ...

: * In the French Fundamento, it is defined as ''r''. The French rhotic has a wide range of realizations: both the voiced uvular fricative or approximant and the

voiceless uvular fricative The voiceless uvular fricative is a type of consonantal sound used in some spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is , the Greek chi. The sound is represented by (ex with underdot) in Am ...

, the

uvular trill The voiced uvular trill is a type of consonantal sound, used in some spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is , a small capital letter ''R''. This consonant is one of several collectively ...

, the

alveolar trill The voiced alveolar trill is a type of consonantal sound used in some spoken languages. The symbol in the International Phonetic Alphabet that represents dental, alveolar, and postalveolar trills is , and the equivalent X-SAMPA symbol is ...

, and the

alveolar tap Alveolus (; pl. alveoli, adj. alveolar) is a general anatomical term for a concave cavity or pit. Uses in anatomy and zoology * Pulmonary alveolus, an air sac in the lungs ** Alveolar cell or pneumocyte ** Alveolar duct ** Alveolar macrophage * M ...

. These are all recognized as the

phoneme In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language. For example, in most dialects of English, with the notable exception of the West Midlands and the north-wes ...

, but the trills and the tap are considered dialectal. * In the English Fundamento, it is defined as in ''rare'', which is an

alveolar approximant The voiced alveolar approximant is a type of consonantal sound used in some spoken languages. The symbol in the International Phonetic Alphabet that represents the alveolar and postalveolar approximants is , a lowercase letter ''r'' rotated 180 ...

. * In the German Fundamento, it is defined as ''r''. Most varieties of

Standard German Standard High German (SHG), less precisely Standard German or High German (not to be confused with High German dialects, more precisely Upper German dialects) (german: Standardhochdeutsch, , or, in Switzerland, ), is the standardized variety ...

are spoken with a uvular rhotic, now usually a fricative or approximant , rather than . The alveolar pronunciation is used in some standard German varieties of Germany, Austria, and Switzerland. * In the Russian and Polish Fundamento, it is defined as ''р (cyrillic)'', which is most commonly an

. The most common realization depends on the region and native language of the Esperanto speaker. For example, a very common realisation in English speaking countries is the

alveolar flap The voiced alveolar tap or flap is a type of consonantal sound, used in some spoken languages. The symbol in the International Phonetic Alphabet that represents a dental, alveolar, or postalveolar tap or flap is . The terms ''tap'' and ''flap' ...

. Worldwide, the most common realisation is probably the alveolar trill , which makes some people think it is the most desirable pronunciation. However, it is a common misconception to believe that the alveolar trill is the only correct form. The grammar reference

Plena Manlibro de Esperanta Gramatiko ''Plena Manlibro de Esperanta Gramatiko'' (PMEG, en, Complete Manual of Esperanto Grammar, italic=yes) is a book which explains Esperanto grammar in an easy-to-learn format. It was mostly written by Bertilo Wennergren and is for ordinary Esperan ...

considers the velar form to be totally good if it is trilled, and considers the other realisations acceptable. In practice, the different forms are well understood and accepted by experienced Esperanto speakers.

Vowel length and quality

Vowel length is not phonemic in Esperanto. Vowels tend to be long in open stressed syllables and short otherwise. Adjacent stressed syllables are not allowed in compound words, and when stress disappears in such situations, it may leave behind a residue of vowel length. Vowel length is sometimes presented as an argument for the phonemic status of the affricates, because vowels tend to be short before most

consonant cluster In linguistics, a consonant cluster, consonant sequence or consonant compound, is a group of consonants which have no intervening vowel. In English, for example, the groups and are consonant clusters in the word ''splits''. In the education fie ...

s (excepting stops plus ''l'' or ''r'', as in many European languages), but long before /ĉ/, /ĝ/, /c/, and /dz/, though again this varies by speaker, with some speakers pronouncing a short vowel before /ĝ/, /c/, /dz/ and a long vowel only before /ĉ/. Vowel quality has never been an issue for /a/, /i/ and /u/, but has been much discussed for /e/ and /o/. Zamenhof recommended pronouncing the vowels /e/ and /o/ as mid at all times. Kalocsay and Waringhien gave more complicated recommendations. For example, they recommended pronouncing stressed /e/, /o/ as short

open-mid An open-mid vowel (also mid-open vowel, low-mid vowel, mid-low vowel or half-open vowel) is any in a class of vowel sound used in some spoken languages. The defining characteristic of an open-mid vowel is that the tongue is positioned one thi ...

in closed syllables and long

close-mid A close-mid vowel (also mid-close vowel, high-mid vowel, mid-high vowel or half-close vowel) is any in a class of vowel sound used in some spoken languages. The defining characteristic of a close-mid vowel is that the tongue is positioned one ...

in open syllables. However, this is widely considered unduly elaborate, and Zamenhof's recommendation of using mid qualities is considered the norm. For many speakers, however, the pronunciation of /e/ and /o/ reflects the details of their native language.

Epenthesis

Zamenhof noted that

glides may be inserted between dissimilar vowels, especially after

high vowel A close vowel, also known as a high vowel (in U.S. terminology), is any in a class of vowel sounds used in many spoken languages. The defining characteristic of a close vowel is that the tongue is positioned as close as possible to the roof of th ...

s as in for ('my'), for ('honey') and for ('further'). This is quite common, and there is no possibility of confusion, because /ij/ and /uŭ/ do not occur in Esperanto (though more general epenthesis could cause confusion between and , as mentioned above). However, Zamenhof stated that in "severely regular" speech such epenthesis would not occur. Epenthetic glottal stops in vowel sequences such as ('boa') are non-phonemic detail, allowed for the comfort of the speaker. Glottal stop is especially common in sequences of identical vowels, such as ('hero'), and ('great-grandfather'). Other speakers, however, mark the hiatus by a change of intonation, such as by raising the pitch of the stressed vowel: . As in many languages,

fricative A fricative is a consonant produced by forcing air through a narrow channel made by placing two articulators close together. These may be the lower lip against the upper teeth, in the case of ; the back of the tongue against the soft palate in ...

s may become

affricate An affricate is a consonant that begins as a stop and releases as a fricative, generally with the same place of articulation (most often coronal). It is often difficult to decide if a stop and fricative form a single phoneme or a consonant pai ...

s after a nasal, via an epenthetic stop. Thus, the neologism ('sense', as in the five senses) may be pronounced the same as the fundamental word ('sense, meaning'), and the older term for the former, , may be preferable.

Poetic elision

Vowel elision is allowed with the grammatical suffix ''-o'' of singular nominative nouns, and the ''a'' of the article ''la'', though this rarely occurs outside of poetry: ('from the heart'). Normally semivowels are restricted to offglides in diphthongs. However, poetic meter may force the reduction of unstressed and to semivowels before a stressed vowel: ; .

Assimilation

Zamenhof recognized place-assimilation of

nasal Nasal is an adjective referring to the nose, part of human or animal anatomy. It may also be shorthand for the following uses in combination: * With reference to the human nose: ** Nasal administration, a method of pharmaceutical drug delivery * ...

s before another consonant, such as ''n'' before a velar, as in ('bank') and ('blood'), or before palatal , as in ('mommy') and ('sir'). However, he stated that "severely regular" speech would not have such variation from his ideal of 'one letter, one sound'., §17 Nonetheless, although the desirability of such allophony may be debated, the question almost never arises as to whether the ''m'' in should remain bilabial or should assimilate to labiodental ''f'' (), because this assimilation is nearly universal in human language. Indeed, where the orthography allows (e.g. 'bonbon'), we see that assimilation can occur. In addition, speakers of many languages (including Zamenhof's, though not always English) have regressive

voicing assimilation In phonology, voicing (or sonorization) is a sound change where a voiceless consonant becomes voiced due to the influence of its phonological environment; shift in the opposite direction is referred to as devoicing or desonorization. Most com ...

, when two

s (consonants that occur in voiced-voiceless pairs) occur next to each other. Zamenhof did not mention this directly, but did indicate it indirectly, in that he didn't create compound words with adjacent obstruents that have mixed voicing. For example, by the phonotactics of both of Zamenhof's mother tongues, Yiddish and (Belo)Russian, ('rose-colored', 'pink') would be pronounced the same as ('dew-colored'), and so the preferred form for the former is . Indeed, Kalocsay & Waringhien state that when voiced and voiceless consonants are adjacent, the assimilation of one of them is "inevitable". Thus one pronounces ('eighty') as , as if it were spelled ""; ('exist') as , as if it were spelled ""; ('for example') as , ('support') as , ('for a long time') as , ('ringing of a sword') as , etc.Miroslav Malovec, 1999
, §2.9.
/ref> Such assimilation likewise occurs in words that maintain Latinate orthography, such as ('absolutely'), pronounced , and ('obtuse'), pronounced , despite the superficially contrastive sequences in the words ('apsis') and ('optics'). Instead, the debate centers on the non-Latinate orthographic sequence ''kz'', frequently found in Latinate words like and above. It is sometimes claimed that ''kz'' is properly pronounced exactly as written, with mixed voicing, , despite the fact that assimilation to occurs in Russian, English (including the words 'example' and 'exist'), Polish (where it is even spelled ), French and many other languages. These two positions are called and in Esperanto. In practice, most Esperanto speakers assimilate ''kz'' to and pronounce ''nk'' as when speaking fluently. : In compound

lexical words In linguistics, function words (also called functors) are words that have little lexical meaning or have ambiguous meaning and express grammatical relationships among other words within a sentence, or specify the attitude or mood of the speaker. ...

, Zamenhof himself inserted an epenthetic vowel between obstruents with different voicing, as in above, never , and , never as with some later writers; mixed voicing only occurred with grammatical words, for example with compound numbers and with prepositions used as prefixes, as in and above. ''V'' is never found before any consonant in Zamenhof's writing, because that would force it to contrast with ''ŭ''. Similarly, mixed

sequences, as in the polymorphemic ('to scatter'), tend to assimilate in rapid speech, sometimes completely (). Like the generally ignored regressive devoicing in words such as , progressive devoicing tends to go unnoticed within obstruent–

sonorant In phonetics and phonology, a sonorant or resonant is a speech sound that is produced with continuous, non-turbulent airflow in the vocal tract; these are the manners of articulation that are most often voiced in the world's languages. Vowels ar ...

clusters, as in ('additional'; contrasts with 'blue') and ('boy'; the ''kn-'' contrasts with ''gn-'', as in 'gnome'). Partial to full devoicing of the sonorant is probably the norm for most speakers. Voicing assimilation of affricates and fricatives before nasals, as in ('a detachment') and the suffix ('-ism'), is both more noticeable and easier for most speakers to avoid, so for is less tolerated than for .

Loss of phonemic ''ĥ''

The sound of , , was always somewhat marginal in Esperanto, and there has been a strong move to merge it into , starting with suggestions from Zamenhof himself. Dictionaries generally cross-reference and , but the sequence (as in 'architecture') was replaced by () so completely by the early 20th century that few dictionaries even list as an option. The central/eastern European form for 'Chinese', , has been completely replaced with the western European form, , a unique exception to the general pattern, perhaps because the word ('cinematography') already existed. Other words, such as ('chemistry') and ('monk'), still vary but are more commonly found with (). In a few cases, such as with words of Russian origin, may instead be replaced by . This merger has had only a few complications. Zamenhof gave ('chorus') the alternative form , because both ('heart') and ('hour') were taken. The two words still almost universally seen with are ('echo') and ('a Czech'). (

perfective aspect The perfective aspect ( abbreviated ), sometimes called the aoristic aspect, is a grammatical aspect that describes an action viewed as a simple whole; i.e., a unit without interior composition. The perfective aspect is distinguished from the ...

) and ('check') already exist, though for is occasionally seen.

Proper names and borrowings

A common source of allophonic variation is borrowed words, especially proper names, when non-Esperantized remnants of the source-language orthography remain, or when novel sequences are created in order to avoid duplicating existing roots. For example, it is doubtful that many people fully pronounce the ''g'' in ('Washington') as either or , or pronounce the in ('Buddha') at all. Such situations are unstable, and in many cases dictionaries recognize that certain spellings (and therefore pronunciations) are inadvisable. For example, the physical unit "watt" was first borrowed as , to distinguish it from ('cotton-wool'), and this is the only form found in dictionaries in 1930. However, initial violates Esperanto phonotactics, and by 1970 there was an alternative spelling, . This was also unsatisfactory, however, because of the geminate , and by 2000 the effort had been given up, with now the advised spelling for both 'watt' and 'cotton-wool'. Some recent dictionaries no longer even list initial in their index.For instance, the ''

Reta Vortaro ''Reta Vortaro'' ("Internet Dictionary", often known by the Esperanto short form ReVo) is a general-purpose multilingual Esperanto dictionary for the Internet. Each of the dictionary's headwords is defined in Esperanto, along with additional infor ...

'' didn't list for year

until it added an entry for 'wow!' in 201

/ref> Likewise, several dictionaries now list the spellings for 'Washington' and for 'Buddha'.

Violations

Before Esperanto

became fixed, Coined word, foreign words were adopted with spellings that violated the apparent intentions of Zamenhof and the norms that would develop later, such as ('

poop deck In naval architecture, a poop deck is a deck that forms the roof of a cabin built in the rear, or "aft", part of the superstructure of a ship. The name originates from the French word for stern, ''la poupe'', from Latin ''puppis''. Thus ...

'), ('

watt The watt (symbol: W) is the unit of power or radiant flux in the International System of Units (SI), equal to 1 joule per second or 1 kg⋅m2⋅s−3. It is used to quantify the rate of energy transfer. The watt is named after James ...

'), and (' sports match'). Many of these coinages have proven to be unstable, and have either fallen out of use or been replaced with pronunciations more in keeping with the developing norms, such as for , for , and for . On the other hand, ('

') was also sometimes criticized on phonotactical grounds, but was used by Zamenhof after its introduction in the ''Plena Vortaro'' as a replacement for ''novjuda'' and ''judgermana'' and is well established.

Notes

References

{{DEFAULTSORT:Esperanto Phonology Phonologies by language

Phonology Phonology is the branch of linguistics that studies how languages or dialects systematically organize their sounds or, for sign languages, their constituent parts of signs. The term can also refer specifically to the sound or sign system of a ...