
Estonian orthography is the system used for writing the
Estonian language
Estonian ( ) is a Finnic language and the official language of Estonia. It is written in the Latin script and is the first language of the majority of the country's population; it is also an official language of the European Union. Estonian is sp ...
and is based on the
Latin alphabet
The Latin alphabet, also known as the Roman alphabet, is the collection of letters originally used by the Ancient Rome, ancient Romans to write the Latin language. Largely unaltered except several letters splitting—i.e. from , and from � ...
. The Estonian
orthography
An orthography is a set of convention (norm), conventions for writing a language, including norms of spelling, punctuation, Word#Word boundaries, word boundaries, capitalization, hyphenation, and Emphasis (typography), emphasis.
Most national ...
is generally guided by phonemic principles, with each
grapheme
In linguistics, a grapheme is the smallest functional unit of a writing system.
The word ''grapheme'' is derived from Ancient Greek ('write'), and the suffix ''-eme'' by analogy with ''phoneme'' and other emic units. The study of graphemes ...
corresponding to one
phoneme
A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
.
Alphabet
Due to
German and
Swedish influence, the Estonian alphabet () has the letters
Ä,
Ö, and
Ü (A, O, and U with
diaeresis), which represent the vowel sounds , and , respectively. Unlike German
umlauts, they are considered, and alphabetised as, separate letters. The most distinctive letter in the Estonian alphabet, however, is the
Õ (O with tilde), which was added to the alphabet in the 19th century by
Otto Wilhelm Masing and stands for the vowel . In addition, the alphabet also differs from the Latin alphabet by the addition of the letters
Š and
Ž (S and Z with
caron
A caron or háček ( ), is a diacritic mark () placed over certain letters in the orthography of some languages, to indicate a change of the related letter's pronunciation.
Typographers tend to use the term ''caron'', while linguists prefer ...
/háček), and by the position of Z in the alphabet: it has been moved from the end to between S and T (or Š and Ž).
The official Estonian alphabet has 27 letters: A, B, D, E, F, G, H, I, J, K, L, M, N, O, P, R, S, Š, Z, Ž, T, U, V, Õ, Ä, Ö, Ü. The letters F, Š, Z, Ž are so-called "foreign letters" (), and occur only in
loanword
A loanword (also a loan word, loan-word) is a word at least partly assimilated from one language (the donor language) into another language (the recipient or target language), through the process of borrowing. Borrowing is a metaphorical term t ...
s and
proper name
A proper noun is a noun that identifies a single entity and is used to refer to that entity (''Africa''; ''Jupiter''; ''Sarah''; ''Walmart'') as distinguished from a common noun, which is a noun that refers to a class of entities (''continent, pl ...
s.
Additionally, ''C'', ''Q'', ''W'', ''X'', and ''Y'' are "foreign letters" used only in writing foreign
proper names
A proper noun is a noun that identifies a single entity and is used to refer to that entity (''Africa''; ''Jupiter''; ''Sarah (given name), Sarah''; ''Walmart'') as distinguished from a common noun, which is a noun that refers to a Class (philoso ...
. These letters do not occur in any
Estonian words, and thus are not usually considered part of the "Estonian proper" alphabet. Including all the foreign letters, the entire alphabet consists of the following 32 letters:
In
Blackletter script W was used instead of V. In some reference works in the past, V and W were sorted as if they were one and the same letter.
Johannes Aavik
Johannes Aavik ( – 18 March 1973) was an Estonian linguist and innovator of the Estonian language.
Early life and education
Aavik was born in Randvere, Saaremaa, in the Governorate of Livonia of the Russian Empire (now Estonia). He studied h ...
suggested that the letter Ü be replaced by Y, as it has been in the
Finnish alphabet
Finnish orthography is based on the Latin script, and uses an alphabet derived from the Swedish alphabet, officially comprising twenty-nine letters but also including two additional letters found in some loanwords. The Finnish orthography striv ...
.
Double letters are used to write half-long and overlong vowels and consonants, e. g. ''aa'' or , ''nn'' or , ''kk'' . For more information, see below.
As the distinction between voiced and voiceless plosives is not native to Estonian, the names of the letters 'b', 'd', 'g' may be pronounced , , , so the letters 'b' and 'd' are also named () and () to distinguish them from () and (). About usage of these letters, see below.
Orthographic principles
Although Estonian
orthography
An orthography is a set of convention (norm), conventions for writing a language, including norms of spelling, punctuation, Word#Word boundaries, word boundaries, capitalization, hyphenation, and Emphasis (typography), emphasis.
Most national ...
is generally guided by phonemic principles, with each
grapheme
In linguistics, a grapheme is the smallest functional unit of a writing system.
The word ''grapheme'' is derived from Ancient Greek ('write'), and the suffix ''-eme'' by analogy with ''phoneme'' and other emic units. The study of graphemes ...
corresponding to one
phoneme
A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
, there are some historical and morphological deviations from this: for example the initial letter ''h'' in words, preservation of the morpheme in
declension
In linguistics, declension (verb: ''to decline'') is the changing of the form of a word, generally to express its syntactic function in the sentence by way of an inflection. Declension may apply to nouns, pronouns, adjectives, adverbs, and det ...
of the word (writing ''b'', ''g'', ''d'' in places where ''p'', ''k'', ''t'' is pronounced) and in the use of ''i'' and ''j''. Where it is impractical or impossible to type ''š'' and ''ž'', they are substituted with ''sh'' and ''zh'' in some written texts, although this is considered incorrect. Otherwise, the ''h'' in ''sh'' represents a
voiceless glottal fricative
The voiceless glottal fricative, sometimes called voiceless glottal transition or the aspirate, is a type of sound used in some spoken languages that patterns like a fricative or approximant consonant '' phonologically'', but often lacks the ...
, as in (''pas-ha''); this also applies to some foreign names.
Some features of the modern Estonian orthography are:
* Word-initial ''b, d, g'' occur only in loanwords and are normally pronounced as , , . Some old loanwords are spelled with ''p, t, k'' instead of etymological ''b, d, g'': 'bank'. Word-medially and word-finally, ''b, d, g'' represent short plosives (may be pronounced as partially voiced consonants), ''p, t, k'' represent half-long plosives , and ''pp, tt, kk'' represent overlong plosives ; for example: 'hoof' — 'wardrobe'
— 'wardrobe
.
* Before and after ''b, p, d, t, g, k, s, h, f, š, z, ž'', the sounds , , are written as ''p, t, k'', with some exceptions due to morphology or etymology. For example, the suffixed particle ''-gi'' 'too, also' may become ''-ki'', but does not alter the spelling of the stem, so 'desert' + ''-gi'' becomes .
* Word-initial is usually dropped in spontaneous speech, but should be represented in writing.
* The letter ''j'' is used at the beginning of syllables, but ''i'' is used at the end of diphthongs. Double ''j'' is used only in some
illative case forms. The spelling ''üü'' before vowels corresponds to the pronunciation : 'he sells' (from 'to sell'). The spelling ''üi'' is used only in the loanwords , , . Between ''i'' and vowels, the epenthetic sound is pronounced but not written. It is, however, written in the suffix ''-ja''.
* Vowels and the consonants ''h, j, l, m, n, r, s, v'' are written single when they are short, double when they are half-long or overlong: 'blood
— 'edge
— 'roll
, 'sheet' — 'town
— 'town
.
* Diphthongs and consonant combinations are written as combinations of single letters, regardless of whether they are pronounced short or long. Only ''s'' after ''l, m, n, r'' may be doubled if not followed by another consonant ( 'waltz'), otherwise combinations "consonant+double consonant" and "double consonant+consonant" occur only in morpheme boundaries, e. g. 'modern' (''-ne'' is a suffix), 'cardboard box' (from 'cardboard' and 'box'). However, a double consonant at the end of a root is simplified before a suffix beginning with a consonant (except ''-gi/-ki''): 'townsman' (from 'town').
* The single word-medial or word-final letters ''f'' and ''š'' represent half-long consonants , the double letters ''ff'' and ''šš'' represent overlong consonants . After consonants, ''f'' and ''š'' are always written single, regardless of whether they are pronounced half-long or overlong.
* Palatalization is not indicated in writing, e. g. 'jug' — 'toy'. It occurs in words that have ''i'' in declension: 'toy
and .
* Stress is not indicated in writing. Usually it falls on the first syllable, but there are a few exceptions with the stress on the second syllable: 'thanks', 'female friend'. Often the original stress is preserved in loanwords, such as 'ideal', 'professor'; presence of long vowels (as in ) also shows stress.
Syllabification
One consonant between two vowels belongs to the following syllable: 'fish' is syllabified ''ka-la''. Consonant combinations are syllabified before the last consonant: 'town
is syllabified ''lin-na'', 'acquaintance' is syllabified . Consonant digraphs and trigraphs in foreign names are regarded as single consonants: is syllabified ''Man-ches-ter''. Two vowels usually form a long vowel or a diphthong, e. g. 'song' is syllabified ''lau-lu''. However, a hiatus is formed in morpheme bounds, e. g. 'opening' is syllabified ''a-va-us'' as the word is composed from the root ''ava-'' and the suffix ''-us''. Combinations of three vowel letters represent a hiatus of a long vowel or a diphthong and another vowel, e. g. 'dry, droughty, arid (lacking rain)' is syllabified ''põu-a-ne''; but some loanwords have a hiatus of a short vowel followed by a long vowel: 'oasis' is syllabified ''o-aas''. Compound words are syllabified as combinations of their parts: 'grandmother' is syllabified as ''va-na-e-ma'' as the word is composed from 'old' and 'mother'. Etymologically compound loanwords and foreign names may be syllabified as compound or simple words: 'photographer' is syllabified ''fo-to-graaf'' or ''fo-tog-raaf'', is syllabified ''Pet-ro-grad'' or ''Pet-rog-rad''.
These syllabification rules are used for hyphenating words at the end of line, with the additional rule that a single letter is not left on a line.
Foreign words
Loanwords are normally adapted to Estonian spelling: 'web', 'jazz'. However, foreign words and phrases sometimes may be used in the original spelling, such as Latin phrases, Italian musical terms, exotic words. Such citations are typographically emphasized using italics and declined using an apostrophe: ''croissant''
'id 'croissants'.
Foreign proper names from Latin-script languages are written in their original spelling: ''
Margaret Thatcher
Margaret Hilda Thatcher, Baroness Thatcher (; 13 October 19258 April 2013), was a British stateswoman who served as Prime Minister of the United Kingdom from 1979 to 1990 and Leader of the Conservative Party (UK), Leader of th ...
,
Bordeaux
Bordeaux ( ; ; Gascon language, Gascon ; ) is a city on the river Garonne in the Gironde Departments of France, department, southwestern France. A port city, it is the capital of the Nouvelle-Aquitaine region, as well as the Prefectures in F ...
''. Names from non-Latin-script languages are written using either Estonian orthographic transcription or established romanization systems. Some geographical names (and some names of historical personalities, such as monarchs) have traditional Estonian forms (including some adapted spellings such as for German '
Vienna
Vienna ( ; ; ) is the capital city, capital, List of largest cities in Austria, most populous city, and one of Federal states of Austria, nine federal states of Austria. It is Austria's primate city, with just over two million inhabitants. ...
').
Derivations from foreign proper names with the suffixes ''-lik, -lane, -lus, -ism, -ist'' usually conserve the spelling of names (e. g. , ), but a few are adapted by established tradition: , , . Derivations without suffixes or with other suffixes are adapted to Estonian spelling: '
newton' (unit of force), (
maxillary sinus
The pyramid-shaped maxillary sinus (or antrum of Nathaniel Highmore (surgeon), Highmore) is the largest of the paranasal sinuses, located in the maxilla. It drains into the middle meatus of the noseHuman Anatomy, Jacobs, Elsevier, 2008, page 209- ...
itis, inflammation of the
antrum of Highmore), '
ytterbium' (chemical element), etc.
Expressions such as 'degree Celsius', 'Cheddar cheese' conserve the spelling of proper names (adding case endings). However, names of plants and animals are usually written in adapted forms, e.g. '
Colorado beetle'.
Apostrophe is used when adding case endings to proper names with unusual grapheme-to-phoneme correspondences (such as ending on a consonant orthographically but on a vowel phonetically or vice versa), e.g. (genitive of ''
Provence
Provence is a geographical region and historical province of southeastern France, which stretches from the left bank of the lower Rhône to the west to the France–Italy border, Italian border to the east; it is bordered by the Mediterrane ...
'').
Capitalization
Capital letters are written at the beginning of the first word in a sentence, proper names, and official names functioned as proper names. May be used in the pronouns 'you (singular)' and 'you (plural, also used as formal singular)' to show respect.
Names of months, days of the week, holidays, Chinese zodiac years, and titles of people such as are not capitalized.
Titles of books, films, etc. are written in quotation marks with only the first word and proper names capitalized.
Compound words
Compound words are written as one word, but they are often composed of genitive+nominative and hard to distinguish from simple word combinations. A compound word is considered a single word and written together when: 1) it has a separate meaning, e. g. 'chapter' but 'part of a head'; 2) it is different from the genitive+nominative combination, e. g. (nominative+nominative) 'watermill'; 3)some combinations may be together or separately, but writing them together is preferred in more complex word phrases: 'member of a party' — 'every member of the party'. Rare and long word combinations are typically written separately.
The
hyphen
The hyphen is a punctuation mark used to join words and to separate syllables of a single word. The use of hyphens is called hyphenation.
The hyphen is sometimes confused with dashes (en dash , em dash and others), which are wider, or with t ...
is used: 1) in compounds where one of the parts is a letter ( 'vitamin C'), an initialism ( 'text TV'), a foreign citation ( 'joke show') or a word part ( 'word containing ); 2) in compound adjectives where the first part as a proper name; 3) in compound geographical names such as 'South Estonia'; 4) as a
suspended hyphen, e.g. 'gold and silver things' (also in compound words such as 'export-import bank'); 5) in "nominative+ablative" adverbs, e. g. 'day after day'; 6) in
dvandva compounds, e. g. 'father and mother'; 7) in compound adjectives from word phrases, e.g. 'related to tentative phonetics'; 8) in compound adjectives with coordinating meaning, e.g. 'Estonian-English dictionary'; 9) in double names such as . It can be optionally used in unusual compounds such as 'gooseberry disease'; in compounds with three or four identical letters in a row (e. g. 'yearly', 'bone groove'); in compounds with numbers (see below) or with signs (e.g. '+ sign'); in the construction 'genitive of a proper name + nominative' after another genitive (e. g. 'European part of Russia'); in the colloquial construction 'genitive of a proper name + noun' instead of 'noun + proper name', e. g. instead of 'Uncle Kuusk'; in ad hoc compounds such as ; in words from two-or-more-component proper names, e. g. .
Abbreviations
The abbreviation period (full stop) may be used, but it is not mandatory. Commonly used abbreviations are usually written without the abbreviation period: , , or for 'street'; for 'see'; for 'and many others'. Using the abbreviation period is recommended when an abbreviation may be misread as another word: for 'figure, draft' but 'line'. If an abbreviation of a word phrase may be mistaken for a word or for another abbreviation, periods are used after every letter but the last one, and spaces are not used: for but 'mother', for but for 'economy'.
The hyphen is used in some abbreviations of compound words, e. g. for 'doctor of pedagogy', for 'capitan lieutenant', especially in the construction ''abbreviation + complete word'', such as for 'reinforced concrete panels'.
Numerals
Numerals may be written in words ( 'one', 'two', 'three'...) or in figures (1, 2, 3, ...). In Estonian texts, the comma is used as the decimal separator, and the space is used as thousands separator (in financial documents, the point can be used as thousands separator to avoid inserting an extra digit). The point as a separator is used for dates, daytime, prices, and sports results in meters and centimeters. For prices in euros and cents, writing ''€ 84.95'' as well as ''84,95 €'' is accepted. Daytime in hours and minutes (24-hour format) may be written using the point or the colon (without spaces): ''16.15'' or ''16:15''; but seconds are separated by the point: ''16:15.25''. The colon with spaces is used for ratios. e. g. ''2 : 3''.
When written in words, numerals with or (11 to 19), (tens) and (hundreds) are written together, e.g. 'fifteen', 'fifty', 'five hundred'. Other compound numerals are written separately: 'twenty-five'.
For writing ordinal numbers in figures, the
ordinal dot is used: ''16.'' for 'the sixteenth'. In some cases, ordinals are written as Roman numerals (without the ordinal dot). Roman numerals followed by a dot may be used in numbered lists.
Case forms of cardinal and ordinal numerals may be written in the form "figures+case ending" with or without a hyphen: ''16s'' or ''16-s'' for 'sixteen
nessive, or for 'the sixteenth
nessive. For case endings beginning with the letter ''l'', the hyphen is mandatory to avoid confusion with the digit 1: ''16-le'' for 'sixteen
llative. Case endings after figures are not used when a cardinal or ordinal numeral is in a case concordance with a following noun. Likewise, compound words with numbers written in figures may be written with or without the hyphen: or for '60-watt light'.
Punctuation
The period (full stop) is used at the end of sentences, as an ordinal mark and sometimes as an abbreviation mark and as a number separator (see above).
The comma is used for appositions (but appositions in genitive require the comma only before them), for more than one attribute after a determined word, for enumerations (but the serial comma is not used), between coordinated or subordinated clauses, between direct speech and author's words, before and after parenthetic or vocative phrases, and before and after some other constructions. It is also used between placenames and dates in the nominative case (but not in locative cases); between a surname and a given name, if they are written in this order; before parts of and address; and as a decimal mark.
The colon is used before lists, before direct speech, before explanations, and also in writing daytime and ratios (see above).
The semicolon is used between weakly related parts of sentences, especially containing commas.
The hyphen is used for writing compound words (see above). It is also used for hyphenating words at the end of line, for declining letters and abbreviations, and optionally for declining acronyms/initialisms, numbers, and symbols.
The dash is used when there appears a generalizing word after an enumeration; instead of the comma for accenting clauses and appositions or for relatively long parenthetical constructions; before words indicating surprise; for slight pauses (interchangeably with the ellipsis); in the meaning "from...to" (instead of the word ''kuni''); for indicating lines or routes (when in attributive function, the hyphen is also accepted); between coordinated attributes if at least one attribute has a hyphen or a space; between remarks of a dialogue written as one line without author's words; as a marker before enumeration items. The dash is not used to indicate omission of a word that would be repeated.
The exclamation and question marks are used at the end of exclamative and interrogative sentences. Occasionally, they may be parenthesized and written after words within sentences to show doubt or surprise. The exclamation mark is also used for addressing people in letters, e.g. . Using the comma or the colon in this case is considered inappropriate.
The quotation marks, written as „ ”, are used for direct speech, citations, scare quotes, and names of books, documents, episodes, enterprises, etc. Names of plant sorts may be written in double or in single quotation marks (looking like apostrophes: ’ ’) and are normally italicized. For cited words and phrases, including words in a linguistic context, quotation marks or italics may be used. Quotation marks are not used in the names of institutions, periodicals, awards, wares, and vehicles.
The apostrophe is used for adding case endings and suffixes to foreign names with unusual grapheme-to-phoneme correspondences and to foreign citations in the original spelling (see above). Sometimes the apostrophe is used for adding case endings and suffixes to Estonian names, to make the original form clear: (allative of the surname ), (the apostrophe is used to conserve the spelling of the surname , otherwise the double consonant would become a single consonant). Also, the apostrophe is sometimes used in poetry to indicate omission of a sound: , , instead of , , are found in
Lydia Koidula
Lydia Emilie Florentine Jannsen ( – ), known by her pen name Koidula, was an Estonian literature, Estonian poet. Her sobriquet means '(Lydia of) The Dawn' in Estonian language, Estonian. It was given to her by the writer Carl Robert Jakobson. Sh ...
's poems. Single quotation marks (' ') are used for word meanings in a linguistic context.
The parentheses are used for parenthetical words or sentences, and also for optional parts of words in a linguistic context.
The square brackets are used for citer's notes to citations and for showing pronunciation in linguistic and reference works.
The slash is used for division in fractions and unit symbols, for connecting alternatives, to show line breaks when citing poetry in the single-line format, and for non-calendar years. In practice, it occasionally appears in abbreviations made of more than one word (e. g. for 'school year'), but this usage is considered nonstandard (correct abbreviation: ). Spaces are used before and after the slash only if it separates text fragments of more than one word.
The ellipsis is used for slight pauses and for unfinished thoughts. It is surrounded by spaces. Also, the ellipsis is used for bowdlerizing obscene words.
History
Modern Estonian orthography is based on the ''Newer Orthography'' created by
Eduard Ahrens in the second half of the 19th century based on Finnish orthography. The ''Older Orthography'' it replaced was created in the 17th century by
Bengt Gottfried Forselius and
Johann Hornung based on
standard German
Standard High German (SHG), less precisely Standard German or High German (, , or, in Switzerland, ), is the umbrella term for the standard language, standardized varieties of the German language, which are used in formal contexts and for commun ...
orthography. In the old orthography, single consonants following short vowels were written double even if they are short ( 'fish' was written as ) and long vowels in an open syllable were written single ( 'to create' was written as ). Before
Otto Wilhelm Masing introduced the letter ''õ'' in the early 19th century, its sound had not been distinguished in writing from ''ö''. Earlier writing in Estonian had by and large used an ad hoc orthography based on
Latin
Latin ( or ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken by the Latins (Italic tribe), Latins in Latium (now known as Lazio), the lower Tiber area aroun ...
and
Middle Low German
Middle Low German is a developmental stage of Low German. It developed from the Old Saxon language in the Middle Ages and has been documented in writing since about 1225–34 (). During the Hanseatic period (from about 1300 to about 1600), Mid ...
orthography. Some influences of the standard German orthography — for example, writing ''W''/''w'' instead of ''V''/''v'' persisted well into the 1930s.
In
Fraktur
Fraktur () is a calligraphic hand of the Latin alphabet and any of several blackletter typefaces derived from this hand. It is designed such that the beginnings and ends of the individual strokes that make up each letter will be clearly vis ...
typesetting (which was common in Estonian publications before the first half of the 20th century), two kinds of the small letter ''s'' were distinguished: the short ''s'' and the
long ''ſ''. The long ''ſ'' was used at the beginning and in the middle of syllables, and the short ''s'' was used at the end of syllables. For example: 'cat' — 'cat
en. sg., part. sg..
In the second half of the 20th century, some Estonian words and names were quoted in international publications from Soviet sources, and were often in fact spelled as incorrect back-transliterations from Russian Cyrillic. Such examples of Russian transliteration include the use of я ("ya") for ä (e.g. Pyarnu (Пярну) for
Pärnu
Pärnu () is the fourth-largest city in Estonia. Situated in southwest Estonia, Pärnu is located south of the Estonian capital, Tallinn, and west of Estonia's second-largest city, Tartu. The city sits off the coast of Pärnu Bay, an inlet of ...
), ы ("y") for õ (e.g., Pylva (Пылва) for
Põlva) and ю ("yu") for ü (e.g., Pyussi (Пюсси) for
Püssi).
See also
*
Estonian Braille
*
Finnish alphabet
Finnish orthography is based on the Latin script, and uses an alphabet derived from the Swedish alphabet, officially comprising twenty-nine letters but also including two additional letters found in some loanwords. The Finnish orthography striv ...
References
Mati Erelt, Tiiu Erelt, Kristiina Ross. Eesti keele käsiraamat 2020(in Estonian).
*
Eesti õigekeelsussõnaraamat ÕS 2018 (in Estonian).
*Mati Erelt, Reet Kasik, Helle Metslang, Henno Rajandi, Kristiina Ross, Henn Saari,
Kaja Tael, Silvi Vare
Eesti keele grammatika. II. Süntaks. Lisa: Kiri. Eesti Teaduste Akadeemia Keele ja Kirjanduse Instituut. Tallinn, 1993(in Estonian).
External links
Orthography and pronunciation
{{Language orthographies
Estonian language
Latin-script orthographies