The orthographic depth of an

alphabet An alphabet is a standard set of letter (alphabet), letters written to represent particular sounds in a spoken language. Specifically, letters largely correspond to phonemes as the smallest sound segments that can distinguish one word from a ...

orthography An orthography is a set of convention (norm), conventions for writing a language, including norms of spelling, punctuation, Word#Word boundaries, word boundaries, capitalization, hyphenation, and Emphasis (typography), emphasis. Most national ...

indicates the degree to which a written language deviates from simple one-to-one letter–

phoneme A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...

correspondence. It depends on how easy it is to predict the pronunciation of a word based on its spelling: shallow orthographies are easy to pronounce based on the written word, and deep orthographies are difficult to pronounce based on how they are written. In shallow orthographies, the spelling-sound correspondence is direct: from the rules of pronunciation, one is able to pronounce the word correctly. In other words, shallow (transparent) orthographies, also called phonemic orthographies, have a one-to-one relationship between its graphemes and phonemes, and the spelling of words is very consistent. Examples include Japanese kana,

Hindi Modern Standard Hindi (, ), commonly referred to as Hindi, is the Standard language, standardised variety of the Hindustani language written in the Devanagari script. It is an official language of India, official language of the Government ...

, Lao (since 1975), Spanish, Finnish, Turkish, Georgian,

Latin Latin ( or ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken by the Latins (Italic tribe), Latins in Latium (now known as Lazio), the lower Tiber area aroun ...

, Italian,

Serbo-Croatian Serbo-Croatian ( / ), also known as Bosnian-Croatian-Montenegrin-Serbian (BCMS), is a South Slavic language and the primary language of Serbia, Croatia, Bosnia and Herzegovina, and Montenegro. It is a pluricentric language with four mutually i ...

, Ukrainian, and Welsh. In contrast, in deep (opaque) orthographies, the relationship is less direct, and the reader must learn the arbitrary or unusual pronunciations of irregular words. In other words, deep orthographies are writing systems that do not have a one-to-one correspondence between sounds (phonemes) and the letters (

grapheme In linguistics, a grapheme is the smallest functional unit of a writing system. The word ''grapheme'' is derived from Ancient Greek ('write'), and the suffix ''-eme'' by analogy with ''phoneme'' and other emic units. The study of graphemes ...

s) that represent them. They may reflect etymology. Examples include English, Danish, Swedish, Faroese, Chinese, Tibetan, Mongolian, Thai, Khmer, Burmese, Lao (until 1975; now only used overseas), French, and

Franco-Provençal Franco-Provençal (also Francoprovençal, Patois or Arpitan) is a Gallo-Romance languages, Gallo-Romance language that originated and is spoken in eastern France, western Switzerland, and northwestern Italy. Franco-Provençal has several di ...

. Orthographies such as those of German, Hungarian (mainly phonemic with the exception ''ly'', ''j'' representing the same sound, but consonant and vowel length are not always accurate and various spellings reflect etymology, not pronunciation), Portuguese, modern Greek, Icelandic, Korean, Tamil, and Russian are considered to be of intermediate depth as they include many morphophonemic features. (see §Comparison between languages)

By language

Written Korean represents an unusual hybrid; each phoneme in the language is represented by a letter but the letters are packaged into "square" units of two to four phonemes, each square representing a syllable. Korean has very complex phonological variation rules, especially regarding the consonants rather than the vowels, in contrast to English. For example, the Korean word , which should be pronounced as based on standard pronunciations of the components of the grapheme, is actually pronounced as . Among the consonants of the Korean language, only one is always pronounced exactly as it is written. Italian offers clear examples of differential directionality in depth. Even in a very shallow orthographic system, spelling-to-pronunciation and pronunciation-to-spelling may not be equally clear. There are two major imperfect matches of vowels to letters: in stressed syllables, ''e'' can represent either open or closed , and ''o'' stands for either open or closed . According to the orthographic principles used for the language, 'sect', for example, with open can be spelled only , and 'summit' with closed can be only — if a listener can hear it, he can spell it. But since the letter ''e'' is assigned to represent both and , there is no principled way to know whether to pronounce the written words and with or — the spelling does not present the information needed for accurate pronunciation. A second lacuna in Italian's shallow orthography is that, although stress position in words is only very partially predictable, it is normally not indicated in writing. For purposes of spelling, it makes no difference which syllable is stressed in the place names '' Arsoli'' and '' Carsoli'', but the spellings offer no clue that they are ''ARsoli'' and ''CarSOli'' (and as with the letter ''e'' above, the stressed ''o'' of ''Carsoli'', which is , is unknown from the spelling).

Orthographic depth hypothesis

According to the orthographic depth hypothesis, shallow orthographies are more easily able to support a word recognition process that involves the language phonology. In contrast, deep orthographies encourage a reader to process printed words by referring to their morphology via the printed word's visual-orthographic structure (see also Ram Frost). For languages with relatively deep orthographies such as English, French, unvocalised

Arabic Arabic (, , or , ) is a Central Semitic languages, Central Semitic language of the Afroasiatic languages, Afroasiatic language family spoken primarily in the Arab world. The International Organization for Standardization (ISO) assigns lang ...

Hebrew Hebrew (; ''ʿÎbrit'') is a Northwest Semitic languages, Northwest Semitic language within the Afroasiatic languages, Afroasiatic language family. A regional dialect of the Canaanite languages, it was natively spoken by the Israelites and ...

, new readers have much more difficulty learning to decode words. As a result, children learn to read more slowly. Goswami, Usha (2005-09-06). "Chapter 28: Orthography, Phonology, and Reading Development: A Cross-Linguistic Perspective". in Malatesha, Joshi. Handbook of orthography and literacy. Lawrence Erlbaum Assoc Inc. pp. 463–464. . For languages with relatively shallow orthographies, such as Italian and Finnish, new readers have few problems learning to decode words. As a result, children learn to read relatively quickly. Van den Bosch et al. consider orthographic depth to be the composition of at least two separate components. One of these relates to the complexity of the relations between the elements at the graphemic level (graphemes) to those at the phonemic level (phonemes), i.e., how difficult it is to convert graphemic strings (words) to phonemic strings. The second component is related to the diversity at the graphemic level, and to the complexity of determining the graphemic elements of a word (graphemic parsing), i.e., how to align a phonemic transcription to its

spelling Spelling is a set of conventions for written language regarding how graphemes should correspond to the sounds of spoken language. Spelling is one of the elements of orthography, and highly standardized spelling is a prescriptive element. Spelli ...

counterpart. In 2021, Xavier Marjou used an

artificial neural network In machine learning, a neural network (also artificial neural network or neural net, abbreviated ANN or NN) is a computational model inspired by the structure and functions of biological neural networks. A neural network consists of connected ...

to rank 17 orthographies according to their level of transparency. Among the tested orthographies, Chinese and French orthographies, followed by English and Russian, are the most opaque regarding writing (i.e. phonemes to graphemes direction) and English, followed by Dutch, is the most opaque regarding reading (i.e. graphemes to phonemes direction); Esperanto, Arabic, Finnish, Korean, Serbo-Croatian and Turkish are very shallow both to read and to write; Italian is shallow to read and very shallow to write; Breton, German, Portuguese and Spanish are shallow to read and to write.

References

{{reflist Depth Phonetics Phonology Spelling

By language

Orthographic depth hypothesis

See also

References