An orthography is a set of
conventions for
writing a
language. It includes norms of
spelling,
hyphenation,
capitalization,
word breaks,
emphasis, and
punctuation.
Most transnational languages in the modern period have a system of
writing, and for most such languages a standard orthography has been developed, often based on a
standard variety of the language, and thus exhibiting less
dialect variation than the spoken language. Sometimes there may be variation in a language's orthography, as between
American and British spelling in the case of
English orthography. In some languages orthography is regulated by
language academies, although for many languages (including English) there are no such authorities. Even in the latter languages, a significant amount of consensus arises
naturally, although a maximum of consistency or standardization occurs only when
prescriptively imposed according to
style guides.
Etymology and meaning
The English word ''orthography'' dates from the 15th century. It comes from the
French ''orthographie'', from
Latin ''orthographia'', which derives from
Greek ὀρθός ''orthós'', "correct", and γράφειν ''gráphein'', "to write".
Orthography is largely concerned with matters of
spelling, and in particular the relationship between
phonemes and
graphemes in a language. Other elements that may be considered part of orthography include
hyphenation,
capitalization,
word breaks,
emphasis, and
punctuation. Orthography thus describes or defines the set of symbols used in writing a language, and the rules regarding how to use those symbols.
Most
natural languages developed as oral languages, and
writing systems have usually been crafted or adapted as ways of representing the spoken language. The rules for doing this tend to become
standardized for a given language, leading to the development of an orthography that is generally considered "correct". In
linguistics the term ''orthography'' is often used to refer to any method of writing a language, without judgment as to right and wrong, with a scientific understanding that orthographic standardization exists on a spectrum of strength of convention. The original sense of the word, though, implies a dichotomy of correct and incorrect, and the word is still most often used to refer specifically to a thoroughly standardized,
prescriptively correct, way of writing a language. A distinction may be made here between
''etic'' and ''emic'' viewpoints: the purely descriptive (etic) approach, which simply considers any system that is actually used—and the emic view, which takes account of language users' perceptions of correctness.
Units and notation
Orthographic units, such as letters of an
alphabet, are technically called
graphemes. These are a type of
abstraction, analogous to the
phonemes of spoken languages; different physical forms of written symbols are considered to represent the same grapheme if the differences between them are not significant for meaning. For example, different forms of the letter "b" are all considered to represent a single grapheme in the orthography of, say, English.
Graphemes or sequences of them are sometimes placed between angle brackets, as in or . This distinguishes them from phonemic transcription, which is placed between slashes (, ), and from
phonetic transcription, which is placed between square brackets (, ).
Types
The
writing systems on which orthographies are based can be divided into a number of types, depending on what type of unit each symbol serves to represent. The principal types are ''
logographic'' (with symbols representing words or
morphemes), ''
syllabic'' (with symbols representing syllables), and ''
alphabetic'' (with symbols roughly representing
phonemes). Many writing systems combine features of more than one of these types, and a number of detailed classifications have been proposed. Japanese is an example of a writing system that can be written using a combination of logographic
kanji characters and syllabic
hiragana and
katakana characters; as with many non-alphabetic languages, alphabetic
romaji characters may also be used as needed.
Correspondence with pronunciation
Orthographies that use
alphabets and
syllabaries are based on the principle that the written symbols (
graphemes) correspond to units of sound of the spoken language:
phonemes in the former case, and
syllables in the latter. However, in virtually all cases, this correspondence is not exact. Different languages' orthographies offer different degrees of correspondence between spelling and pronunciation.
English orthography,
French orthography and
Danish orthography, for example, are highly irregular, whereas the orthographies of languages such as
Russian,
German and
Spanish represent pronunciation much more faithfully, although the correspondence between letters and phonemes is still not exact.
Finnish,
Turkish and
Serbo-Croatian orthographies are remarkably consistent: approximation of the principle "one letter per sound".
An orthography in which the correspondences between spelling and pronunciation are highly complex or inconsistent is called a ''
deep orthography'' (or less formally, the language is said to have ''irregular spelling''). An orthography with relatively simple and consistent correspondences is called ''shallow'' (and the language has ''regular spelling'').
One of the main reasons for which spelling and pronunciation deviate is that
sound changes taking place in the spoken language are not always reflected in the orthography, and hence spellings correspond to historical rather than present-day pronunciation. One consequence of this is that many spellings come to reflect a word's
morphophonemic structure rather than its purely phonemic structure (for example, the English regular past tense
morpheme is consistently spelled ''-ed'' in spite of its different pronunciations in various words). This is discussed further at .
The
syllabary systems of
Japanese (
hiragana and
katakana) are examples of almost perfectly shallow orthographies—the kana correspond with almost perfect consistency to the spoken syllables, although with a few exceptions where symbols reflect historical or morphophonemic features: notably the use of ぢ ''ji'' and づ ''zu'' (rather than じ ''ji'' and ず ''zu'', their pronunciation in standard Tokyo dialect) when the character is a voicing of an underlying ち or つ (see
rendaku), and the use of は, を, and へ to represent the sounds わ, お, and え, as relics of
historical kana usage.
The Korean ''
hangul'' system was also originally an extremely shallow orthography, but as a representation of the modern language it frequently also reflects morphophonemic features.
For full discussion of degrees of correspondence between spelling and pronunciation in alphabetic orthographies, including reasons why such correspondence may break down, see
Phonemic orthography.
Defective orthographies
An orthography based on the principle that symbols correspond to phonemes may, in some cases, lack characters to represent all the phonemes or all the phonemic distinctions in the language. This is called a
defective orthography. An example in English is the lack of any indication of
stress. Another is the
digraph ''th'', which represents two different phonemes (as in ''then'' and ''thin'') and replaced the old letters ''
ð'' and ''
þ''. A more systematic example is that of
abjads like the
Arabic and
Hebrew alphabets, in which the short vowels are normally left unwritten and must be inferred by the reader.
When an alphabet is borrowed from its original language for use with a new language—as has been done with the
Latin alphabet for many languages, or Japanese
Katakana for non-Japanese words—it often proves defective in representing the new language's phonemes. Sometimes this problem is addressed by the use of such devices as
digraphs (such as ''sh'' and ''ch'' in English, where pairs of letters represent single sounds),
diacritics (like the
caron on the letters ''š'' and ''č'', which represent those same sounds in
Czech), or the addition of completely new symbols (as some languages have introduced the letter ''
w'' to the Latin alphabet) or of symbols from another alphabet, such as the
rune ''
þ'' in Icelandic.
After the classical period, Greek developed a lowercase letter system that introduced
diacritic marks to enable foreigners to learn pronunciation and in some cases, grammatical features. However, as pronunciation of letters changed over time, the
diacritic marks were reduced to representing the stressed syllable. In Modern Greek typesetting, this system has been simplified to only have a single accent to indicate which syllable is stressed.
[Bulley, Michael. 2011. "Spelling Reform: A Lesson from the Greeks". ''English Today'', 27(4), p. 71. ]
See also
*
Cursive
*
Grapheme
*
Keyboard layout
*
Lateral masking
*
Leet
*
List of language disorders
*
Palaeography
*
Penmanship
*
Prescription and description
*
Romanization
*
Writing system
References
Further reading
*
*
Smalley, W. A. (ed.) 1964. ''Orthography studies: articles on new writing systems'' (United Bible Society, London).
*
External links
Videos: The History and Impact of Writing in the WestPhonemic awarenesspage of the CTER
wiki
lonestar.texas.net/~jebbo/learn-as/orthography of
Old English
{{Authority control
Category:Applied linguistics
Category:Language