Grapheme → Color Synesthesia
   HOME

TheInfoList



OR:

In
linguistics Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...
, a grapheme is the smallest functional unit of a
writing system A writing system comprises a set of symbols, called a ''script'', as well as the rules by which the script represents a particular language. The earliest writing appeared during the late 4th millennium BC. Throughout history, each independen ...
. The word ''grapheme'' is derived from
Ancient Greek Ancient Greek (, ; ) includes the forms of the Greek language used in ancient Greece and the classical antiquity, ancient world from around 1500 BC to 300 BC. It is often roughly divided into the following periods: Mycenaean Greek (), Greek ...
('write'), and the suffix ''-eme'' by analogy with ''
phoneme A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
'' and other
emic unit In linguistics and related fields, an emic unit is a type of abstract object. cited in Kinds of emic units are generally denoted by terms with the suffix ''-eme'', such as ''phoneme'', ''grapheme'', and ''morpheme''. The term "emic unit" is def ...
s. The study of graphemes is called ''
graphemics Graphemics or graphematics is the linguistic study of writing systems and their basic components, i.e. graphemes. At the beginning of the development of this area of linguistics, Ignace Gelb coined the term grammatology for this discipline;Gelb, ...
''. The concept of graphemes is abstract and similar to the notion in
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
of a character. (A specific geometric shape that represents any particular grapheme in a given
typeface A typeface (or font family) is a design of Letter (alphabet), letters, Numerical digit, numbers and other symbols, to be used in printing or for electronic display. Most typefaces include variations in size (e.g., 24 point), weight (e.g., light, ...
is called a
glyph A glyph ( ) is any kind of purposeful mark. In typography, a glyph is "the specific shape, design, or representation of a character". It is a particular graphical representation, in a particular typeface, of an element of written language. A ...
.)


Conceptualization

There are two main opposing grapheme concepts. In the so-called ''referential conception'', graphemes are interpreted as the smallest units of writing that correspond with sounds (more accurately
phoneme A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
s). In this concept, the ''sh'' in the written English word ''shake'' would be a grapheme because it represents the phoneme /ʃ/. This referential concept is linked to the ''dependency hypothesis'' that claims that writing merely depicts speech. By contrast, the ''analogical concept'' defines graphemes analogously to phonemes, i.e. via written
minimal pair In phonology, minimal pairs are pairs of words or phrases in a particular language, spoken or signed, that differ in only one phonological element, such as a phoneme, toneme or chroneme, and have distinct meanings. They are used to demonstrate t ...
s such as ''shake'' vs. ''snake''. In this example, ''h'' and ''n'' are graphemes because they distinguish two words. This analogical concept is associated with the autonomy hypothesis which holds that writing is a system in its own right and should be studied independently from speech. Both concepts have weaknesses. Some models adhere to both concepts simultaneously by including two individual units, which are given names such as ''graphemic grapheme'' for the grapheme according to the analogical conception (''h'' in ''shake''), and ''phonological-fit grapheme'' for the grapheme according to the referential concept (''sh'' in ''shake''). In newer concepts, in which the grapheme is interpreted semiotically as a dyadic
linguistic sign In semiotics, a sign is anything that communicates a meaning that is not the sign itself to the interpreter of the sign. The meaning can be intentional, as when a word is uttered with a specific meaning, or unintentional, as when a symptom is t ...
, it is defined as a minimal unit of writing that is both lexically distinctive and corresponds with a linguistic unit (
phoneme A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
,
syllable A syllable is a basic unit of organization within a sequence of speech sounds, such as within a word, typically defined by linguists as a ''nucleus'' (most often a vowel) with optional sounds before or after that nucleus (''margins'', which are ...
, or
morpheme A morpheme is any of the smallest meaningful constituents within a linguistic expression and particularly within a word. Many words are themselves standalone morphemes, while other words contain multiple morphemes; in linguistic terminology, this ...
).


Notation

Graphemes are often notated within
angle bracket A bracket is either of two tall fore- or back-facing punctuation marks commonly used to isolate a segment of text or data from its surroundings. They come in four main pairs of shapes, as given in the box to the right, which also gives their n ...
s: e.g. .The Cambridge Encyclopedia of Language, second edition, Cambridge University Press, 1997, p. 196 This is analogous to the slash notation used for
phoneme A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
s. Analogous to the
square bracket A bracket is either of two tall fore- or back-facing punctuation marks commonly used to isolate a segment of text or data from its surroundings. They come in four main pairs of shapes, as given in the box to the right, which also gives their n ...
notation used for phones,
glyph A glyph ( ) is any kind of purposeful mark. In typography, a glyph is "the specific shape, design, or representation of a character". It is a particular graphical representation, in a particular typeface, of an element of written language. A ...
s are sometimes denoted with vertical lines, e.g. .


Glyphs

In the same way that the surface forms of
phoneme A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
s are speech sounds or phones (and different phones representing the same phoneme are called
allophone In phonology, an allophone (; from the Greek , , 'other' and , , 'voice, sound') is one of multiple possible spoken soundsor '' phones''used to pronounce a single phoneme in a particular language. For example, in English, the voiceless plos ...
s), the surface forms of graphemes are
glyph A glyph ( ) is any kind of purposeful mark. In typography, a glyph is "the specific shape, design, or representation of a character". It is a particular graphical representation, in a particular typeface, of an element of written language. A ...
s (sometimes ''graphs''), namely concrete written representations of symbols (and different glyphs representing the same grapheme are called
allograph In graphemics and typography, the term allograph is used of a glyph that is a design variant of a letter or other grapheme, such as a letter, a number, an ideograph, a punctuation mark or other typographic symbol. In graphemics, an obvious exa ...
s). Thus, a grapheme can be regarded as an
abstraction Abstraction is a process where general rules and concepts are derived from the use and classifying of specific examples, literal (reality, real or Abstract and concrete, concrete) signifiers, first principles, or other methods. "An abstraction" ...
of a collection of glyphs that are all functionally equivalent. For example, in written English (or other languages using the
Latin alphabet The Latin alphabet, also known as the Roman alphabet, is the collection of letters originally used by the Ancient Rome, ancient Romans to write the Latin language. Largely unaltered except several letters splitting—i.e. from , and from � ...
), there are two different physical representations of the
lowercase Letter case is the distinction between the letters that are in larger uppercase or capitals (more formally ''majuscule'') and smaller lowercase (more formally '' minuscule'') in the written representation of certain languages. The writing system ...
Latin letter "a": "a" and "ɑ". Since, however, the substitution of either of them for the other cannot change the meaning of a word, they are considered to be allographs of the same grapheme, which can be written . Similarly, the grapheme corresponding to "Arabic numeral zero" has a unique semantic identity and Unicode value but exhibits variation in the form of
slashed zero The slashed zero, , is a representation of the Arabic digit zero ("0") with a slash (punctuation), slash through it. This variant zero glyph is often used to distinguish the digit zero from the Latin script letter O anywhere that the distinctio ...
. Italic and bold face forms are also allographic, as is the variation seen in
serif In typography, a serif () is a small line or stroke regularly attached to the end of a larger stroke in a letter or symbol within a particular font or family of fonts. A typeface or "font family" making use of serifs is called a serif typeface ( ...
(as in
Times New Roman Times New Roman is a serif typeface commissioned for use by the British newspaper ''The Times'' in 1931. It has become one of the most popular typefaces of all time and is installed on most personal computers. The typeface was conceived by Stanl ...
) versus
sans-serif In typography and lettering, a sans-serif, sans serif (), gothic, or simply sans letterform is one that does not have extending features called "serifs" at the end of strokes. Sans-serif typefaces tend to have less stroke width variation than ...
(as in
Helvetica Helvetica, also known by its original name Neue Haas Grotesk, is a widely-used sans-serif typeface developed in 1957 by Swiss typeface designer Max Miedinger and Eduard Hoffmann. Helvetica is a neo-grotesque design, one influenced by the f ...
) forms. There is some disagreement as to whether capital and lower case letters are allographs or distinct graphemes. Capitals are generally found in certain triggering contexts that do not change the meaning of a word: a proper name, for example, or at the beginning of a sentence, or all caps in a newspaper headline. In other contexts, capitalization can determine meaning: compare, for example Polish and polish: the former is a language, the latter is for shining shoes. Some linguists consider digraphs like the in ''ship'' to be distinct graphemes, but these are generally analyzed as sequences of graphemes. Non-stylistic ligatures, however, such as , are distinct graphemes, as are various letters with distinctive
diacritic A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacrit ...
s, such as . Identical glyphs may not always represent the same grapheme. For example, the three letters , and appear identical but each has a different meaning: in order, they are the Latin letter A, the Cyrillic letter Azǔ/Азъ and the Greek letter
Alpha Alpha (uppercase , lowercase ) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter ''aleph'' , whose name comes from the West Semitic word for ' ...
. Each has its own
code point A code point, codepoint or code position is a particular position in a Table (database), table, where the position has been assigned a meaning. The table may be one dimensional (a column), two dimensional (like cells in a spreadsheet), three dime ...
in Unicode: , and .


Types of grapheme

The principal types of graphemes are
logogram In a written language, a logogram (from Ancient Greek 'word', and 'that which is drawn or written'), also logograph or lexigraph, is a written character that represents a semantic component of a language, such as a word or morpheme. Chine ...
s (more accurately termed morphograms), which represent words or
morpheme A morpheme is any of the smallest meaningful constituents within a linguistic expression and particularly within a word. Many words are themselves standalone morphemes, while other words contain multiple morphemes; in linguistic terminology, this ...
s (for example
Chinese characters Chinese characters are logographs used Written Chinese, to write the Chinese languages and others from regions historically influenced by Chinese culture. Of the four independently invented writing systems accepted by scholars, they represe ...
, the
ampersand The ampersand, also known as the and sign, is the logogram , representing the grammatical conjunction, conjunction "and". It originated as a typographic ligature, ligature of the letters of the word (Latin for "and"). Etymology Tradi ...
"&" representing the word ''and'',
Arabic numerals The ten Arabic numerals (0, 1, 2, 3, 4, 5, 6, 7, 8, and 9) are the most commonly used symbols for writing numbers. The term often also implies a positional notation number with a decimal base, in particular when contrasted with Roman numera ...
); syllabic characters, representing
syllable A syllable is a basic unit of organization within a sequence of speech sounds, such as within a word, typically defined by linguists as a ''nucleus'' (most often a vowel) with optional sounds before or after that nucleus (''margins'', which are ...
s (as in Japanese
kana are syllabary, syllabaries used to write Japanese phonology, Japanese phonological units, Mora (linguistics), morae. In current usage, ''kana'' most commonly refers to ''hiragana'' and ''katakana''. It can also refer to their ancestor , wh ...
); and
alphabet An alphabet is a standard set of letter (alphabet), letters written to represent particular sounds in a spoken language. Specifically, letters largely correspond to phonemes as the smallest sound segments that can distinguish one word from a ...
ic letters, corresponding roughly to
phoneme A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
s (see next section). For a full discussion of the different types, see . There are additional graphemic components used in writing, such as
punctuation mark Punctuation marks are marks indicating how a piece of written text should be read (silently or aloud) and, consequently, understood. The oldest known examples of punctuation marks were found in the Mesha Stele from the 9th century BC, consisti ...
s,
mathematical symbol A mathematical symbol is a figure or a combination of figures that is used to represent a mathematical object, an action on mathematical objects, a relation between mathematical objects, or for structuring the other symbols that occur in a formula ...
s,
word divider In punctuation, a word divider is a form of glyph which separates written words. In languages which use the Latin, Cyrillic, and Arabic alphabets, as well as other scripts of Europe and West Asia, the word divider is a blank space, or ''whitesp ...
s such as the space, and other typographic symbols. Ancient logographic scripts often used silent
determinative A determinative, also known as a taxogram or semagram, is an ideogram used to mark semantic categories of words in logographic scripts which helps to disambiguate interpretation. They have no direct counterpart in spoken language, though they ...
s to disambiguate the meaning of a neighboring (non-silent) word.


Relationship with phonemes

As mentioned in the previous section, in languages that use
alphabet An alphabet is a standard set of letter (alphabet), letters written to represent particular sounds in a spoken language. Specifically, letters largely correspond to phonemes as the smallest sound segments that can distinguish one word from a ...
ic writing systems, many of the graphemes stand in principle for the
phoneme A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
s (significant sounds) of the language. In practice, however, the
orthographies An orthography is a set of conventions for writing a language, including norms of spelling, punctuation, word boundaries, capitalization, hyphenation, and emphasis. Most national and international languages have an established writing syst ...
of such languages entail at least a certain amount of deviation from the ideal of exact grapheme–phoneme correspondence. A phoneme may be represented by a
multigraph In mathematics, and more specifically in graph theory, a multigraph is a graph which is permitted to have multiple edges (also called ''parallel edges''), that is, edges that have the same end nodes. Thus two vertices may be connected by mor ...
(sequence of more than one grapheme), as the digraph ''sh'' represents a single sound in English (and sometimes a single grapheme may represent more than one phoneme, as with the Russian letter я or the Spanish c). Some graphemes may not represent any sound at all (like the ''b'' in English ''debt'' or the ''h'' in all Spanish words containing the said letter), and often the rules of correspondence between graphemes and phonemes become complex or irregular, particularly as a result of historical
sound change In historical linguistics, a sound change is a change in the pronunciation of a language. A sound change can involve the replacement of one speech sound (or, more generally, one phonetic feature value) by a different one (called phonetic chan ...
s that are not necessarily reflected in spelling. "Shallow" orthographies such as those of standard
Spanish Spanish might refer to: * Items from or related to Spain: **Spaniards are a nation and ethnic group indigenous to Spain **Spanish language, spoken in Spain and many countries in the Americas **Spanish cuisine **Spanish history **Spanish culture ...
and Finnish have relatively regular (though not always one-to-one) correspondence between graphemes and phonemes, while those of French and English have much less regular correspondence, and are known as
deep orthographies The orthographic depth of an alphabetic orthography indicates the degree to which a written language deviates from simple one-to-one Letter (alphabet), letter–phoneme correspondence. It depends on how easy it is to predict the pronunciation of ...
. Multigraphs representing a single phoneme are normally treated as combinations of separate letters, not as graphemes in their own right. However, in some languages a multigraph may be treated as a single unit for the purposes of
collation Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof. Collation is a fundamental element of most office fi ...
; for example, in a
Czech Czech may refer to: * Anything from or related to the Czech Republic, a country in Europe ** Czech language ** Czechs, the people of the area ** Czech culture ** Czech cuisine * One of three mythical brothers, Lech, Czech, and Rus *Czech (surnam ...
dictionary, the section for words that start with comes after that for . For more examples, see .


See also

* * *


References

{{Authority control Learning to read Typography Linguistics terminology