Written Chinese is a
writing system
A writing system comprises a set of symbols, called a ''script'', as well as the rules by which the script represents a particular language. The earliest writing appeared during the late 4th millennium BC. Throughout history, each independen ...
that uses
Chinese characters
Chinese characters are logographs used Written Chinese, to write the Chinese languages and others from regions historically influenced by Chinese culture. Of the four independently invented writing systems accepted by scholars, they represe ...
and other symbols to represent the
Chinese language
Chinese ( or ) is a group of languages spoken natively by the ethnic Han Chinese majority and List of ethnic groups in China, many minority ethnic groups in China, as well as by various communities of the Chinese diaspora. Approximately 1.39& ...
s. Chinese characters do not directly represent pronunciation, unlike letters in an
alphabet
An alphabet is a standard set of letter (alphabet), letters written to represent particular sounds in a spoken language. Specifically, letters largely correspond to phonemes as the smallest sound segments that can distinguish one word from a ...
or syllabograms in a
syllabary
In the Linguistics, linguistic study of Written language, written languages, a syllabary is a set of grapheme, written symbols that represent the syllables or (more frequently) mora (linguistics), morae which make up words.
A symbol in a syllaba ...
. Rather, the writing system is ''
morphosyllabic'': characters are one spoken
syllable
A syllable is a basic unit of organization within a sequence of speech sounds, such as within a word, typically defined by linguists as a ''nucleus'' (most often a vowel) with optional sounds before or after that nucleus (''margins'', which are ...
in length, but generally correspond to
morpheme
A morpheme is any of the smallest meaningful constituents within a linguistic expression and particularly within a word. Many words are themselves standalone morphemes, while other words contain multiple morphemes; in linguistic terminology, this ...
s in the language, which may either be independent words, or part of a polysyllabic word. Most characters are constructed from smaller components that may reflect the character's meaning or pronunciation. Literacy requires the memorization of thousands of characters; college-educated Chinese speakers know approximately 4,000. This has led in part to the adoption of complementary transliteration systems (generally
Pinyin
Hanyu Pinyin, or simply pinyin, officially the Chinese Phonetic Alphabet, is the most common romanization system for Standard Chinese. ''Hanyu'' () literally means 'Han Chinese, Han language'—that is, the Chinese language—while ''pinyin' ...
) as a means of representing the pronunciation of Chinese.
Chinese writing is first attested during the late
Shang dynasty
The Shang dynasty (), also known as the Yin dynasty (), was a Chinese royal dynasty that ruled in the Yellow River valley during the second millennium BC, traditionally succeeding the Xia dynasty and followed by the Western Zhou d ...
(),
but the process of creating characters is thought to have begun centuries earlier during the Late Neolithic and early Bronze Age (). After a period of variation and evolution, Chinese characters were standardized under the
Qin dynasty
The Qin dynasty ( ) was the first Dynasties of China, imperial dynasty of China. It is named for its progenitor state of Qin, a fief of the confederal Zhou dynasty (256 BC). Beginning in 230 BC, the Qin under King Ying Zheng enga ...
(221–206 BCE). Over the millennia, these characters have evolved into well-developed styles of
Chinese calligraphy
Chinese calligraphy is the writing of Chinese characters as an art form, combining purely Visual arts, visual art and interpretation of the literary meaning. This type of expression has been widely practiced in China and has been generally held ...
. As the
varieties of Chinese
There are hundreds of local Chinese language varieties forming a branch of the Sino-Tibetan languages, Sino-Tibetan language family, many of which are not Mutual intelligibility, mutually intelligible. Variation is particularly strong in the m ...
diverged, a situation of
diglossia
In linguistics, diglossia ( , ) is where two dialects or languages are used (in fairly strict compartmentalization) by a single language community. In addition to the community's everyday or vernacular language variety (labeled "L" or "low" v ...
developed, with speakers of mutually unintelligible varieties able to communicate through writing using
Literary Chinese
Classical Chinese is the language in which the classics of Chinese literature were written, from . For millennia thereafter, the written Chinese used in these works was imitated and iterated upon by scholars in a form now called Literary ...
. In the early 20th century, Literary Chinese was replaced in large part with
written vernacular Chinese
Written vernacular Chinese, also known as ''baihua'', comprises forms of written Chinese based on the vernacular varieties of the language spoken throughout China. It is contrasted with Literary Chinese, which was the predominant written form ...
, largely corresponding to
Standard Chinese
Standard Chinese ( zh, s=现代标准汉语, t=現代標準漢語, p=Xiàndài biāozhǔn hànyǔ, l=modern standard Han speech) is a modern standard form of Mandarin Chinese that was first codified during the republican era (1912–1949). ...
, a form based on the Beijing dialect of Mandarin. Although most other
varieties of Chinese
There are hundreds of local Chinese language varieties forming a branch of the Sino-Tibetan languages, Sino-Tibetan language family, many of which are not Mutual intelligibility, mutually intelligible. Variation is particularly strong in the m ...
are not written, there are traditions of
written Cantonese,
written Shanghainese and
written Hokkien, among others.
Structure
Written Chinese is not based on an alphabet or syllabary. Most characters can be analyzed as compounds of smaller components, which may be assembled according to several different principles. Characters and components may reflect aspects of meaning or pronunciation. The best known exposition of Chinese character composition is the ''
Shuowen Jiezi
The ''Shuowen Jiezi'' is a Chinese dictionary compiled by Xu Shen , during the Eastern Han dynasty (25–220 CE). While prefigured by earlier reference works for Chinese characters like the ''Erya'' (), the ''Shuowen Jiezi'' contains the ...
'', compiled by
Xu Shen
Xu Shen () was a Chinese calligrapher, philologist, politician, and writer of the Eastern Han dynasty (25–189 CE). During his own lifetime, Xu was recognized as a preeminent scholar of the Five Classics. He was the author of ''Shuowen Jiezi'' ...
. Xu did not have access to the earliest forms of Chinese characters, and his analysis is not considered to fully capture the nature of the writing system. Nevertheless, no later work has supplanted the ''Shuowen Jiezi'' in terms of breadth, and it is still relevant to etymological research today.
Derivation of characters
According to the ''Shuowen Jiezi'', Chinese characters are developed on six basic principles. (These principles, though popularized by the ''Shuowen Jiezi'', were developed earlier; the oldest known mention of them is in the ''
Rites of Zhou
The ''Rites of Zhou'' (), originally known as "Officers of Zhou" (), is a Chinese work on bureaucracy and organizational theory. It was renamed by Liu Xin to differentiate it from a chapter in the '' Book of History'' by the same name. To rep ...
'', a text from .
) The first two principles produce simple characters, known as :
- Pictographs (): in which the character is a graphical depiction of the object it denotes.
:''Examples'': , , .
- Indicatives (): in which the character represents an abstract notion.
:''Examples'': , , .
The remaining four principles produce complex characters historically called , though this term is now generally used to refer to all characters, whether simple or complex. Of these four, two construct characters from simpler parts:
- Ideographic compounds (): in which two or more parts are used for their meaning. This yields a composite meaning, which is then applied to the new character.
:''Example'': , which represents a sun rising in the trees.
- Phono-semantic compounds (): in which one part—often called the radical—indicates the general semantic category of the character, such as being related to ''water'' or ''eyes', with the other part being another character used for its phonetic value.
:''Example'': , which is composed of , and , which is used for its pronunciation.
The last two principles do not produce new written forms; they instead transfer new meanings to existing forms:
- Transference (): in which a character, often with a simple, concrete meaning takes on an extended, more abstract meaning.
:''Example'': , which was originally a pictograph depicting a fishing net. Over time, it has taken on an extended meaning, covering any kind of lattice: for instance, it is the word used to refer to computer networks.
- Loangraphs (): in which a character is used, either intentionally or accidentally, for some entirely different purpose.
:''Example'': is not attested in formal writing prior to the Tang dynasty, and was created from the leftmost component of the more ancient character . The ancient character meaning 'elder brother' continues to be used in idioms and formal writing, whereas is used in daily conversation in most Chinese dialects. Some dialects such as Minnan which retain features of spoken Old Chinese continue to use exclusively for 'elder brother' in daily conversation.
In contrast to the popular conception of written Chinese as
ideographic
An ideogram or ideograph (from Greek 'idea' + 'to write') is a symbol that is used within a given writing system to represent an idea or concept in a given language. (Ideograms are contrasted with phonograms, which indicate sounds of speech ...
, the vast majority of characters—about 95% of those in the ''Shuowen Jiezi''—either reflect elements of pronunciation, or are logical aggregates. In fact, some phonetic complexes were originally simple pictographs that were later augmented by the addition of a semantic root. An example is , now archaic, which was originally a pictograph of a lamp stand , a character that is now pronounced and means 'host', or the character was added to indicate that the meaning is fire related.
Chinese characters are written to fit into a square, even when composed of two simpler forms written side-by-side or top-to-bottom. In such cases, each form is compressed to fit the entire character into a square.
Strokes
Character components can be further subdivided into individual written strokes. The strokes of Chinese characters fall into eight main categories: "horizontal" , "vertical" , "left-falling" , "right-falling" , "rising", "dot" , "hook" , and "turning" , , .
There are eight basic rules of stroke order in writing a Chinese character, which apply only generally and are sometimes violated:
# Horizontal strokes are written before vertical ones.
# Left-falling strokes are written before right-falling ones.
# Characters are written from top to bottom.
# Characters are written from left to right.
# If a character is framed from above, the frame is written first.
# If a character is framed from below, the frame is written last.
# Frames are closed last.
# In a symmetrical character, the middle is drawn first, then the sides.
Layout
As characters are essentially rectilinear and are not joined with one another, written Chinese does not require a set orientation. Chinese texts were traditionally written in columns from top to bottom, which were laid out from right to left. Prior to the 20th century, Literary Chinese used little to no punctuation, with the breaks between sentences and phrases determined largely by context and the rhythms implied by patterns of syllables.
In the 20th century, the layout used in Western scripts—where text is written in rows from left to right, which are laid out from top to bottom—became predominant in mainland China, where it was mandated by the Chinese government in 1955. Vertical layouts are still used for aesthetic effect, or when space limitations require it, such as on signage or book spines. The government of
Taiwan
Taiwan, officially the Republic of China (ROC), is a country in East Asia. The main geography of Taiwan, island of Taiwan, also known as ''Formosa'', lies between the East China Sea, East and South China Seas in the northwestern Pacific Ocea ...
followed suit in 2004 for official documents, but vertical layouts have persisted in some books and newspapers.
Less frequently, Chinese is written in rows from right to left, usually on signage or banners, though a left to right orientation remains more common.
The use of punctuation has also become more common. In general, punctuation occupies the width of a full character, such that text remains visually well-aligned in a grid. Punctuation used in simplified Chinese shows clear influence from that used in Western scripts, though some marks are particular to Asian languages. For example, there are double and single quotation marks (『 』 and 「 」), and a hollow full stop (。), which is used to separate sentences in an identical manner to a Western full stop. A special mark called an ''
enumeration comma'' (、) is used to separate items in a list, as opposed to the clauses in a sentence.
History
Written Chinese is one of the oldest continuously used writing systems. The earliest examples universally accepted as Chinese writing are the
oracle bone inscriptions
Oracle bone script is the oldest attested form of written Chinese, dating to the late 2nd millennium BC. Inscriptions were made by carving characters into oracle bones, usually either the shoulder bones of oxen or the plastrons of turtl ...
made during the reign of the
Shang king
Wu Ding (). These inscriptions were made primarily on ox scapulae and turtle shells in order to record the results of divinations conducted by the Shang royal family. Characters posing a question were first carved into the bones. The question's answer was then divined by heating the bones over a fire and interpreting the resulting cracks that formed. The interpretation was then carved into the same
oracle bone
Oracle bones are pieces of ox scapula and turtle plastron which were used in pyromancya form of divinationduring the Late Shang period () in ancient China. '' Scapulimancy'' is the specific term if ox scapulae were used for the divination, ''p ...
.
In 2003, 11 isolated symbols carved on tortoise shells were found at the
Jiahu archaeological site in
Henan
Henan; alternatively Honan is a province in Central China. Henan is home to many heritage sites, including Yinxu, the ruins of the final capital of the Shang dynasty () and the Shaolin Temple. Four of the historical capitals of China, Lu ...
—with some bearing a striking resemblance to certain modern characters, such as . The Jiahu site dates from , predating the earliest attested Chinese writing by more than 5,000 years. Garman Harbottle, who had headed a team of archaeologists at the
University of Science and Technology of China
The University of Science and Technology of China (USTC) is a public university in Hefei, China. It is affiliated with the Chinese Academy of Sciences, and co-funded by the Chinese Academy of Sciences, the Ministry of Education of the People' ...
in Anhui—has suggested that these symbols were precursors to Chinese writing. However, the palaeographer
David Keightley argues instead that the time gap is too great to establish any connection.
From the
Late Shang period (), Chinese writing evolved into the form found in
cast inscriptions on ritual bronzes made during the
Western Zhou
The Western Zhou ( zh, c=西周, p=Xīzhōu; 771 BC) was a period of Chinese history corresponding roughly to the first half of the Zhou dynasty. It began when King Wu of Zhou overthrew the Shang dynasty at the Battle of Muye and ended in 77 ...
dynasty (771 BCE) and the
Spring and Autumn period
The Spring and Autumn period () was a period in History of China, Chinese history corresponding roughly to the first half of the Eastern Zhou (256 BCE), characterized by the gradual erosion of royal power as local lords nominally subject t ...
(771–476 BCE), a form of writing called bronze script (). Bronze script characters are less angular than their oracle bone script counterparts. The script became increasingly regularized during the
Warring States period
The Warring States period in history of China, Chinese history (221 BC) comprises the final two and a half centuries of the Zhou dynasty (256 BC), which were characterized by frequent warfare, bureaucratic and military reforms, and ...
(475–221 BCE), settling into what is called , that Xu Shen used as source material in the ''Shuowen Jiezi''. These characters were later embellished and stylized to yield the
seal script
Seal script or sigillary script () is a Chinese script styles, style of writing Chinese characters that was common throughout the latter half of the 1st millennium BC. It evolved organically out of bronze script during the Zhou dynasty (1 ...
, which represents the oldest form of Chinese characters still in modern use. They are used principally for
signature seals, or chops, which are often used in place of a signature for Chinese documents and artwork.
Li Si
Li Si (; 208 BC) was a Chinese calligrapher, philosopher, and politician of the Qin dynasty. He served as Chancellor from 246 to 208 BC, first under King Zheng of the state of Qin—who later became Qin Shi Huang, the "First Emperor" o ...
promulgated the seal script as the standard throughout China, which had been recently united under the imperial
Qin dynasty
The Qin dynasty ( ) was the first Dynasties of China, imperial dynasty of China. It is named for its progenitor state of Qin, a fief of the confederal Zhou dynasty (256 BC). Beginning in 230 BC, the Qin under King Ying Zheng enga ...
(221–206 BCE).
The initial adaptation of seal into
clerical script
The clerical script (), sometimes also chancery script, is a style of Chinese writing that evolved from the late Warring States period to the Qin dynasty. It matured and became dominant in the Han dynasty, and remained in active use through t ...
can be attributed to scribes in the
state of Qin
Qin (, , or ''Ch'in'') was an ancient Chinese state during the Zhou dynasty. It is traditionally dated to 897 BC. The state of Qin originated from a reconquest of western lands that had previously been lost to the Xirong. Its location at ...
working prior to the wars of unification. Clerical script forms generally have a "flat" appearance, being wider than their seal script equivalents. In the
semi-cursive script
Semi-cursive script, also known as running script, is a style of Chinese calligraphy that emerged during the Han dynasty (202 BC220 AD). The style is used to write Chinese characters and is abbreviated slightly where a character's stro ...
that evolved from clerical script, character elements begin to run into each other, though the characters themselves generally remain discrete. This is contrasted with fully
cursive script, where characters are often rendered unrecognizable by their canonical forms.
Regular script
The regular script is the newest of the major Chinese script styles, emerging during the Three Kingdoms period , and stylistically mature by the 7th century. It is the most common style used in modern text. In its traditional form it is the t ...
is the most widely recognized script, and was considerably influenced by semi-cursive. In regular script, each stroke of each character is clearly drawn out from the others.
Regular script is considered the archetypal Chinese writing and forms the basis for most printed forms. In addition, regular script
imposes a stroke order, which must be followed in order for the characters to be written correctly. Strictly speaking, this stroke order applies to the clerical, running, and grass scripts as well, but especially in the running and grass scripts, this order is occasionally deviated from. Thus, for instance, the character must be written starting with the horizontal stroke, drawn from left to right; next, the vertical stroke, from top to bottom; next, the left diagonal stroke, from top to bottom; and lastly the right diagonal stroke, from top to bottom.
Simplification and standardization
Beginning in the mid-20th century, Chinese has primarily been written using either
simplified or
traditional character forms. Simplified characters, which merge some character forms and reduce the average stroke count per character, were developed by the Chinese government with the stated goal of increasing literacy among the population. During this time, literacy rates did increase rapidly, but some observers instead attribute this to other education reforms and a general increase in the standard of living. Little systematic research has been conducted to support the conclusion that the use of simplified characters has affected literacy rates; studies conducted in China have instead focused on arbitrary statistics, such as quantifying the number of strokes saved on average in a given text sample. Simplified characters are standard in mainland China,
Singapore
Singapore, officially the Republic of Singapore, is an island country and city-state in Southeast Asia. The country's territory comprises one main island, 63 satellite islands and islets, and one outlying islet. It is about one degree ...
and
Malaysia
Malaysia is a country in Southeast Asia. Featuring the Tanjung Piai, southernmost point of continental Eurasia, it is a federation, federal constitutional monarchy consisting of States and federal territories of Malaysia, 13 states and thre ...
, while traditional characters are standard in
Hong Kong
Hong Kong)., Legally Hong Kong, China in international treaties and organizations. is a special administrative region of China. With 7.5 million residents in a territory, Hong Kong is the fourth most densely populated region in the wor ...
,
Macau
Macau or Macao is a special administrative regions of China, special administrative region of the People's Republic of China (PRC). With a population of about people and a land area of , it is the most List of countries and dependencies by p ...
,
Taiwan
Taiwan, officially the Republic of China (ROC), is a country in East Asia. The main geography of Taiwan, island of Taiwan, also known as ''Formosa'', lies between the East China Sea, East and South China Seas in the northwestern Pacific Ocea ...
and some
overseas Chinese
Overseas Chinese people are Chinese people, people of Chinese origin who reside outside Greater China (mainland China, Hong Kong, Macau, and Taiwan). As of 2011, there were over 40.3 million overseas Chinese. As of 2023, there were 10.5 milli ...
communities.
Simplified forms have also been characterized as being inconsistent. For instance, the traditional is simplified to , in which the phonetic on the right side is reduced from 17 strokes to 3, and the radical on the left also being simplified. However, the same phonetic component is not reduced in simplified characters such as and —these characters are relatively uncommon, and would therefore represent a negligible stroke reduction. Other simplified forms derive from long-standing calligraphic abbreviations, as with , which has the traditional form of .
Function
Chinese characters have always been used to represent individual spoken syllables. While writing was being invented in the Yellow River valley, words in spoken Chinese were largely monosyllabic, and each written character corresponded to a monosyllabic word. Spoken Chinese varieties have since acquired much more polysyllabic vocabulary, usually compound words composed of morphemes corresponding to older monosyllabic words.
For over two thousand years, the predominant form of written Chinese was
Literary Chinese
Classical Chinese is the language in which the classics of Chinese literature were written, from . For millennia thereafter, the written Chinese used in these works was imitated and iterated upon by scholars in a form now called Literary ...
, which had vocabulary and syntax rooted in the language of the
Chinese classics
The Chinese classics or canonical texts are the works of Chinese literature authored prior to the establishment of the imperial Qin dynasty in 221 BC. Prominent examples include the Four Books and Five Classics in the Neo-Confucian traditi ...
, as spoken around the time of
Confucius
Confucius (; pinyin: ; ; ), born Kong Qiu (), was a Chinese philosopher of the Spring and Autumn period who is traditionally considered the paragon of Chinese sages. Much of the shared cultural heritage of the Sinosphere originates in the phil ...
(). Over time, Literary Chinese acquired some elements of grammar and vocabulary from various varieties of vernacular Chinese that had since diverged. By the 20th century, Literary Chinese was distinctly different from any spoken vernacular, and had to be learned separately. Once learned, it was a common medium for communication between people speaking different dialects, many of which were mutually unintelligible by the end of the first millennium CE.
Varieties of Chinese vary in pronunciation, and to a lesser extent in vocabulary and grammar. Modern written Chinese, which replaced Classical Chinese as the written standard as an indirect result of the 1919
May Fourth Movement
The May Fourth Movement was a Chinese cultural and anti-imperialist political movement which grew out of student protests in Beijing on May 4, 1919. Students gathered in front of Tiananmen to protest the Chinese government's weak response ...
, is not technically bound to any single variety; however, it most nearly represents the vocabulary and syntax of Mandarin, by far the most widespread Chinese dialectal family in terms of both geographical area and number of speakers. This form is known as
written vernacular Chinese
Written vernacular Chinese, also known as ''baihua'', comprises forms of written Chinese based on the vernacular varieties of the language spoken throughout China. It is contrasted with Literary Chinese, which was the predominant written form ...
. While some written vernacular Chinese expressions are often ungrammatical or unidiomatic outside of Mandarin, its use permits some communication between speakers of different dialects. This function may be considered analogous to that of
linguae francae, such as
Latin
Latin ( or ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken by the Latins (Italic tribe), Latins in Latium (now known as Lazio), the lower Tiber area aroun ...
. For literate speakers, it serves as a common medium; however, the forms of individual characters generally provide little insight to their meaning if not already known. Ghil'ad Zuckermann's exploration of
phono-semantic matching
Phono-semantic matching (PSM) is the incorporation of a word into one language from another, often creating a neologism, where the word's non-native quality is hidden by replacing it with phonetically and semantically similar words or roots f ...
in
Standard Chinese
Standard Chinese ( zh, s=现代标准汉语, t=現代標準漢語, p=Xiàndài biāozhǔn hànyǔ, l=modern standard Han speech) is a modern standard form of Mandarin Chinese that was first codified during the republican era (1912–1949). ...
concludes that the Chinese writing system is multifunctional, conveying both semantic and phonetic content.
The variation in vocabulary among varieties has also led to informal use of "dialectal characters", which may include characters previously used in Literary Chinese that are considered archaic in written Standard Chinese. Cantonese is unique among non-Mandarin regional languages in having a written colloquial standard, used in Hong Kong and overseas, with a large number of unofficial characters for words particular to this language.
Written Cantonese has become quite popular on the Internet, while Standard Chinese is still normally used in formal written communications.
To a lesser degree,
Hokkien
Hokkien ( , ) is a Varieties of Chinese, variety of the Southern Min group of Chinese language, Chinese languages. Native to and originating from the Minnan region in the southeastern part of Fujian in southeastern China, it is also referred ...
is used similarly in Taiwan and elsewhere, though it lacks the level of standardization seen in Cantonese. However, Taiwan's Ministry of Education has promulgated a standard character set for Hokkien, which is taught in schools and encouraged for use by the general population.
Media
Over the history of written Chinese, a variety of media have been used for writing. They include:
*
Bamboo and wooden slips
Bamboo and wooden strips ( zh, s=简牍, t=簡牘, first=t, p=jiǎndú) are long, narrow strips of wood or bamboo, each typically holding a single column of several dozen brush-written characters. They were the main media for writing documents ...
, from at least the 13th century BCE
*
Paper
Paper is a thin sheet material produced by mechanically or chemically processing cellulose fibres derived from wood, Textile, rags, poaceae, grasses, Feces#Other uses, herbivore dung, or other vegetable sources in water. Once the water is dra ...
, invented no later than the 2nd century BCE
*
Silk
Silk is a natural fiber, natural protein fiber, some forms of which can be weaving, woven into textiles. The protein fiber of silk is composed mainly of fibroin and is most commonly produced by certain insect larvae to form cocoon (silk), c ...
, since at least the Han dynasty
* Stone, metal, wood, bamboo, plastic and ivory on
seals.
Since at least the Han dynasty, such media have been used to create
hanging scroll
A hanging scroll is one of the many traditional ways to display and exhibit East Asian painting and calligraphy. They are different from handscrolls, which are narrower and designed to be viewed flat on a table.
Hanging scrolls are generally i ...
s and
handscrolls.
Literacy
Because the majority of modern Chinese words contain more than one character, there are at least two measuring sticks for Chinese literacy: the number of characters known, and the number of words known.
John DeFrancis
John DeFrancis (August 31, 1911January 2, 2009) was an American linguist, sinologist, author of Chinese language textbooks, lexicographer of Chinese dictionaries, and professor emeritus of Chinese Studies at the University of Hawaiʻi at Mānoa ...
, in the introduction to his ''Advanced Chinese Reader'', estimates that a typical Chinese college graduate recognizes 4,000 to 5,000 characters, and 40,000 to 60,000 words.
Jerry Norman, in ''Chinese'', places the number of characters somewhat lower, at 3,000 to 4,000. These counts are complicated by the tangled development of Chinese characters. In many cases, a single character came to have multiple
variants
Variant may refer to:
Arts and entertainment
* ''Variant'' (magazine), a former British cultural magazine
* Variant cover, an issue of comic books with varying cover art
* ''Variant'' (novel), a novel by Robison Wells
* " The Variant", 2021 epis ...
. This development was restrained to an extent by the standardization of the seal script during the Qin dynasty, but soon started again. Although the ''Shuowen Jiezi'' lists 10,516 characters—9,353 of them unique (some of which may already have been out of use by the time it was compiled) plus 1,163 graphic variants—the ''
Jiyun
The ''Jiyun'' (''Chi-yun''; ) is a Chinese rime dictionary published in 1037 during the Song dynasty. The chief editor Ding Du (丁度) and others expanded and revised the ''Guangyun''. It is possible, according to Teng and Biggerstaff (1971:147 ...
'' of the Northern
Song dynasty
The Song dynasty ( ) was an Dynasties of China, imperial dynasty of China that ruled from 960 to 1279. The dynasty was founded by Emperor Taizu of Song, who usurped the throne of the Later Zhou dynasty and went on to conquer the rest of the Fiv ...
, compiled less than a thousand years later in 1039, contains 53,525 characters, most of them graphic variants.
Dictionaries
Written Chinese is not based on an alphabet or syllabary, so Chinese dictionaries, as well as dictionaries that define Chinese characters in other languages, cannot easily be alphabetized or otherwise lexically ordered, as English dictionaries are. The need to arrange Chinese characters in order to permit efficient lookup has given rise to a considerable variety of ways to organize and index the characters.
A traditional mechanism is the method of radicals, which uses a set of character roots. These roots, or radicals, generally but imperfectly align with the parts used to compose characters by means of logical aggregation and phonetic complex. A
canonical set of 214 radicals was developed during the rule of the
Kangxi Emperor
The Kangxi Emperor (4 May 165420 December 1722), also known by his temple name Emperor Shengzu of Qing, personal name Xuanye, was the third emperor of the Qing dynasty, and the second Qing emperor to rule over China proper. His reign of 61 ...
(around the year 1700); these are sometimes called the Kangxi radicals. The radicals are ordered first by stroke count (that is, the number of strokes required to write the radical); within a given stroke count, the radicals also have a prescribed order.
Every Chinese character falls (sometimes arbitrarily or incorrectly) under the heading of exactly one of these 214 radicals. In many cases, the radicals are themselves characters, which naturally come first under their own heading. All other characters under a given radical are ordered by the stroke count of the character. Usually, however, there are still many characters with a given stroke count under a given radical. At this point, characters are not given in any recognizable order; the user must locate the character by going through all the characters with that stroke count, typically listed for convenience at the top of the page on which they occur.
Because the method of radicals is applied only to the written character, one need not know how to pronounce a character before looking it up; the entry, once located, usually gives the pronunciation. However, it is not always easy to identify which of the various roots of a character is the proper radical. Accordingly, dictionaries often include a list of hard to locate characters, indexed by total stroke count, near the beginning of the dictionary. Some dictionaries include almost one-seventh of all characters in this list. Alternatively, some dictionaries list "difficult" characters under more than one radical, with all but one of those entries redirecting the reader to the "canonical" location of the character according to Kangxi.
Other methods of organization exist, often in an attempt to address the shortcomings of the radical method, but are less common. For instance, it is common for a dictionary ordered principally by the Kangxi radicals to have an auxiliary index by pronunciation, expressed typically in either
pinyin
Hanyu Pinyin, or simply pinyin, officially the Chinese Phonetic Alphabet, is the most common romanization system for Standard Chinese. ''Hanyu'' () literally means 'Han Chinese, Han language'—that is, the Chinese language—while ''pinyin' ...
or
bopomofo
Bopomofo, also called Zhuyin Fuhao ( ; ), or simply Zhuyin, is a Chinese transliteration, transliteration system for Standard Chinese and other Sinitic languages. It is the principal method of teaching Chinese Mandarin pronunciation in Taiwa ...
. This index points to the page in the main dictionary where the desired character can be found. Other methods use only the structure of the characters, such as the
four-corner method
The four-corner method or four-corner system () is a Chinese input methods for computers, character-input method used for Character encoding, encoding Chinese characters into either a computer or a manual typewriter, using four or five numerical ...
, in which characters are indexed according to the kinds of strokes located nearest the four corners (hence the name of the method),
or the
Cangjie method, in which characters are broken down into a set of 24 basic components.
Neither the four-corner method nor the Cangjie method requires the user to identify the proper radical, although many strokes or components have alternate forms, which must be memorized in order to use these methods effectively.
The availability of computerized Chinese dictionaries now makes it possible to look characters up by any of the indexing schemes described, thereby shortening the search process.
Transliteration
Chinese characters do not reliably indicate their pronunciation. Therefore, many transliteration systems have been developed to write the sounds of different varieties of Chinese. While many use the
Latin alphabet
The Latin alphabet, also known as the Roman alphabet, is the collection of letters originally used by the Ancient Rome, ancient Romans to write the Latin language. Largely unaltered except several letters splitting—i.e. from , and from � ...
, systems using the
Cyrillic
The Cyrillic script ( ) is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Ea ...
and
Perso-Arabic alphabets have also been designed. Among other purposes, these systems are used by students learning the corresponding varieties. The replacement of Chinese characters with a phonetic writing system was first prominently proposed during the May Fourth Movement, partly motivated by a desire to increase the country's literacy rate. The idea gained further support following the victory of the Communists in 1949, who immediately began two parallel programs regarding written Chinese. The first was the development of an alphabet to write the sounds of Mandarin, the variety spoken by around two-thirds of the Chinese population. The other program investigated the simplification of the standard character forms. Initially, character simplification was not competing with the idea of a phonetic script; rather, simplification was intended to make the transition to a fully phonetic writing system easier.
By 1958, official priorities had shifted towards character simplification. The
Hanyu Pinyin
Hanyu Pinyin, or simply pinyin, officially the Chinese Phonetic Alphabet, is the most common romanization system for Standard Chinese. ''Hanyu'' () literally means ' Han language'—that is, the Chinese language—while ''pinyin'' literally ...
(or simply 'pinyin') alphabet had been developed, but plans to replace Chinese characters with it were deferred, and the idea is no longer actively pursued. This change in priorities may have been due in part to pinyin's design being specific to Mandarin, to the exclusion of other dialects.
Pinyin uses the Latin alphabet with
diacritics
A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacrit ...
to represent the phonology of Standard Chinese. For the most part, pinyin uses phonetic values for letters that reflect their existing pronunciations in Romance languages and the
International Phonetic Alphabet
The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standard written representation ...
(IPA). However, pairs of letters such as and that correspond to a
voicing distinction in languages such as French instead represent the
aspiration distinction that is more abundant in Mandarin. Pinyin also uses several consonantal letters to represent markedly different sounds from their assignments in other languages. For example, pinyin and correspond to sounds similar to English ''ch'' and ''sh'', respectively. While pinyin has become the predominant transliteration system for Mandarin, others include
bopomofo
Bopomofo, also called Zhuyin Fuhao ( ; ), or simply Zhuyin, is a Chinese transliteration, transliteration system for Standard Chinese and other Sinitic languages. It is the principal method of teaching Chinese Mandarin pronunciation in Taiwa ...
,
Wade–Giles
Wade–Giles ( ) is a romanization system for Mandarin Chinese. It developed from the system produced by Thomas Francis Wade during the mid-19th century, and was given completed form with Herbert Giles's '' A Chinese–English Dictionary'' ...
,
Yale
Yale University is a private Ivy League research university in New Haven, Connecticut, United States. Founded in 1701, Yale is the third-oldest institution of higher education in the United States, and one of the nine colonial colleges ch ...
,
EFEO and
Gwoyeu Romatzyh
Gwoyeu Romatzyh ( ; GR) is a system for writing Standard Chinese using the Latin alphabet. It was primarily conceived by Yuen Ren Chao (1892–1982), who led a group of linguists on the National Languages Committee in refining the system betwe ...
.
Notes
References
Citations
Works cited
*
*
*
*
*
*
*
*
*
*
*
Further reading
*
*
*
External links
Yue E Li and Christopher Upward. "Review of the process of reform in the simplification of Chinese Characters". (Journal of Simplified Spelling Society, 1992/2 pp. 14–16, later designated J13)
{{Authority control
Chinese
Chinese
Chinese