HOME

TheInfoList



OR:

Chinese characters are
logograph In a written language, a logogram (from Ancient Greek 'word', and 'that which is drawn or written'), also logograph or lexigraph, is a written character that represents a semantic component of a language, such as a word or morpheme. Chines ...
s used to write the Chinese languages and others from regions historically influenced by
Chinese culture Chinese culture () is one of the Cradle of civilization#Ancient China, world's earliest cultures, said to originate five thousand years ago. The culture prevails across a large geographical region in East Asia called the Sinosphere as a whole ...
. Of the four independently invented
writing system A writing system comprises a set of symbols, called a ''script'', as well as the rules by which the script represents a particular language. The earliest writing appeared during the late 4th millennium BC. Throughout history, each independen ...
s accepted by scholars, they represent the only one that has remained in continuous use. Over a documented history spanning more than three millennia, the function, style, and means of writing characters have changed greatly. Unlike letters in alphabets that reflect the sounds of speech, Chinese characters generally represent
morpheme A morpheme is any of the smallest meaningful constituents within a linguistic expression and particularly within a word. Many words are themselves standalone morphemes, while other words contain multiple morphemes; in linguistic terminology, this ...
s, the units of meaning in a language. Writing all of the frequently used vocabulary in a language requires roughly 2000–3000 characters; , nearly have been identified and included in ''
The Unicode Standard Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 ch ...
''. Characters are created according to several principles, where aspects of shape and pronunciation may be used to indicate the character's meaning. The first attested characters are
oracle bone inscriptions Oracle bone script is the oldest attested form of written Chinese, dating to the late 2nd millennium BC. Inscriptions were made by carving characters into oracle bones, usually either the shoulder bones of oxen or the plastrons of turtl ...
made during the 13th century BCE in what is now
Anyang Anyang ( zh, s=安阳, t=安陽; ) is a prefecture-level city in Henan, China. Geographical coordinates are 35° 41'~ 36° 21' north latitude and 113° 38'~ 114° 59' east longitude. The northernmost city in Henan, Anyang borders Puyang to the eas ...
, Henan, as part of divinations conducted by the
Shang dynasty The Shang dynasty (), also known as the Yin dynasty (), was a Chinese royal dynasty that ruled in the Yellow River valley during the second millennium BC, traditionally succeeding the Xia dynasty and followed by the Western Zhou d ...
royal house. Character forms were originally
ideograph An ideogram or ideograph (from Ancient Greek, Greek 'idea' + 'to write') is a symbol that is used within a given writing system to represent an idea or concept in a given language. (Ideograms are contrasted with phonogram (linguistics), phono ...
ic or
pictograph A pictogram (also pictogramme, pictograph, or simply picto) is a graphical symbol that conveys meaning through its visual resemblance to a physical object. Pictograms are used in systems of writing and visual communication. A pictography is a wri ...
ic in style, but evolved as writing spread across China. Numerous attempts have been made to reform the script, including the promotion of
small seal script The small seal script is an archaic script style of written Chinese. It developed within the state of Qin during the Eastern Zhou dynasty (771–256 BC), and was then promulgated across China in order to replace script varieties used i ...
by the
Qin dynasty The Qin dynasty ( ) was the first Dynasties of China, imperial dynasty of China. It is named for its progenitor state of Qin, a fief of the confederal Zhou dynasty (256 BC). Beginning in 230 BC, the Qin under King Ying Zheng enga ...
(221–206 BCE).
Clerical script The clerical script (), sometimes also chancery script, is a style of Chinese writing that evolved from the late Warring States period to the Qin dynasty. It matured and became dominant in the Han dynasty, and remained in active use through t ...
, which had matured by the early
Han dynasty The Han dynasty was an Dynasties of China, imperial dynasty of China (202 BC9 AD, 25–220 AD) established by Liu Bang and ruled by the House of Liu. The dynasty was preceded by the short-lived Qin dynasty (221–206 BC ...
(202 BCE220 CE), abstracted the forms of characters—obscuring their pictographic origins in favour of making them easier to write. Following the Han,
regular script The regular script is the newest of the major Chinese script styles, emerging during the Three Kingdoms period , and stylistically mature by the 7th century. It is the most common style used in modern text. In its traditional form it is the t ...
emerged as the result of cursive influence on clerical script, and has been the primary style used for characters since. Informed by a long tradition of
lexicography Lexicography is the study of lexicons and the art of compiling dictionaries. It is divided into two separate academic disciplines: * Practical lexicography is the art or craft of compiling, writing and editing dictionaries. * Theoretical le ...
, states using Chinese characters have standardized their forms—broadly,
simplified characters Simplified Chinese characters are one of two standardized character sets widely used to write the Chinese language, with the other being traditional characters. Their mass standardization during the 20th century was part of an initiative by t ...
are used to write Chinese in
mainland China "Mainland China", also referred to as "the Chinese mainland", is a Geopolitics, geopolitical term defined as the territory under direct administration of the People's Republic of China (PRC) in the aftermath of the Chinese Civil War. In addit ...
,
Singapore Singapore, officially the Republic of Singapore, is an island country and city-state in Southeast Asia. The country's territory comprises one main island, 63 satellite islands and islets, and one outlying islet. It is about one degree ...
, and
Malaysia Malaysia is a country in Southeast Asia. Featuring the Tanjung Piai, southernmost point of continental Eurasia, it is a federation, federal constitutional monarchy consisting of States and federal territories of Malaysia, 13 states and thre ...
, while
traditional characters Traditional Chinese characters are a standard set of Chinese character forms used to write Chinese languages. In Taiwan, the set of traditional characters is regulated by the Ministry of Education and standardized in the ''Standard Form of ...
are used in
Taiwan Taiwan, officially the Republic of China (ROC), is a country in East Asia. The main geography of Taiwan, island of Taiwan, also known as ''Formosa'', lies between the East China Sea, East and South China Seas in the northwestern Pacific Ocea ...
,
Hong Kong Hong Kong)., Legally Hong Kong, China in international treaties and organizations. is a special administrative region of China. With 7.5 million residents in a territory, Hong Kong is the fourth most densely populated region in the wor ...
, and
Macau Macau or Macao is a special administrative regions of China, special administrative region of the People's Republic of China (PRC). With a population of about people and a land area of , it is the most List of countries and dependencies by p ...
. Where the use of characters spread beyond China, they were initially used to write
Literary Chinese Classical Chinese is the language in which the classics of Chinese literature were written, from . For millennia thereafter, the written Chinese used in these works was imitated and iterated upon by scholars in a form now called Literary ...
; they were then often adapted to write local languages spoken throughout the
Sinosphere The Sinosphere, also known as the Chinese cultural sphere, East Asian cultural sphere, or the Sinic world, encompasses multiple countries in East Asia and Southeast Asia that were historically heavily influenced by Chinese culture. The Sinosph ...
. In Japanese, Korean, and Vietnamese, Chinese characters are known as ''
kanji are logographic Chinese characters, adapted from Chinese family of scripts, Chinese script, used in the writing of Japanese language, Japanese. They were made a major part of the Japanese writing system during the time of Old Japanese and are ...
'', ''
hanja Hanja (; ), alternatively spelled Hancha, are Chinese characters used to write the Korean language. After characters were introduced to Korea to write Literary Chinese, they were adapted to write Korean as early as the Gojoseon period. () ...
'', and ' respectively. Writing traditions also emerged for some of the other
languages of China There are several hundred languages in the People's Republic of China. The predominant language is Standard Chinese, which is based on Beijing dialect, Beijingese, but there are hundreds of related Chinese languages, collectively known as ''Hany ...
, like the script used to write the
Zhuang languages The Zhuang languages (; autonym: , , pre-1982: , Sawndip: 話僮, from ''vah'', 'language' and ''Cuengh'', 'Zhuang'; ) are the more than a dozen Tai languages spoken by the Zhuang people of Southern China in the province of Guangxi and adjace ...
of
Guangxi Guangxi,; officially the Guangxi Zhuang Autonomous Region, is an Autonomous regions of China, autonomous region of the China, People's Republic of China, located in South China and bordering Vietnam (Hà Giang Province, Hà Giang, Cao Bằn ...
. Each of these written vernaculars used existing characters to write the language's native vocabulary, as well as the loanwords it borrowed from Chinese. In addition, each invented characters for local use. In written Korean and Vietnamese, Chinese characters have largely been replaced with alphabets—leaving Japanese as the only major non-Chinese language still written using them, alongside the other elements of the
Japanese writing system The modern Japanese writing system uses a combination of Logogram, logographic kanji, which are adopted Chinese characters, and Syllabary, syllabic kana. Kana itself consists of a pair of syllabary, syllabaries: hiragana, used primarily for n ...
. At the most basic level, characters are composed of strokes that are written in a fixed order. Historically, methods of writing characters have included inscribing stone, bone, or bronze; brushing ink onto silk, bamboo, or paper; and printing with woodblocks or
moveable type Movable type (US English; moveable type in British English) is the system and technology of printing and typography that uses movable components to reproduce the elements of a document (usually individual alphanumeric characters or punctuation ...
. Technologies invented since the 19th century to facilitate the use of characters include
telegraph code A telegraph code is one of the character encodings used to transmit information by telegraphy. Morse code is the best-known such code. ''Telegraphy'' usually refers to the electrical telegraph, but telegraph systems using the optical telegraph w ...
s and
typewriter A typewriter is a Machine, mechanical or electromechanical machine for typing characters. Typically, a typewriter has an array of Button (control), keys, and each one causes a different single character to be produced on paper by striking an i ...
s, as well as
input method An input method (or input method editor, commonly abbreviated IME) is an operating system component or program that enables users to generate characters not natively available on their input devices by using sequences of characters (or mouse oper ...
s and text encodings on computers.


Development

Chinese characters are accepted as representing one of four independent inventions of writing in human history. In each instance, writing evolved from a system using two distinct types of
ideograph An ideogram or ideograph (from Ancient Greek, Greek 'idea' + 'to write') is a symbol that is used within a given writing system to represent an idea or concept in a given language. (Ideograms are contrasted with phonogram (linguistics), phono ...
s—either pictographs visually depicting objects or concepts, or fixed
sign A sign is an object, quality, event, or entity whose presence or occurrence indicates the probable presence or occurrence of something else. A natural sign bears a causal relation to its object—for instance, thunder is a sign of storm, or me ...
s representing concepts only by shared convention. These systems are classified as
proto-writing Proto-writing consists of visible marks communication, communicating limited information. Such systems emerged from earlier traditions of symbol systems in the early Neolithic, as early as the 7th millennium BC in History of China, China a ...
, because the techniques they used were insufficient to carry the meaning of spoken language by themselves. Various innovations were required for Chinese characters to emerge from proto-writing. Firstly, pictographs became distinct from simple pictures in use and appearance—for example, the pictograph , meaning 'large', was originally a picture of a large man, but one would need to be aware of its specific meaning in order to interpret the sequence as signifying 'large deer', rather than being a picture of a large man and a deer next to one another. Due to this process of abstraction, as well as to make characters easier to write, pictographs gradually became more simplified and regularized—often to the extent that the original objects represented are no longer obvious. This proto-writing system was limited to representing a relatively narrow range of ideas with a comparatively small library of symbols. This compelled innovations that allowed for symbols which indicated elements of spoken language directly. In each historical case, this was accomplished by some form of the
rebus A rebus ( ) is a puzzle device that combines the use of illustrated pictures with individual letters to depict words or phrases. For example: the word "been" might be depicted by a rebus showing an illustrated bumblebee next to a plus sign (+ ...
technique, where the symbol for a word is used to indicate a different word with a similar pronunciation, depending on context. This allowed for words that lacked a plausible pictographic representation to be written down for the first time. This technique preempted more sophisticated methods of character creation that would further expand the lexicon. The process whereby writing emerged from proto-writing took place over a long period; when the purely pictorial use of symbols disappeared, leaving only those representing spoken words, the process was complete.


Classification

Chinese characters have been used in several different
writing system A writing system comprises a set of symbols, called a ''script'', as well as the rules by which the script represents a particular language. The earliest writing appeared during the late 4th millennium BC. Throughout history, each independen ...
s throughout history. A writing system is most commonly defined to include the written symbols themselves, called ''
grapheme In linguistics, a grapheme is the smallest functional unit of a writing system. The word ''grapheme'' is derived from Ancient Greek ('write'), and the suffix ''-eme'' by analogy with ''phoneme'' and other emic units. The study of graphemes ...
s''—which may include characters, numerals, or punctuation—as well as the rules by which they are used to record language. Chinese characters are
logograph In a written language, a logogram (from Ancient Greek 'word', and 'that which is drawn or written'), also logograph or lexigraph, is a written character that represents a semantic component of a language, such as a word or morpheme. Chines ...
s, which are graphemes that represent units of meaning in a language. Specifically, characters represent a language's
morpheme A morpheme is any of the smallest meaningful constituents within a linguistic expression and particularly within a word. Many words are themselves standalone morphemes, while other words contain multiple morphemes; in linguistic terminology, this ...
s, its most basic units of meaning. Morphemes in Chinese—and therefore the characters used to write them—are nearly always a single syllable in length. In some special cases, characters may denote non-morphemic syllables as well; due to this,
written Chinese Written Chinese is a writing system that uses Chinese characters and other symbols to represent the Chinese languages. Chinese characters do not directly represent pronunciation, unlike letters in an alphabet or syllabograms in a syllabary. Rath ...
is often characterized as morphosyllabic. Logographs may be contrasted with letters in an
alphabet An alphabet is a standard set of letter (alphabet), letters written to represent particular sounds in a spoken language. Specifically, letters largely correspond to phonemes as the smallest sound segments that can distinguish one word from a ...
, which generally represent
phoneme A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
s, the distinct units of sound used by speakers of a language. Despite their origins in picture-writing, Chinese characters are no longer ideographs capable of representing ideas directly; their comprehension relies on the reader's knowledge of the particular language being written. The areas where Chinese characters were historically used—sometimes collectively termed the ''
Sinosphere The Sinosphere, also known as the Chinese cultural sphere, East Asian cultural sphere, or the Sinic world, encompasses multiple countries in East Asia and Southeast Asia that were historically heavily influenced by Chinese culture. The Sinosph ...
''—have a long tradition of
lexicography Lexicography is the study of lexicons and the art of compiling dictionaries. It is divided into two separate academic disciplines: * Practical lexicography is the art or craft of compiling, writing and editing dictionaries. * Theoretical le ...
attempting to explain and refine their use; for most of history, analysis revolved around a model first popularized in the 2nd-century ' dictionary. More recent models have analysed the methods used to create characters, how characters are structured, and how they function in a given writing system.


Structural analysis

Most characters can be analysed structurally as compounds made of smaller
components Component may refer to: In engineering, science, and technology Generic systems *System components, an entity with discrete structure, such as an assembly or software module, within a system considered at a particular level of analysis * Lumped e ...
(), which are often independent characters in their own right, adjusted to occupy a given position in the compound. Components within a character may serve a specific function—phonetic components provide a hint for the character's pronunciation, and semantic components indicate some element of the character's meaning. Components that serve neither function may be classified as pure signs with no particular meaning, other than their presence distinguishing one character from another. A straightforward structural classification scheme may consist of three pure classes of semantographs, phonographs, and signs—having only semantic, phonetic, and form components respectively—as well as classes corresponding to each combination of component types. Of the characters that are frequently used in Standard Chinese, pure semantographs are estimated to be the rarest, accounting for about 5% of the lexicon, followed by pure signs with 18%, and semantic–form and phonetic–form compounds together accounting for 19%. The remaining 58% are phono-semantic compounds. The 20th-century Chinese palaeographer
Qiu Xigui Qiu Xigui (; (13 July 1935 – 8 May 2025) was a Chinese historian, palaeographer, and professor of Fudan University. His book ''Chinese Writing'' is considered the "single most influential study of Chinese palaeography". Early life and educa ...
presents three principles of character function adapted from earlier proposals by and Chen Mengjia, with ''semantographs'' describing all characters with forms wholly related to their meaning, regardless of the method by which the meaning was originally depicted; ''phonographs'' that include a phonetic component; and ''loangraphs'' encompassing existing characters that have been borrowed to write other words. Qiu also acknowledges the existence of character classes that fall outside of these principles, such as pure signs.


Semantographs


Pictographs

Most of the oldest characters are
pictograph A pictogram (also pictogramme, pictograph, or simply picto) is a graphical symbol that conveys meaning through its visual resemblance to a physical object. Pictograms are used in systems of writing and visual communication. A pictography is a wri ...
s (), representational pictures of physical objects. Examples include , , and . Over time, the forms of pictographs have been simplified in order to make them easier to write. As a result, modern readers generally cannot deduce what many pictographs were originally meant to resemble; without knowing the context of their origin in picture-writing, they may be interpreted instead as pure signs. However, if a pictograph's use in compounds still reflects its original meaning, as with in , it can still be analysed as a semantic component. Pictographs have often been extended from their original meanings to take on additional layers of metaphor and
synecdoche Synecdoche ( ) is a type of metonymy; it is a figure of speech that uses a term for a part of something to refer to the whole (''pars pro toto''), or vice versa (''totum pro parte''). The term is derived . Common English synecdoches include '' ...
, which sometimes displace the character's original sense. When this process results in excessive ambiguity between distinct senses written with the same character, it is usually resolved by new compounds being derived to represent particular senses.


Indicatives

Indicatives (), also called ''simple ideographs'' or ''self-explanatory characters'', are visual representations of abstract concepts that lack any tangible form. Examples include and —these characters were originally written as dots placed above and below a line, and later evolved into their present forms with less potential for graphical ambiguity in context. More complex indicatives include , , and .


Compound ideographs

Compound ideographs ()—also called ''logical aggregates'', ''associative idea characters'', or ''syssemantographs''—combine other characters to convey a new, synthetic meaning. A canonical example is , interpreted as the juxtaposition of the two brightest objects in the sky: and , together expressing their shared quality of brightness. Other examples include , composed of pictographs and , and , composed of and . Many traditional examples of compound ideographs are now believed to have actually originated as phono-semantic compounds, made obscure by subsequent changes in pronunciation. For example, the ' describes as an ideographic compound of and , but modern analyses instead identify it as a phono-semantic compound—though with disagreement as to which component is phonetic. Peter A. Boodberg and
William G. Boltz William G. Boltz is a professor emeritus at the University of Washington and a scholar of manuscript study, philology, and textual criticism, known for his studies of the origin of the Chinese writing system. Education and career William G. B ...
go so far as to deny that any compound ideographs were devised in antiquity, maintaining that secondary readings that are now lost are responsible for the apparent absence of phonetic indicators, but their arguments have been rejected by other scholars.


Phonographs


Phono-semantic compounds

Phono-semantic compounds () are composed of at least one semantic component and one phonetic component. They may be formed by one of several methods, often by adding a phonetic component to disambiguate a loangraph, or by adding a semantic component to represent a specific extension of a character's meaning. Examples of phono-semantic compounds include , , , , and . Each of these characters have three short strokes on their left-hand side: , a simplified combining form of ('water'). This component serves a semantic function in each example, indicating the character has some meaning related to water. The remainder of each character is its phonetic component: is pronounced identically to in Standard Chinese, is pronounced similarly to , and is pronounced similarly to . The phonetic components of most compounds may only provide an approximate pronunciation, even before subsequent sound shifts in the spoken language. Some characters may only have the same initial or final sound of a syllable in common with phonetic components. A phonetic series comprises all the characters created using the same phonetic component, which may have diverged significantly in their pronunciations over time. For example, and are characters in the phonetic series using , a literary first-person pronoun. Their
Old Chinese Old Chinese, also called Archaic Chinese in older works, is the oldest attested stage of Chinese language, Chinese, and the ancestor of all modern varieties of Chinese. The earliest examples of Chinese are divinatory inscriptions on oracle bones ...
pronunciations were similar, but the phonetic component no longer serves as a useful hint for their pronunciation in modern
varieties of Chinese There are hundreds of local Chinese language varieties forming a branch of the Sino-Tibetan languages, Sino-Tibetan language family, many of which are not Mutual intelligibility, mutually intelligible. Variation is particularly strong in the m ...
due to subsequent sound shifts—demonstrated here in both their
Mandarin Mandarin or The Mandarin may refer to: Language * Mandarin Chinese, branch of Chinese originally spoken in northern parts of the country ** Standard Chinese or Modern Standard Mandarin, the official language of China ** Taiwanese Mandarin, Stand ...
and
Cantonese Cantonese is the traditional prestige variety of Yue Chinese, a Sinitic language belonging to the Sino-Tibetan language family. It originated in the city of Guangzhou (formerly known as Canton) and its surrounding Pearl River Delta. While th ...
readings.


Loangraphs

The phenomenon of existing characters being adapted to write other words with similar pronunciations was necessary in the initial development of Chinese writing, and has remained common throughout its subsequent history. Some loangraphs () are introduced to represent words previously lacking a written form—this is often the case with abstract grammatical particles such as and . The process of characters being borrowed as loangraphs should not be conflated with the distinct process of semantic extension, where a word acquires additional senses, which often remain written with the same character. As both processes often result in a single character form being used to write several distinct meanings, loangraphs are often misidentified as being the result of semantic extension, and vice versa. Loangraphs are also used to write words borrowed from other languages, such as the Buddhist terminology introduced to China in antiquity, as well as contemporary non-Chinese words and names. For example, each character in the name is often used as a loangraph for its respective syllable. However, the barrier between a character's pronunciation and meaning is never total; when transcribing into Chinese, loangraphs are often chosen deliberately as to create certain connotations. This is regularly done with corporate brand names—for example,
Coca-Cola Coca-Cola, or Coke, is a cola soft drink manufactured by the Coca-Cola Company. In 2013, Coke products were sold in over 200 countries and territories worldwide, with consumers drinking more than 1.8 billion company beverage servings ...
's Chinese name is .


Signs

Some characters and components are pure signs, with meanings merely stemming from their having a fixed and distinct form. Basic examples of pure signs are found with the numerals beyond four, e.g. and , whose forms do not give visual hints to the quantities they represent.


Traditional ''Shuowen Jiezi'' classification

The ' is a character dictionary authored by the scholar Xu Shen. In its postface, Xu analyses what he sees as all the methods by which characters are created. Later authors iterated upon Xu's analysis, developing a categorization scheme known as the , which identifies every character with one of six categories that had previously been mentioned in the '. For nearly two millennia, this scheme was the primary framework for character analysis used throughout the Sinosphere. Xu based most of his analysis on examples of Qin seal script that were written down several centuries before his time—these were usually the oldest specimens available to him, though he stated he was aware of the existence of even older forms. The first five categories are pictographs, indicatives, compound ideographs, phono-semantic compounds, and loangraphs. The sixth category is given by Xu as ; however, its definition is unclear, and it is generally disregarded by modern scholars. Modern scholars agree that the theory presented in the ' is problematic, failing to fully capture the nature of Chinese writing, both in the present, as well as at the time Xu was writing. Traditional Chinese lexicography as embodied in the ' has suggested implausible etymologies for some characters. Moreover, several categories are considered to be ill-defined—for example, it is unclear whether characters like should be classified as pictographs or indicatives. However, awareness of the 'six writings' model has remained a common component of character literacy, and often serves as a tool for students memorizing characters.


History

The broadest trend in the evolution of Chinese characters over their history has been simplification, both in graphical shape (), the "external appearances of individual graphs", and in graphical form (), "overall changes in the distinguishing features of graphic lshape and calligraphic style, ... in most cases refer
ing Ing, ING or ing may refer to: Art and media * '' ...ing'', a 2003 Korean film * i.n.g, a Taiwanese girl group * The Ing, a race of dark creatures in the 2004 video game '' Metroid Prime 2: Echoes'' * "Ing", the first song on The Roches' 199 ...
to rather obvious and rather substantial changes". The traditional notion of an orderly procession of script styles, each suddenly appearing and displacing the one previous, has been disproven by later scholarship and archaeological work. Instead, scripts evolved gradually, with several distinct styles often coexisting within a given area.


Traditional invention narrative

Several of the
Chinese classics The Chinese classics or canonical texts are the works of Chinese literature authored prior to the establishment of the imperial Qin dynasty in 221 BC. Prominent examples include the Four Books and Five Classics in the Neo-Confucian traditi ...
indicate that knotted cords were used to keep records prior to the invention of writing. Works that reference the practice include chapter 80 of the ' and the " II" commentary to the '. According to one tradition, Chinese characters were invented during the 3rd millennium BCE by Cangjie, a scribe of the legendary
Yellow Emperor The Yellow Emperor, also known as the Yellow Thearch, or Huangdi ( zh, t=黃帝, s=黄帝, first=t) in Chinese, is a mythical Chinese sovereign and culture hero included among the legendary Three Sovereigns and Five Emperors. He is revered as ...
. Cangjie is said to have invented symbols called due to his frustration with the limitations of knotting, taking inspiration from his study of the tracks of animals, landscapes, and the stars in the sky. On the day that these first characters were created, grain rained down from the sky; that night, the people heard the wailing of ghosts and demons, lamenting that humans could no longer be cheated.


Neolithic precursors

Collections of graphs and pictures have been discovered at the sites of several
Neolithic The Neolithic or New Stone Age (from Ancient Greek, Greek 'new' and 'stone') is an archaeological period, the final division of the Stone Age in Mesopotamia, Asia, Europe and Africa (c. 10,000 BCE to c. 2,000 BCE). It saw the Neolithic Revo ...
settlements throughout the
Yellow River The Yellow River, also known as Huanghe, is the second-longest river in China and the List of rivers by length, sixth-longest river system on Earth, with an estimated length of and a Drainage basin, watershed of . Beginning in the Bayan H ...
valley, including (), and (6th millennium BCE), and (5th millennium BCE). Symbols at each site were inscribed or drawn onto artefacts, appearing one at a time and without indicating any greater context. Qiu concludes, "We simply possess no basis for saying that they were already being used to record language." A historical connection with the symbols used by the late Neolithic culture () in Shandong has been deemed possible by palaeographers, with Qiu concluding that they "cannot be definitively treated as primitive writing, nevertheless they are symbols which resemble most the ancient pictographic script discovered thus far in China... They undoubtedly can be viewed as the forerunners of primitive writing."


Oracle bone script

The oldest attested Chinese writing comprises a body of inscriptions produced during the Late Shang period (1050 BCE), with the very earliest examples from the reign of Wu Ding dated between 1250 and 1200 BCE. Many of these inscriptions were made on
oracle bone Oracle bones are pieces of ox scapula and turtle plastron which were used in pyromancya form of divinationduring the Late Shang period () in ancient China. '' Scapulimancy'' is the specific term if ox scapulae were used for the divination, ''p ...
s—usually either ox
scapula The scapula (: scapulae or scapulas), also known as the shoulder blade, is the bone that connects the humerus (upper arm bone) with the clavicle (collar bone). Like their connected bones, the scapulae are paired, with each scapula on either side ...
e or turtle plastrons—and recorded official
divination Divination () is the attempt to gain insight into a question or situation by way of an occultic ritual or practice. Using various methods throughout history, diviners ascertain their interpretations of how a should proceed by reading signs, ...
s carried out by the Shang royal house. Contemporaneous inscriptions in a related but distinct style were also made on ritual bronze vessels. This
oracle bone script Oracle bone script is the oldest attested form of written Chinese, dating to the late 2nd millennium BC. Inscriptions were made by carving characters into oracle bones, usually either the shoulder bones of oxen or the plastrons of turtl ...
() was first documented in 1899, after specimens were discovered being sold as "dragon bones" for medicinal purposes, with the symbols carved into them identified as early character forms. By 1928, the source of the bones had been traced to a village near
Anyang Anyang ( zh, s=安阳, t=安陽; ) is a prefecture-level city in Henan, China. Geographical coordinates are 35° 41'~ 36° 21' north latitude and 113° 38'~ 114° 59' east longitude. The northernmost city in Henan, Anyang borders Puyang to the eas ...
in
Henan Henan; alternatively Honan is a province in Central China. Henan is home to many heritage sites, including Yinxu, the ruins of the final capital of the Shang dynasty () and the Shaolin Temple. Four of the historical capitals of China, Lu ...
—discovered to be the site of Yin, the final Shang capital—which was excavated by a team led by Li Ji from the
Academia Sinica Academia Sinica (AS, ; zh, t=中央研究院) is the national academy of the Taiwan, Republic of China. It is headquartered in Nangang District, Taipei, Nangang, Taipei. Founded in Nanjing, the academy supports research activities in mathemat ...
between 1928 and 1937. To date, over oracle bone fragments have been found. Oracle bone inscriptions recorded divinations undertaken to communicate with the spirits of royal ancestors. The inscriptions range from a few characters in length at their shortest, to several dozen at their longest. The Shang king would communicate with his ancestors by means of scapulimancy, inquiring about subjects such as the royal family, military success, and the weather. Inscriptions were made in the divination material itself before and after it had been cracked by exposure to heat; they generally include a record of the questions posed, as well as the answers as interpreted in the cracks. A minority of bones feature characters that were inked with a brush before their strokes were incised; the evidence of this also shows that the conventional stroke orders used by later calligraphers had already been established for many characters by this point. Oracle bone script is the direct ancestor of later forms of written Chinese. The oldest known inscriptions already represent a well-developed writing system, which suggests an initial emergence predating the late 2nd millennium BCE. Although written Chinese is first attested in official divinations, it is widely believed that writing was also used for other purposes during the Shang, but that the media used in other contexts—likely
bamboo and wooden slips Bamboo and wooden strips ( zh, s=简牍, t=簡牘, first=t, p=jiǎndú) are long, narrow strips of wood or bamboo, each typically holding a single column of several dozen brush-written characters. They were the main media for writing documents ...
—were less durable than bronzes or oracle bones, and have not been preserved.


Zhou scripts

As early as the Shang, the oracle bone script existed as a simplified form alongside another that was used in bamboo books, in addition to elaborate pictorial forms often used in clan emblems. These other forms have been preserved in
bronze script Chinese bronze inscriptions, also referred to as bronze script or bronzeware script, comprise Chinese writing made in several styles on ritual bronzes mainly during the Late Shang dynasty () and Western Zhou dynasty (771 BC). Types of bron ...
(), where inscriptions were made using a stylus in a clay mould, which was then used to cast ritual bronzes. These differences in technique generally resulted in character forms that were less angular in appearance than their oracle bone script counterparts. Study of these bronze inscriptions has revealed that the mainstream script underwent slow, gradual evolution during the late Shang, which continued during the
Zhou dynasty The Zhou dynasty ( ) was a royal dynasty of China that existed for 789 years from until 256 BC, the longest span of any dynasty in Chinese history. During the Western Zhou period (771 BC), the royal house, surnamed Ji, had military ...
(256 BCE) until assuming the form now known as ''
small seal script The small seal script is an archaic script style of written Chinese. It developed within the state of Qin during the Eastern Zhou dynasty (771–256 BC), and was then promulgated across China in order to replace script varieties used i ...
'' () within the Zhou
state of Qin Qin (, , or ''Ch'in'') was an ancient Chinese state during the Zhou dynasty. It is traditionally dated to 897 BC. The state of Qin originated from a reconquest of western lands that had previously been lost to the Xirong. Its location at ...
. Other scripts in use during the late Zhou include the
bird-worm seal script The bird-worm seal script () is a type of ancient seal script originating in China. Names The Chinese character (''niǎo'') means "bird" and the character (') means "insect", but can also mean any creature that looks like a "worm", including ...
(), as well as the regional forms used in non-Qin states. Examples of these styles were preserved as variants in the '. Historically, Zhou forms were collectively known as ''
large seal script The term large seal script traditionally refers to written Chinese dating from before the Qin dynasty—now used either narrowly to the writing of the Western and early Eastern Zhou dynasty (403 BCE), or more broadly to also include the ...
'' (), though Qiu refrains from using this term due to its lack of precision.


Qin unification and small seal script

Following Qin's conquest of the other Chinese states that culminated in the founding of the imperial
Qin dynasty The Qin dynasty ( ) was the first Dynasties of China, imperial dynasty of China. It is named for its progenitor state of Qin, a fief of the confederal Zhou dynasty (256 BC). Beginning in 230 BC, the Qin under King Ying Zheng enga ...
in 221 BCE, the Qin small seal script was standardized for use throughout the entire country under the direction of Chancellor
Li Si Li Si (; 208 BC) was a Chinese calligrapher, philosopher, and politician of the Qin dynasty. He served as Chancellor from 246 to 208 BC, first under King Zheng of the state of Qin—who later became Qin Shi Huang, the "First Emperor" o ...
. It was traditionally believed that Qin scribes only used small seal script, and the later clerical script was a sudden invention during the early Han. However, more than one script was used by Qin scribes—a rectilinear vulgar style had also been in use in Qin for centuries prior to the wars of unification. The popularity of this form grew as writing became more widespread.


Clerical script

By the
Warring States period The Warring States period in history of China, Chinese history (221 BC) comprises the final two and a half centuries of the Zhou dynasty (256 BC), which were characterized by frequent warfare, bureaucratic and military reforms, and ...
(221 BCE), an immature form of
clerical script The clerical script (), sometimes also chancery script, is a style of Chinese writing that evolved from the late Warring States period to the Qin dynasty. It matured and became dominant in the Han dynasty, and remained in active use through t ...
() had emerged based on the vulgar form developed within Qin, often called "early clerical" or "proto-clerical". The proto-clerical script evolved gradually; by the
Han dynasty The Han dynasty was an Dynasties of China, imperial dynasty of China (202 BC9 AD, 25–220 AD) established by Liu Bang and ruled by the House of Liu. The dynasty was preceded by the short-lived Qin dynasty (221–206 BC ...
(202 BCE220 CE), it had arrived at a mature form, also called . Bamboo slips discovered during the late 20th century point to this maturation being completed during the reign of
Emperor Wu of Han Emperor Wu of Han (156 – 29 March 87BC), born Liu Che and courtesy name Tong, was the seventh Emperor of China, emperor of the Han dynasty from 141 to 87 BC. His reign lasted 54 years – a record not broken until the reign of the Kangxi ...
(). This process, called (), involved character forms being mutated and simplified, with many components being consolidated, substituted, or omitted. In turn, the components themselves were regularized to use fewer, straighter, and more well-defined strokes. As a result, clerical script largely lacks the pictorial qualities still evident in seal script. Around the midpoint of the
Eastern Han The Han dynasty was an Dynasties of China, imperial dynasty of China (202 BC9 AD, 25–220 AD) established by Liu Bang and ruled by the House of Liu. The dynasty was preceded by the short-lived Qin dynasty (221–206 BC ...
(25–220 CE), a simplified and easier form of clerical script appeared, which Qiu terms . By the end of the Han, this had become the dominant script used by scribes, though clerical script remained in use for formal works, such as engraved
stelae A stele ( ) or stela ( )The plural in English is sometimes stelai ( ) based on direct transliteration of the Greek, sometimes stelae or stelæ ( ) based on the inflection of Greek nouns in Latin, and sometimes anglicized to steles ( ) or stela ...
. Qiu describes neo-clerical as a transitional form between clerical and
regular script The regular script is the newest of the major Chinese script styles, emerging during the Three Kingdoms period , and stylistically mature by the 7th century. It is the most common style used in modern text. In its traditional form it is the t ...
which remained in use through the
Three Kingdoms The Three Kingdoms of Cao Wei, Shu Han, and Eastern Wu dominated China from AD 220 to 280 following the end of the Han dynasty. This period was preceded by the Eastern Han dynasty and followed by the Jin dynasty (266–420), Western Jin dyna ...
period (220–280 CE) and beyond.


Cursive and semi-cursive

Cursive script () was in use as early as 24 BCE, synthesizing elements of the vulgar writing that had originated in Qin with flowing cursive brushwork. By the Jin dynasty (266–420), the Han cursive style became known as , sometimes known in English as 'clerical cursive', 'ancient cursive', or 'draft cursive'. Some attribute this name to the fact that the style was considered more orderly than a later form referred to as , which had first emerged during the Jin and was influenced by semi-cursive and regular script. This later form was exemplified by the work of figures like Wang Xizhi (), who is often regarded as the most important calligrapher in Chinese history. An early form of
semi-cursive script Semi-cursive script, also known as running script, is a style of Chinese calligraphy that emerged during the Han dynasty (202 BC220 AD). The style is used to write Chinese characters and is abbreviated slightly where a character's stro ...
() can be identified during the late Han, with its development stemming from a cursive form of neo-clerical script. Liu Desheng (; ) is traditionally recognized as the inventor of the semi-cursive style, though accreditations of this kind often indicate a given style's early masters, rather than its earliest practitioners. Later analysis has suggested popular origins for semi-cursive, as opposed to it being an invention of Liu. It can be characterized partly as the result of clerical forms being written more quickly, without formal rules of technique or composition—what would be discrete strokes in clerical script frequently flow together instead. The semi-cursive style is commonly adopted in contemporary handwriting.


Regular script

Regular script The regular script is the newest of the major Chinese script styles, emerging during the Three Kingdoms period , and stylistically mature by the 7th century. It is the most common style used in modern text. In its traditional form it is the t ...
(), based on clerical and semi-cursive forms, is the predominant form in which characters are written and printed. Its innovations have traditionally been credited to the calligrapher
Zhong Yao Zhong Yao (鍾繇, 151 – April or May 230), also referred to as Zhong You, courtesy name Yuanchang (元常), was a Chinese calligrapher and politician who lived during the late Eastern Han dynasty and Three Kingdoms period of China. He serv ...
, who lived in the state of
Cao Wei Wei () was one of the major Dynasties in Chinese history, dynastic states in China during the Three Kingdoms period. The state was established in 220 by Cao Pi based upon the foundations laid by his father Cao Cao during the end of the Han dy ...
(extant 220–266); he is often called the "father of regular script". The earliest surviving writing in regular script comprises copies of Zhong Yao's work, including at least one copy by Wang Xizhi. Characteristics of regular script include the technique used to end horizontal strokes, as well as heavy tails on diagonal strokes made going down and to the right. It developed further during the Eastern Jin (317–420) in the hands of Wang Xizhi and his son Wang Xianzhi. However, most Jin-era writers continued to use neo-clerical and semi-cursive styles in their daily writing. It was not until the Northern and Southern period (420–589) that regular script became the predominant form. The system of
imperial examination The imperial examination was a civil service examination system in History of China#Imperial China, Imperial China administered for the purpose of selecting candidates for the Civil service#China, state bureaucracy. The concept of choosing bureau ...
s for the civil service established during the
Sui dynasty The Sui dynasty ( ) was a short-lived Dynasties of China, Chinese imperial dynasty that ruled from 581 to 618. The re-unification of China proper under the Sui brought the Northern and Southern dynasties era to a close, ending a prolonged peri ...
(581–618) required test takers to write in
Literary Chinese Classical Chinese is the language in which the classics of Chinese literature were written, from . For millennia thereafter, the written Chinese used in these works was imitated and iterated upon by scholars in a form now called Literary ...
using regular script, which contributed to the prevalence of both throughout later Chinese history.


Structure

Each character of a text is written within a uniform square allotted for it. As part of the evolution from seal script into clerical script, character components became regularized as discrete series of strokes (). Strokes can be considered both the basic unit of handwriting, as well as the writing system's basic unit of graphemic organization. In clerical and regular script, individual strokes traditionally belong to one of eight categories according to their technique and graphemic function. In what is known as the '' Eight Principles of '', calligraphers practise their technique using the character , which can be written with one stroke of each type. In ordinary writing, is now written with five strokes instead of eight, and a system of five basic stroke types is commonly employed in analysis—with certain compound strokes treated as sequences of basic strokes made in a single motion. Characters are constructed according to predictable visual patterns. Some components have distinct combining forms when occupying specific positions within a character—for example, the ('knife') component appears as on the right side of characters, but as at the top of characters. The order in which components are drawn within a character is fixed. The order in which the strokes of a component are drawn is also largely fixed, but may vary according to several different standards. This is summed up in practice with a few rules of thumb, including that characters are generally assembled from left to right, then from top to bottom, with "enclosing" components started before, then closed after, the components they enclose. For example, is drawn in the following order:


Variant characters

Over a character's history,
variant character form Chinese characters may have several variant forms—visually distinct glyphs that represent the same underlying meaning and pronunciation. Variants of a given character are ''allographs'' of one another, and many are directly analogous to allog ...
s () emerge via several processes. Variant forms have distinct structures, but represent the same morpheme; as such, they can be considered instances of the same underlying character. This is comparable to visually distinct double-storey and single-storey forms both representing the Latin letter . Variants also emerge for aesthetic reasons, to make handwriting easier, or to correct what the writer perceives to be errors in a character's form. Individual components may be replaced with visually, phonetically, or semantically similar alternatives. The boundary between character structure and style—and thus whether forms represent different characters, or are merely variants of the same character—is often non-trivial or unclear. For example, prior to the Qin dynasty the character meaning 'bright' was written as either or —with either ('Sun') or ('window') on the left, and ('Moon') on the right. As part of the Qin programme to standardize small seal script across China, the form was promoted. Some scribes ignored this, and continued to write the character as . However, the increased usage of was followed by the proliferation of a third variant: , with ('eye') on the left—likely derived as a contraction of . Ultimately, became the character's standard form.


Layout

From the earliest inscriptions until the 20th century, texts were generally laid out vertically—with characters written from top to bottom in columns, arranged from right to left. Word boundaries are generally not indicated with
space Space is a three-dimensional continuum containing positions and directions. In classical physics, physical space is often conceived in three linear dimensions. Modern physicists usually consider it, with time, to be part of a boundless ...
s. A horizontal writing direction—with characters written from left to right in rows, arranged from top to bottom—only became predominant in the Sinosphere during the 20th century as a result of Western influence. Many publications outside mainland China continue to use the traditional vertical writing direction. Western influence also resulted in the generalized use of punctuation being widely adopted in print during the 19th and 20th centuries. Prior to this, the context of a passage was considered adequate to guide readers; this was enabled by characters being easier to read than alphabets when written without spaces or punctuation due to their more discretized shapes.


Methods of writing

The earliest attested Chinese characters were carved into bone, or marked using a stylus in clay moulds used to cast ritual bronzes. Characters have also been incised into stone, or written in ink onto slips of silk, wood, and bamboo. The invention of paper for use as a writing medium occurred during the 1st century CE, and is traditionally credited to
Cai Lun Cai Lun ( zh, s=蔡伦; courtesy name: Jingzhong ( zh, labels=no, t=敬仲, s=敬仲); – 121 CE), formerly romanization of Chinese, romanized as Ts'ai Lun, was a Eunuchs in China, Chinese eunuch court official of the Eastern Han dynasty. H ...
. There are numerous styles, or ''scripts'' () in which characters can be written, including the historical forms like seal script and clerical script. Most styles used throughout the Sinosphere originated within China, though they may display regional variation. Styles that have been created outside of China tend to remain localized in their use—these include the Japanese and Vietnamese scripts.


Calligraphy

Calligraphy was traditionally one of the
four arts The four arts (), or the four arts of the Chinese scholar, were the four main academic and artistic talents required of the aristocratic ancient Chinese scholar-gentleman. They were the mastery of the ''qin'' (the guqin, a stringed instrument, ...
to be mastered by Chinese scholars, considered to be an artful means of expressing thoughts and teachings. Chinese calligraphy typically makes use of an ink brush to write characters. Strict regularity is not required, and character forms may be accentuated to evoke a variety of aesthetic effects. Traditional ideals of calligraphic beauty often tie into broader philosophical concepts native to East Asia. For example, aesthetics can be conceptualized using the framework of
yin and yang Originating in Chinese philosophy, yin and yang (, ), also yinyang or yin-yang, is the concept of opposite cosmic principles or forces that interact, interconnect, and perpetuate each other. Yin and yang can be thought of as complementary an ...
, where the extremes of any number of mutually reinforcing dualities are balanced by the calligrapher—such as the duality between strokes made quickly or slowly, between applying ink heavily or lightly, between characters written with symmetrical or asymmetrical forms, and between characters representing concrete or abstract concepts.


Printing and typefaces

Woodblock printing Woodblock printing or block printing is a technique for printing text, images or patterns used widely throughout East Asia and originating in China in antiquity as a method of textile printing, printing on textiles and later on paper. Each page ...
was invented in China between the 6th and 9th centuries, followed by the invention of
moveable type Movable type (US English; moveable type in British English) is the system and technology of printing and typography that uses movable components to reproduce the elements of a document (usually individual alphanumeric characters or punctuation ...
by
Bi Sheng Bi Sheng (972–1051) was a Chinese artisan and engineer during the Song dynasty (960–1279), who invented the world's first movable type. Bi's system used fired clay tiles, one for each Chinese character, and was invented between 1039 and 1048 ...
during the 11th century. The increasing use of print during the Ming (1368–1644) and
Qing The Qing dynasty ( ), officially the Great Qing, was a Manchu-led Dynasties of China, imperial dynasty of China and an early modern empire in East Asia. The last imperial dynasty in Chinese history, the Qing dynasty was preceded by the ...
dynasties (1644–1912) led to considerable standardization in character forms, which prefigured later script reforms during the 20th century. This print
orthography An orthography is a set of convention (norm), conventions for writing a language, including norms of spelling, punctuation, Word#Word boundaries, word boundaries, capitalization, hyphenation, and Emphasis (typography), emphasis. Most national ...
, exemplified by the 1716 ''
Kangxi Dictionary The ''Kangxi Dictionary'' () is a Chinese dictionary published in 1716 during the High Qing, considered from the time of its publishing until the early 20th century to be the most authoritative reference for written Chinese characters. Wanting ...
'', was later dubbed the ' ('old character shapes'). Printed Chinese characters may use different
typeface A typeface (or font family) is a design of Letter (alphabet), letters, Numerical digit, numbers and other symbols, to be used in printing or for electronic display. Most typefaces include variations in size (e.g., 24 point), weight (e.g., light, ...
s, of which there are four broad classes in use: *
Song A song is a musical composition performed by the human voice. The voice often carries the melody (a series of distinct and fixed pitches) using patterns of sound and silence. Songs have a structure, such as the common ABA form, and are usu ...
() or Ming () typefaces—with "Song" generally used with simplified Chinese typefaces, and "Ming" with others—broadly correspond to Western
serif In typography, a serif () is a small line or stroke regularly attached to the end of a larger stroke in a letter or symbol within a particular font or family of fonts. A typeface or "font family" making use of serifs is called a serif typeface ( ...
styles. Song typefaces are broadly within the tradition of historical Chinese print; both names for the style refer to eras regarded as high points for printing in the Sinosphere. While type during the
Song dynasty The Song dynasty ( ) was an Dynasties of China, imperial dynasty of China that ruled from 960 to 1279. The dynasty was founded by Emperor Taizu of Song, who usurped the throne of the Later Zhou dynasty and went on to conquer the rest of the Fiv ...
(960–1279) generally resembled the regular script style of a particular calligrapher, most modern Song typefaces are intended for general purpose use and emphasize neutrality in their design. *
Sans-serif typefaces In typography and lettering, a sans-serif, sans serif (), gothic, or simply sans letterform is one that does not have extending features called "serifs" at the end of strokes. Sans-serif typefaces tend to have less stroke width variation than ...
are called in Chinese and 'Gothic' () in Japanese. Sans-serif strokes are rendered as simple lines of even thickness. * "Kai" typefaces () imitate handwritten regular script. * Fangsong typefaces (), called "Song" in Japan, correspond to semi- script styles in the Western paradigm.


Use with computers

Before computers became ubiquitous, earlier electro-mechanical communications devices like
telegraph Telegraphy is the long-distance transmission of messages where the sender uses symbolic codes, known to the recipient, rather than a physical exchange of an object bearing the message. Thus flag semaphore is a method of telegraphy, whereas ...
s and
typewriter A typewriter is a Machine, mechanical or electromechanical machine for typing characters. Typically, a typewriter has an array of Button (control), keys, and each one causes a different single character to be produced on paper by striking an i ...
s were originally designed for use with alphabets, often by means of alphabetic text encodings like
Morse code Morse code is a telecommunications method which Character encoding, encodes Written language, text characters as standardized sequences of two different signal durations, called ''dots'' and ''dashes'', or ''dits'' and ''dahs''. Morse code i ...
and
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
. Adapting these technologies for a writing system that uses thousands of distinct characters was non-trivial.


Input methods

Chinese characters are predominantly input on computers using a standard keyboard. Many input methods (IMEs) are phonetic, where typists enter characters according to schemes like
pinyin Hanyu Pinyin, or simply pinyin, officially the Chinese Phonetic Alphabet, is the most common romanization system for Standard Chinese. ''Hanyu'' () literally means 'Han Chinese, Han language'—that is, the Chinese language—while ''pinyin' ...
or
bopomofo Bopomofo, also called Zhuyin Fuhao ( ; ), or simply Zhuyin, is a Chinese transliteration, transliteration system for Standard Chinese and other Sinitic languages. It is the principal method of teaching Chinese Mandarin pronunciation in Taiwa ...
for Mandarin,
Jyutping The Linguistic Society of Hong Kong Cantonese Romanization Scheme, also known as Jyutping, is a romanisation system for Cantonese developed in 1993 by the Linguistic Society of Hong Kong (LSHK). The name ''Jyutping'' (itself the Jyutping ro ...
for Cantonese, or Hepburn for Japanese. For example, could be input as using pinyin, or as using Jyutping. Character input methods may also be based on form, using the shape of characters and existing rules of handwriting to assign unique codes to each character, potentially increasing the speed of typing. Popular form-based input methods include Wubi on the mainland, and Cangjie—named after the mythological inventor of writing—in Taiwan and Hong Kong. Often, unnecessary parts are omitted from the encoding according to predictable rules. For example, is encoded using the Cangjie method as , which corresponds to the components . Contextual constraints may be used to improve candidate character selection. When ignoring
tone Tone may refer to: Visual arts and color-related * Tone (color theory), a mix of tint and shade, in painting and color theory * Tone (color), the lightness or brightness (as well as darkness) of a color * Toning (coin), color change in coins * ...
s, and are both transcribed as ; the system may prioritize which candidate appears first based on context.


Encoding and interchange

While special text encodings for Chinese characters were introduced prior to its popularization, ''
The Unicode Standard Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 ch ...
'' is the predominant text encoding worldwide. According to the philosophy of the
Unicode Consortium The Unicode Consortium (legally Unicode, Inc.) is a 501(c)(3) non-profit organization incorporated and based in Mountain View, California, U.S. Its primary purpose is to maintain and publish the Unicode Standard which was developed with the in ...
, each distinct graph is assigned a number in the standard, but specifying its appearance or the particular
allograph In graphemics and typography, the term allograph is used of a glyph that is a design variant of a letter or other grapheme, such as a letter, a number, an ideograph, a punctuation mark or other typographic symbol. In graphemics, an obvious exa ...
used is a choice made by the engine rendering the text. Unicode's
Basic Multilingual Plane In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal ...
(BMP) represents the standard's 216 smallest code points. Of these, (or %) are assigned to
CJK Unified Ideographs The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Uni ...
, a designation comprising characters used in each of the
Chinese family of scripts The Chinese family of scripts includes writing systems used to write various East Asian languages, that ultimately descend from the oracle bone script invented in the Yellow River valley during the Shang dynasty. These include written Chinese it ...
. As of version , published in 2024, Unicode defines a total of Chinese characters.


Vocabulary and adaptation

Writing first emerged during the historical stage of the Chinese language known as ''
Old Chinese Old Chinese, also called Archaic Chinese in older works, is the oldest attested stage of Chinese language, Chinese, and the ancestor of all modern varieties of Chinese. The earliest examples of Chinese are divinatory inscriptions on oracle bones ...
''. Most characters correspond to morphemes that originally functioned as stand-alone Old Chinese words.
Classical Chinese Classical Chinese is the language in which the classics of Chinese literature were written, from . For millennia thereafter, the written Chinese used in these works was imitated and iterated upon by scholars in a form now called Literary ...
is the form of written Chinese used in the classic works of Chinese literature from roughly the 5th century BCE until the 2nd century CE. This form of the language was imitated by later authors, even as it began to diverge from the language they spoke. This later form, referred to as ''Literary Chinese'', remained the predominant written language in China until the 20th century. Its use in the
Sinosphere The Sinosphere, also known as the Chinese cultural sphere, East Asian cultural sphere, or the Sinic world, encompasses multiple countries in East Asia and Southeast Asia that were historically heavily influenced by Chinese culture. The Sinosph ...
was loosely analogous to that of
Latin Latin ( or ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken by the Latins (Italic tribe), Latins in Latium (now known as Lazio), the lower Tiber area aroun ...
in pre-modern Europe. While it was not static over time, Literary Chinese retained many properties of spoken Old Chinese. Informed by the local spoken
vernacular Vernacular is the ordinary, informal, spoken language, spoken form of language, particularly when perceptual dialectology, perceived as having lower social status or less Prestige (sociolinguistics), prestige than standard language, which is mor ...
s, texts were read aloud using literary and colloquial readings that varied by region. Over time, sound mergers created ambiguities in vernacular speech as more words became homophonic. This ambiguity was often reduced through the introduction of multi-syllable
compound words In linguistics, a compound is a lexeme (less precisely, a word or sign) that consists of more than one stem. Compounding, composition or nominal composition is the process of word formation that creates compound lexemes. Compounding occurs when t ...
, which comprise much of the vocabulary in modern varieties of Chinese. Over time, use of Literary Chinese spread to neighbouring countries, including Vietnam, Korea, and Japan. Alongside other aspects of Chinese culture, local elites adopted writing for record-keeping, histories, and official communications. Excepting hypotheses by some linguists of the latter two sharing a common ancestor, Chinese, Vietnamese, Korean, and Japanese each belong to different
language families A language family is a group of languages related through descent from a common ancestor, called the proto-language of that family. The term ''family'' is a metaphor borrowed from biology, with the tree model used in historical linguistics ana ...
, and tend to function differently from one another. Reading systems were devised to enable non-Chinese speakers to interpret Literary Chinese texts in terms of their native language, a phenomenon that has been variously described as either a form of diglossia, as ''reading by gloss'', or as a process of translation into and out of Chinese. Compared to other traditions that wrote using alphabets or syllabaries, the literary culture that developed in this context was less directly tied to a specific spoken language. This is exemplified by the cross-linguistic phenomenon of
brushtalk Brushtalk is a form of written communication using Classical Chinese, Literary Chinese to facilitate diplomatic and casual discussions between people of the countries in the Sinosphere, which include China, Japan, Korea, and Vietnam. History Br ...
, where mutual literacy allowed speakers of different languages to engage in face-to-face conversations. Following the introduction of Literary Chinese, characters were later adapted to write many non-Chinese languages spoken throughout the Sinosphere. These new writing systems used characters to write both native vocabulary and the numerous
loanword A loanword (also a loan word, loan-word) is a word at least partly assimilated from one language (the donor language) into another language (the recipient or target language), through the process of borrowing. Borrowing is a metaphorical term t ...
s each language had borrowed from Chinese, collectively termed '' Sino-Xenic vocabulary''. Characters may have native readings, Sino-Xenic readings, or both. Comparison of Sino-Xenic vocabulary across the Sinosphere has been useful in the reconstruction of Middle Chinese phonology. Literary Chinese was used in Vietnam during the millennium of Chinese rule that began in 111 BCE. By the 15th century, a system that adapted characters to write Vietnamese called ' had fully matured. The 2nd century BCE is the earliest possible period for the introduction of writing to Korea; the oldest surviving manuscripts in the country date to the early 5th century CE. Also during the 5th century, writing spread from Korea to Japan. Characters were being used to write both Korean and Japanese by the 6th century. By the late 20th century, characters had largely been replaced with alphabets designed to write Vietnamese and Korean. This leaves Japanese as the only major non-Sinitic language typically written using Chinese characters.


Literary and vernacular Chinese

Words in Classical Chinese were generally a single character in length. An estimated 25–30% of the vocabulary used in Classical Chinese texts consists of two-character words. Over time, the introduction of multi-syllable vocabulary into vernacular varieties of Chinese was encouraged by phonetic shifts that increased the number of homophones. The most common process of Chinese word formation after the Classical period has been to create compounds of existing words. Words have also been created by appending
affix In linguistics, an affix is a morpheme that is attached to a word stem to form a new word or word form. The main two categories are Morphological derivation, derivational and inflectional affixes. Derivational affixes, such as ''un-'', ''-ation' ...
es to words, by
reduplication In linguistics, reduplication is a Morphology (linguistics), morphological process in which the Root (linguistics), root or Stem (linguistics), stem of a word, part of that, or the whole word is repeated exactly or with a slight change. The cla ...
, and by borrowing words from other languages. While multi-syllable words are generally written with one character per syllable, abbreviations are occasionally used. For example, may be written as the contracted form . Sometimes, different morphemes come to be represented by characters with identical shapes. For example, may represent either or the extended sense of —these morphemes are ultimately
cognate In historical linguistics, cognates or lexical cognates are sets of words that have been inherited in direct descent from an etymological ancestor in a common parent language. Because language change can have radical effects on both the s ...
s that diverged in pronunciation but remained written with the same character. However, Qiu reserves the term ''homograph'' to describe identically shaped characters with different meanings that emerge via processes other than semantic extension. An example homograph is ; , which originally meant . In the 20th century, this character was created again with the meaning . Both of these characters are phono-semantic compounds with ('gold') as the semantic component and as the phonetic component, but the words represented by each are not related. There are a number of that are not used in standard
written vernacular Chinese Written vernacular Chinese, also known as ''baihua'', comprises forms of written Chinese based on the vernacular varieties of the language spoken throughout China. It is contrasted with Literary Chinese, which was the predominant written form ...
, but reflect the vocabulary of other spoken varieties. The most complete example of an orthography based on a variety other than
Standard Chinese Standard Chinese ( zh, s=现代标准汉语, t=現代標準漢語, p=Xiàndài biāozhǔn hànyǔ, l=modern standard Han speech) is a modern standard form of Mandarin Chinese that was first codified during the republican era (1912–1949). ...
is Written Cantonese. A common Cantonese character is , derived by removing two strokes from . It is common to use standard characters to transcribe previously unwritten words in Chinese dialects when obvious cognates exist. When no obvious cognate exists due to factors like irregular sound changes, semantic drift, or an origin in a non-Chinese language, characters are often borrowed or invented to transcribe the word—either ad hoc, or according to existing principles. These new characters are generally phono-semantic compounds.


Japanese

In Japanese, Chinese characters are referred to as . During the
Nara period The of the history of Japan covers the years from 710 to 794. Empress Genmei established the capital of Heijō-kyō (present-day Nara). Except for a five-year period (740–745), when the capital was briefly moved again, it remained the capita ...
(710–794), readers and writers of —the Japanese term for Literary Chinese writing—began utilizing a system of reading techniques and annotations called . When reading, Japanese speakers would adapt the syntax and vocabulary of Literary Chinese texts to reflect their Japanese-language equivalents. Writing essentially involved the inverse of this process, and resulted in ordinary Literary Chinese. When adapted to write Japanese, characters were used to represent both
Sino-Japanese vocabulary Sino-Japanese vocabulary, also known as , is a subset of Japanese vocabulary that originated in Chinese language, Chinese or was created from elements borrowed from Chinese. Most Sino-Japanese words were borrowed in the 5th–9th centuries AD, from ...
loaned from Chinese, as well as the corresponding native synonyms. Most kanji were subject to both borrowing processes, and as a result have both Sino-Japanese and native readings, known as and respectively. Moreover, kanji may have multiple readings of either kind. Distinct classes of were borrowed into Japanese at different points in time from different varieties of Chinese. The
Japanese writing system The modern Japanese writing system uses a combination of Logogram, logographic kanji, which are adopted Chinese characters, and Syllabary, syllabic kana. Kana itself consists of a pair of syllabary, syllabaries: hiragana, used primarily for n ...
is a mixed script, and has also incorporated syllabaries called to represent phonetic units called '' moras'', rather than morphemes. Prior to the
Meiji era The was an Japanese era name, era of History of Japan, Japanese history that extended from October 23, 1868, to July 30, 1912. The Meiji era was the first half of the Empire of Japan, when the Japanese people moved from being an isolated feu ...
(1868–1912), writers used certain kanji to represent their sound values instead, in a system known as . Starting in the 9th century, specific were graphically simplified to create two distinct syllabaries called and , which slowly replaced the earlier convention. Modern Japanese retains the use of kanji to represent most
word stem In linguistics, a word stem is a word part responsible for a word's lexical meaning. The term is used with slightly different meanings depending on the morphology of the language in question. For instance, in Athabaskan linguistics, a verb stem ...
s, while syllabograms are generally used for grammatical affixes, particles, and loanwords. The forms of and are visually distinct from one another, owing in large part to different methods of simplification— were derived from smaller components of each , while were derived from the cursive forms of in their entirety. In addition, the and for some moras were derived from different . Characters invented for Japanese-language use are called . The methods employed to create are equivalent to those used by Chinese-original characters, though most are ideographic compounds. For example, (; 'mountain pass') is a compound composed of ('mountain'), ('above'), and ('below'). While characters used to write Chinese are monosyllabic, many kanji have multi-syllable readings. For example, the kanji has a native reading of . In different contexts, it can also be read with the reading , such as in the Chinese loanword (; 'Japanese sword'), with a pronunciation corresponding to that in Chinese at the time of borrowing. Prior to the universal adoption of , loanwords were typically written with unrelated kanji with readings matching the syllables in the loanword. These spellings are called —for example, was the spelling of 'America', now rendered as . As opposed to used solely for their pronunciation, still corresponded to specific Japanese words. Some are still in use, with the official list of kanji including 106 readings.


Korean

In Korean, Chinese characters are referred to as ''hanja''. Literary Chinese may have been written in Korea as early as the 2nd century BCE. During Korea's
Three Kingdoms The Three Kingdoms of Cao Wei, Shu Han, and Eastern Wu dominated China from AD 220 to 280 following the end of the Han dynasty. This period was preceded by the Eastern Han dynasty and followed by the Jin dynasty (266–420), Western Jin dyna ...
period (57 BCE668 CE), characters were also used to write , a form of Korean-language literature that mostly made use of
Sino-Korean vocabulary Sino-Korean vocabulary or Hanja-eo () refers to Korean words of Chinese origin. Sino-Korean vocabulary includes words borrowed directly from Chinese, as well as new Korean words created from Chinese characters, and words borrowed from Sino-Japan ...
. During the
Goryeo Goryeo (; ) was a Korean state founded in 918, during a time of national division called the Later Three Kingdoms period, that unified and ruled the Korea, Korean Peninsula until the establishment of Joseon in 1392. Goryeo achieved what has b ...
period (918–1392), Korean writers developed a system of phonetic annotations for Literary Chinese called , comparable to in Japan, though it only entered widespread use during the later
Joseon Joseon ( ; ; also romanized as ''Chosun''), officially Great Joseon (), was a dynastic kingdom of Korea that existed for 505 years. It was founded by Taejo of Joseon in July 1392 and replaced by the Korean Empire in October 1897. The kingdom w ...
period (1392–1897). While the
hangul The Korean alphabet is the modern writing system for the Korean language. In North Korea, the alphabet is known as (), and in South Korea, it is known as (). The letters for the five basic consonants reflect the shape of the speech organs ...
alphabet was invented by the Joseon king
Sejong Sejong (; 15 May 1397 – 8 April 1450), commonly known as Sejong the Great (), was the fourth monarch of the Joseon dynasty of Korea. He is regarded as the greatest ruler in Korean history, and is remembered as the inventor of Hangul, the n ...
in 1443, it was not adopted by the Korean literati and was relegated to use in glosses for Literary Chinese texts until the late 19th century. Much of the Korean lexicon consists of Chinese loanwords, especially technical and academic vocabulary. While hanja were usually only used to write this Sino-Korean vocabulary, there is evidence that vernacular readings were sometimes used. Compared to the other written vernaculars, very few characters were invented to write Korean words; these are called '' gukja''. During the late 19th and early 20th centuries, Korean was written either using a mixed script of hangul and hanja, or only using hangul. Following the end of the
Empire of Japan The Empire of Japan, also known as the Japanese Empire or Imperial Japan, was the Japanese nation state that existed from the Meiji Restoration on January 3, 1868, until the Constitution of Japan took effect on May 3, 1947. From Japan–Kor ...
's occupation of Korea in 1945, the total replacement of hanja with hangul was advocated throughout the country as part of a broader "purification movement" of the national language and culture. However, due to the lack of tones in spoken Korean, there are many Sino-Korean words that are homophones with identical hangul spellings. For example, the phonetic dictionary entry for () yields more than 30 different entries. This ambiguity had historically been resolved by also including the associated hanja. While still sometimes used for Sino-Korean vocabulary, it is much rarer for native Korean words to be written using hanja. When learning new characters, Korean students are instructed to associate each one with both its Sino-Korean pronunciation, as well as a native Korean synonym. Examples include:


Vietnamese

In Vietnamese, Chinese characters are referred to as ' (), ' (; '
Confucian Confucianism, also known as Ruism or Ru classicism, is a system of thought and behavior originating in ancient China, and is variously described as a tradition, philosophy, religion, theory of government, or way of life. Founded by Confucius ...
characters'), or ' (). Literary Chinese was used for all formal writing in Vietnam until the modern era, having first acquired official status in 1010. Literary Chinese written by Vietnamese authors is first attested in the late 10th century, though the local practice of writing is likely several centuries older. Characters used to write Vietnamese called ' () are first attested in an inscription dated to 1209 made at the site of a pagoda. A mature script had likely emerged by the 13th century, and was initially used to record Vietnamese folk literature. Some characters are phono-semantic compounds corresponding to spoken Vietnamese syllables. Another technique with no equivalent in China created compounds using two phonetic components. This was done because Vietnamese phonology included consonant clusters not found in Chinese, and were thus poorly approximated by the sound values of borrowed characters. Compounds used components with two distinct consonant sounds to specify the cluster, e.g. (; 'Moon') was created as a compound of () and (). As a system, was highly complex, and the literacy rate among the Vietnamese population never exceeded 5%. Both Literary Chinese and fell out of use during the French colonial period, and were gradually replaced by the Latin-based
Vietnamese alphabet The Vietnamese alphabet (, ) is the modern writing script for the Vietnamese language. It uses the Latin script based on Romance languages like French language, French, originally developed by Francisco de Pina (1585–1625), a missionary from P ...
. Following the end of colonial rule in 1954, the Vietnamese alphabet has been sole official writing system in Vietnam, and is used exclusively in Vietnamese-language media.


Other languages

Several minority languages of
South South is one of the cardinal directions or compass points. The direction is the opposite of north and is perpendicular to both west and east. Etymology The word ''south'' comes from Old English ''sūþ'', from earlier Proto-Germanic ''*sunþa ...
and
Southwestern China Southwestern China () is a region in the People's Republic of China. It consists of five provincial administrative regions, namely Chongqing, Sichuan, Guizhou, Yunnan, and Xizang. Geography Southwestern China is a rugged and mountainous region, ...
have been written with scripts using both borrowed and locally created characters. The most well-documented of these is the script for the
Zhuang languages The Zhuang languages (; autonym: , , pre-1982: , Sawndip: 話僮, from ''vah'', 'language' and ''Cuengh'', 'Zhuang'; ) are the more than a dozen Tai languages spoken by the Zhuang people of Southern China in the province of Guangxi and adjace ...
of
Guangxi Guangxi,; officially the Guangxi Zhuang Autonomous Region, is an Autonomous regions of China, autonomous region of the China, People's Republic of China, located in South China and bordering Vietnam (Hà Giang Province, Hà Giang, Cao Bằn ...
. While little is known about its early development, a tradition of vernacular Zhuang writing likely first emerged during the
Tang dynasty The Tang dynasty (, ; zh, c=唐朝), or the Tang Empire, was an Dynasties of China, imperial dynasty of China that ruled from 618 to 907, with an Wu Zhou, interregnum between 690 and 705. It was preceded by the Sui dynasty and followed ...
(618–907). Modern scholarship characterizes writing as a network of regional traditions that have mutually influenced one another while maintaining their local characteristics. Like Vietnamese, some invented Zhuang characters are phonetic–phonetic compounds, though not primarily ones intended to describe consonant clusters. Despite the Chinese government encouraging its replacement with a Latin-based
Zhuang alphabet Standard Zhuang ( autonym: , ; pre-1982 autonym: ; Sawndip: ; ) is the official standardized form of the Zhuang languages, which are a branch of the Northern Tai languages. Its pronunciation is based on that of the Yongbei Zhuang dialect o ...
, remains in use. Other non-Sinitic
languages of China There are several hundred languages in the People's Republic of China. The predominant language is Standard Chinese, which is based on Beijing dialect, Beijingese, but there are hundreds of related Chinese languages, collectively known as ''Hany ...
historically written with Chinese characters include
Miao Miao may refer to: * Miao people, linguistically and culturally related group of people, recognized as such by the government of the People's Republic of China * Miao script or Pollard script, writing system used for Miao languages * Miao (Unicode ...
, Yao, Bouyei, Bai, and Hani; each of these are now written with Latin-based alphabets designed for use with each language.


Graphically derived scripts

Between the 10th and 13th centuries, dynasties founded by non-Han peoples in northern China also created scripts for their languages that were inspired by Chinese characters, but did not use them directly—these included the
Khitan large script The Khitan large script () was one of two writing systems used for the now-extinct Khitan language (the other was the Khitan small script). It was used during the 10th–12th centuries by the Khitan people, who had created the Liao Empire in no ...
,
Khitan small script The Khitan small script () was one of two writing systems used for the now-extinct Khitan language. It was used during the 10th–12th century by the Khitan people, who had created the Liao Empire in present-day northeastern China. In addition to ...
, Tangut script, and
Jurchen script The Jurchen script (Jurchen: ; ) was the writing system used to write the Jurchen language, the language of the Jurchen people who created the Jin Empire in northeastern China in the 12th–13th centuries. It was derived from the Khitan scrip ...
. This has occurred in other contexts as well: Nüshu was a script used by Yao women to write the
Xiangnan Tuhua Xiangnan Tuhua (), or simply Tuhua, is a group of unclassified Chinese varieties of southeastern Hunan Hunan is an inland Provinces of China, province in Central China. Located in the middle reaches of the Yangtze watershed, it borders the ...
language, and
bopomofo Bopomofo, also called Zhuyin Fuhao ( ; ), or simply Zhuyin, is a Chinese transliteration, transliteration system for Standard Chinese and other Sinitic languages. It is the principal method of teaching Chinese Mandarin pronunciation in Taiwa ...
() is a semi-syllabary first invented in 1907 to represent the sounds of Standard Chinese; both use forms graphically derived from Chinese characters. Other scripts within China that have adapted some characters but are otherwise distinct include the
Geba syllabary ''Geba'' is a syllabary, syllabic script for the Naxi language. It is called ''¹Ggo¹baw'' in Naxi, adapted as ''Geba'', 哥巴, in Chinese. Some glyphs resemble the Yi script, and some appear to be adaptations of Chinese characters. ''Geba'' ...
used to write the
Naxi language Naxi (), also known as ''Nakhi'', ''Nasi'', ''Lomi'', ''Moso'', or ''Mo-su'', is a Sino-Tibetan language or group of languages spoken by approximately 310,000 Nakhi people, most of whom live in or around Yulong Naxi Autonomous County in the pr ...
,
the script The Script are an Irish Soft rock, soft-rock band formed in 2001 in Dublin. The band currently consists of Danny O'Donoghue (lead vocals, guitar, piano, keyboards), Glen Power (drums, percussion, backing vocals), Benjamin Seargent (bass, backin ...
for the Sui language,
the script The Script are an Irish Soft rock, soft-rock band formed in 2001 in Dublin. The band currently consists of Danny O'Donoghue (lead vocals, guitar, piano, keyboards), Glen Power (drums, percussion, backing vocals), Benjamin Seargent (bass, backin ...
for the Yi languages, and the syllabary for the Lisu language. Chinese characters have also been repurposed phonetically to transcribe the sounds of non-Chinese languages. For example, the only manuscripts of the 13th-century ''
Secret History of the Mongols The ''Secret History of the Mongols'' is the oldest surviving literary work in the Mongolic languages. Written for the Mongol royal family some time after the death of Genghis Khan in 1227, it recounts his life and conquests, and partially the r ...
'' that have survived from the medieval era use characters in this manner to write the
Mongolian language Mongolian is the Prestige (sociolinguistics), principal language of the Mongolic languages, Mongolic language family that originated in the Mongolian Plateau. It is spoken by ethnic Mongols and other closely related Mongolic peoples who are nati ...
.


Literacy and lexicography

The memorization of thousands of different characters is required to achieve literacy in languages written with them, in contrast to the relatively small inventory of graphemes used in phonetic writing. Historically, character literacy was often acquired via Chinese primers like the 6th-century '' Thousand Character Classic'' and 13th-century '' Three Character Classic'', as well as surname dictionaries like the Song-era ''
Hundred Family Surnames The ''Hundred Family Surnames'' (), commonly known as ''Bai Jia Xing'', also translated as ''Hundreds of Chinese Surnames'', is a classic Chinese language , Chinese text composed of common Chinese surnames. An unknown author compiled the book ...
''. Studies of Chinese-language literacy suggest that literate individuals generally have an
active vocabulary A vocabulary (also known as a lexicon) is a set of words, typically the set in a language or the set known to an individual. The word ''vocabulary'' originated from the Latin , meaning "a word, name". It forms an essential component of language ...
of three to four thousand characters; for specialists in fields like literature or history, this figure may be between five and six thousand.


Dictionaries

According to analyses of mainland Chinese, Taiwanese, Hong Kong, Japanese, and Korean sources, the total number of characters in the modern lexicon is around . Dozens of schemes have been devised for indexing Chinese characters and arranging them in dictionaries, though relatively few have achieved widespread use. Characters may be ordered according to methods based on their meaning, visual structure, or pronunciation. The ' () organized the Chinese lexicon into 19 sections according to character meaning, with 3 dealing with everyday vocabulary, and each of the remaining 16 dedicated to specialized vocabulary related to a specific topic. The ' () introduced what would ultimately become the predominant method of organization used by later character dictionaries, whereby characters are grouped according to certain visually prominent components called '' radicals'' (). The ' used a system of 540 radicals, while subsequent dictionaries have generally used fewer. The set of 214
Kangxi radicals The ''Kangxi'' radicals (), also known as ''Zihui'' radicals, are a set of 214 radicals that were collated in the 18th-century '' Kangxi Dictionary'' to aid categorization of Chinese characters. They are primarily sorted by stroke count. They ...
was popularized by the ''
Kangxi Dictionary The ''Kangxi Dictionary'' () is a Chinese dictionary published in 1716 during the High Qing, considered from the time of its publishing until the early 20th century to be the most authoritative reference for written Chinese characters. Wanting ...
'' (1716), but originally appeared in the earlier ' (1615). Character dictionaries have historically been indexed using
radical-and-stroke sorting Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof. Collation is a fundamental element of most office fil ...
, where characters are grouped by radical and sorted within each group by stroke number. Some modern dictionaries arrange character entries alphabetically according to their pinyin spelling, while also providing a traditional radical-based index. Before the invention of romanization systems for Chinese, the pronunciation of characters was transmitted via rhyme dictionaries. These used the () method, where each entry lists a common character with the same initial sound as the character in question, alongside one with the same final sound.


Neurolinguistics

Using
functional magnetic resonance imaging Functional magnetic resonance imaging or functional MRI (fMRI) measures brain activity by detecting changes associated with blood flow. This technique relies on the fact that cerebral blood flow and neuronal activation are coupled. When an area o ...
(fMRI), neurolinguists have studied the brain activity associated with literacy. Compared to phonetic systems, reading and writing with characters involves additional areas of the brain—including those associated with visual processing. While the level of memorization required for character literacy is significant, identification of the phonetic and semantic components in compounds—which constitute the vast majority of characters—also plays a key role in reading comprehension. The ease of recognition for a given character is impacted by how regular the positioning of its components is, as well as how reliable its phonetic component is in indicating a specific pronunciation. Moreover, due to the high level of homophony in Chinese languages and the more irregular correspondences between writing and the sounds of speech, it has been suggested that knowledge of orthography plays a greater role in speech recognition for literate Chinese speakers. Developmental dyslexia in readers of character-based languages appears to involve independent visuospatial and phonological disorders co-occurring. This seems to be a distinct phenomenon from dyslexia as experienced with phonetic orthographies, which can result from only one of the aforementioned disorders.


Reform and standardization

Attempts to reform and standardize the use of characters—including aspects of form, stroke order, and pronunciation—have been undertaken by states throughout history. Thousands of
simplified characters Simplified Chinese characters are one of two standardized character sets widely used to write the Chinese language, with the other being traditional characters. Their mass standardization during the 20th century was part of an initiative by t ...
were standardized and adopted in mainland China during the 1950s and 1960s, with most either already existing as common variants, or being produced via the systematic simplification of their components. After World War II, the Japanese government also simplified hundreds of character forms, including some simplifications distinct from those adopted in China. Orthodox forms that have not undergone simplification are referred to as ''
traditional characters Traditional Chinese characters are a standard set of Chinese character forms used to write Chinese languages. In Taiwan, the set of traditional characters is regulated by the Ministry of Education and standardized in the ''Standard Form of ...
''. Across Chinese-speaking polities, mainland China, Malaysia, and Singapore use simplified characters, while Taiwan, Hong Kong, and Macau use traditional characters. In general, Chinese and Japanese readers can successfully identify characters from all three standards. Prior to the 20th century, reforms were generally conservative and sought to reduce the use of simplified variants. During the late 19th and early 20th centuries, an increasing number of intellectuals in China came to see both the Chinese writing system and the lack of a national spoken dialect as serious impediments to achieving the mass literacy and mutual intelligibility required for the country's successful modernization. Many began advocating for the replacement of Literary Chinese with a written language that more closely reflected speech, as well as for a mass simplification of character forms, or even the total replacement of characters with an alphabet tailored to a specific spoken variety. In 1909, the educator and linguist
Lufei Kui Lufei Kui (, 17 September 1886 – 9 July 1941) was a Chinese educator, essayist, linguist, and publisher. His courtesy name was Bohong (, ). He founded the influential publisher Zhonghua Book Company, and was an early advocate for simplified Ch ...
formally proposed the adoption of simplified characters in education for the first time. In 1911, the
Xinhai Revolution The 1911 Revolution, also known as the Xinhai Revolution or Hsinhai Revolution, ended China's last imperial dynasty, the Qing dynasty, and led to the establishment of the Republic of China (ROC). The revolution was the culmination of a decade ...
toppled the Qing dynasty, and resulted in the establishment of the
Republic of China Taiwan, officially the Republic of China (ROC), is a country in East Asia. The main geography of Taiwan, island of Taiwan, also known as ''Formosa'', lies between the East China Sea, East and South China Seas in the northwestern Pacific Ocea ...
the following year. The early Republican era (1912–1949) was characterized by growing social and political discontent that erupted into the 1919
May Fourth Movement The May Fourth Movement was a Chinese cultural and anti-imperialist political movement which grew out of student protests in Beijing on May 4, 1919. Students gathered in front of Tiananmen to protest the Chinese government's weak response ...
, catalysing the replacement of Literary Chinese with
written vernacular Chinese Written vernacular Chinese, also known as ''baihua'', comprises forms of written Chinese based on the vernacular varieties of the language spoken throughout China. It is contrasted with Literary Chinese, which was the predominant written form ...
over the subsequent decades. Alongside the corresponding spoken variety of
Standard Chinese Standard Chinese ( zh, s=现代标准汉语, t=現代標準漢語, p=Xiàndài biāozhǔn hànyǔ, l=modern standard Han speech) is a modern standard form of Mandarin Chinese that was first codified during the republican era (1912–1949). ...
, this written vernacular was promoted by intellectuals and writers such as Lu Xun and
Hu Shih Hu Shih ( zh, t=胡適; 17 December 189124 February 1962) was a Chinese academic, writer, and politician. Hu contributed to Chinese liberalism and language reform, and was a leading advocate for the use of written vernacular Chinese. He part ...
. It was based on the
Beijing dialect The Beijing dialect ( zh, s=北京话, t=北京話, p=Běijīnghuà), also known as Pekingese and Beijingese, is the prestige dialect of Mandarin spoken in the urban area of Beijing, China. It is the phonological basis of Standard Chinese, the ...
of
Mandarin Mandarin or The Mandarin may refer to: Language * Mandarin Chinese, branch of Chinese originally spoken in northern parts of the country ** Standard Chinese or Modern Standard Mandarin, the official language of China ** Taiwanese Mandarin, Stand ...
, as well as on the existing body of vernacular literature authored over the preceding centuries, which included classic novels such as ''
Journey to the West ''Journey to the West'' () is a Chinese novel published in the 16th century during the Ming dynasty and attributed to Wu Cheng'en. It is regarded as one of the Classic Chinese Novels, great Chinese novels, and has been described as arguably the ...
'' () and '' Dream of the Red Chamber'' (mid-18th century). At this time, character simplification and phonetic writing were being discussed within both the ruling
Kuomintang The Kuomintang (KMT) is a major political party in the Republic of China (Taiwan). It was the one party state, sole ruling party of the country Republic of China (1912-1949), during its rule from 1927 to 1949 in Mainland China until Retreat ...
(KMT) party, as well as the
Chinese Communist Party The Communist Party of China (CPC), also translated into English as Chinese Communist Party (CCP), is the founding and One-party state, sole ruling party of the People's Republic of China (PRC). Founded in 1921, the CCP emerged victorious in the ...
(CCP). In 1935, the Republican government published the first official list of simplified characters, comprising 324 forms collated by
Peking University Peking University (PKU) is a Public university, public Types of universities and colleges in China#By designated academic emphasis, university in Haidian, Beijing, China. It is affiliated with and funded by the Ministry of Education of the Peop ...
professor Qian Xuantong. However, strong opposition within the party resulted in the list being rescinded in 1936.


People's Republic of China

The project of script reform in China was ultimately inherited by the Communists, who resumed work following the
proclamation of the People's Republic of China The proclamation of the People's Republic of China was made by Mao Zedong, the chairman of the Chinese Communist Party (CCP), on October 1, 1949, in Tiananmen Square in Beijing. The government of a new state under the CCP, formally called ...
in 1949. In 1951, Premier
Zhou Enlai Zhou Enlai ( zh, s=周恩来, p=Zhōu Ēnlái, w=Chou1 Ên1-lai2; 5 March 1898 – 8 January 1976) was a Chinese statesman, diplomat, and revolutionary who served as the first Premier of the People's Republic of China from September 1954 unti ...
ordered the formation of a Script Reform Committee, with subgroups investigating both simplification and alphabetization. The simplification subgroup began surveying and collating simplified forms the following year, ultimately publishing a draft scheme of simplified characters and components in 1956. In 1958, Zhou Enlai announced the government's intent to focus on simplification, as opposed to replacing characters with
Hanyu Pinyin Hanyu Pinyin, or simply pinyin, officially the Chinese Phonetic Alphabet, is the most common romanization system for Standard Chinese. ''Hanyu'' () literally means ' Han language'—that is, the Chinese language—while ''pinyin'' literally ...
, which had been introduced earlier that year. The 1956 scheme was largely ratified by a revised list of characters promulgated in 1964. The majority of these characters were drawn from conventional abbreviations or ancient forms with fewer strokes. The committee also sought to reduce the total number of characters in use by merging some forms together. For example, was written as in oracle bone script. The simpler form remained in use as a loangraph meaning 'to say'; it was replaced in its original sense of 'cloud' with a form that added a semantic ('rain') component. The simplified forms of these two characters have been merged into . A second round of simplified characters was promulgated in 1977, but was poorly received by the public and quickly fell out of official use. It was ultimately formally rescinded in 1986. The second-round simplifications were unpopular in large part because most of the forms were completely new, in contrast to the familiar variants comprising the majority of the first round. With the rescission of the second round, work toward further character simplification largely came to an end. The ''Chart of Generally Utilized Characters of Modern Chinese'' was published in 1988 and included simplified and unsimplified characters. Of these, half were also included in the revised ''
List of Commonly Used Characters in Modern Chinese The ''List of Commonly Used Characters in Modern Chinese'' () is a list of 7,000 commonly used Chinese characters in Chinese. It was created in 1988 in the People's Republic of China China, officially the People's Republic of China (PR ...
'', which specified common characters and less common characters. In 2013, the ''
List of Commonly Used Standard Chinese Characters The ''List of Commonly Used Standard Chinese Characters'' is the current standard list of 8,105 Chinese characters published by the government of the People's Republic of China and promulgated in June 2013. The project began in 2001, origina ...
'' was published as a revision of the 1988 lists; it includes a total of characters.


Japan

After World War II, the Japanese government instituted its own program of orthographic reforms. Some characters were assigned simplified forms called ; the older forms were then labelled . Inconsistent use of different variant forms was discouraged, and lists of characters to be taught to students at each grade level were developed. The first of these was the -character kanji list published in 1946, later replaced by the -character kanji list in 1981. In 2010, the kanji were expanded to include a total of characters. The Japanese government restricts characters that may be used in names to the kanji, plus an additional list of 983 kanji whose use are historically prevalent in names.


South Korea

Hanja are still used in South Korea, though not to the extent that kanji are used in Japan. In general, there is a trend toward the exclusive use of hangul in ordinary contexts. Characters remain in use in place names, newspapers, and to disambiguate homophones. They are also used in the practice of calligraphy. Use of hanja in education is politically contentious, with official policy regarding the prominence of hanja in curricula having vacillated since the country's independence. Some support the total abandonment of hanja, while others advocate an increase in use to levels previously seen during the 1970s and 1980s. Students in grades 7–12 are presently taught with a principal focus on simple recognition and attaining sufficient literacy to read a newspaper. The South Korean Ministry of Education published the '' Basic Hanja for Educational Use'' in 1972, which specified characters meant to be learned by secondary school students. In 1991, the
Supreme Court of Korea The Supreme Court of Korea () is the highest ordinary court in the judicial branch of South Korea, seated in Seocho, Seoul. Established under Chapter 5 of the Constitution of South Korea, the court has ultimate and comprehensive jurisdictio ...
published the ''Table of Hanja for Use in Personal Names'' (; ), which initially included characters. The list has been expanded several times since; , it includes characters.


North Korea

In the years following its establishment, the North Korean government sought to eliminate the use of hanja in standard writing; by 1949, characters had been almost entirely replaced with hangul in North Korean publications. While mostly unused in writing, hanja remain an important part of North Korean education. A 1971 textbook for university history departments contained distinct characters, and in the 1990s North Korean schoolchildren were still expected to learn characters. A 2013 textbook appears to integrate the use of hanja in secondary school education. It has been estimated that North Korean students learn around hanja by the time they graduate university.


Taiwan

The '' Chart of Standard Forms of Common National Characters'' was published by Taiwan's Ministry of Education in 1982, and lists traditional characters. The Ministry of Education also compiles dictionaries of characters used in
Taiwanese Hokkien Taiwanese Hokkien ( , ), or simply Taiwanese, also known as Taigi ( zh, c=臺語, tl=Tâi-gí), Taiwanese Southern Min ( zh, c=臺灣閩南語, tl=Tâi-uân Bân-lâm-gí), Hoklo and Holo, is a variety of the Hokkien language spoken natively ...
and
Hakka The Hakka (), sometimes also referred to as Hakka-speaking Chinese, or Hakka Chinese, or Hakkas, are a southern Han Chinese subgroup whose principal settlements and ancestral homes are dispersed widely across the provinces of southern China ...
.


Other regional standards

Singapore's Ministry of Education promulgated three successive rounds of simplifications. The first round in 1969 included 502 simplified characters, and the second round in 1974 included simplified characters—including 49 that differed from those in the PRC, which were ultimately removed in the final round in 1976. In 1993, Singapore adopted the revisions made in mainland China in 1986. The Hong Kong Education and Manpower Bureau's ''
List of Graphemes of Commonly-Used Chinese Characters The ''List of Graphemes of Commonly-Used Chinese Characters'' () is a list of 4762 commonly used Chinese characters and their standardized forms prescribed by the Hong Kong Education Bureau. The list is meant to be taught in primary and middl ...
'' includes traditional characters used in elementary and junior secondary education.


Notes


References


Citations


Works cited

* * * * * * * ** ** ** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ** ** ** * * * * * ** * * * * * * * * * * * *


Primary and media sources


Further reading

* * * * * * * * *


External links


Unihan Database
eference glyphs, readings, and meanings for characters in ''The Unicode Standard'', with information about the history of Han unification
Chinese Text Project Dictionary
omprehensive character dictionary, including examples of Classical Chinese usage
zi.tools
haracter lookup by orthography, phonology, and etymology
Chinese Etymology
by Richard Sears {{Authority control Chinese culture Culture of East Asia East Asia Southeast Asia Writing systems without word boundaries