The CEDICT project was started by Paul Denisowski in 1997 and is maintained by a team on mdbg.net under the name CC-CEDICT, with the aim to provide a complete
Chinese to
English dictionary with pronunciation in
pinyin
Hanyu Pinyin, or simply pinyin, officially the Chinese Phonetic Alphabet, is the most common romanization system for Standard Chinese. ''Hanyu'' () literally means 'Han Chinese, Han language'—that is, the Chinese language—while ''pinyin' ...
for the Chinese characters.
Content
CEDICT is a
text file
A text file (sometimes spelled textfile; an old alternative name is flat file) is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system.
In ope ...
; other programs (or simply
Notepad or
egrep
grep is a command-line utility for searching plaintext datasets for lines that match a regular expression. Its name comes from the ed command g/re/p (global regular expression search and print), which has the same effect. grep was originally de ...
or equivalent) are needed to search and display it. This project is used by several other Chinese-English projects. The
Unihan Database uses CEDICT data for most of its information about character compounds, but this is auxiliary and is explicitly not a part of the main Unicode database.
Features:
*
Traditional Chinese
A tradition is a system of beliefs or behaviors (folk custom) passed down within a group of people or society with symbolic meaning or special significance with origins in the past. A component of cultural expressions and folklore, common examp ...
and
Simplified Chinese
Simplification, Simplify, or Simplified may refer to:
Mathematics
Simplification is the process of replacing a mathematical expression by an equivalent one that is simpler (usually shorter), according to a well-founded ordering. Examples include: ...
* Pinyin (several pronunciations)
* American English (several)
* , it had 122,444 entries in
UTF-8
UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8.
UTF-8 supports all 1,112,0 ...
.
The basic format of a CEDICT entry is:
Traditional Simplified
in1 yin1/American English equivalent 1/equivalent 2/
漢字 汉字
an4 zi4/Chinese character/CL:個, 个/
Example of a simple egrep search:
$ egrep -i 有勇無謀 cedict.txt
有勇無謀 有勇无谋
ou3 yong3 wu2 mou2/bold but not very astute/
History
Related projects
CEDICT has shown the way to some other projects:
*
HanDeDict (~156,000 Chinese entries)
*
CFDICT (~44,000 entries) for French
* Some older CEDICT data is also found in the
Adsotrans dictionary.
* February 2012
ChE-DICC the Spanish-Chinese free dictionary starts (currently beta)
* May 2017: CHDICT (11,000 entries) for Hungarian
* CC-Canto is
Pleco Software's addition of
Cantonese language
Cantonese is the traditional prestige variety of Yue Chinese, a Sinitic languages, Sinitic language belonging to the Sino-Tibetan language family. It originated in the city of Guangzhou (formerly known as Canton) and its surrounding Pearl River ...
readings in
Jyutping
The Linguistic Society of Hong Kong Cantonese Romanization Scheme, also known as Jyutping, is a romanisation system for Cantonese developed in 1993 by the Linguistic Society of Hong Kong (LSHK).
The name ''Jyutping'' (itself the Jyutping ro ...
transcription to CC-CEDICT
* Cantonese CEDICT features
Cantonese language
Cantonese is the traditional prestige variety of Yue Chinese, a Sinitic languages, Sinitic language belonging to the Sino-Tibetan language family. It originated in the city of Guangzhou (formerly known as Canton) and its surrounding Pearl River ...
readings in
Yale transcription and has Cantonese-specific words, many of which were taken from "A Dictionary of Cantonese Slang" in possible
copyright infringement
Copyright infringement (at times referred to as piracy) is the use of Copyright#Scope, works protected by copyright without permission for a usage where such permission is required, thereby infringing certain exclusive rights granted to the c ...
.
References
External links
CC-CEDICT EditorProject home page
more information on the formatting of CC-CEDICT
{{Dictionaries of Chinese
Chinese dictionaries
Translation dictionaries