Ghost Characters
   HOME

TheInfoList



OR:

are erroneous
kanji are logographic Chinese characters, adapted from Chinese family of scripts, Chinese script, used in the writing of Japanese language, Japanese. They were made a major part of the Japanese writing system during the time of Old Japanese and are ...
included in the
Japanese Industrial Standard are the standards used for industrial activities in Japan, coordinated by the Japanese Industrial Standards Committee (JISC) and published by the Japanese Standards Association (JSA). The JISC is composed of many nationwide committees and play ...
,
JIS X 0208 JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...
. 12 of the 6,355 kanji characters are ghost characters.


Overview

In 1978, the
Ministry of Trade and Industry A ministry of trade and industry, ministry of commerce, ministry of commerce and industry or variations is a ministry that is concerned with a nation's trade, industry and commerce. Notable examples are: List *Algeria: Ministry of Industry and ...
established the standard JIS C 6226 (later
JIS X 0208 JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...
). This standard defined 6349 characters as JIS Level 1 and 2 Kanji characters. This set of Kanji characters is called "JIS Basic Kanji". At this time, the following four lists of Kanji characters were used as sources. # Kanji Table for Standard Codes (Draft): IPSJ Kanji Code Committee (1971) # National Land Administrative Districts Directory: Geographical Society of Japan (1972) # Nippon Seimei's family name table:
Nippon Life , also known as or is the largest Japanese life insurance company by revenue. The company was founded in 1889 as the ''Nippon Life Assurance Co., Inc.'' In structure it is a mutual company. It first paid policyholder dividends in 1898. Overv ...
(1973, no longer extant) # Basic Kanji for Administrative Information Processing: Administrative Management Agency (1975) At the time of the establishment of the standard, the authority for each character was not clearly stated, and it was pointed out that some characters had unknown meanings and usage examples. The term "ghost character" was coined from "
ghost word In folklore, a ghost is the soul or spirit of a dead person or non-human animal that is believed by some people to be able to appear to the living. In ghostlore, descriptions of ghosts vary widely, from an invisible presence to translucen ...
", meaning a word that is included in dictionaries but has no practical use. The most common examples are "妛" and "彁". These characters were never mentioned in the ''
Kangxi Dictionary The ''Kangxi Dictionary'' () is a Chinese dictionary published in 1716 during the High Qing, considered from the time of its publishing until the early 20th century to be the most authoritative reference for written Chinese characters. Wanting ...
'' or the ''
Dai Kan-Wa Jiten The is a Japanese dictionary of ''kanji'' (Chinese characters) compiled by Tetsuji Morohashi. Remarkable for its comprehensiveness and size, Morohashi's dictionary contains over 50,000 character entries and 530,000 compound words. Haruo Shira ...
'', a comprehensive collection of ancient Chinese character books. In 1997, the drafting committee for the revised standard, led by its chairman, Koji Shibano, and Hiroyuki Sasahara of the
National Institute for Japanese Language and Linguistics The (NINJAL) is an independent administrative institution in Japan, established for the purpose of studying, surveying, promoting, and making recommendations for the proper usage of the Japanese language.NINJALweb page (English)/ref> The insti ...
, investigated the literature referred to in the drafting of the 1978 standard. It was revealed that many of the characters that had been considered ghost characters were actually kanji used in place names. According to the survey, prior to the drafting of the 1978 standard, the Administrative Management Agency had compiled eight lists of Kanji characters, including the above 1–3, in 1974, entitled "Frequency of Use and Correspondence Analysis Results of Kanji Characters for Selection of Standard Kanji Characters for Administrative Information Processing." This is accompanied by a list of kanji characters and their original sources. The results of this correspondence analysis, rather than the original sources, were referred to when selecting the JIS basic kanji at that time. Of these, many ghost characters were found to be included in those based on the ''Comprehensive list of administrative divisions of national land'' and ''List of Kanji characters for personal names by Nippon Life Insurance Company''. In particular, the ''List of Kanji characters for personal names'' had no original source at the time of drafting the first standard, and its contents have been pointed out to be inadequate. In response to these results, the Standard Revision Committee restored the 1972 edition of the ''Comprehensive list of administrative divisions of national land'' from its proofreading history, and checked all the kanji appearing in the book against all the pages to confirm the examples. In addition, as a replacement for the ''List of Kanji characters for personal names'', which no longer exists, they conducted an exhaustive literature search, including a comparative study of the NTT and Nippon Telegraph and Telephone Public Corporation telephone directory databases and a survey of more than 30 ancient and modern character books. 12 kanji characters remain unidentified. 3 appear to be typos. Perhaps by coincidence, there are eight characters that were listed in the Japanese ancient dictionary or the Chinese ancient dictionary. As for "彁" is Ka, no concrete source has been found. Ghost characters have already been adopted into
international standard An international standard is a technical standard developed by one or more international standards organizations. International standards are available for consideration and use worldwide. The most prominent such organization is the International O ...
s such as
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
, and changes to these standards are likely to cause compatibility problems, making it virtually impossible to modify or remove ghost characters.


List of ghost characters


Identified sources

The results of the aforementioned survey by Hiroyuki Sasahara et al. are summarized in Annex 7, "Detailed Description of Ward Locations", of JIS X 0208:1997. This section excerpts some of them. JIS X 0208:1997 compiles the details of the sources of 72 characters whose sources have been identified, mainly those not listed in both Morohashi's ''
Dai Kan-Wa Jiten The is a Japanese dictionary of ''kanji'' (Chinese characters) compiled by Tetsuji Morohashi. Remarkable for its comprehensiveness and size, Morohashi's dictionary contains over 50,000 character entries and 530,000 compound words. Haruo Shira ...
'' and Kadokawa's ''Shin Jigen''. However, this also includes characters that have been found to be misspelled by the original sources. The list of delimiters appended as "source authority" in Annex 7 of JIS X 0208:1997 lists 72 characters, but the detailed text does not list "鰛(82-60)", and although "幤 (54-82)" is marked as "source authority," it is not listed in the list of 72 characters. Some of these are listed in the table below, including some that are known to be typos in the original text.


Unknown sources

JIS X 0208:1997 treats the 12 characters in the table below as "Authority unknown", "Unknown", or "Unidentifiable" because it is not certain which of the four aforementioned lists of kanji is the source of the characters. Since ghost characters are "kanji that do not exist", the readings are given "for convenience".


Possible typos

Some of the characters of unknown authority are believed to have been miswritten by the standard's creator. *It is possible that "壥" was miswritten because "㕓", which is similar to "壥", is not included in the JIS Basic Kanji. "㕓" is also not included in
JIS X 0213 JIS X 0213 is a Japanese Industrial Standard defining coded character sets for encoding the characters used in Japan. This standard extends JIS X 0208. The first version was published in 2000 and revised in 2004 (JIS2004) and 2012. As well as ad ...
. *It is possible that "妛" was miswritten because "𡚴", which is similar to "妛", is not included in the JIS Basic Kanji. In the National Land Administrative Districts Index, the source for this document, there is a shadow-like print mark on the overlay that appears to have been created by cutting and pasting together parts of different characters when creating the block, and it is assumed that this was mistakenly transcribed as a horizontal stroke. *It is possible that "椦" was miswritten because "橳", which is similar to "椦", is not included in the JIS Basic Kanji.


Treatment in dictionaries

Since the establishment of the standard, the policy for compiling dictionaries has been to publish character books that are based on the assumption that all JIS basic Kanji characters are listed. For ghost characters, it is not possible to refer to past sources. Therefore, their treatment differed depending on the dictionaries and individual characters as follows.


Makeshift readings assigned

In equipment that implements JIS basic Kanji characters, they are often assigned a phonetic reading. Some dictionaries also list these makeshift readings. Hiroyuki Sasahara points out that these readings may have been given based on a research report by the
Japan Electronics and Information Technology Industries Association The is a Japanese trade organization for the electronics and IT industries. It was formed in 2000 from two earlier organizations, the Electronic Industries Association of Japan and the Japan Electronic Industries Development Association. Histor ...
(1982) and published materials by
NEC is a Japanese multinational information technology and electronics corporation, headquartered at the NEC Supertower in Minato, Tokyo, Japan. It provides IT and network solutions, including cloud computing, artificial intelligence (AI), Inte ...
(1982) and
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
Japan (1983).


Regarded as variations of similar characters

Some have assigned "駲" as a variant of "馴" and "軅" as a variant of "軈". None of these sources provide a source. The character "妛" may be a typo of the very similar character "" (the upper "山" becomes "屮"), and it is found in the ''
Dai Kan-Wa Jiten The is a Japanese dictionary of ''kanji'' (Chinese characters) compiled by Tetsuji Morohashi. Remarkable for its comprehensiveness and size, Morohashi's dictionary contains over 50,000 character entries and 530,000 compound words. Haruo Shira ...
'' and ''
Kangxi Dictionary The ''Kangxi Dictionary'' () is a Chinese dictionary published in 1716 during the High Qing, considered from the time of its publishing until the early 20th century to be the most authoritative reference for written Chinese characters. Wanting ...
''. This is also introduced in the JIS X 0208:1997 survey with an example of implicit merging with an authoritative source. These two characters are also merged into the same code point in
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
.


Individual interpretations

Since "竜" is a variant of "龍", there is an interpretation that "槞" is a variant of "櫳". Some dictionaries consider "鵈"="耳(ear)"+"鳥(bird)" to be the character for
black kite The black kite (''Milvus migrans'') is a medium-sized bird of prey in the family Accipitridae, which also includes many other diurnal raptors. It is thought to be the world's most abundant species of Accipitridae, although some populations have ...
s.


Explained as unknown

After the results of the research above were published, these contents were generally adopted in dictionaries. The ''
Dai Kan-Wa Jiten The is a Japanese dictionary of ''kanji'' (Chinese characters) compiled by Tetsuji Morohashi. Remarkable for its comprehensiveness and size, Morohashi's dictionary contains over 50,000 character entries and 530,000 compound words. Haruo Shira ...
'' published a supplemental volume in 2000; the letters 垈, 垉, 岾, 橸, 汢, 粭, 糘, 膤, 軅 and 鵈 were recorded there. And Kadokawa's ''Shin Jigen'' (New Character Source) was revised in 2017 to include all JIS standards first through fourth, including ghost characters.


Character's remains

Since Chinese characters (including Japanese
kanji are logographic Chinese characters, adapted from Chinese family of scripts, Chinese script, used in the writing of Japanese language, Japanese. They were made a major part of the Japanese writing system during the time of Old Japanese and are ...
) have been used in East Asian countries since ancient times and have been handed down mainly by handwriting, there have arisen characters with slightly different writing styles from country to country or within a single country, so-called
variant Chinese characters Chinese characters may have several variant forms—visually distinct glyphs that represent the same underlying meaning and pronunciation. Variants of a given character are ''allographs'' of one another, and many are directly analogous to allog ...
. Unicode did not adopt all variations, and characters with only slight differences were inclusive and registered. On the other hand, combining simple parts of a Chinese characters to create another character has also been done in different countries and regions. As a result, the same Chinese characters may be invented in different countries by coincidence with different (sometimes identical) meanings. As mentioned above, it is presumed that the Japanese ghost character "妛" was originally just "𡚴", which is a combination of "山" and "女", but with an accidental "一" in between. On the other hand, there is a Chinese character in China "" which is a combination of "屮", "一", and "女" Which is also a variant of "媸". However, in Unicode, "妛", which did not originally exist in Japan, was encompassed because it happened to be similar to "". Moreover, the ''Japanese'' character "妛", which is a mistake, was registered as a Unicode character. Also, the Japanese ghost character "閠" (lower part is "玉") is thought to be a misspelling for "閏" (lower part is "王"). (A 16th-century manuscript of the Japanese 15th-century Wagokuhen also has the character "閠", but it is a solitary example.) On the other hand, the Chinese character in China "閠" is a kind of variant of "閏", which is not a misspelling. This was also unified in Unicode. Some believe that the Japanese ghost character "岾" is a
Kokuji In Japanese, or are kanji created in Japan rather than borrowed from China. Like most Chinese characters, they are primarily formed by combining existing characters - though using combinations that are not used in Chinese. Since kokuji ar ...
(a uniquely Japanese kanji) meaning bald mountains, and was originally a misspelling of "岵". In Korea, however, this character was created as a Chinese character meaning
mountain pass A mountain pass is a navigable route through a mountain range or over a ridge. Since mountain ranges can present formidable barriers to travel, passes have played a key role in trade, war, and both Human migration, human and animal migration t ...
. This was also unified in Unicode.


Contemporary use

Since the publishing of the standard, examples of ghost characters have appeared along with their widespread use. The "祢宜", the title of the deputy manager of a Japanese shrine, is sometimes misused as "袮宜" (using the radical instead of the correct ) In some cases, the Japanese surname "栩谷" is mistaken for "挧谷" (using the
Radical 64 or radical hand () meaning "hand" is one of the 34 Kangxi radicals (214 radicals in total) composed of 4 Stroke (CJK character), strokes. When appearing as a left-side component, this radical is almost always written as (notable exc ...
radical instead of the correct
Radical 75 or radical tree (), meaning "tree", is one of the 34 Kangxi radicals (214 radicals in total) composed of 4 strokes. In the ''Kangxi Dictionary'', there are 1,369 characters (out of 49,030) to be found under this radical. is also th ...
). Japanese folklorist Motoji Niwa introduced the surname "妛芸凡" (Akiōshi) in his book. ''
The Asahi Shimbun is a Japanese daily newspaper founded in 1879. It is one of the oldest newspapers in Japan and Asia, and is considered a newspaper of record for Japan. The ''Asahi Shimbun'' is one of the five largest newspapers in Japan along with the ''Yom ...
'' database contained the name "埼玉自彊会" printed in the February 23, 1923 edition of ''The Asahi Shimbun'', but when it was digitized, it was incorrectly labeled "埼玉自彁会." It has now been corrected.


Examples of use in fiction

Japanese
tokusatsu is a Japanese term for live-action films or television programs that make heavy use of practical special effects. Credited to special effects director Eiji Tsuburaya, ''tokusatsu'' mainly refers to science fiction film, science fiction, War fi ...
television series ''
Gosei Sentai Dairanger is a Japanese ''tokusatsu'' television series. It was the seventeenth production in the long-running Super Sentai metaseries of television tokusatsu dramas produced by Toei Company, following ''Kyōryū Sentai Zyuranger''. It was originally bro ...
'' features a character named "嘉挧" (Kaku). The name is taken from the ancient Chinese statesman
Jia Xu Jia Xu (147 – 11 August 223), courtesy name Wenhe, was an official of the state of Cao Wei during the early Three Kingdoms period of China. He started his career in the late Eastern Han dynasty as a minor official. In 189, when the warlord Don ...
(賈詡), but the characters have been replaced by ghost characters because the character "詡" is not registered in
JIS X 0208 JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...
. The book ''5A73'', by Japanese mystery writer Yuji Yomisaka, begins with a series of murders in which the ghost character "暃" is written on the bodies of the victims. The music game ''
Beatmania IIDX (IIDX) is a series of rhythm video games, that was first released by Konami in Japan on 26 February 1999. ''Beatmania IIDX'' has since spawned 32 arcade releases and 14 console releases on the Sony PlayStation 2. It is the sequel to the '' b ...
'' includes a song titled "閠槞彁の願い" that uses ghost characters. According to the comments on the song, the pronunciation is "unpronounceable to humans" and is tentatively called "Gyokurōka no Negai" (ぎょくろうかのねがい), which is the
ateji In modern Japanese, principally refers to kanji used to phonetically represent native or borrowed words with less regard to the underlying meaning of the characters. This is similar to in Old Japanese. Conversely, also refers to kanji used s ...
reading of the ghost characters.


Similar cases in Unicode

Unicode's
CJK Unified Ideographs The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Uni ...
also have characters whose inclusion history is unknown and are sometimes called "ghost characters" as well. For example, it has been pointed out that the character "螀" (U+8780), which was also registered in Unicode because it was included in the CCITT Chinese Primary Set, may be a typographical error that was adopted without sufficiently checking the source. was added by
Monotype Imaging Monotype Imaging Holdings Inc., founded as Lanston Monotype Machine Company in 1887 in Philadelphia by Tolbert Lanston, is an American (historically Anglo-American) company that specializes in digital typesetting and typeface design for use wit ...
to its mathematical sets in 1972 for unknown reasons. It has since been included in Unicode. In the
CJK Compatibility CJK Compatibility is a Unicode block containing square symbols (both CJK and Latin alphanumeric) encoded for compatibility with East Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) ...
block of Unicode 1.0, there is a square version of the Japanese word for "
baht The baht (; , ; currency sign, sign: ฿; ISO 4217, code: THB) is the official currency of Thailand. It is divided into 100 ''satang'' (, ). Prior to decimalisation, the baht was divided into eight ''fueang'' (, ), each of eight ''at'' (, ). The ...
", written in
katakana is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji and in some cases the Latin script (known as rōmaji). The word ''katakana'' means "fragmentary kana", as the katakana characters are derived fr ...
script. The Japanese for "baht" is (). However, the reference glyph and the character name correspond to (, from English "parts"). The CJK codepoint, , is documented in subsequent versions of the standard as "a mistaken, unused representation" and users are directed to instead. Consequently, only a few
computer fonts A computer font is implemented as a digital data file containing a set of graphically related glyphs. A computer font is designed and created using a font editor. A computer font specifically designed for the computer screen, and not for printi ...
have any content for this codepoint and its use is deprecated.


References

* {{cite book , last = Sasahara , first = Hiroyuki , year = 2007 , title =国字の位相と展開 , publisher=
Sanseidō is a Japanese publishing company known for publishing dictionaries and textbooks. The headquarters is situated in the area between Suidōbashi Station and Kanda River, at a location previously used as a warehouse for the company's own printing ...
, isbn=978-4-385-36263-2 , language=ja , trans-title=Phases and Development of the National Scrip Character encoding Chinese character lists Error Unicode