HOME

TheInfoList



OR:

The
writing system A writing system is a method of visually representing verbal communication, based on a script and a set of rules regulating its use. While both writing and speech are useful in conveying messages, writing differs in also being a reliable fo ...
of the
Korean language Korean (South Korean: , ''hangugeo''; North Korean: , ''chosŏnmal'') is the native language for about 80 million people, mostly of Korean descent. It is the official and national language of both North Korea and South Korea (geographica ...
is a syllabic alphabet of character parts () organized into character blocks () representing
syllable A syllable is a unit of organization for a sequence of speech sounds typically made up of a syllable nucleus (most often a vowel) with optional initial and final margins (typically, consonants). Syllables are often considered the phonological ...
s. The character parts cannot be written from left to right on the computer, as in many Western languages. Every possible syllable in Korean would have to be rendered as syllable blocks by a
font In metal typesetting, a font is a particular size, weight and style of a typeface. Each font is a matched set of type, with a piece (a " sort") for each glyph. A typeface consists of a range of such fonts that shared an overall design. In mo ...
, or each character part would have to be
encoded In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication ...
separately.
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
has both options; the character parts (h) and (a), and the combined syllable (ha), are encoded.


Character encoding

In RFC 1557, a method known as
ISO-2022-KR ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the ...
for seven-bit encoding of Korean characters in
email Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" mean ...
was described. Where eight
bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...
s are allowed, EUC-KR encoding is preferred. These two encodings combine US-ASCII (
ISO 646 ISO/IEC 646 is a set of ISO/IEC standards, described as ''Information technology — ISO 7-bit coded character set for information interchange'' and developed in cooperation with ASCII at least since 1964. Since its first edition in ...
) with the Korean standard
KS X 1001 KS X 1001, "''Code for Information Interchange (Hangul and Hanja)''", formerly called KS C 5601, is a South Korean coded character set standard to represent hangul and hanja characters on a computer. KS X 1001 is encoded by the most common l ...
:1992 (previously named KS C 5601:1987). Another character set,
KPS 9566 KPS 9566 ("''DPRK Standard Korean Graphic Character Set for Information Interchange''") is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 speci ...
(similar to KS X 1001), is used in
North Korea North Korea, officially the Democratic People's Republic of Korea (DPRK), is a country in East Asia. It constitutes the northern half of the Korean Peninsula and shares borders with China and Russia to the north, at the Yalu (Amnok) and T ...
. The international
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
standard contains special characters for the Korean language in the
hangul The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The l ...
phonetic system. Unicode supports two methods. The method used by
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
is to have each of the 11,172 syllable combinations as code and a preformed font character. The other method encodes letters ('' jamos'') and lets the software combine them correctly. The Windows method requires more font memory but allows better shapes, since it is complicated to create stylistically-correct combinations (preferable for documents). Another possibility is stacking a sequence of medial(s) (''jungseong'') and a sequence of
final Final, Finals or The Final may refer to: *Final (competition), the last or championship round of a sporting competition, match, game, or other contest which decides a winner for an event ** Another term for playoffs, describing a sequence of con ...
(s) (''jongseong'') or a
Middle Korean Middle Korean is the period in the history of the Korean language succeeding Old Korean and yielding in 1600 to the Modern period. The boundary between the Old and Middle periods is traditionally identified with the establishment of Goryeo in 9 ...
pitch mark (if needed) on top of the sequence of
initial In a written or published work, an initial capital, also referred to as a drop capital or simply an initial cap, initial, initcapital, initcap or init or a drop cap or drop, is a letter at the beginning of a word, a chapter, or a paragraph tha ...
(s) (''choseong'') if the font has medial and final ''jamos'' with zero-width spacing inserted to the left of the cursor or caret, thus appearing in the right place below (or to the right of) the initial. If a syllable has a horizontal medial (, , , or ), the initial will probably appear further left in a complete syllable than in preformed syllables due to the space that must be reserved for a vertical medial, making aesthetically poor what may be the only way to display Middle Korean hangul text without resorting to images, romanization, replacement of obsolete jamo or non-standard encodings. However, most current fonts do not support this. The Unicode standard also has attempted to create a unified CJK character set which can represent Chinese (
Hanzi Chinese characters () are logograms developed for the writing of Chinese. In addition, they have been adapted to write other East Asian languages, and remain a key component of the Japanese writing system where they are known as ''kanji' ...
) and the Japanese (
Kanji are the logographic Chinese characters taken from the Chinese script and used in the writing of Japanese. They were made a major part of the Japanese writing system during the time of Old Japanese and are still used, along with the subsequ ...
) and Korean (
Hanja Hanja (Hangul: ; Hanja: , ), alternatively known as Hancha, are Chinese characters () used in the writing of Korean. Hanja was used as early as the Gojoseon period, the first ever Korean kingdom. (, ) refers to Sino-Korean vocabulary, ...
) derivatives of this script through
Han unification Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a featur ...
, which does not discriminate by language or region in rendering Chinese characters if the typographic traditions have not resulted in major differences in what a character looks like. Han unification has been criticized.


Text input

On a Korean computer keyboard, text is typically entered by pressing a key for the appropriate jamo; the operating system creates each composite character on the fly. Depending on the Input method editor and keyboard layout, double consonants can be entered by holding the shift button. When all jamo making up a syllabic block have been entered, the user may initiate a conversion to
hanja Hanja (Hangul: ; Hanja: , ), alternatively known as Hancha, are Chinese characters () used in the writing of Korean. Hanja was used as early as the Gojoseon period, the first ever Korean kingdom. (, ) refers to Sino-Korean vocabulary, ...
(or other special characters) using a keyboard shortcut or interface button; South Korean keyboards have a key for this. Subsequent semi-automated hanja conversion is supported in varying degrees by word processors. When using a keyboard with another language, most operating systems require the user to type with an original Korean keyboard layout; the most common is Dubeolsik. In other languages, such as Japanese, text can be entered on non-native keyboards with
romanization Romanization or romanisation, in linguistics, is the conversion of text from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, a ...
. Operating systems such as
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
allow ''engine/hangul/hangul-keyboard='ro'', resulting in a
romaja Romanization of Korean refers to systems for representing the Korean language in the Latin script. Korea's alphabetic script, called Hangul, has historically been used in conjunction with Hanja (Chinese characters), though such practice has bec ...
keyboard; typing "seonggye" results in 성계. In this configuration, ㄲ is obtained by "gg" rather than . This allows keying "jasanGun" to obtain 자산군, instead of keying "jasangun" (which would provide 자상운).


Korean typewriters


Before Korean division

Korean text input is related to Korean typewriters (타자기) before computers. The first Korean typewriter is unclear; according to Jang Bong Seon,
Horace Grant Underwood Horace Grant Underwood (19 July 1859 – 12 October 1916) was a Presbyterian missionary, educator, and translator who dedicated his life to developing Christianity in Korea. Early life Underwood was born in London and immigrated to the United ...
made a Korean typewriter during the first decade of the 20th century. Lee Won Ik, living in the United States, has been credited with developing the first Korean typewriter in 1914. In 1927, Song Ki Joo invented the first Dubeolsik typewriter in Chicago; however, it no longer exists. Song's 1934 typewriter is stored in the
Hangul The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The l ...
museum as the oldest existing Korean typewriter. The invention of the typewriter led to the development of other typewriters in 1945 by Kim Joon Sung and 1950 by Kong Byung Woo.


After division

South Korea originally had a Nebeolsik standard, but Dubeolsik became standard in 1985.


Hanja

Some Korean fonts do not include
hanja Hanja (Hangul: ; Hanja: , ), alternatively known as Hancha, are Chinese characters () used in the writing of Korean. Hanja was used as early as the Gojoseon period, the first ever Korean kingdom. (, ) refers to Sino-Korean vocabulary, ...
, and word processors do not allow a user to specify which font to use as a fallback for any hanja in a text; each hanja sequence must be manually formatted for a desired font.


Pitch marks and vertical text

Vertical text is supported poorly (or not at all) by
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaS ...
and most word processors. This is not an issue for modern Korean, which is usually written horizontally; until the second half of the 20th century, however, Korean was often written vertically. Fifteenth-century texts written in
hangul The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The l ...
had pitch marks to the left of syllables which are included in Unicode, although current fonts do not support them.


Programs

Programs designed for Korean language-related use include: * Language recognition ** A North Korean
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...
program is said to recognize 100,000 words, with a success rate of over 90 percent. ** '' Mongnan'' (; Korea Computer Center, North Korea)
Optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...
software, with a reported success rate of 99 percent for printed text and 95 percent for
handwriting recognition Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other de ...
. *
Input method An input method (or input method editor, commonly abbreviated IME) is an operating system component or program that enables users to generate characters not natively available on their input devices by using sequences of characters (or mouse o ...
editors ** '' Tan'gun'' (; Pyongyang Information Center, North Korea) Allows hangul on English versions of Windows. ** Nalgaeset Hangul Input Method Editor (날개셋 한글 입력기); Kim Yongmook, South Korea) A hangul input method developed for the 3(se)-beolsik Windows keyboard layout ** ''Nabi'' (), ''ami'' (; South Korea)Permits hangul on
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
** m17nPermits revised romanization for hangul input on
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, ...
** SCIM and IBusPermits hangul and hanja input on
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming in ...
operating systems (including Linux and BSD) *
Word processor A word processor (WP) is a device or computer program that provides for input, editing, formatting, and output of text, often with some additional features. Early word processors were stand-alone devices dedicated to the function, but current ...
sThe following programs include domestic hangul fonts, non-hangul fonts and a hangul-
hanja Hanja (Hangul: ; Hanja: , ), alternatively known as Hancha, are Chinese characters () used in the writing of Korean. Hanja was used as early as the Gojoseon period, the first ever Korean kingdom. (, ) refers to Sino-Korean vocabulary, ...
conversion utility. **
Hangul The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The l ...
( Hancom, South Korea) ** Changdok (; PIC, North Korea)
MS-DOS MS-DOS ( ; acronym for Microsoft Disk Operating System, also known as Microsoft DOS) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and a few o ...
program developed in April 1990; a Windows version was developed in 1996. It has a personality-cult feature in which pressing or produces titles praising
Kim Il-sung Kim Il-sung (; , ; born Kim Song-ju, ; 15 April 1912 – 8 July 1994) was a North Korean politician and the founder of North Korea, which he ruled from the country's establishment in 1948 until his death in 1994. He held the posts of ...
and
Kim Jong-il Kim Jong-il (; ; ; born Yuri Irsenovich Kim;, 16 February 1941 – 17 December 2011) was a North Korean politician who was the second supreme leader of North Korea from 1994 to 2011. He led North Korea from the 1994 death of his father Ki ...
, respectively.


Hangul in Unicode

Hangul letters are detailed in several parts of Unicode: *
Hangul Syllables Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables can be directly mapped by algorithm to sequences of two or three characters in the Hangul Jamo Unicode block: * one of U+1100� ...
(AC00–D7A3) * Hangul Jamo (1100–11FF) *
Hangul Compatibility Jamo Hangul Compatibility Jamo is a Unicode block containing Hangul characters for compatibility with the South Korean national standard KS X 1001 KS X 1001, "''Code for Information Interchange (Hangul and Hanja)''", formerly called KS C 5601, ...
(3130–318F) *
Hangul Jamo Extended-A Hangul Jamo Extended-A is a Unicode block containing ''choseong'' (initial consonant) forms of archaic Hangul consonant clusters. They can be used to dynamically compose syllables that are not available as precomposed Hangul syllables in Unic ...
(A960–A97F) *
Hangul Jamo Extended-B Hangul Jamo Extended-B is a Unicode block containing positional (''jungseong'' and ''jongseong'') forms of archaic Hangul vowel and consonant clusters. They can be used to dynamically compose syllables that are not available as precomposed Ha ...
(D7B0–D7FF)


Hangul syllables block

Pre-composed hangul syllables in the Unicode hangul syllables block are algorithmically defined with the following formula: : initial) × 588 + (medial) × 28 + (final)+ 44032 * Initial consonants * Medial vowels * Final consonants To find the code point of " " in Unicode: * The value of the initial consonant (ㅎ) is 18. * The value of the medial vowel (ㅏ) is 0. * The value of the final consonant (ㄴ) is 4. Substituting these values in the formula above yields 18 × 588) + (0 × 28) + 4+ 44032 = 54620. The Unicode value of 한 is 54620 in decimal, 한 in
numeric character reference A numeric character reference (NCR) is a common markup construct used in SGML and SGML-derived markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represents a single character. Since WebSgml, XML ...
, and U+D55C in hexadecimal Unicode notation.


Hangul Compatibility Jamo block

The Unicode
Hangul Compatibility Jamo Hangul Compatibility Jamo is a Unicode block containing Hangul characters for compatibility with the South Korean national standard KS X 1001 KS X 1001, "''Code for Information Interchange (Hangul and Hanja)''", formerly called KS C 5601, ...
block has been allocated for compatibility with the
KS X 1001 KS X 1001, "''Code for Information Interchange (Hangul and Hanja)''", formerly called KS C 5601, is a South Korean coded character set standard to represent hangul and hanja characters on a computer. KS X 1001 is encoded by the most common l ...
character set. It is usually used to represent hangul without distinguishing initials and finals.


Hangul Jamo blocks

The Hangul Jamo,
Hangul Jamo Extended-A Hangul Jamo Extended-A is a Unicode block containing ''choseong'' (initial consonant) forms of archaic Hangul consonant clusters. They can be used to dynamically compose syllables that are not available as precomposed Hangul syllables in Unic ...
and
Hangul Jamo Extended-B Hangul Jamo Extended-B is a Unicode block containing positional (''jungseong'' and ''jongseong'') forms of archaic Hangul vowel and consonant clusters. They can be used to dynamically compose syllables that are not available as precomposed Ha ...
blocks contain initial, medial and final jamo, including obsolete jamo.


Hanyang Private Use Area code

Hangul (word processor) Hangul Office ( ko, 한글 오피스) is a proprietary word processing application published by the South Korean company Hancom Inc. Hangul's specialized support for the Korean written language has gained it widespread use in South Korea, esp ...
shipped with fonts from Hanyang Information and Communication, which map obsolete hangul characters with Unicode's
Private Use Areas In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane (), and one each in, and near ...
. Despite the use of PUAs instead of dedicated
code point In character encoding terminology, a code point, codepoint or code position is a numerical value that maps to a specific character. Code points usually represent a single grapheme—usually a letter, digit, punctuation mark, or whitespace—but ...
s, Hanyang’s mapping was the most popular way to represent obsolete hangul in South Korea in 2007. With its Hangul 2010, however, Hancom deprecated Hanyang PUA code and began representing obsolete hangul characters with Unicode hangul jamo.


See also

*
Japanese language and computers In relation to the Japanese language and computers many adaptation issues arise, some unique to Japanese and others common to languages which have a very large number of characters. The number of characters needed in order to write in English is ...
*
Vietnamese language and computers The Vietnamese language is written with a Latin script with diacritics ( accent tones) which requires several accommodations when typing on phone or computers. Software-based systems are a form of writing Vietnamese on phones or computers with softw ...
*
List of CJK fonts This is a list of notable CJK fonts ( computer fonts which contain a large range of Chinese/Japanese/Korean characters). These fonts are primarily sorted by their typeface, the main classes being "with serif", "without serif" and "script". In th ...
*
Chinese input methods for computers Chinese input methods are methods that allow a computer user to input Chinese characters. Most, if not all, Chinese input methods fall into one of two categories: phonetic readings or root shapes. Methods under the phonetic category usually are e ...
*
McCune–Reischauer McCune–Reischauer romanization () is one of the two most widely used Korean language romanization systems. A modified version of McCune–Reischauer was the official romanization system in South Korea until 2002, when it was replaced by the R ...
*
Yale romanization of Korean The Yale romanization of Korean was developed by Samuel Elmo Martin and his colleagues at Yale University about half a decade after McCune–Reischauer. It is the standard romanization of the Korean language in linguistics. The Yale system pla ...
*
Revised Romanization of Korean Revised Romanization of Korean () is the official Korean language romanization system in South Korea. It was developed by the National Academy of the Korean Language from 1995 and was released to the public on 7 July 2000 by South Korea's Mini ...
*
New Korean Orthography The New Korean Orthography was a spelling reform used in North Korea from 1948 to 1954. It added five consonants and one vowel letter to the Hangul alphabet, supposedly making it a more morphophonologically "clear" approach to the Korean language ...


References


External links


Online Korean Virtual Keyboard

InputKing Online Input System
an online tool for typing Korean *   *   * , an online tool for converting Korean text into various coding formats and vice versa {{DEFAULTSORT:Korean Language And Computers Character encoding Han character input Science and technology in Korea Communications in Korea
computers A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These prog ...
Natural language and computing Korean-language computing