HOME

TheInfoList



OR:

The
writing system A writing system comprises a set of symbols, called a ''script'', as well as the rules by which the script represents a particular language. The earliest writing appeared during the late 4th millennium BC. Throughout history, each independen ...
of the
Korean language Korean is the first language, native language for about 81 million people, mostly of Koreans, Korean descent. It is the national language of both South Korea and North Korea. In the south, the language is known as () and in the north, it is kn ...
is a syllabic alphabet of character parts () organized into character blocks (; ) representing
syllable A syllable is a basic unit of organization within a sequence of speech sounds, such as within a word, typically defined by linguists as a ''nucleus'' (most often a vowel) with optional sounds before or after that nucleus (''margins'', which are ...
s. The character parts cannot be written from left to right on the computer, as in many Western languages. Every possible syllable in Korean would have to be rendered as syllable blocks by a
font In metal typesetting, a font is a particular size, weight and style of a ''typeface'', defined as the set of fonts that share an overall design. For instance, the typeface Bauer Bodoni (shown in the figure) includes fonts " Roman" (or "regul ...
, or each character part would have to be
encoded In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication ...
separately.
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
has both options; the character parts (h) and (a), and the combined syllable (ha), are encoded.


Character encoding

In RFC 1557, a method known as
ISO-2022-KR ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/ IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japane ...
for seven-bit encoding of Korean characters in
email Electronic mail (usually shortened to email; alternatively hyphenated e-mail) is a method of transmitting and receiving Digital media, digital messages using electronics, electronic devices over a computer network. It was conceived in the ...
was described. Where eight bits are allowed, EUC-KR encoding is preferred. These two encodings combine US-ASCII (
ISO 646 ISO/IEC 646 ''Information technology — ISO 7-bit coded character set for information interchange'', is an International Organization for Standardization, ISO/International Electrotechnical Commission, IEC standard in the ...
) with the Korean standard
KS X 1001 KS X 1001, "''Code for Information Interchange (Hangul and Hanja)''", formerly called KS C 5601, is a South Korean coded character set standard to represent Hangul and Hanja characters on a computer. KS X 1001 is encoded by the most common leg ...
:1992 (previously named KS C 5601:1987). Another character set, KPS 9566 (similar to KS X 1001), is used in
North Korea North Korea, officially the Democratic People's Republic of Korea (DPRK), is a country in East Asia. It constitutes the northern half of the Korea, Korean Peninsula and borders China and Russia to the north at the Yalu River, Yalu (Amnok) an ...
. The international
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
standard contains special characters for the Korean language in the
Hangul The Korean alphabet is the modern writing system for the Korean language. In North Korea, the alphabet is known as (), and in South Korea, it is known as (). The letters for the five basic consonants reflect the shape of the speech organs ...
phonetic system. Unicode supports two methods. The method used by
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
is to have each of the 11,172 syllable combinations as code and a preformed font character. The other method encodes letters ('' jamos'') and lets the software combine them correctly. The Windows method requires more font memory but allows better shapes, since it is complicated to create stylistically correct combinations (preferable for documents). Another possibility is stacking a sequence of medial(s) (''jungseong'') and a sequence of
final Final, Finals or The Final may refer to: *Final examination or finals, a test given at the end of a course of study or training *Final (competition), the last or championship round of a sporting competition, match, game, or other contest which d ...
(s) (''jongseong'') or a
Middle Korean Middle Korean is the period in the history of the Korean language succeeding Old Korean and yielding in 1600 to the Modern period. The boundary between the Old and Middle periods is traditionally identified with the establishment of Goryeo in 918 ...
pitch mark (if needed) on top of the sequence of
initial In a written or published work, an initial is a letter at the beginning of a word, a chapter (books), chapter, or a paragraph that is larger than the rest of the text. The word is ultimately derived from the Latin ''initiālis'', which means '' ...
(s) (''choseong'') if the font has medial and final ''jamo'' with zero-width spacing inserted to the left of the cursor or caret, thus appearing in the right place below (or to the right of) the initial. If a syllable has a horizontal medial (, , , or ), the initial will probably appear further left in a complete syllable than in preformed syllables due to the space that must be reserved for a vertical medial, making aesthetically poor what may be the only way to display Middle Korean hangul text without resorting to images, romanization, replacement of obsolete jamo or non-standard encodings. However, most current fonts do not support this. The Unicode standard also has attempted to create a unified CJK character set which can represent Chinese (
Hanzi Chinese characters are logographs used to write the Chinese languages and others from regions historically influenced by Chinese culture. Of the four independently invented writing systems accepted by scholars, they represent the only one ...
) and the Japanese (
Kanji are logographic Chinese characters, adapted from Chinese family of scripts, Chinese script, used in the writing of Japanese language, Japanese. They were made a major part of the Japanese writing system during the time of Old Japanese and are ...
) and Korean (
Hanja Hanja (; ), alternatively spelled Hancha, are Chinese characters used to write the Korean language. After characters were introduced to Korea to write Literary Chinese, they were adapted to write Korean as early as the Gojoseon period. () ...
) derivatives of this script through
Han unification Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a featur ...
, which does not discriminate by language or region in rendering Chinese characters if the typographic traditions have not resulted in major differences in what a character looks like. Han unification has been criticized.


Hangul type, Korean typewriters

While the first Korean typewriter (한글 타자기, ''Hangeul tajagi'') is unclear, the first ''moa-sseugi'' style (모아쓰기, the form of Hangul where consonants and vowels come together to form a letter; The standard form of Hangul used today) typewriter is thought to be first invented by Korean-American ''gyopo'' Lee Won-Ik (이원익) in 1914, where he modified a Smith Premier 10 typewriter's type into Hangul. Alongside Lee Won-ik's, Horace Grant Underwood's 1913 US-patented Hangul type, ''the Underwood,'' and another Korean-American Kim Jun-Sung's Hangul type are also brought up when discussing the first ''Moa-Sugi'' type. In 1929, the first Dubeolsik typewriter was made by Song Ki-Ju, a student studying abroad in the US, gaining attention from the Donga ilbo, however, it no longer exists; In 1934 he showcased another type, which was a modification of the ''Underwood portable''. Song's 1934 typewriter is stored in the Hangul museum as the oldest existing Korean typewriter. The invention led to the development of other typewriters in 1945 by Kim Joon Sung and 1950 by Kong Byung Woo. In 1949, eye doctor Kong Byung-Woo made the first practical Hangul type able to write both in ''Moa-Sugi'' and horizontally.


Modern text input

On a Korean computer keyboard, text is typically entered by pressing a key for the appropriate jamo; the operating system creates each composite character on the fly. Depending on the Input method editor and keyboard layout, double consonants can be entered by holding the shift button. When all jamo making up a syllabic block has been entered, the user may initiate a conversion to
hanja Hanja (; ), alternatively spelled Hancha, are Chinese characters used to write the Korean language. After characters were introduced to Korea to write Literary Chinese, they were adapted to write Korean as early as the Gojoseon period. () ...
(or other special characters) using a keyboard shortcut or interface button; South Korean keyboards have a key for this. Subsequent semi-automated hanja conversion is supported in varying degrees by word processors. When using a keyboard with another language, most operating systems require the user to type with an original Korean keyboard layout; the most common is Dubeolsik. In other languages, such as Japanese, text can be entered on non-native keyboards with
romanization In linguistics, romanization is the conversion of text from a different writing system to the Latin script, Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, and tra ...
. Operating systems such as
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
allow ''engine/hangul/hangul-keyboard='ro'', resulting in a ''romaja'' keyboard; typing "seonggye" results in . In this configuration, ㄲ is obtained by "gg" rather than . This allows keying "jasanGun" to obtain , instead of keying "jasangun" (which would provide ).


Before Korean division

Korean text input is related to Korean typewriters () before computers. according to Jang Bong Seon, Horace Grant Underwood made a Korean typewriter during the first decade of the 20th century. In 1927, Song Ki Joo invented the first Dubeolsik typewriter in Chicago.


After division

South Korea originally had a Nebeolsik standard, but Dubeolsik became standard in 1985.


Hanja

Some Korean fonts do not include
hanja Hanja (; ), alternatively spelled Hancha, are Chinese characters used to write the Korean language. After characters were introduced to Korea to write Literary Chinese, they were adapted to write Korean as early as the Gojoseon period. () ...
, and word processors do not allow a user to specify which font to use as a fallback for any hanja in a text; each hanja sequence must be manually formatted for a desired font.


Pitch marks and vertical text

Vertical text is supported poorly (or not at all) by
HTML Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
and most word processors. This is not an issue for modern Korean, which is usually written horizontally; until the second half of the 20th century, however, Korean was often written vertically. Fifteenth-century texts written in
hangul The Korean alphabet is the modern writing system for the Korean language. In North Korea, the alphabet is known as (), and in South Korea, it is known as (). The letters for the five basic consonants reflect the shape of the speech organs ...
had pitch marks to the left of syllables which are included in Unicode, although current fonts do not support them.


Programs

Programs designed for Korean language-related use include: * Language recognition ** A North Korean
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
program is said to recognize 100,000 words, with a success rate of over 90 percent. ** '' Mongnan'' (; Korea Computer Center, North Korea)
Optical character recognition Optical character recognition or optical character reader (OCR) is the electronics, electronic or machine, mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo ...
software, with a reported success rate of 99 percent for printed text and 95 percent for
handwriting recognition Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwriting, handwritten input from sources such as paper documents, photographs, touch-screens ...
. *
Input method An input method (or input method editor, commonly abbreviated IME) is an operating system component or program that enables users to generate characters not natively available on their input devices by using sequences of characters (or mouse oper ...
editors ** '' Tan'gun'' (; Pyongyang Information Center, North Korea) Allows hangul on English versions of Windows. ** Korean IME (
Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
) Allows hangul on all versions of Windows. ** Nalgaeset Hangul Input Method Editor (날개셋 한글 입력기); Kim Yongmook, South Korea) A Hangul input method developed for the sebeolsik (3-set style) Windows keyboard layout ** ''Nabi'' (), ''ami'' (; South Korea)Permits hangul on
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
** m17nPermits revised romanization for hangul input on
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
** SCIM and
IBus The principal factors that characterize beer are bitterness, the variety of flavours present in the beverage and their intensity, ethanol, alcohol content, and colour. Standards for those characteristics allow a more objective and uniform determ ...
Permits Hangul and hanja input on
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
operating systems (including Linux and
BSD The Berkeley Software Distribution (BSD), also known as Berkeley Unix or BSD Unix, is a discontinued Unix operating system developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berkeley, beginni ...
) *
Word processor A word processor (WP) is a device or computer program that provides for input, editing, formatting, and output of text, often with some additional features. Early word processors were stand-alone devices dedicated to the function, but current word ...
sThe following programs include domestic Hangul fonts, non-Hangul fonts and a Hangul-
hanja Hanja (; ), alternatively spelled Hancha, are Chinese characters used to write the Korean language. After characters were introduced to Korea to write Literary Chinese, they were adapted to write Korean as early as the Gojoseon period. () ...
conversion utility. **
Hangul The Korean alphabet is the modern writing system for the Korean language. In North Korea, the alphabet is known as (), and in South Korea, it is known as (). The letters for the five basic consonants reflect the shape of the speech organs ...
( Hancom, South Korea) ** Changdok (; PIC, North Korea)
MS-DOS MS-DOS ( ; acronym for Microsoft Disk Operating System, also known as Microsoft DOS) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and a few op ...
program developed in April 1990; a Windows version was developed in 1996. It has a personality-cult feature in which pressing or produces titles praising
Kim Il Sung Kim Il Sung (born Kim Song Ju; 15 April 1912 – 8 July 1994) was a North Korean politician and the founder of North Korea, which he led as its first Supreme Leader (North Korean title), supreme leader from North Korea#Founding, its establishm ...
and
Kim Jong Il Kim Jong Il (born Yuri Kim; 16 February 1941 or 1942 – 17 December 2011) was a North Korean politician who was the second Supreme Leader (North Korean title), supreme leader of North Korea from Death and state funeral of Kim Il Sung, the de ...
, respectively.


Hangul in Unicode

Hangul letters are detailed in several parts of Unicode: *
Hangul Syllables Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables Korean language and computers#Hangul in Unicode, can be directly mapped by algorithm to sequences of two or three characters in th ...
(AC00–D7A3) *
Hangul Jamo This is the list of Hangul ''jamo'' (Korean alphabet letters which represent consonants and vowels in Korean) including obsolete ones. This list contains Unicode code points. In the lists below, * code points in were added in .
(1100–11FF) * Hangul Compatibility Jamo (3130–318F) * Hangul Jamo Extended-A (A960–A97F) * Hangul Jamo Extended-B (D7B0–D7FF)


Hangul Syllables block

Pre-composed Hangul syllables in the Unicode
Hangul Syllables Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables Korean language and computers#Hangul in Unicode, can be directly mapped by algorithm to sequences of two or three characters in th ...
block are algorithmically defined with the following formula: : initial) × 588 + (medial) × 28 + (final)+ 44032 * Initial consonants * Medial vowels * Final consonants To find the code point of "한" in Unicode: * The value of the initial consonant (ㅎ) is 18. * The value of the medial vowel (ㅏ) is 0. * The value of the final consonant (ㄴ) is 4. Substituting these values in the formula above yields 18 × 588) + (0 × 28) + 4+ 44032 = 54620. The Unicode value of 한 is 54620 in decimal, 한 in
numeric character reference A numeric character reference (NCR) is a common markup construct used in SGML and SGML-derived markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represents a single character. Since WebSgml, XM ...
, and U+D55C in hexadecimal Unicode notation.


Hangul Compatibility Jamo block

The Unicode Hangul Compatibility Jamo block has been allocated for compatibility with the
KS X 1001 KS X 1001, "''Code for Information Interchange (Hangul and Hanja)''", formerly called KS C 5601, is a South Korean coded character set standard to represent Hangul and Hanja characters on a computer. KS X 1001 is encoded by the most common leg ...
character set. It is usually used to represent hangul without distinguishing initials and finals.


Hangul Jamo blocks

The
Hangul Jamo This is the list of Hangul ''jamo'' (Korean alphabet letters which represent consonants and vowels in Korean) including obsolete ones. This list contains Unicode code points. In the lists below, * code points in were added in .
, Hangul Jamo Extended-A and Hangul Jamo Extended-B blocks contain initial, medial and final jamo, including obsolete jamo.


Hanyang Private Use Area code

Hangul (word processor) shipped with fonts from Hanyang Information and Communication, which map obsolete Hangul characters with Unicode's Private Use Areas. Despite the use of PUAs instead of dedicated
code point A code point, codepoint or code position is a particular position in a Table (database), table, where the position has been assigned a meaning. The table may be one dimensional (a column), two dimensional (like cells in a spreadsheet), three dime ...
s, Hanyang's mapping was the most popular way to represent obsolete Hangul in South Korea in 2007. With its Hangul 2010, however, Hancom deprecated Hanyang PUA code and began representing obsolete Hangul characters with Unicode Hangul jamo.


See also

*
Japanese language and computers In relation to the Japanese language and computers many adaptation issues arise, some unique to Japanese language, Japanese and others common to languages which have a very large number of characters. The number of characters needed in order to w ...
* Vietnamese language and computers *
List of CJK fonts This is a list of notable CJK fonts (computer fonts with a large range of CJK characters, Chinese/Japanese/Korean characters). These fonts are primarily sorted by their typeface, the main classes being "with serif", "without serif" and "script". ...
*
Chinese input methods for computers Several input methods allow the use of Chinese characters with computers. Most allow selection of characters based either on their pronunciation or their graphical shape. Phonetic input methods are easier to learn but are less efficient, while g ...
*
McCune–Reischauer McCune–Reischauer romanization ( ) is a romanization system for the Korean language. It was first published in 1939 by George M. McCune and Edwin O. Reischauer. According to Reischauer, McCune "persuaded the American Army Map Service to ad ...
* Yale romanization of Korean *
Revised Romanization of Korean Revised Romanization of Korean () is the official Romanization of Korean, Korean language romanization system in South Korea. It was developed by the National Institute of Korean Language, National Academy of the Korean Language from 1995 and w ...
*
New Korean Orthography The New Korean Orthography was a spelling reform used in North Korea from 1948 to 1954. It added five consonants and one vowel letter to the Hangul alphabet in an attempt to fit the morphophonology of the Korean language. Its use has since be ...


References


External links


Online Korean Virtual Keyboard

InputKing Online Input System
an online tool for typing Korean *   *   * , an online tool for converting Korean text into various coding formats and vice versa {{DEFAULTSORT:Korean Language And Computers Character encoding CJK input methods Science and technology in Korea Communications in Korea
computers A computer is a machine that can be programmed to automatically carry out sequences of arithmetic or logical operations ('' computation''). Modern digital electronic computers can perform generic sets of operations known as ''programs'', ...
Natural language and computing Korean-language computing