are

katakana is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji and in some cases the Latin script (known as rōmaji). The word ''katakana'' means "fragmentary kana", as the katakana characters are derived fr ...

characters displayed compressed at half their normal width (a 1:2

aspect ratio The aspect ratio of a geometry, geometric shape is the ratio of its sizes in different dimensions. For example, the aspect ratio of a rectangle is the ratio of its longer side to its shorter side—the ratio of width to height, when the rectangl ...

), instead of the usual square (1:1) aspect ratio. For example, the usual (full-width) form of the katakana ''ka'' is カ while the half-width form is ｶ. Additionally, half-width

hiragana is a Japanese language, Japanese syllabary, part of the Japanese writing system, along with ''katakana'' as well as ''kanji''. It is a phonetic lettering system. The word ''hiragana'' means "common" or "plain" kana (originally also "easy", ...

is included in Unicode, and it is usable on Web or in

e-book An ebook (short for electronic book), also spelled as e-book or eBook, is a book publication made available in electronic form, consisting of text, images, or both, readable on the flat-panel display of computers or other electronic devices. Al ...

s via CSS's font-feature-settings: "hwid" 1 with Adobe-Japan1-6 based OpenType fonts. Finally, half-width

kanji are logographic Chinese characters, adapted from Chinese family of scripts, Chinese script, used in the writing of Japanese language, Japanese. They were made a major part of the Japanese writing system during the time of Old Japanese and are ...

is usable on modern computers, and is used in some receipt printers, electric bulletin board and old computers. Half-width kana were used in the early days of Japanese computing, to allow Japanese characters to be displayed on the same grid as

monospaced font A monospaced font, also called a fixed-pitch, fixed-width, or non-proportional font, is a font whose letters and characters each occupy the same amount of horizontal space. This contrasts with Typeface#Proportion, variable-width fonts, where t ...

s of Latin characters. Half-width kanji were not used. Half-width kana characters are not generally used today, but find some use in specific settings, such as

cash register A cash register, sometimes called a till or automated money handling system, is a mechanical or electronic device for registering and calculating transactions at a point of sale. It is usually attached to a Cash register#Cash drawer, drawer fo ...

displays, on shop receipts, Japanese digital television and DVD subtitles, and mailing address labels. Their usage is sometimes also a stylistic choice, particularly frequent in certain Internet slang. The term "half-width kana", which strictly refers only to how kana are ''displayed'', not how they are ''stored'' – is also used loosely to refer to the A0–DF (hexadecimal) block where katakana are stored in some

character encoding Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical v ...

s, such as

JIS X 0201 JIS X 0201, a Japanese Industrial Standards, Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. The character set was initially known as JIS C 6220 before the JIS category reform. ...

(1969) – see

encodings In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication ...

, below. This is formally incorrect, however – this JIS standard simply specifies that katakana ''can'' be stored in these locations, without specifying ''how'' they should be displayed; the confusion is because in early computing, the characters stored here were in fact displayed as half-width kana – see

confusion In psychology, confusion is the quality or emotional state of being bewildered or unclear. The term "acute mental confusion"

, below.

History

Half-width kana and 2/3-width kana were used from pre-computer era. In the early computer era,

ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...

is defined as a 7-bit

character set Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical values that make up a c ...

and has room for 128 characters. However, since this standard was designed for the

United States The United States of America (USA), also known as the United States (U.S.) or America, is a country primarily located in North America. It is a federal republic of 50 U.S. state, states and a federal capital district, Washington, D.C. The 48 ...

, it does not contain characters and symbols, such as the yen (¥) symbol needed to represent Japanese currency, nor did it include space for characters from other alphabets, such as kana or kanji – thus Japanese characters could not be ''encoded''. Further, Japanese characters, both kana and kanji, are drawn on a square grid, while Latin characters are generally written more narrowly – thus Japanese characters could not be ''displayed'' either.

was developed in 1969, a time when computers were generally incapable, both by software design and hardware resources, of representing the thousands of Chinese-based

characters used in Japanese. As a compromise, this standard encoded katakana (only – not hiragana or kanji) as a small set of characters, assigned in the upper byte value range of 0x80–0xFF. This allowed 8-bit processors to encode and process Japanese text phonetically (as katakana), though without being able to process hiragana or kanji. These katakana characters were in turn ''displayed'' as "half-width kana" – a new, unorthodox, narrower form factor to fit the same width as the monospaced Latin alphabets machines were capable of printing and displaying. Encoding-wise, JIS X 0201 is a variant extension of ASCII – it includes additional characters, and does not exactly agree with ASCII on the overlapping part (the Latin character section). Bankbook description written in half-width kana

Bankbook description written in half-width kana

Half-width kana were developed as "... the first Japanese characters encoded on computers because they are used for Japanese telegrams." , the largest funds transfer system in Japan, was established in 1973. Transaction messages between banks could only use Latin, numbers, and half-width katakana within 20 characters. The system is superseded by ZEDI (The Nationwide Banking Electronic Data Interchange System) in 2018, which can handle hiragana and kanji with variable length characters. To make katakana fit into the narrower cell area allowed, some compromises were made. For example, the

diacritical marks A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacrit ...

dakuten The , colloquially , is a diacritic most often used in the Japanese kana syllabaries to indicate that the consonant of a mora should be pronounced voiced, for instance, on sounds that have undergone rendaku (sequential voicing). The , coll ...

'' and ''

handakuten The , colloquially , is a diacritic most often used in the Japanese language, Japanese kana syllabaries to indicate that the consonant of a Mora (linguistics), mora should be pronounced Voice (phonetics), voiced, for instance, on sounds that ...

'' are treated as separate characters instead of being part of the preceding character. This compromise led many to consider "half-width kana" visually unattractive, and causes problems for many computer programs today. Another use of half-width kana is to save space. The Japanese version of

Windows 3.1 Windows 3.1 is a major release of Microsoft Windows. It was released to manufacturing on April 6, 1992, as a successor to Windows 3.0. Like its predecessors, the Windows 3.1 series run as a shell on top of MS-DOS; it was the last Windows 1 ...

used both half-width and full-width katakana of

MS Gothic This is a list of notable CJK fonts (computer fonts with a large range of Chinese/Japanese/Korean characters). These fonts are primarily sorted by their typeface, the main classes being "with serif", "without serif" and "script". This article na ...

in its user interface. The Japanese version of

Windows 95 Windows 95 is a consumer-oriented operating system developed by Microsoft and the first of its Windows 9x family of operating systems, released to manufacturing on July 14, 1995, and generally to retail on August 24, 1995. Windows 95 merged ...

used half-width katakana of MS P Gothic in its user interface. It was replaced by full-width kana of MS UI Gothic, which present on Japanese version of

Windows 98 Windows 98 is a consumer-oriented operating system developed by Microsoft as part of its Windows 9x family of Microsoft Windows operating systems. It was the second operating system in the 9x line, as the successor to Windows 95. It was Software ...

and later, little narrower than MS P Gothic.

Encoding

In the

specification (1969), katakana are encoded in A0–DF (hexadecimal) block – how they are displayed is not specified, and there is no separate encoding of full-width and half-width kana. In

JIS X 0208 JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...

, katakana, hiragana, and kanji are all encoded (and displayed as full-width characters; there are no half-width characters), though the ordering of the kana is different – see JIS X 0208#Hiragana and katakana. In

Shift JIS Shift JIS (also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation in conjunction with Microsoft and standardized as JIS ...

, which combines JIS X 0201 and JIS X 0208, these encodings (both of which can encode Latin characters and katakana) are stored separately, with JIS X 0201 all being displayed as half-width (thus the JIS X 0201 katakana are displayed as half-width kana), while JIS X 0208 are all displayed as full-width (thus the JIS X 0208 Latin characters are all displayed as full-width Latin characters). Thus in Shift JIS, Latin characters and katakana have two encodings with two separate display forms, both half-width and full-width. In

Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...

, katakana and hiragana are primarily used as normal, full-width characters (the Katakana and Hiragana blocks are displayed as full-width characters); a separate block, the

Halfwidth and Fullwidth Forms In CJK characters, CJK (Chinese, Japanese, and Korean) computing, graphic characters are traditionally classed into fullwidth and halfwidth characters. Unlike monospaced fonts, a halfwidth character occupies half the width of a fullwidth characte ...

block is used to store variant characters, including half-width kana and full-width Latin characters. Thus, the katakana in JIS X 0201 and the corresponding part of derived encodings (the JIS X 0201 part of Shift JIS) are displayed as half-width, while in Unicode half-width forms are specified separately.

Half-width table

"J" indicates the first four bits in

(though see below, these do not ''necessarily'' indicate half-width) and in other sets such as

, "U" indicates the row in

in the Halfwidth and Fullwidth Forms block. Please note that the blank first cell represents a non-existent character in JIS, A0; but a fullwidth double parenthesis ｠ in Unicode, U+FF60.

Half-width kana on the Internet

E-mail

Since the

SMTP The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typi ...

and

NNTP The Network News Transfer Protocol (NNTP) is an application protocol used for transporting Usenet news articles (''netnews'') between news servers, and for reading/posting articles by the end user client applications. Brian Kantor of the Unive ...

protocols (used to deliver e-mail and

Usenet Usenet (), a portmanteau of User's Network, is a worldwide distributed discussion system available on computers. It was developed from the general-purpose UUCP, Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Elli ...

, respectively) were formerly only able to transmit 7-bit bytes, it was then the convention to use

ISO-2022-JP ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an International Organization for Standardization, ISO/International Electrotechnical Commission, IEC standard in the field of character encoding. It ...

for sending e-mail in Japanese. Half-width kana is not contained in ISO-2022-JP: it includes the Roman set of JIS X 0201, and all of JIS X 0208, but not the katakana set of JIS X 0201 (which is used for half-width kana in Shift JIS, for instance). Both sets of JIS X 0201 have ISO 2022 codes, but the ISO-2022-JP profile only includes the Roman set: this means that the format for including half-width katakana in ISO-2022-JP is both well-defined and a violation of the ISO-2022-JP format. For this reason, if half-width kana were accidentally included in a message, it could become garbled during transmission (see

mojibake Mojibake (; , 'character transformation') is the garbled or gibberish text that is the result of text being decoded using an unintended character encoding. The result is a systematic replacement of symbols with completely unrelated ones, often ...

). The

WHATWG The Web Hypertext Application Technology Working Group (WHATWG) is a community of people interested in evolving HTML and related technologies. The WHATWG was founded by individuals from Apple Inc., the Mozilla Foundation and Opera Software, ...

encoding standard used by

HTML5 HTML5 (Hypertext Markup Language 5) is a markup language used for structuring and presenting hypertext documents on the World Wide Web. It was the fifth and final major HTML version that is now a retired World Wide Web Consortium (W3C) recommend ...

permits decoding, but not encoding, of JIS X 0201 katakana in ISO-2022-JP as an extension to the format, and converts half-width katakana to their JIS X 0208 equivalents upon encoding. This is no longer such a problem since most e-mail servers today support

8BITMIME The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typica ...

extension and hence understand 8-bit characters. Alternatively, an encoding system such as

Base64 In computer programming, Base64 is a group of binary-to-text encoding schemes that transforms binary data into a sequence of printable characters, limited to a set of 64 unique characters. More specifically, the source binary data is taken 6 bits ...

can be used and specified in the message using

MIME A mime artist, or simply mime (from Greek language, Greek , , "imitator, actor"), is a person who uses ''mime'' (also called ''pantomime'' outside of Britain), the acting out of a story through body motions without the use of speech, as a the ...

Web pages

The problem that exists in e-mail does not exist with Web pages since

HTTP HTTP (Hypertext Transfer Protocol) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, wher ...

accepts 8-bit characters. However, one problem that does exist is that computer programs have difficulties determining whether to treat a character as

EUC-JP Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese (characters). The most commonly used EUC codes are variable-length encodings with a character belonging to an compl ...

, or

UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...

– hence character code information should be specified with a HTTP response header or a

Meta tag Meta most commonly refers to: * Meta (prefix), a common affix and word in English ( in Greek) * Meta Platforms, an American multinational technology conglomerate (formerly ''Facebook, Inc.'') Meta or META may also refer to: Businesses * Meta (ac ...

Confusion

Strictly speaking, JIS X 0201 encoding as "half-width katakana" is incorrect, as the standard does not define character widths – it defines only the code representation of katakana characters. In the JIS X 0201 standard, katakana characters are printed in normal (full) width, not half-width. Half-width characters were only used for display during the period when characters were displayed at half-width (and single-byte encodings were used), before full-width character displays (and associated double-byte encodings such as JIS X 0208) became widespread. However, in the Shift JIS standard, which combines the JIS X 0201 standard (whose characters – Latin and katakana – were displayed as half-width) and the JIS X 0208 standard (whose characters – katakana, hiragana, kanji, and Latin – were displayed as full-width), katakana and Latin characters are encoded twice, both in JIS X 0201 and JIS 0208, but displayed as half-width or full-width according to which section they are in (0201 or 0208) – thus the 0201 katakana block can be thought of as corresponding to "half-width kana", and the misunderstanding that the 0201 standard defines "half-width" characters is widespread. Further, though JIS X 0201 is a single-byte encoding (and displayed at half-width) and JIS X 0208 is a double-byte encoding (and displayed at full-width), there is no connection between number of bytes and width (other than those corresponding in Shift JIS, as above) – for example, Unicode can be encoded with four bytes (

UTF-32 UTF-32 (32- bit Unicode Transformation Format), sometimes called UCS-4, is a fixed-length encoding used to encode Unicode code points that uses exactly 32 bits (four bytes) per code point (but a number of leading bits must be zero as there are far ...

) to display both full-width and single-width characters.

References

* Lunde, Ken. ''CJKV Information Processing''. O'Reilly, 2nd ed., 2009
p. 224–226
(also 1st ed., 1999. p. 144–145) {{DEFAULTSORT:Half-Width Kana Japanese writing system terms Kana