HOME

TheInfoList



OR:

Volume 1 of the
Association of Radio Industries and Businesses The , commonly known as , is a standardization organization in Japan. ARIB is designated as the center of promotion of the efficient use of the radio spectrum and designated frequency change support agency. Its activities include those previously ...
(
ARIB The , commonly known as , is a standardization organization in Japan. ARIB is designated as the center of promotion of the efficient use of the radio spectrum and designated frequency change support agency. Its activities include those previously ...
) STD-B24 standard for
Broadcast Markup Language Broadcast Markup Language, or BML, is an XML-based standard developed by Japan's Association of Radio Industries and Businesses as a data broadcasting specification for digital television broadcasting. It is a data-transmission service allowing tex ...
specifies, amongst other details, a
character encoding Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmi ...
for use in Japanese-language broadcasting. It was introduced on . The latest revision is version 6.3 as of . It includes a number of not found in the base standards (
JIS X 0208 JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...
and
JIS X 0201 JIS X 0201, a Japanese Industrial Standard developed in 1969 (then called JIS C 6220 until the JIS category reform), was the first Japanese electronic character set to become widely used. It is either a 7-bit encoding or an 8-bit encoding, altho ...
). It was the source standard for many symbol characters which were added to
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
, including portions of the
Miscellaneous Symbols Miscellaneous Symbols is a Unicode block (U+2600–U+26FF) containing glyphs representing concepts from a variety of categories: astrological, astronomical, chess, dice, musical notation, political symbols, recycling, religious symbols, trigr ...
, Enclosed Alphanumeric Supplement and Enclosed Ideographic Supplement blocks. Its contributions partially overlap the Unicode emoji, but were added a year earlier, in Unicode 5.2. Fascicle 1 of the ARIB STD-B62 standard, published in 2014, defines Unicode mappings for a selection of the B24 extended characters (excluding, for example, those duplicated by
JIS X 0213 JIS X 0213 is a Japanese Industrial Standard defining coded character sets for encoding the characters used in Japan. This standard extends JIS X 0208. The first version was published in 2000 and revised in 2004 (JIS2004) and 2012. As well as a ...
), as well as a few extended Kanji. It also includes a mapping of utilised characters outside the
Basic Multilingual Plane In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecima ...
to the BMP's
private use area In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane (), and one each in, and nearl ...
.


Sets and codes

The ARIB STD B24 standard defines multiple character sets and a method of switching between them. These include a Kanji set (an extension of
JIS X 0208 JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...
), an Alphanumeric set, a Hiragana set, Katakana sets of two distinct layouts and four
mosaic A mosaic is a pattern or image made of small regular or irregular pieces of colored stone, glass or ceramic, held in place by plaster/mortar, and covering a surface. Mosaics are often used as floor and wall decoration, and were particularly pop ...
sets. The sets are selected using
ISO 2022 ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/ IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the ...
mechanisms for 94-sets, using the following codes (proportional sets use the same layout as the corresponding non-proportional ones):


Code charts


Kanji (double-byte) set

This is a double-byte character set extending
JIS X 0208 JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...
.


Lead byte

The encoding bytes correspond to the row or cell number plus 0x20, or 32 in decimal (see below). Hence, the code set starting with 0x21 has a row number of 1, and its cell 1 has a continuation byte of 0x21 (or 33), and so forth. Most of the code corresponds to JIS X 0208.


Character sets 0x21-0x74 (row numbers 1-84: punctuation, alphabets, numbers, Kana, Kanji)


Character set 0x7A (row number 90, traffic symbols)

Characters 90-45 through 90-63 and 90-66 through 90-84 (shown below shaded) are listed in the B24 standard only in table 7-10 (the list of extension characters), and are also the only characters in rows 90 through 91 which are not transport-related symbols; this is noted in the B24 standard in an endnote to table 7-10. The remainder of the extensions are listed in both table 7-4 (the double-byte code chart) and table 7-10.


Character set 0x7B (row number 91, map symbols)

Characters from ARIB STD-B24 which were not retained in ARIB STD-B62 are shown shaded.


Character set 0x7C (row number 92, units, enclosed forms, list markers, arrows)

Characters from ARIB STD-B24 which were not retained in ARIB STD-B62 are shown shaded. } , , , , , , - , , , , , , , , , , , , , , , , , , - , , , , , , , , , , , , , , , , , , - , , , , , , , , , , , , , , , , , , - , , , , , , , , , , , , , , , , , , - ,


Character set 0x7D (row number 93, game and weather symbols, fractions, units, enclosed forms)

Characters from ARIB STD-B24 which were not retained in ARIB STD-B62 are shown shaded.


Character set 0x7E (row number 94, list markers)

Characters from ARIB STD-B24 which were not retained in ARIB STD-B62 are shown shaded.


Single-byte sets


Alphanumeric set


Hiragana set


Katakana set


JIS X 0201 Katakana set


Mosaic sets


Shift_JIS variant

In addition to the modified ISO 2022 encoding, the B24 standard also specifies a
Shift JIS Shift JIS (Shift Japanese Industrial Standards, also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by a Japanese company called ASCII Corporation in conjuncti ...
encoding following JIS X 0208:1997, but with the addition of the extended characters in the kanji set.


See also


Footnotes


References

* *


Further reading

* * (NB. Translated into Japanese and Chinese in 2002.)


External links


Official changelog for ARIB STD-B24


(
ARIB The , commonly known as , is a standardization organization in Japan. ARIB is designated as the center of promotion of the efficient use of the radio spectrum and designated frequency change support agency. Its activities include those previously ...
) {{Character encoding Character sets Encodings of Japanese