The character sets used by

Videotex Videotex (or interactive videotex) was one of the earliest implementations of an end-user information system. From the late 1970s to early 2010s, it was used to deliver information (usually pages of text) to a user in computer-like format, typi ...

are based, to greater or lesser extents, on

ISO/IEC 2022 ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/ IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japane ...

. Three Data Syntax systems are defined by

ITU The International Telecommunication Union (ITU)In the other common languages of the ITU: * * is a specialized agency of the United Nations responsible for many matters related to information and communication technologies. It was established ...

T.101, corresponding to the Videotex systems of different countries.

Data Syntax 1

Data Syntax 1 is defined in Annex B of T.101:1994. It is based on the

CAPTAIN Captain is a title, an appellative for the commanding officer of a military unit; the supreme leader or highest rank officer of a navy ship, merchant ship, aeroplane, spacecraft, or other vessel; or the commander of a port, fire or police depa ...

system used in

Japan Japan is an island country in East Asia. Located in the Pacific Ocean off the northeast coast of the Asia, Asian mainland, it is bordered on the west by the Sea of Japan and extends from the Sea of Okhotsk in the north to the East China Sea ...

. Its graphical sets include

JIS X 0201 JIS X 0201, a Japanese Industrial Standards, Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. The character set was initially known as JIS C 6220 before the JIS category reform. ...

and

JIS X 0208 JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...

. The following G-sets are available through

-based designation escapes:

Mosaic sets for Data Syntax 1

The mosaic sets supply characters for use in

semigraphics Text-based semigraphics, pseudographics, or character graphics is a primitive method used in early text mode video hardware to emulate raster graphics without having to implement the logic for such a display mode. There are two different ways ...

. � Not in Unicode

Data Syntax 2

Data Syntax 2 is defined in Annex C of T.101:1994. It corresponds to some European Videotex systems such as CEPT T/CD 06-01. The graphical character coding of Data Syntax 2 is based on T.51. The default G2 set of Data Syntax 2 is based on an older version of T.51, lacking the

non-breaking space In word processing and digital typesetting, a non-breaking space (), also called NBSP, required space, hard space, or fixed space ...

soft hyphen In computing and typesetting, a soft hyphen (Unicode ) or syllable hyphen, is a code point reserved in some coded character sets for the purpose of breaking words across lines by inserting visible hyphens if they fall on the line end but remain i ...

, not sign ( ¬) and broken bar ( ¦) present in the current version, but adding a dialytika tonos (΅—combining form is U+0344) at the beginning of the row of diacritical marks for combination with codes from a

Greek Greek may refer to: Anything of, from, or related to Greece, a country in Southern Europe: *Greeks, an ethnic group *Greek language, a branch of the Indo-European language family **Proto-Greek language, the assumed last common ancestor of all kno ...

primary set. An umlaut diacritic code distinct from the diaeresis code, as included in some versions of T.61, is also sometimes included. The default G1 set is the second mosaic set, corresponding roughly to the second mosaic set of Data Syntax 1. The default G3 set is the third mosaic set, matching the first mosaic set of Data Syntax 1 for 0x60 through 0x6D and 0x70 through 0x7D, and otherwise differing. The first mosaic set matches the second except for 0x40 through 0x5E: 0x40 through 0x5A follow ASCII (supplying uppercase letters), whereas the remainder are national variant characters; the displaced full block is placed at 0x7F. * Representation of 0x5B-5E is not guaranteed in international communication and may be replaced by national application oriented variants. * 0x5F may be displayed either as ⌗ (square) or _ (lower bar) to represent the terminator function required by Videotex services.

Data Syntax 3

Data Syntax 3 is defined in Annex D of T.101:1994. The graphical character coding of Data Syntax 3 is based on T.51. The supplementary set for Data Syntax 3 is based on an older version of T.51, lacking the

non-breaking space In word processing and digital typesetting, a non-breaking space (), also called NBSP, required space, hard space, or fixed space ...

, not sign ( ¬) and broken bar ( ¦) present in the current version, and allocating non-spacing marks for a "vector overbar" and

solidus Solidus (Latin for "solid") may refer to: * Solidus (coin) The ''solidus'' (Latin 'solid'; : ''solidi'') or ''nomisma'' () was a highly pure gold coin issued in the Later Roman Empire and Byzantine Empire. It was introduced in the early ...

and several semigraphic characters to unallocated space in that set. See the comments in the T.51 article for caveats about the combining mark Unicode mappings shown below. Unlike Unicode

combining character In digital typography, combining characters are Character (computing), characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritic, diacritical marks (including c ...

s, T.51 diacritic codes precede the base character.

C0 control codes

C0 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, a ...

for Videotex differ from ASCII as shown in the table below. The , , (LS1), (LS0) and codes are also available in some or all data syntaxes, but without change in name or semantic from ASCII.

C1 control codes

The following specialised

C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, a ...

are used in Videotex. There are four registered sets, with some differences between them.

References

{{Character encodings Character sets

character set Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical values that make up a c ...