HOME

TheInfoList



OR:

A Unicode block is one of several contiguous ranges of numeric character codes ( code points) of the
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole. Each block is generally, but not always, meant to supply glyphs used by one or more specific languages, or in some general application area such as
mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
,
surveying Surveying or land surveying is the technique, profession, art, and science of determining the terrestrial two-dimensional or three-dimensional positions of points and the distances and angles between them. A land surveying professional is ...
, decorative
typesetting Typesetting is the composition of text by means of arranging physical ''type'' (or ''sort'') in mechanical systems or '' glyphs'' in digital systems representing '' characters'' (letters and other symbols).Dictionary.com Unabridged. Random ...
, social forums, etc.


Design and implementation

Unicode blocks are identified by unique names, which use only ASCII characters and are usually descriptive of the nature of the symbols, in
English English usually refers to: * English language * English people English may also refer to: Peoples, culture, and language * ''English'', an adjective for something of, from, or related to England ** English national ...
; such as "Tibetan" or "Supplemental Arrows-A". (When comparing block names, one is supposed to equate uppercase with lowercase letters, and ignore any whitespace, hyphens, and underbars; so the last name is equivalent to "supplemental_arrows__a" and "SUPPLEMENTALARROWSA". Blocks are pairwise disjoint; that is, they do not overlap. The starting code point and the size (number of code points) of each block are always multiples of 16; therefore, in the hexadecimal notation, the starting (smallest) point is U+''xxx''0 and the ending (largest) point is U+''yyy''F, where ''xxx'' and ''yyy'' are three or more hexadecimal digits. (These constraints are intended to simplify the display of glyphs in Unicode Consortium documents, as tables with 16 columns labeled with the last hexadecimal digit of the code point.) The size of a block may range from the minimum of 16 to a maximum of 65,536 code points. Every assigned code point has a glyph property called "Block", whose value is a character string naming the unique block that owns that point. However, a block may also contain unassigned code points, usually reserved for future additions of characters that "logically" should belong to that block. Code points not belonging to any of the named blocks, e.g. in the unassigned
planes Plane(s) most often refers to: * Aero- or airplane, a powered, fixed-wing aircraft * Plane (geometry), a flat, 2-dimensional surface Plane or planes may also refer to: Biology * Plane (tree) or ''Platanus'', wetland native plant * ''Planes' ...
4–13, have the value block="No_block".


Other classifications

Each Unicode point also has a property called " General Category", that attempts to describe the role of the corresponding symbol in the languages or applications for whose sake it was included in the system. Examples of General Categories are "Lu" (meaning upper-case letter), "Nd" (decimal digit), "Pi" (open-quote punctuation), and "Mn" (non-spacing mark, i.e. a diacritic for the preceding glyph). This division is completely independent of code blocks: the code points with a given General Category generally span many blocks, and do not have to be consecutive, not even within each block. Each code point also has a script property, specifying which
writing system A writing system is a method of visually representing verbal communication, based on a script and a set of rules regulating its use. While both writing and speech are useful in conveying messages, writing differs in also being a reliable fo ...
it is intended for, or whether it is intended for multiple writing systems. This, also, is independent of block. In descriptions of the Unicode system, a block may be subdivided into more specific subgroups, such as the " Chess symbols" in the Miscellaneous Symbols block (not to be confused with the separate Chess Symbols block). Those subgroups are not "blocks" in the technical sense used by the Unicode consortium, and are named only for the convenience of users.


List of blocks

Unicode 15.0 defines 327 blocks: * 164 in plane 0, the Basic Multilingual Plane (in table below: ) * 151 in plane 1, the Supplementary Multilingual Plane () * 6 in plane 2, the Supplementary Ideographic Plane () * 2 in plane 3, the Tertiary Ideographic Plane () * 2 in plane 14 (E in
hexadecimal In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, he ...
), the Supplementary Special-purpose Plane () * One each in the planes 15 (Fhex) and 16 (10hex), called Supplementary Private Use Area-A and -B ()


Deleted blocks

The Unicode Stability Policy requires that a character, once assigned, may not be moved or removed, although it may be deprecated. This applies to Unicode 2.0 and all subsequent versions. Prior to this, the following former blocks were removed:


References


External links


Official web site of the Unicode Consortium
(English) {{MathematicalSymbolsNotationLanguage