HOME
*





Unicode Alias Names And Abbreviations
In Unicode, characters can have a unique name. A character can also have one or more alias names. An alias name can be an abbreviation, a C0 or C1 control name, a correction, an alternate name or a figment. An alias too is unique over all names and aliases, and therefore identifying. Background The formal, primary Unicode name is unique over all names, only uses certain characters & format, and is guaranteed never to change. The formal name consists of characters A–Z (uppercase), 0–9, " " (space), and "-" (hyphen). Next to this name, a character can have one or more formal (normative) alias names. Such an alias name also follows the rules of a name: characters used (A-Z, -, 0-9, <space>) and not used (a-z, %, $, etc.). Alias names are also unique in the full name set (that is, all names and alias names are all unique in their combined set). Alias names are formally described in the Unicode Standard. In this sense, an abbreviation is also considered a Unicode '' ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, emoji (including in colors), and non-visual control and formatting codes. Unicode's success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, and most modern programming languages. The Unicode character repertoire is synchronized with ISO/IEC 10646, each being code-for-code identical with the other. ''The Unicode Standard'', however, includes more than just the base code. Alon ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Unicode Character Property
The Unicode Standard assigns various properties to each Unicode character and code point. The properties can be used to handle characters (code points) in processes, like in line-breaking, script direction right-to-left or applying controls. Some "character properties" are also defined for code points that have no character assigned and code points that are labeled like "<not a character>". The character properties are described in Standard Annex #44. Properties have levels of forcefulness: normative, informative, contributory, or provisional. For simplicity of specification, a character property can be assigned by specifying a continuous range of code points that have the same property. Semantic elements Properties are displayed in the following order: ode ame c c c ecomposition;; v m lias;;; *'alias' = corrected name *'bc' = bidi (bidirectional) category , R etc*'bm' = bidi mirrored or Y*'cc' = combining class osition of diacritic*decomposition = letter + diacritic, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


ISO 6429
The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received. C0 codes are the range 00 HEX–1FHEX and the default C0 set was originally defined in ISO 646 (ASCII). C1 codes are the range 80HEX–9FHEX and the default C1 set was originally defined in ECMA-48 (harmonized later with ISO 6429). The ISO/IEC 2022 system of specifying control and graphic characters allows other C0 and C1 sets to be available for specialized applications, but they are rarely used. C0 controls ASCII defined 32 control characters, plus a necessary extra character for the DEL character, 7FHEX or 01111111BIN (needed to punch out all the holes on a paper tape and erase it). This large number of codes was desirable at the time, as multi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Reference Mark
The reference mark or reference symbol "※" is a typographic mark or word used in Chinese, Japanese and Korean (CJK) writing. The symbol was used historically to call attention to an important sentence or idea, such as a prologue or footnote. As an indicator of a note, the mark serves the exact same purpose as the asterisk in English. However, in Japanese usage, the note text is placed directly into the main text immediately after the ''komejirushi'' symbol, rather than at the bottom of the page or end of chapter as is the case in English writing. Names The Japanese name, ( ja, こめじるし; , , ), refers to the symbol's visual similarity to the for "rice" (). In Korean, the symbol's name, ( ko, 参考表; 참고표), simply means "reference mark". Informally, the symbol is often called (; ), as it is often used to indicate the presence of pool halls, due to its visual similarity to two crossed cue sticks and four billiard balls. In Chinese, the symbol is called ( ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Control Pictures
Control Pictures is a Unicode block A Unicode block is one of several contiguous ranges of numeric character codes ( code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the ... containing characters for graphically representing the C0 control codes, and other control characters. Its block name in Unicode 1.0 was Pictures for Control Codes. Block History The following Unicode-related documents record the purpose and process of defining specific characters in the Control Pictures block: See also * ISO 2047 References {{reflist Unicode blocks ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Regional Indicator Symbol
The regional indicator symbols are a set of 26 alphabetic Unicode characters (A–Z) intended to be used to encode ISO 3166-1 alpha-2 two-letter country codes in a way that allows optional special treatment. These were defined by October 2010 as part of the Unicode 6.0 support for emoji, as an alternative to encoding separate characters for each country flag. Although they can be displayed as Roman letters, it is intended that implementations may choose to display them in other ways, such as by using national flags. The Unicode FAQ indicates that this mechanism should be used and that symbols for national flags will not be directly encoded. They are encoded in the range to within the Enclosed Alphanumeric Supplement block in the Supplementary Multilingual Plane. Emoji flag sequences A pair of regional indicator symbols is referred to as an ''emoji flag sequence'' (although it represents a specific region, not a specific flag for that region). Out of the 676 possible pairs ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Enclosed Alphanumeric Supplement
Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane. The block is mostly an extension of the Enclosed Alphanumerics block, containing further enclosed alphanumeric characters which are not included in that block or Enclosed CJK Letters and Months. Most of the characters are single alphanumerics in boxes or circles, or with trailing commas. Two of the symbols are identified as dingbats. A number of multiple-letter enclosed abbreviations are also included, mostly to provide compatibility with Broadcast Markup Language standards (see ARIB STD B24 character set) and Japanese telecommunications networks' emoji sets. The block also includes the regional indicator symbols to be used for emoji country flag support. Emoji The Enclosed Alphanumeric Supplement block conta ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Tags (Unicode Block)
Tags is a Unicode block containing formatting tag characters. The block is designed to mirror ASCII. It was originally intended for language tags, but has now been repurposed as emoji modifiers, specifically for region flags. Legacy use U+E0001, U+E0020–U+E007F were originally intended for invisibly tagging texts by language but that use is no longer recommended. All of those characters were deprecated in Unicode 5.1. With the release of Unicode 8.0, U+E0020–U+E007E are no longer deprecated characters. The change was made "to clear the way for the potential future use of tag characters for a purpose other than to represent language tags". Unicode states that "the use of tag characters to represent language tags in a plain text stream is still a deprecated mechanism for conveying language information about text". Current use With the release of Unicode 9.0, U+E007F is no longer a deprecated character. (U+E0001 LANGUAGE TAG remains deprecated.) The release of Emoji 5.0 in ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]