HOME

TheInfoList



OR:

Unicode has subscripted and superscripted versions of a number of characters including a full set of
Arabic numerals Arabic numerals are the ten numerical digits: , , , , , , , , and . They are the most commonly used symbols to write Decimal, decimal numbers. They are also used for writing numbers in other systems such as octal, and for writing identifiers ...
. These characters allow any polynomial, chemical and certain other
equation In mathematics, an equation is a formula that expresses the equality of two expressions, by connecting them with the equals sign . The word ''equation'' and its cognates in other languages may have subtly different meanings; for example, in ...
s to be represented in plain text without using any form of markup like HTML or TeX. The World Wide Web Consortium and the Unicode Consortium have made recommendations on the choice between using markup and using superscript and subscript characters:
When used in mathematical context ( MathML) it is recommended to consistently use style markup for superscripts and subscripts.... However, when super and sub-scripts are to reflect semantic distinctions, it is easier to work with these meanings encoded in text rather than markup, for example, in phonetic or
phonemic transcription In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language. For example, in most dialects of English, with the notable exception of the West Midlands and the north-west ...
.


Uses

The ''intended'' use when these characters were added to Unicode was to allow chemical and algebra formulas and phonetics to be written without markup, but produce true superscripts and subscripts. Thus "H₂O" (using a subscript character) is ''supposed'' to be identical to "H2O" (with subscript markup). In reality most fonts that include these characters ignore the Unicode definition, and design the digits for mathematical
numerator A fraction (from la, fractus, "broken") represents a part of a whole or, more generally, any number of equal parts. When spoken in everyday English, a fraction describes how many parts of a certain size there are, for example, one-half, eight ...
and denominator glyphs, which are smaller than normal characters but are aligned with the
cap line In typography, cap height is the height of a capital letter above the baseline for a particular typeface.http://pfaedit.sourceforge.net/glossary.html Glossary of (some) Typographic Terms It specifically is the height of capital letters that are fl ...
and the baseline, respectively. When used with the solidus, these glyphs are useful for making arbitrary diagonal fractions (similar to the ½ glyph). Making fractions using existing software super/subscripts requires many characters and does not look like the rendered fraction (example: 1/2), so font designers provided this alternative. This also makes the superscript letters useful for ordinal indicators, more closely matching the ª and º characters. However it makes them incorrect for normal super and subscripts, and formulas are rendered correctly by using markup rather than these characters. Unicode intended to produce diagonal fractions through a different mechanism but it is very poorly supported. The fraction slash U+2044 is visually similar to the solidus, but when used with the ordinary digits (not the superscripts and subscripts) is intended to tell a layout system that a fraction such as ¾ should be rendered using automatic glyph substitutionFor a general overview and technical information on glyph substitution (though not specifically for fractions)
GSUB — Glyph Substitution Table
in th

on th
Microsoft Typography site
for the digits. Some browsers support thisSuch a

on Windows
Firefox
/ref> but not in all fonts. A selection of fonts is shown in the below table.


Superscripts and subscripts block

The most common superscript digits (1, 2, and 3) were in
ISO-8859-1 ISO/IEC 8859-1:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1 ...
and were therefore carried over into those positions in the Latin-1 range of Unicode. The rest were placed in a dedicated section of Unicode at to U+209F. The two tables below show these characters. Each superscript or subscript character is preceded by a normal ''x'' to show the subscripting/superscripting. The table on the left contains the actual Unicode characters; the one on the right contains the equivalents using HTML markup for the subscript or superscript.


Other superscript and subscript characters

Unicode version 15.0 also includes subscript and superscript characters that are intended for semantic usage, in the following blocks: ;Superscript * The Latin-1 Supplement block contains the feminine and masculine ordinal indicators ª and º. * The Latin Extended-C block contains one additional superscript, ⱽ. * The Latin Extended-D block contains six superscripts: ꝰ ꟲ ꟳ ꟴ ꟸ ꟹ. * The Latin Extended-E block contains five superscripts: ꭜ ꭝ ꭞ ꭟ ꭩ. * The Latin Extended-F block is entirely superscript IPA letters: 𐞁 𐞂 𐞃 𐞄 𐞅 𐞇 𐞈 𐞉 𐞊 𐞋 𐞌 𐞍 𐞎 𐞏 𐞐 𐞑 𐞒 𐞓 𐞔 𐞕 𐞖 𐞗 𐞘 𐞙 𐞚 𐞛 𐞜 𐞝 𐞞 𐞟 𐞠 𐞡 𐞢 𐞣 𐞤 𐞥 𐞦 𐞧 𐞨 𐞩 𐞪 𐞫 𐞬 𐞭 𐞮 𐞯 𐞰 𐞲 𐞳 𐞴 𐞵 𐞶 𐞷 𐞸 𐞹 𐞺. * The Spacing Modifier Letters block has superscripted letters and symbols used for phonetic transcription: ʰ ʱ ʲ ʳ ʴ ʵ ʶ ʷ ʸ ˀ ˁ ˠ ˡ ˢ ˣ ˤ. * The Phonetic Extensions block has several superscripted letters and symbols: Latin/IPA ᴬ ᴭ ᴮ ᴯ ᴰ ᴱ ᴲ ᴳ ᴴ ᴵ ᴶ ᴷ ᴸ ᴹ ᴺ ᴻ ᴼ ᴽ ᴾ ᴿ ᵀ ᵁ ᵂ ᵃ ᵄ ᵅ ᵆ ᵇ ᵈ ᵉ ᵊ ᵋ ᵌ ᵍ ᵏ ᵐ ᵑ ᵒ ᵓ ᵖ ᵗ ᵘ ᵚ ᵛ, Greek ᵝ ᵞ ᵟ ᵠ, Cyrillic ᵸ, other ᵎ ᵔ ᵕ ᵙ ᵜ. These are intended to indicate
secondary articulation In phonetics, secondary articulation occurs when the articulation of a consonant is equivalent to the combined articulations of two or three simpler consonants, at least one of which is an approximant. The secondary articulation of such co-articul ...
. * The Phonetic Extensions Supplement block has several more: Latin/IPA ᶛ ᶜ ᶝ ᶞ ᶟ ᶠ ᶡ ᶢ ᶣ ᶤ ᶥ ᶦ ᶧ ᶨ ᶩ ᶪ ᶫ ᶬ ᶭ ᶮ ᶯ ᶰ ᶱ ᶲ ᶳ ᶴ ᶵ ᶶ ᶷ ᶸ ᶹ ᶺ ᶻ ᶼ ᶽ ᶾ, Greek ᶿ. * The IPA Extensions block has several Latin superscripts: 𐞄 𐞒 𐞖 𐞪 𐞲. * The
Cyrillic Extended-B Cyrillic Extended-B is a Unicode block A Unicode block is one of several contiguous ranges of numeric character codes ( code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation pu ...
block contains two
Cyrillic , bg, кирилица , mk, кирилица , russian: кириллица , sr, ћирилица, uk, кирилиця , fam1 = Egyptian hieroglyphs , fam2 = Proto-Sinaitic , fam3 = Phoenician , fam4 = G ...
superscripts: ꚜ ꚝ. * The
Cyrillic Extended-D Cyrillic Extended-D is a Unicode block containing superscript and subscript Cyrillic characters used in Cyrillic-based phonetic transcription. The block contains the first Cyrillic characters defined outside of the Basic Multilingual Plane In t ...
block contains many Cyrillic superscripts: 𞀰 𞀱 𞀲 𞀳 𞀷 𞀵 𞀶 𞀷 𞀸 𞀹 𞀺 𞀻 𞀼 𞀽 𞀾 𞀿 𞁀 𞁁 𞁂 𞁃 𞁄 𞁅 𞁆 𞁇 𞁈 𞁉 𞁊 𞁋 𞁌 𞁍 𞁎 𞁏 𞁐 𞁫 𞁬 𞁭. * The
Georgian Georgian may refer to: Common meanings * Anything related to, or originating from Georgia (country) ** Georgians, an indigenous Caucasian ethnic group ** Georgian language, a Kartvelian language spoken by Georgians **Georgian scripts, three scrip ...
block contains one superscripted Mkhedruli letter: ჼ. * The Kanbun block has superscripted annotation characters used in Japanese copies of Classical Chinese texts: ㆒ ㆓ ㆔ ㆕ ㆖ ㆗ ㆘ ㆙ ㆚ ㆛ ㆜ ㆝ ㆞ ㆟. * The Tifinagh block has one superscript letter : ⵯ. * The
Unified Canadian Aboriginal Syllabics Canadian syllabic writing, or simply syllabics, is a family of writing systems used in a number of Indigenous Canadian languages of the Algonquian, Inuit, and (formerly) Athabaskan language families. These languages had no formal writing sy ...
and its
Extended Extension, extend or extended may refer to: Mathematics Logic or set theory * Axiom of extensionality * Extensible cardinal * Extension (model theory) * Extension (predicate logic), the set of tuples of values that satisfy the predicate * Exte ...
blocks contain several mostly consonant-only letters to indicate
syllable coda A syllable is a unit of organization for a sequence of Phone (phonetics), speech sounds typically made up of a syllable nucleus (most often a vowel) with optional initial and final margins (typically, consonants). Syllables are often considered t ...
called Finals, along with some characters that indicate
syllable medial A syllable is a unit of organization for a sequence of speech sounds typically made up of a syllable nucleus (most often a vowel) with optional initial and final margins (typically, consonants). Syllables are often considered the phonological "bu ...
known as Medials: Main block ᐜ ᐝ ᐞ ᐟ ᐠ ᐡ ᐢ ᐣ ᐤ ᐥ ᐦ ᐧ ᐨ ᐩ ᐪ ᑉ ᑊ ᑋ ᒃ ᒄ ᒡ ᒢ ᒻ ᒼ ᒽ ᒾ ᓐ ᓑ ᓒ ᓪ ᓫ ᔅ ᔆ ᔇ ᔈ ᔉ ᔊ ᔋ ᔥ ᔾ ᔿ ᕀ ᕁ ᕐ ᕑ ᕝ ᕪ ᕻ ᕯ ᕽ ᖅ ᖕ ᖖ ᖟ ᖦ ᖮ ᗮ ᘁ ᙆ ᙇ ᙚ ᙾ ᙿ; Extended block: ᣔ ᣕ ᣖ ᣗ ᣘ ᣙ ᣚ ᣛ ᣜ ᣝ ᣞ ᣟ ᣳ ᣴ ᣵ. ;Combining superscript * The
Combining Diacritical Marks Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actual ...
block contains medieval superscript letter diacritics. These letters are written directly above other letters appearing in medieval Germanic manuscripts, and so these glyphs do not include spacing, for example uͤ. They are shown here over the dotted circle placeholder ◌: ◌ͣ ◌ͤ ◌ͥ ◌ͦ ◌ͧ ◌ͨ ◌ͩ ◌ͪ ◌ͫ ◌ͬ ◌ͭ ◌ͮ ◌ͯ. * The Combining Diacritical Marks Extended block contains two combining letters for linguistic transcriptions of Scots (◌ᪿ ◌ᫀ) and three combining insular letters for the Middle English Ormulum (◌ᫌ ◌ᫍ ◌ᫎ). * The Combining Diacritical Marks Supplement block contains additional medieval superscript letter diacritics, enough to complete the basic lowercase Latin alphabet except for j, q and y, a few small capitals and ligatures (ae, ao, av), and additional letters: ◌᷒ ◌ᷓ ◌ᷔ ◌ᷕ ◌ᷖ ◌ᷗ ◌ᷘ ◌ᷙ ◌ᷚ ◌ᷛ ◌ᷜ ◌ᷝ ◌ᷞ ◌ᷟ ◌ᷠ ◌ᷡ ◌ᷢ ◌ᷣ ◌ᷤ ◌ᷥ ◌ᷦ ◌ᷧ ◌ᷨ ◌ᷪ ◌ᷫ ◌ᷬ ◌ᷭ ◌ᷮ ◌ᷯ ◌ᷰ ◌ᷱ ◌ᷲ ◌ᷳ ◌ᷴ, Greek ◌ᷩ. * The
Cyrillic Extended-A Cyrillic Extended-A is a Unicode block containing combining Cyrillic , bg, кирилица , mk, кирилица , russian: кириллица , sr, ћирилица, uk, кирилиця , fam1 = Egyptian hieroglyphs , fam2 ...
and -B blocks contains multiple medieval superscript letter diacritics, enough to complete the basic lowercase Cyrillic alphabet used in Church Slavonic texts, also includes an additional ligature (ст): ◌ⷠ ◌ⷡ ◌ⷢ ◌ⷣ ◌ⷤ ◌ⷥ ◌ⷦ ◌ⷧ ◌ⷨ ◌ⷩ ◌ⷪ ◌ⷫ ◌ⷬ ◌ⷭ ◌ⷮ ◌ⷯ ◌ⷰ ◌ⷱ ◌ⷲ ◌ⷳ ◌ⷴ ◌ⷵ ◌ⷶ ◌ⷷ ◌ⷸ ◌ⷹ ◌ⷺ ◌ⷻ ◌ⷼ ◌ⷽ ◌ⷾ ◌ⷿ ◌ꙴ ◌ꙵ ◌ꙶ ◌ꙷ ◌ꙸ ◌ꙹ ◌ꙺ ◌ꙻ ◌ꚞ ◌ꚟ. ;Subscript * The Latin Extended-C block contains one additional subscript, ⱼ. * The Phonetic Extensions block has several subscripted letters and symbols: Latin/IPA ᵢ ᵣ ᵤ ᵥ and Greek ᵦ ᵧ ᵨ ᵩ ᵪ. * The Cyrillic Extended-D block also contains many Cyrillic subscripts: 𞁑 𞁒 𞁓 𞁔 𞁕 𞁖 𞁗 𞁘 𞁙 𞁚 𞁛 𞁜 𞁝 𞁞 𞁟 𞁠 𞁡 𞁢 𞁣 𞁤 𞁥 𞁦 𞁧 𞁨 𞁩 𞁪. ;Combining subscript * The Combining Diacritical Marks Supplement block contains a combining subscript: ◌᷊.


Latin, Greek and Cyrillic tables

Consolidated, the Unicode standard contains superscript and subscript versions of a subset of Latin, Greek and Cyrillic letters. Here they are arranged in alphabetical order for comparison (or for copy and paste convenience). Since these characters appear in different Unicode ranges, they may not appear to be the same size or position due to font substitution in the browser. Shaded cells mark small capitals that are not very distinct from minuscules, and Greek letters that are indistinguishable from Latin, and so would not be expected to be supported by Unicode. Many of these characters were added to Unicode 15, in the
Cyrillic Extended-D Cyrillic Extended-D is a Unicode block containing superscript and subscript Cyrillic characters used in Cyrillic-based phonetic transcription. The block contains the first Cyrillic characters defined outside of the Basic Multilingual Plane In t ...
block, and published in 2022. See also small caps in Unicode.


Composite characters

Primarily for compatibility with earlier character sets, Unicode contains a number of characters that compose super- and subscripts with other symbols. In most fonts these render much better than attempts to construct these symbols from the above characters or by using markup. * The Latin-1 Supplement block contains the precomposed fractions ½, ¼, and ¾. The copyright © and
registered trademark sign The registered trademark symbol, , is a typographic symbol that provides notice that the preceding word or symbol is a trademark or service mark that has been registered with a national trademark office. A trademark is a symbol, word, or word ...
s ® are also in this block. * The General Punctuation block contains the permille sign ‰ and the per-ten-thousand sign ‱, and Basic Latin has the percent sign %. * The
Number Forms Number Forms is a Unicode block containing Unicode compatibility characters that have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and Roman numerals. In addition to the cha ...
block contains several precomposed fractions: ⅐ ⅑ ⅒ ⅓ ⅔ ⅕ ⅖ ⅗ ⅘ ⅙ ⅚ ⅛ ⅜ ⅝ ⅞ ⅟ ↉. * The Letterlike Symbols block contains a few symbols composed of subscript and superscript characters: ℀ ℁ ℅ ℆ № ℠ ™ ⅍. * The
Enclosed Alphanumeric Supplement Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplem ...
block contains three superscript abbreviations 🅪 🅫 🅬: MC for ( trademark), MD for ( registered trademark), both used in Canada; MR for (registered trademark) in Spanish and Portuguese speaking countries. * The Miscellaneous Technical block has one additional subscript, a subscript 10 (⏨), for the purpose of scientific notation. * The
Unified Canadian Aboriginal Syllabics Canadian syllabic writing, or simply syllabics, is a family of writing systems used in a number of Indigenous Canadian languages of the Algonquian, Inuit, and (formerly) Athabaskan language families. These languages had no formal writing sy ...
and its
Extended Extension, extend or extended may refer to: Mathematics Logic or set theory * Axiom of extensionality * Extensible cardinal * Extension (model theory) * Extension (predicate logic), the set of tuples of values that satisfy the predicate * Exte ...
blocks contain several letters composed with superscripted letters to indicate extended sound values: Main block ᐂ ᐫ ᐬ ᐭ ᐮ ᐰ ᑍ ᑧ ᑨ ᑩ ᑪ ᑬ ᒅ ᒆ ᒇ ᒈ ᒊ ᒤ ᓁ ᓔ ᓮ ᔌ ᔍ ᔎ ᔏ ᔧ ᕅ ᕔ ᕿ ᖀ ᖁ ᖂ ᖃ ᖄ ᖎ ᖏ ᖐ ᖑ ᖒ ᖓ ᖔ ᙯ ᙰ ᙱ ᙲ ᙳ ᙴ ᙵ ᙶ, Extended block ᢰ ᢱ ᢲ ᢳ ᢴ ᢵ ᢶ ᢷ ᢸ ᢹ ᢺ ᢻ ᢼ ᢽ ᢾ ᢿ ᣀ ᣁ ᣂ ᣃ ᣄ ᣅ.


Notes


References

{{DEFAULTSORT:Subscripts and superscripts, Unicode Unicode