Over a thousand characters from the
Latin script
The Latin script, also known as Roman script, is an alphabetic writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greece, Greek city of Cumae, in southe ...
are encoded in the
Unicode Standard
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
, grouped in several basic and extended Latin
blocks. The extended ranges contain mainly
precomposed letters plus diacritics that are equivalently encoded with
combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages (including
click
Click, Klick and Klik may refer to:
Airlines
* Click Airways, a UAE airline
* Clickair, a Spanish airline
* MexicanaClick, a Mexican airline
Art, entertainment, and media Fictional characters
* Klick (fictional species), an alien race in the g ...
symbols in Latin Extended-B) and the
Vietnamese alphabet
The Vietnamese alphabet ( vi, chữ Quốc ngữ, lit=script of the National language) is the modern Latin writing script or writing system for Vietnamese. It uses the Latin script based on Romance languages originally developed by Portuguese m ...
(Latin Extended Additional). Latin Extended-C contains additions for
Uighur and the
Claudian letters
The Claudian letters were developed by the Roman emperor Claudius (reigned 41–54). He introduced three new letters to the Latin alphabet:
*Ↄ or ↃϹ/X (''antisigma'') to replace BS and PS, much as X stood in for CS and GS. The shape ...
. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (
Teuthonista
Teuthonista is a phonetic transcription system used predominantly for the transcription of (High) German dialects. It is very similar to other Central European transcription systems from the early 20th century. The base characters are mostly bas ...
). Latin Extended-F and -G contain characters for
phonetic transcription
Phonetic transcription (also known as phonetic script or phonetic notation) is the visual representation of speech sounds (or ''phones'') by means of symbols. The most common type of phonetic transcription uses a phonetic alphabet, such as the ...
.
Blocks
As of version 15.0 of the Unicode Standard, 1,481 characters in the following 19 blocks are classified as belonging to the Latin script.
*
Basic Latin, 0000–007F. This block corresponds to
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
.
*
Latin-1 Supplement
The Latin-1 Supplement (also called C1 Controls and Latin-1 Supplement) is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. Thi ...
, 0080–00FF
*
Latin Extended-A
Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 (which is already encoded in the Latin-1 Supplement block) and also legacy characte ...
, 0100–017F
*
Latin Extended-B
Latin Extended-B is the fourth block (0180-024F) of the Unicode Standard. It has been included since version 1.0, where it was only allocated to the code points 0180-01FF and contained 113 characters. During unification with ISO 10646 for version ...
, 0180–024F
*
IPA Extensions
IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA sign ...
, 0250–02AF
*
Spacing Modifier Letters
Spacing Modifier Letters is a Unicode block containing characters for the IPA, UPA, and other phonetic transcriptions. Included are the IPA tone marks, and modifiers for aspiration and palatalization
Palatalization may refer to:
*Palatalizat ...
, 02B0–02FF
*
Phonetic Extensions
Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the Oxford English dictionary and American dictionaries, and Americanist and Russianist phonetic notation ...
, 1D00–1D7F
*
Phonetic Extensions Supplement
Phonetic Extensions Supplement is a Unicode block containing characters for specialized and deprecated forms of the International Phonetic Alphabet.
Block
History
The following Unicode-related documents record the purpose and process of defi ...
, 1D80–1DBF
*
Latin Extended Additional
Latin Extended Additional is a Unicode block.
The characters in this block are mostly precomposed combinations of Latin letters with one or more general Diacritic, diacritical marks. Ninety of the characters are used in the Vietnamese alphabet. T ...
, 1E00–1EFF
*
Superscripts and Subscripts, 2070–209F
*
Letterlike Symbols
Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not expl ...
, 2100–214F
*
Number Forms
Number Forms is a Unicode block containing Unicode compatibility characters that have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and Roman numerals. In addition to the cha ...
, 2150–218F
*
Latin Extended-C
Latin Extended-C is a Unicode block containing Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as ...
, 2C60–2C7F
*
Latin Extended-D
Latin Extended-D is a Unicode block containing Latin characters for phonetic, Mayanist, and Medieval
In the history of Europe, the Middle Ages or medieval period lasted approximately from the late 5th to the late 15th centuries, simi ...
, A720–A7FF
*
Latin Extended-E
Latin Extended-E is a Unicode block containing Latin script characters used in German dialectology (Teuthonista
Teuthonista is a phonetic transcription system used predominantly for the transcription of (High) German dialects. It is very s ...
, AB30–AB6F
*
Alphabetic Presentation Forms
Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.
Block
History
The following Unicode-related documents record the purpose and process of defining specific characters in ...
(Latin ligatures) FB00–FB4F
*
Halfwidth and Fullwidth Forms, FF00–FFEF
*
Latin Extended-F
Latin Extended-F is a Unicode block containing modifier letters, nearly all IPA and extIPA, for phonetic transcription. The Latin Extended-F and -G blocks contain the first Latin characters defined outside of the Basic Multilingual Plane
In ...
, 10780–107BF
*
Latin Extended-G, 1DF00–1DFFF
In addition, a number of Latin-like characters are encoded in the
Currency Symbols,
Control Pictures
Control Pictures is a Unicode block
A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes ...
,
CJK Compatibility,
Enclosed Alphanumerics,
Enclosed CJK Letters and Months
Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed ...
,
Mathematical Alphanumeric Symbols
Mathematical Alphanumeric Symbols is a Unicode block comprising styled forms of Latin and Greek letters and decimal digits that enable mathematicians to denote different notions with different letter styles. The letters in various fonts ofte ...
, and
Enclosed Alphanumeric Supplement blocks, but, although they are Latin letters graphically, they have the script property ''
common
Common may refer to:
Places
* Common, a townland in County Tyrone, Northern Ireland
* Boston Common, a central public park in Boston, Massachusetts
* Cambridge Common, common land area in Cambridge, Massachusetts
* Clapham Common, originally ...
'', and, so, do not belong to the Latin script in Unicode terms.
Lisu also consists almost entirely of Latin forms, but uses its own script property.
Table of characters
In this table those characters with the
Unicode script property of Latin are highlighted in colour, indicating the version of Unicode they were introduced in. Reserved code points (which may be assigned as characters at a future date) have a grey background. All characters that do not belong to the Latin script have a white background (and the version of Unicode they were introduced in is therefore not indicated).
See also
*
Universal Character Set characters
*
Letterlike Symbols (Unicode block)
*
List of Latin-script letters
This is a list of letters of the Latin script. The definition of a Latin-script letter for this list is a character encoded in the Unicode Standard that has a script property of 'Latin' and the general category of 'Letter'. An overview of the ...
*
List of Latin letters by shape The following list are the graphically Latin letters in the Unicode Standard, regardless of whether they are defined as Latin script, as collated by shape (base letter) or by phonetic value. Many are hard-coded formatting variants. For example, the ...
*
Mathematical Alphanumeric Symbols
Mathematical Alphanumeric Symbols is a Unicode block comprising styled forms of Latin and Greek letters and decimal digits that enable mathematicians to denote different notions with different letter styles. The letters in various fonts ofte ...
*
European Latin Unicode subset (DIN 91379)
References
{{DEFAULTSORT:Latin Characters in Unicode
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
*