As of
Unicode
Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
version ,
Cyrillic script
The Cyrillic script ( ) is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic languages, Slavic, Turkic languages, Turkic, Mongolic languages, Mongolic, Uralic languages, Uralic, C ...
is encoded across several
blocks:
*
Cyrillic
The Cyrillic script ( ) is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Ea ...
U+0400–U+04FF 256 characters
*
Cyrillic SupplementU+0500–U+052F 48 characters
*
Cyrillic Extended-AU+2DE0–U+2DFF 32 characters
*
Cyrillic Extended-BU+A640–U+A69F 96 characters
*
Cyrillic Extended-CU+1C80–U+1C8F 11 characters
*
Cyrillic Extended-DU+1E030–U+1E08F 63 characters
*
Phonetic Extensions
Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the ''Oxford English Dictionary'' and American dictionaries, and Americanist and Russianist phonetic notat ...
U+1D2B, U+1D78 2 Cyrillic characters
*
Combining Half MarksU+FE2E–U+FE2F 2 Cyrillic characters
The characters in the range U+0400–U+045F are basically the characters from
ISO 8859-5 moved upward by 864 positions. The next characters in the Cyrillic block, range U+0460–U+0489, are historical letters, some of which are still used for
Church Slavonic
Church Slavonic is the conservative Slavic liturgical language used by the Eastern Orthodox Church in Belarus, Bulgaria, North Macedonia, Montenegro, Poland, Russia, Ukraine, Serbia, the Czech Republic and Slovakia, Slovenia and Croatia. The ...
. The characters in the range U+048A–U+04FF and the complete Cyrillic Supplement block (U+0500–U+052F) are additional letters for various languages that are written with
Cyrillic script
The Cyrillic script ( ) is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic languages, Slavic, Turkic languages, Turkic, Mongolic languages, Mongolic, Uralic languages, Uralic, C ...
. Two characters are in the Phonetic Extensions block: from the
Uralic Phonetic Alphabet
Finno-Ugric transcription (FUT) or the Uralic Phonetic Alphabet (UPA) is a phonetic transcription or notational system used predominantly for the transcription and reconstruction of Uralic languages. It was first published in 1901 by Eemil Nesto ...
and for transcribing nasal vowels.
Unicode
Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
includes few
precomposed accented Cyrillic letters; the others can be
combined by adding after the accented vowel (e.g., е́ у́ э́); see below.
Several diacritical marks not specific to Cyrillic can be used with Cyrillic text, including:
* in
Combining Diacritical Marks
Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character " Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actua ...
bloc
U+0300–U+036F
** (as common Cyrillic stress mark).
** (as stress mark in Bulgarian).
** (in non Slavic languages)
** (in non Slavic languages)
** (with й but also other letters in non Slavic languages)
** (in transliterations of other writing systems)
** (in non Slavic languages)
** (in non Slavic languages)
** (in non Slavic languages)
** (in non Slavic languages)
** (with ѷ in old spelling)
** (in 19th century Aleut alphabet)
** (in transliterations of other writing systems)
** (in 19th century Lithuanian or Polish cyrillic alphabets)
** (in transliterations of other writing systems)
** (in 19th century Polish cyrillic alphabet)
* in
Combining Diacritical Marks for Symbols
Combining Diacritical Marks for Symbols is a Unicode block
A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for administrative ...
bloc
U+20D0–U+20F0** (as Cyrillic ten thousands sign).
In the table below, small letters are ordered according to their Unicode numbers; capital letters are placed immediately before the corresponding small letters. Standard Unicode names and
canonical decompositions are included.
Table of characters
Blocks
The Cyrillic block (U+0400 – U+04FF) was added to the Unicode Standard in October, 1991 with the release of version 1.0:
The Cyrillic Supplement block (U+0500 – U+052F) was added to the Unicode Standard in March, 2002 with the release of version 3.2:
The Cyrillic Extended-A (U+2DE0 – U+2DFF) and Cyrillic Extended-B (U+A640 – U+A69F) blocks were added to the Unicode Standard in April, 2008 with the release of version 5.1:
The Cyrillic Extended-C block (U+1C80 – U+1C8F) was added to the Unicode Standard in June, 2016 with the release of version 9.0:
The Cyrillic Extended-D block (U+1E030 – U+1E08F) was added to the Unicode Standard in September, 2022 with the release of version 15.0:
See also
*
List of Cyrillic letters
This is a list of letters of the Cyrillic script. The definition of a Cyrillic letter for this list is a character encoded in the Unicode standard that a has script property of 'Cyrillic' and the general category of 'Letter'. An overview ...
*
Cyrillic script
The Cyrillic script ( ) is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic languages, Slavic, Turkic languages, Turkic, Mongolic languages, Mongolic, Uralic languages, Uralic, C ...
*
Cyrillic alphabets
Numerous Cyrillic alphabets are based on the Cyrillic script. The early Cyrillic alphabet was developed in the 9th century AD and replaced the earlier Glagolitic script developed by the theologians Saints Cyril and Methodius, Cyril and Methodi ...
References
*
{{Slavic languages
Unicode
Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
*
Russian-language computing
Internet in Russian language
Cyrillic Unicode blocks