The Slovene alphabet or Slovenian alphabet (, or ''slovenska gajica'' ) is an extension of the
Latin script
The Latin script, also known as the Roman script, is a writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae in Magna Graecia. The Gree ...
used to write
Slovene. The standard language uses a
Latin alphabet
The Latin alphabet, also known as the Roman alphabet, is the collection of letters originally used by the Ancient Rome, ancient Romans to write the Latin language. Largely unaltered except several letters splitting—i.e. from , and from � ...
which is a slight modification of the Croatian
Gaj's Latin alphabet
Gaj's Latin alphabet ( sh-Latn-Cyrl, Gajeva latinica, separator=" / ", Гајева латиница}, ), also known as ( sr-Cyrl, абецеда, ) or ( sr-Cyrl, гајица, link=no, ), is the form of the Latin script used for writing all ...
, consisting of 25 lower- and upper-case letters:
Characters
The following Latin letters are also found separately alphabetized in words of non-Slovene origin:
Ć (mehki č),
Đ (dže),
Q (ku),
W (dvojni ve),
X (iks), and
Y (ipsilon).
Diacritics
To compensate for the shortcomings of the standard orthography, Slovenian also uses standardized
diacritic
A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacrit ...
s or accent marks to denote
stress,
vowel length
In linguistics, vowel length is the perceived or actual length (phonetics), duration of a vowel sound when pronounced. Vowels perceived as shorter are often called short vowels and those perceived as longer called long vowels.
On one hand, many ...
and
pitch accent
A pitch-accent language is a type of language that, when spoken, has certain syllables in words or morphemes that are prominent, as indicated by a distinct contrasting pitch (music), pitch (tone (linguistics), linguistic tone) rather than by vol ...
, much like the closely related
Serbo-Croatian
Serbo-Croatian ( / ), also known as Bosnian-Croatian-Montenegrin-Serbian (BCMS), is a South Slavic language and the primary language of Serbia, Croatia, Bosnia and Herzegovina, and Montenegro. It is a pluricentric language with four mutually i ...
. However, as in Serbo-Croatian, use of such accent marks is restricted to dictionaries, language textbooks and linguistic publications. In normal writing, the diacritics are almost never used, except in a few minimal pairs where real ambiguity could arise.
Two different and mutually incompatible systems of diacritics are used. The first is the simpler non-tonemic system, which can be applied to all Slovene dialects. It is more widely used and is the standard representation in dictionaries such as SSKJ. The tonemic system also includes tone as part of the representation. However, neither system reliably distinguishes schwa from the front mid-vowels, nor vocalised l from regular l . Some sources write these as ''ə'' and ''ł'', respectively, but this is not as common.
Non-tonemic diacritics
In the non-tonemic system, the distinction between the two mid-vowels is indicated, as well as the placement of stress and length of vowels:
* Long stressed vowels are notated with an acute diacritic: ''á é í ó ú ŕ'' (IPA: ).
* However, the rarer long stressed low-mid vowels and are notated with a circumflex: ''ê ô''.
* Short stressed vowels are notated with a grave: ''à è ì ò ù'' (IPA: ). Some systems may also include ''ə̀'' for .
Tonemic diacritics
The tonemic system uses the diacritics somewhat differently from the non-tonemic system. The high-mid vowels and are written ''ẹ ọ'' with a subscript dot, while the low-mid vowels and are written as plain ''e o''.
Pitch accent and length is indicated by four diacritical marks:
* The
acute ( ´ ) indicates long and low pitch: ''á é ẹ́ í ó ọ́ ú ŕ'' (IPA: ).
* The
inverted breve
Inverse or invert may refer to:
Science and mathematics
* Inverse (logic), a type of conditional sentence which is an immediate inference made from another conditional sentence
* Additive inverse, the inverse of a number that, when added to the ...
( ̑ ) indicates long and high pitch: ''ȃ ȇ ẹ̑ ȋ ȏ ọ̑ ȗ ȓ'' (IPA: ).
* The
grave
A grave is a location where a cadaver, dead body (typically that of a human, although sometimes that of an animal) is burial, buried or interred after a funeral. Graves are usually located in special areas set aside for the purpose of buria ...
( ` ) indicates short and low pitch. This occurs only on ''è'' (IPA: ), optionally written as ''ə̀''.
* The
double grave ( ̏ ) indicates short and high pitch: ''ȁ ȅ ȉ ȍ ȕ'' (IPA: á ɛ́ í ɔ́ ú). ''ȅ'' is also used for , optionally written as ''ə̏''.
The schwa vowel is written ambiguously as ''e'', but its accentuation will sometimes distinguish it: a long vowel mark can never appear on a schwa, while a grave accent can appear only on a schwa. Thus, only ''ȅ'' and unstressed ''e'' are truly ambiguous.
Others
The writing in its usual form uses additional accentual marks, which are used to disambiguate similar words with different meanings. For example:
*gòl (''naked'') , gól (''goal''),
*jêsen (''ash (tree)'') , jesén (''autumn''),
*kót (''angle'', ''corner'') , kot (''as'', ''like''),
*kózjak (''goat's dung'') , kozják (''goat-shed''),
*med (''between'') , méd (''brass'') , méd (''honey''),
*pól (''pole'') , pól (''half (of)'') , pôl (''expresses a half an hour before the given hour''),
*prècej (''at once'') , precéj (''a great deal (of)'')),
*remí (''draw'') , rémi (''rummy (- a card game)''),
*je (''he/she is'') , jé (''he/she eats'').
Foreign words
There are 5 letters for
vowel
A vowel is a speech sound pronounced without any stricture in the vocal tract, forming the nucleus of a syllable. Vowels are one of the two principal classes of speech sounds, the other being the consonant. Vowels vary in quality, in loudness a ...
s (a, e, i, o, u) and 20 for
consonant
In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract, except for the h sound, which is pronounced without any stricture in the vocal tract. Examples are and pronou ...
s. The letters q, w, x, y are excluded from the standard spelling, as are some
Serbo-Croatian
Serbo-Croatian ( / ), also known as Bosnian-Croatian-Montenegrin-Serbian (BCMS), is a South Slavic language and the primary language of Serbia, Croatia, Bosnia and Herzegovina, and Montenegro. It is a pluricentric language with four mutually i ...
graphemes (ć, đ), however they are collated as independent letters in some encyclopedias and dictionary listings; foreign
proper noun
A proper noun is a noun that identifies a single entity and is used to refer to that entity ('' Africa''; ''Jupiter''; '' Sarah''; ''Walmart'') as distinguished from a common noun, which is a noun that refers to a class of entities (''continent, ...
s or
toponym
Toponymy, toponymics, or toponomastics is the study of ''wikt:toponym, toponyms'' (proper names of places, also known as place names and geographic names), including their origins, meanings, usage, and types. ''Toponym'' is the general term for ...
s are often not adapted to Slovene orthography as they are in some other Slavic languages, such as partly in
Russian
Russian(s) may refer to:
*Russians (), an ethnic group of the East Slavic peoples, primarily living in Russia and neighboring countries
*A citizen of Russia
*Russian language, the most widely spoken of the Slavic languages
*''The Russians'', a b ...
or entirely in the Serbian standard of Serbo-Croatian.
In addition, the graphemes ö and ü are used in certain non-standard dialect spellings (usually representing loanwords from German, Hungarian or Turkish) – for example, ''dödöli'' (Prekmurje potato dumplings) and ''
Danilo Türk
Danilo Türk (; born 19 February 1952) is a Slovenian diplomat, professor of international law, human rights expert, and political figure who served as President of Slovenia from 2007 to 2012. He was the first Slovene ambassador to the United Nat ...
'' (a politician).
Encyclopedic listings (such as in the 2001 ''Slovenski pravopis'' and the 2006 ''Leksikon SOVA'') use this alphabet:
: a, b, c, č, ć, d, đ, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, š, t, u, v, w, x, y, z, ž.
Therefore, ''Newton'' and ''New York'' remain the same and are not transliterated to ''Njuton'' or ''Njujork''; transliterated forms would seem very odd to a Slovene. However, the unit of force is written as ''njuton'' as well as ''newton''. Some place names are transliterated (e.g. Philadelphia – Filadelfija; Hawaii – Havaji). Other names from non-Latin languages are transliterated in a fashion similar to that used by other European languages, albeit with some adaptations.
Japanese,
Indonesia
Indonesia, officially the Republic of Indonesia, is a country in Southeast Asia and Oceania, between the Indian Ocean, Indian and Pacific Ocean, Pacific oceans. Comprising over List of islands of Indonesia, 17,000 islands, including Sumatra, ...
n and
Arabic
Arabic (, , or , ) is a Central Semitic languages, Central Semitic language of the Afroasiatic languages, Afroasiatic language family spoken primarily in the Arab world. The International Organization for Standardization (ISO) assigns lang ...
names such as ''Kajibumi'', ''Jakarta'' and ''Jabar'' are written as ''Kadžibumi'', ''Džakarta'' and ''Džabar'', where j is replaced with dž. Except for ć and đ, graphemes with
diacritical marks from other foreign alphabets (e.g., ä, å, æ, ç, ë, ï, ń, ö, ß, ş, ü) are not used as independent letters.
History
The modern alphabet (''abeceda'') was standardised in the mid-1840s from an arrangement of the
Croatian national reviver and leader
Ljudevit Gaj
Ljudevit Gaj (; born Ludwig Gay; ; 8 August 1809 – 20 April 1872) was a Croatian linguist, politician, journalist and writer. He was one of the central figures of the pan-Slavist Illyrian movement.
Biography
Origin
He was born in Krapina ( ...
which would become the
Croatian alphabet
Croatian may refer to:
*Croatia
*Croatian language
*Croatian people
*Croatians (demonym)
See also
*
*
* Croatan (disambiguation)
* Croatia (disambiguation)
* Croatoan (disambiguation)
* Hrvatski (disambiguation)
* Hrvatsko (disambiguation)
...
, and was in turn patterned on the
Czech alphabet
Czech orthography is a system of rules for proper formal writing (orthography) in Czech language, Czech. The earliest form of separate Latin script specifically designed to suit Czech was devised by Czech theologian and church reformist Jan Hus, ...
. Before the current alphabet became standard, š was, for example, written as ʃ, ʃʃ or ſ; č as tʃch, cz, tʃcz or tcz; i sometimes as y as a relic of the letter now rendered as Ы (''
yery
Yeru or Eru (Ы ы; italics: ''Ы'' ''ы''), usually called Y in modern Russian language, Russian or Yery or Ery historically and in modern Church Slavonic, is a letter in the Cyrillic script. It represents the close central unround ...
'') in modern Russian; j as y; l as ll; v as w; ž as ʃ, ʃʃ or ʃz.
In the old alphabet used by most distinguished writers, the
Bohorič alphabet (''bohoričica''), developed by
Adam Bohorič, the characters č, š and ž would be spelt as zh, ſh and sh respectively, and c, s and z would be spelt as z, ſ and s respectively. To remedy this, so that there was a one-to-one correspondence between sounds and letters,
Jernej Kopitar
Jernej Kopitar, also known as Bartholomeus Kopitar (21 August 1780 – 11 August 1844), was a Slovene linguist and philologist working in Vienna. He also worked as the Imperial censor for Slovene literature in Vienna. He is perhaps best known ...
urged the development of a new alphabet.
In 1825,
Franc Serafin Metelko proposed his version of the alphabet (the
Metelko alphabet, ''metelčica''). However, it was banned in 1833 in favour of the Bohorič alphabet after the so-called "Suit of the Letters" (''Črkarska pravda'') (1830–1833), which was won by
France Prešeren
France Prešeren () (3 December 1800 – 8 February 1849) was a 19th-century Romantic Slovene poet whose poems have been translated into many languages. and
Matija Čop. Another alphabet, the
Dajnko alphabet (''dajnčica''), was developed by
Peter Dajnko in 1824, but did not catch on as widely as the Metelko alphabet; it was banned in 1838 because it mixed Latin and Cyrillic characters, which was seen as a poor way to handle missing characters.
Gaj's Latin alphabet
Gaj's Latin alphabet ( sh-Latn-Cyrl, Gajeva latinica, separator=" / ", Гајева латиница}, ), also known as ( sr-Cyrl, абецеда, ) or ( sr-Cyrl, гајица, link=no, ), is the form of the Latin script used for writing all ...
(''gajica'') was adopted afterwards, although it still fails to distinguish all the phonemes of Slovene.
Computer encoding
The preferred
character encoding
Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical v ...
s (writing codes) for Slovene texts are
UTF-8
UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8.
UTF-8 supports all 1,112,0 ...
(
Unicode
Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
),
UTF-16
UTF-16 (16-bit Unicode Transformation Format) is a character encoding that supports all 1,112,064 valid code points of Unicode. The encoding is variable-length as code points are encoded with one or two ''code units''. UTF-16 arose from an earli ...
, and
ISO/IEC 8859-2
ISO/IEC 8859-2:1999, ''Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. I ...
(Latin-2), which generally supports Central and Eastern European languages that are written in the
Latin script
The Latin script, also known as the Roman script, is a writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae in Magna Graecia. The Gree ...
.
In the original
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
frame of 1 to 126 characters one can find these examples of writing text in Slovene:
: a, b, c, *c, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, *s, t, u, v, z, *z
: a, b, c, "c, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, "s, t, u, v, z, "z
: a, b, c, c(, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, s(, t, u, v, z, z(
: a, b, c, c^, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, s^, t, u, v, z, z^
: a, b, c, cx, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, sx, t, u, v, z, zx
In
ISO/IEC 8859-1
ISO/IEC 8859-1:1998, ''Information technology—8-bit single-byte coded graphic character sets—Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987 ...
(Latin-1) typical workarounds for missing characters Č (č), Š (š), and Ž (ž) can be C~ (c~), S~ (s~), Z~ (z~) or similar as for ASCII encoding.
For usage under
DOS
DOS (, ) is a family of disk-based operating systems for IBM PC compatible computers. The DOS family primarily consists of IBM PC DOS and a rebranded version, Microsoft's MS-DOS, both of which were introduced in 1981. Later compatible syste ...
and
Microsoft Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
also
code page
In computing, a code page is a character encoding and as such it is a specific association of a set of printable character (computing), characters and control characters with unique numbers. Typically each number represents the binary value in a s ...
s 852 and
Windows-1250
Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use the Latin script. It is primarily used by Czech. It is also used for Polish (as can Windows-1257), Slovak, H ...
respectively fully supported Slovene alphabet.
In
TeX
Tex, TeX, TEX, may refer to:
People and fictional characters
* Tex (nickname), a list of people and fictional characters with the nickname
* Tex Earnhardt (1930–2020), U.S. businessman
* Joe Tex (1933–1982), stage name of American soul singer ...
notation, č, š and ž become \v c, \v s, \v z, \v, \v, \v or in their macro versions, "c, "s and "z, or in other representations as \~, \{, \' for lowercase and \^, \
, \@ for uppercase.
The IETF language tag">/nowiki>, \@ for uppercase.
The IETF language tags have assigned variants to the different orthographies of Slovene:
* (Bohoric alphabet)
* (Dajnko alphabet)
* (Metelko alphabet)
* (Standardized Resian orthography).
See also
*Gaj's Latin alphabet
Gaj's Latin alphabet ( sh-Latn-Cyrl, Gajeva latinica, separator=" / ", Гајева латиница}, ), also known as ( sr-Cyrl, абецеда, ) or ( sr-Cyrl, гајица, link=no, ), is the form of the Latin script used for writing all ...
* Slovenian braille
* Yugoslav manual alphabet
References
External links
Slovene alphabet and pronunciation
{{DEFAULTSORT:Slovene Alphabet
Latin alphabets