Polish orthography
   HOME

TheInfoList



OR:

Polish orthography is the system of writing the
Polish language Polish (Polish: ''język polski'', , ''polszczyzna'' or simply ''polski'', ) is a West Slavic language of the Lechitic group written in the Latin script. It is spoken primarily in Poland and serves as the native language of the Poles. In a ...
. The language is written using the Polish alphabet, which derives from the
Latin alphabet The Latin alphabet or Roman alphabet is the collection of letters originally used by the ancient Romans to write the Latin language. Largely unaltered with the exception of extensions (such as diacritics), it used to write English and th ...
, but includes some additional letters with
diacritics A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacriti ...
. The orthography is mostly phonetic, or rather phonemic—the written letters (or combinations of them) correspond in a consistent manner to the sounds, or rather the
phonemes In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language. For example, in most dialects of English, with the notable exception of the West Midlands and the north-west ...
, of spoken Polish. For detailed information about the system of phonemes, see
Polish phonology The phonological system of the Polish language is similar in many ways to those of other Slavic languages, although there are some characteristic features found in only a few other languages of the family, such as contrasting postalveolar and a ...
.


Polish alphabet

The diacritics used in the Polish alphabet are the ''kreska'' (graphically similar to the acute accent) in the letters ''ć, ń, ó, ś, ź''; the ''kreska ukośna'' ( stroke) in the letter ''ł''; the ''kropka'' (
overdot When used as a diacritic mark, the term dot is usually reserved for the '' interpunct'' ( · ), or to the glyphs "combining dot above" ( ◌̇ ) and "combining dot below" ( ◌̣ ) which may be combined with some letters of t ...
) in the letter ''ż''; and the '' ogonek'' ("little tail") in the letters ''ą, ę''. There are 32 letters (or 35 letters, if the foreign letters ''q, v, x'' are included) in the Polish alphabet: 9
vowel A vowel is a syllabic speech sound pronounced without any stricture in the vocal tract. Vowels are one of the two principal classes of speech sounds, the other being the consonant. Vowels vary in quality, in loudness and also in quantity (leng ...
s and 23 or 26
consonants In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract. Examples are and pronounced with the lips; and pronounced with the front of the tongue; and pronounced wit ...
. The letters ''q'' (named ''ku''), ''v'' (named ''fau'' or rarely ''we''), and ''x'' (named ''iks'') are used in some foreign words and commercial names. In
loanword A loanword (also loan word or loan-word) is a word at least partly assimilated from one language (the donor language) into another language. This is in contrast to cognates, which are words in two or more languages that are similar because t ...
s they are often replaced by ''kw'', ''w'', and (''ks'' or ''gz''), respectively (as in ''kwarc'' "quartz", ''weranda'' "veranda", ''ekstra'' "extra", ''egzosfera'', "exosphere"). When giving the spelling of words, certain letters may be said in more emphatic ways to distinguish them from other identically pronounced characters. For example, H may be referred to as ''samo h'' ("h alone") to distinguish it from CH ''(ce ha)''. The letter Ż may be called "''żet'' (or ''zet'') ''z kropką''" ("Ż with a dot") to distinguish it from RZ ''(er zet)''. The letter U may be called ''u otwarte'' ("open u", a reference to its graphical form) or ''u zwykłe'' ("regular u"), to distinguish it from Ó, which is sometimes called ''ó zamknięte'' ("closed ó"), ''ó kreskowane'' or ''ó z kreską'' ("ó with a stroke accent"), alternatively ''o kreskowane'' or ''o z kreską'' ("o with a stroke accent"). The letter ''ó'' is a relic from hundreds of years ago when there was a length distinction in Polish similar to that in
Czech Czech may refer to: * Anything from or related to the Czech Republic, a country in Europe ** Czech language ** Czechs, the people of the area ** Czech culture ** Czech cuisine * One of three mythical brothers, Lech, Czech, and Rus' Places * Czech, ...
, with ''á'' and ''é'' also being common at the time. Subsequently, the length distinction disappeared and ''á'' and ''é'' were abolished, but ''ó'' came to be pronounced the same as ''u''. Note that Polish letters with
diacritic A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacriti ...
s are treated as fully independent letters in alphabetical ordering (unlike in languages such as French,
Spanish Spanish might refer to: * Items from or related to Spain: **Spaniards are a nation and ethnic group indigenous to Spain **Spanish language, spoken in Spain and many Latin American countries **Spanish cuisine Other places * Spanish, Ontario, Can ...
, and
German German(s) may refer to: * Germany (of or related to) ** Germania (historical use) * Germans, citizens of Germany, people of German ancestry, or native speakers of the German language ** For citizens of Germany, see also German nationality law **Ge ...
). For example, ''być'' comes after ''bycie''. The diacritic letters also have their own sections in dictionaries (words beginning with ''ć'' are not usually listed under ''c''). However, there are no regular words that begin with ''ą'' or ''ń''.


Digraphs

Polish additionally uses the digraphs ch, cz, dz, , , rz, and sz. Combinations of certain consonants with the letter ''i'' before a vowel can be considered digraphs: ci as a positional variant of ć, si as a positional variant of ś, zi as a positional variant of ź, and ni as a positional variant of ń (but see a special remark on ni below); and there is also one trigraph dzi as a positional variant of . These are not given any special treatment in alphabetical ordering. For example, ''ch'' is treated simply as ''c'' followed by ''h'', and not as a single letter as in
Czech Czech may refer to: * Anything from or related to the Czech Republic, a country in Europe ** Czech language ** Czechs, the people of the area ** Czech culture ** Czech cuisine * One of three mythical brothers, Lech, Czech, and Rus' Places * Czech, ...
or Slovak (e.g. ''
Chojnice Chojnice (; , or ''Chòjnice''; german: Konitz or ''Conitz'') is a town in northern Poland with 39,423 inhabitants as of December 2021, near the Tuchola Forest. It is the capital of the Chojnice County in the Pomeranian Voivodeship. History Pias ...
'' only has its first letter capitalised, and is sorted after '' Canki'' and before ''
Cieszyn Cieszyn ( , ; cs, Těšín ; german: Teschen; la, Tessin; szl, Ćeszyn) is a border town in southern Poland on the east bank of the Olza River, and the administrative seat of Cieszyn County, Silesian Voivodeship. The town has 33,500 inhabitan ...
'').


Spelling rules

See below for rules regarding spelling of alveolo-palatal consonants. H may be glottal in a small number of dialects. Rarely, is not a digraph and represents two separate sounds: * in various forms of the verb ''zamarzać'' – "to freeze" * in various forms of the verb ''mierzić'' – "to disgust" * in the place name
Murzasichle Murzasichle is a village in the administrative district of Gmina Poronin, within Tatra County, Lesser Poland Voivodeship, in southern Poland. It lies approximately south-east of Poronin, east of Zakopane, and south of the regional capital Kr ...
* in borrowings, for example ''erzac'' (from German ''Ersatz''), ''Tarzan''


Voicing and devoicing

Voiced consonant Voice or voicing is a term used in phonetics and phonology to characterize speech sounds (usually consonants). Speech sounds can be described as either voiceless (otherwise known as ''unvoiced'') or voiced. The term, however, is used to ref ...
letters frequently come to represent voiceless sounds (as shown in the above tables). This is due to the neutralization that occurs at the end of words and in certain
consonant cluster In linguistics, a consonant cluster, consonant sequence or consonant compound, is a group of consonants which have no intervening vowel. In English, for example, the groups and are consonant clusters in the word ''splits''. In the education fie ...
s; for example, the in ''klub'' ("club") is pronounced like a , and the in ''prze-'' sounds like . Less frequently, voiceless consonant letters can represent voiced sounds; for example, the in ''także'' ("also") is pronounced like a . The conditions for this neutralization are described under ''Voicing and devoicing'' in the article on Polish phonology.


Palatal and palatalized consonants

The spelling rule for the
alveolo-palatal In phonetics, alveolo-palatal (or alveopalatal) consonants, sometimes synonymous with pre-palatal consonants, are intermediate in articulation between the coronal and dorsal consonants, or which have simultaneous alveolar and palatal artic ...
sounds , , , and is as follows: before the vowel the plain letters are used; before other vowels the combinations are used; when not followed by a vowel the diacritic forms are used. For example, the in ''siwy'' ("grey-haired"), the in ''siarka'' ("sulphur") and the in ''święty'' ("holy") all represent the sound . Special attention should be paid to before plus a vowel. In words of foreign origin the causes the palatalization of the preceding consonant to , and it is pronounced as . This situation occurs when the corresponding genitive form ends in ''-nii'', pronounced as , not with ''-ni'', pronounced as (which is a situation typical to the words of Polish origin). For examples, see the table in the next section. According to one system, similar principles apply to the palatalized consonants , and , except that these can only occur before vowels. The spellings are thus before , and otherwise. For example, the in ''kim'' ("whom", instr.) and the in ''kiedy'' both represent . In the system without the palatalized velars, they are analyzed as /k/, /ɡ/ and /x/ before /i/ and /kj/, /ɡj/ and /xj/ before other vowels.


Other issues with ''i'' and ''j''

Except in the cases mentioned in the previous paragraph, the letter if followed by another vowel in the same word usually represents , but it also has the palatalizing effect on the previous consonant. For example, ''pies'' ("dog") is pronounced (). Some words with before plus a vowel also follow this pattern (see below). In fact ''i'' is the usual spelling of between a preceding consonant and a following vowel. The letter normally appears in this position only after , and if the palatalization effect described above has to be avoided (as in ''presja'' "pressure", ''Azja'' "Asia", ''lekcja'' "lesson", and the common suffixes ''-cja'' "-tion", ''-zja'' "-sion": ''stacja'' "station", ''wizja'' "vision"). The letter after consonants is also used in concatenation of two words if the second word in the pair starts with , e.g. ''wjazd'' "entrance" originates from ''w'' + . The pronunciation of the sequence ''wja'' (in ''wjazd'') is the same as the pronunciation of ''wia'' (in ''wiadro'' "bucket"). The ending ''-ii'' which appears in the inflected forms of some nouns of foreign origin, which have ''-ia'' in the nominative case (always after , , , and ; sometimes after , , and other consonants), is pronounced as , with the palatalization of the preceding consonant. For example, ''dalii'' (genitive of ''dalia'' "dalia"), ''Bułgarii'' (genitive of ''Bułgaria'' "Bulgaria"), ''chemii'' (genitive of ''chemia'' "chemistry"), ''religii'' (genitive of ''religia'' "religion"), ''amfibii'' (genitive of ''amfibia'' "amphibia"). The common pronunciation is . This is why children commonly misspell and write ''-i'' in the inflected forms as ''armii'', ''Danii'' or hypercorrectly write ''ziemii'' instead of ''ziemi'' (words of Polish origin do not have the ending ''-ii'' but simple ''-i'', e.g. ''ziemi'', genitive of ''ziemia''). In some rare cases, however, when the consonant is preceded by another consonant, ''-ii'' may be pronounced as , but the preceding consonant is still palatalized, for example, ''Anglii'' (genitive of ''Anglia'' "England") is pronounced . (The spelling ''Angli'', very frequently met with on the Internet, is simply an error in orthography, caused by this pronunciation.) A special situation applies to : it has the full palatalization to before ''-ii'' which is pronounced as – and such a situation occurs only when the corresponding nominative form in ''-nia'' is pronounced as , not as . For example (pay attention to the upper- and lower-case letters): The ending ''-ji'', is always pronounced as . It appears only after ''c'', ''s'' and ''z''. Pronunciation of it as a simple is considered a pronunciation error. For example, (genitive of "pressure") is ; (genitive of "poetry") is ; (genitive of "reason") is .


Nasal vowels

The letters and , when followed by plosives and affricates, represent an oral vowel followed by a nasal consonant, rather than a nasal vowel. For example, in ''dąb'' ("oak") is pronounced , and in ''tęcza'' ("rainbow") is pronounced (the nasal assimilates with the following consonant). When followed by or , and in the case of , always at the end of words, these letters are pronounced as just or .


Homophonic spellings

Apart from the cases in the sections above, there are three sounds in Polish that can be spelt in two different ways, depending on the word. Those result from historical sound changes. The correct spelling can often be deduced from the spelling of other morphological forms of the word or cognates in Polish or in other Slavic languages. * can be spelt either or . ** only occurs in loanwords; however, many of them have been nativized and are not perceived as loanwords. is used: *** when cognate words have the letter , or , e.g.: ***: ''wahadło – waga'' ***: ''druh – drużyna'' ***: ''błahy – błazen'' *** when the same letter is used in the language from which the word was borrowed, e.g. Greek prefixes ''hekto-, hetero-, homo-, hipo-, hiper-, hydro-'', also ''honor, historia, herbata'', etc. ** is used: *** in all native words, e.g. ''chyba, chrust, chrapać, chować, chcieć'' *** when the same digraph is used in the language from which the word was borrowed, e.g. ''chór, echo, charakter, chronologia'', etc. * can be spelt or ; the spelling indicates that the sound developed from the historical long /oː/. ** is used: *** usually at the beginning of a word (except for ''ósemka, ósmy, ów, ówczesny, ówdzie'') *** always at the end of a word *** in the endings ''-uch, -ucha, -uchna, -uchny, -uga, -ula, -ulec, -ulek, -uleńka, -ulka, -ulo, -un, -unek, -uni, -unia, -unio, -ur, -us, -usi, -usieńki, -usia, -uszek, -uszka, -uszko, -uś, -utki'' ** is used: *** when cognate words or other morphological forms have the letter , or , e.g.: ***: ''mróz – mrozu'' ***: ''wiózł – wieźć'' ***: ''skrócić – skracać'' *** in the endings ''-ów, -ówka, -ówna'' (except for ''zasuwka, skuwka, wsuwka'') * can be spelt either or ; the spelling indicates that the sound developed from /r̝/ (cf. Czech ). ** is used: *** when cognate words or other morphological forms have the letter/digraph , , , , , , e.g.: ***: ''może – mogę'' ***: ''mosiężny – mosiądz'' ***: ''drużyna – druh'' ***: ''każe – kazać'' ***: ''wożę – woźnica'' ***: ''bliżej – blisko'' *** in the particle ''że'', e.g. ''skądże, tenże, także'' *** after , , , e.g.: ***: ''lżej'' ***: ''łże'' ***: ''rżysko'' *** in loanwords, especially from French, e.g.: ***: ''rewanż'' ***: ''żakiet'' ***: ''garaż'' *** when cognates in other Slavic languages contain the sound or , e.g. ''żuraw'' – Russian журавль ** is used: *** when cognate words or other morphological forms have the letter , e.g. ''morze – morski, karze – kara'' *** usually after , , , , , , , , , e.g.: ***: ''przygoda'' ***: ''brzeg'' ***: ''trzy'' ***: ''drzewo'' ***: ''krzywy'' ***: ''grzywa'' ***: ''chrzest'' ***: ''ujrzeć'' ***: ''wrzeć'' *** when cognates in other Slavic languages contain the sound or , e.g. ''rzeka'' – Russian река


Other points

The letter represents in the digraphs and in loanwords, for example ''autor, Europa''; but not in native words, like ''nauka'', pronounced . There are certain clusters where a written consonant would not normally be pronounced. For example, the in the words ''mógł'' ("could") and ''jabłko'' ("apple") is omitted in ordinary speech.


Capitalization

Names are generally capitalized in Polish as in English. Polish does not capitalize the months and days of the week, nor adjectives and other forms derived from proper nouns (for example, ''angielski'' "English"). Titles such as ''pan'' ("Mr"), ''pani'' ("Mrs/Ms"), ''lekarz'' ("doctor"), etc. and their abbreviations are not capitalized, except in written polite address. Second-person pronouns are traditionally capitalized in formal writing (e.g. letters or official emails); so may be other words used to refer to someone directly in a formal setting, like ''Czytelnik'' ("reader", in newspapers or books). Third-person pronouns are capitalized to show reverence, most often in a sacred context.


Punctuation

Polish punctuation is similar to that of English. However, there are more rigid rules concerning use of commas— subordinate clauses are almost always marked off with a comma, while it is normally considered incorrect to use a comma before a
coordinating conjunction In grammar, a conjunction (abbreviated or ) is a part of speech that connects words, phrases, or clauses that are called the conjuncts of the conjunctions. That definition may overlap with that of other parts of speech and so what constitutes a ...
with the meaning "and" (''i'', ''a'' or ''oraz''). Abbreviations (but not
acronym An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...
s or initialisms) are followed by a period when they end with a letter other than the one which ends the full word. For example, ''dr'' has no period when it stands for ''doktor'', but takes one when it stands for an inflected form such as ''doktora'' and ''prof.'' has period because it comes from ''profesor'' (
professor Professor (commonly abbreviated as Prof.) is an academic rank at universities and other post-secondary education and research institutions in most countries. Literally, ''professor'' derives from Latin as a "person who professes". Professors ...
). Apostrophes are used to mark the elision of the final sound of foreign words not pronounced before Polish inflectional endings, as in ''Harry'ego'' (, genitive of ''Harry'' – the final is elided in the genitive). However, it is often erroneously used to separate a loanword stem from any inflectional ending, for example, ''*John'a'', which should be ''Johna'' (genitive of ''John''; no sound is elided). Quotation marks are used in different ways: either „ordinary Polish quotes” or «French quotes» (without space) for first level, and ‚single Polish quotes’ or «French quotes» for second level, which gives three styles of nested quotes: # „Quote ‚inside’ quote” # „Quote «inside» quote” # «Quote ‚inside’ quote» Some older prints have used „such Polish quotes“.


History

Poles adopted the
Latin alphabet The Latin alphabet or Roman alphabet is the collection of letters originally used by the ancient Romans to write the Latin language. Largely unaltered with the exception of extensions (such as diacritics), it used to write English and th ...
in the 12th century. However, that alphabet was ill-equipped to represent certain Polish sounds, such as the
palatal consonant Palatals are consonants articulated with the body of the tongue raised against the hard palate (the middle part of the roof of the mouth). Consonants with the tip of the tongue curled back against the palate are called retroflex. Characteris ...
s and nasal vowels. Consequently, Polish spelling in the
Middle Ages In the history of Europe, the Middle Ages or medieval period lasted approximately from the late 5th to the late 15th centuries, similar to the post-classical period of global history. It began with the fall of the Western Roman Empire ...
was highly inconsistent, as different writers used different systems to represent these sounds, For example, in early documents the letter ''c'' could signify the sounds now written ''c, cz, k'', while the letter ''z'' was used for the sounds now written ''z, ż, ś, ź''. Writers soon began to experiment with digraphs (combinations of letters), new letters (φ and ſ, no longer used), and eventually
diacritic A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacriti ...
s. The Polish alphabet was one of two major forms of Latin-based orthography developed for
Slavic languages The Slavic languages, also known as the Slavonic languages, are Indo-European languages spoken primarily by the Slavic peoples and their descendants. They are thought to descend from a proto-language called Proto-Slavic, spoken during the ...
, the other being
Czech orthography Czech orthography is a system of rules for proper formal writing (orthography) in Czech. The earliest form of separate Latin script specifically designed to suit Czech was devised by Czech theologian and church reformist Jan Hus, the namesake of ...
, characterized by
caron A caron (), háček or haček (, or ; plural ''háčeks'' or ''háčky'') also known as a hachek, wedge, check, kvačica, strešica, mäkčeň, varnelė, inverted circumflex, inverted hat, flying bird, inverted chevron, is a diacritic mark ( ...
s (háčeks), as in the letter ''č''. The other major Slavic languages which are now written in Latin-based alphabets ( Slovak, Slovene, and
Serbo-Croatian Serbo-Croatian () – also called Serbo-Croat (), Serbo-Croat-Bosnian (SCB), Bosnian-Croatian-Serbian (BCS), and Bosnian-Croatian-Montenegrin-Serbian (BCMS) – is a South Slavic language and the primary language of Serbia, Croatia, Bosnia an ...
) use systems similar to the Czech. However a Polish-based orthography is used for Kashubian and usually for Silesian, while the
Sorbian languages The Sorbian languages ( hsb, serbska rěč, dsb, serbska rěc) are the Upper Sorbian language and Lower Sorbian language, two closely related and partially mutually intelligible languages spoken by the Sorbs, a West Slavic ethno-cultural min ...
use elements of both systems.


Computer encoding

There are several different systems for
encoding In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication ...
the Polish alphabet for computers. All letters of the Polish alphabet are included in
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
, and thus Unicode-based encodings such as
UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of ...
and
UTF-16 UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode (in fact this number of code points is dictated by the design of UTF-16). The encoding is variable-length, as cod ...
can be used. The Polish alphabet is completely included in the
Basic Multilingual Plane In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecima ...
of Unicode.
ISO 8859-2 ISO/IEC 8859-2:1999, ''Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. ...
(Latin-2),
ISO 8859-13 ISO/IEC 8859-13:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 13: Latin alphabet No. 7'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1998. ...
(Latin-7),
ISO 8859-16 ISO/IEC 8859-16:2001, ''Information technology — 8-bit single-byte coded graphic character sets — Part 16: Latin alphabet No. 10'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001 ...
(Latin-10) and
Windows-1250 Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use Latin script, such as Czech (which is its main user with half its use, though Czech has 96.6% use of UTF-8, an ...
are popular 8-bit encodings that support the Polish alphabet. The Polish letters which are not present in the
English alphabet The alphabet for Modern English is a Latin-script alphabet consisting of 26 letters, each having an upper- and lower-case form. The word ''alphabet'' is a compound of the first two letters of the Greek alphabet, ''alpha'' and '' beta''. ...
use the following HTML character entities and
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
codepoints: For other encodings, see the following table. Numbers in the table are hexadecimal. A common test sentence containing all the Polish diacritic letters is the nonsensical "''Zażółć gęślą jaźń''".


See also

* Polish Braille * Polish manual alphabet


Further reading

*


References


External links


Polish Pronunciation Audio and Grammar Charts

Online editor for typing Polish characters
{{DEFAULTSORT:Polish Orthography
Orthography An orthography is a set of conventions for writing a language, including norms of spelling, hyphenation, capitalization, word breaks, emphasis, and punctuation. Most transnational languages in the modern period have a writing system, and ...
! Indo-European Latin-script orthographies