HOME

TheInfoList



OR:

Polish orthography is the system of writing the Polish language. The language is written using the Polish alphabet, which derives from the Latin alphabet, but includes some additional letters with diacritics. The orthography is mostly phonetic, or rather phonemic—the written letters (or combinations of them) correspond in a consistent manner to the sounds, or rather the phonemes, of spoken Polish. For detailed information about the system of phonemes, see
Polish phonology The phonological system of the Polish language is similar in many ways to those of other Slavic languages, although there are some characteristic features found in only a few other languages of the family, such as contrasting postalveolar and ...
.


Polish alphabet

The diacritics used in the Polish alphabet are the ''kreska'' (graphically similar to the
acute accent The acute accent (), , is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts. For the most commonly encountered uses of the accent in the Latin and Greek alphabets, precomposed cha ...
) in the letters ''ć, ń, ó, ś, ź''; the ''kreska ukośna'' ( stroke) in the letter ''ł''; the ''kropka'' ( overdot) in the letter ''ż''; and the '' ogonek'' ("little tail") in the letters ''ą, ę''. There are 32 letters (or 35 letters, if the foreign letters ''q, v, x'' are included) in the Polish alphabet: 9 vowels and 23 or 26
consonants In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract. Examples are and pronounced with the lips; and pronounced with the front of the tongue; and pronounced wit ...
. The letters ''q'' (named ''ku''), ''v'' (named ''fau'' or rarely ''we''), and ''x'' (named ''iks'') are used in some foreign words and commercial names. In loanwords they are often replaced by ''kw'', ''w'', and (''ks'' or ''gz''), respectively (as in ''kwarc'' "quartz", ''weranda'' "veranda", ''ekstra'' "extra", ''egzosfera'', "exosphere"). When giving the spelling of words, certain letters may be said in more emphatic ways to distinguish them from other identically pronounced characters. For example, H may be referred to as ''samo h'' ("h alone") to distinguish it from CH ''(ce ha)''. The letter Ż may be called "''żet'' (or ''zet'') ''z kropką''" ("Ż with a dot") to distinguish it from RZ ''(er zet)''. The letter U may be called ''u otwarte'' ("open u", a reference to its graphical form) or ''u zwykłe'' ("regular u"), to distinguish it from Ó, which is sometimes called ''ó zamknięte'' ("closed ó"), ''ó kreskowane'' or ''ó z kreską'' ("ó with a stroke accent"), alternatively ''o kreskowane'' or ''o z kreską'' ("o with a stroke accent"). The letter ''ó'' is a relic from hundreds of years ago when there was a length distinction in Polish similar to that in
Czech Czech may refer to: * Anything from or related to the Czech Republic, a country in Europe ** Czech language ** Czechs, the people of the area ** Czech culture ** Czech cuisine * One of three mythical brothers, Lech, Czech, and Rus' Places * Czech ...
, with ''á'' and ''é'' also being common at the time. Subsequently, the length distinction disappeared and ''á'' and ''é'' were abolished, but ''ó'' came to be pronounced the same as ''u''. Note that Polish letters with diacritics are treated as fully independent letters in alphabetical ordering (unlike in languages such as French,
Spanish Spanish might refer to: * Items from or related to Spain: **Spaniards are a nation and ethnic group indigenous to Spain **Spanish language, spoken in Spain and many Latin American countries **Spanish cuisine Other places * Spanish, Ontario, Can ...
, and
German German(s) may refer to: * Germany (of or related to) **Germania (historical use) * Germans, citizens of Germany, people of German ancestry, or native speakers of the German language ** For citizens of Germany, see also German nationality law **Ge ...
). For example, ''być'' comes after ''bycie''. The diacritic letters also have their own sections in dictionaries (words beginning with ''ć'' are not usually listed under ''c''). However, there are no regular words that begin with ''ą'' or ''ń''.


Digraphs

Polish additionally uses the digraphs ch, cz, dz, , , rz, and sz. Combinations of certain consonants with the letter ''i'' before a vowel can be considered digraphs: ci as a positional variant of ć, si as a positional variant of ś, zi as a positional variant of ź, and ni as a positional variant of ń (but see a special remark on ni
below Below may refer to: *Earth * Ground (disambiguation) *Soil *Floor * Bottom (disambiguation) *Less than *Temperatures below freezing *Hell or underworld People with the surname *Ernst von Below (1863–1955), German World War I general *Fred Below ...
); and there is also one trigraph dzi as a positional variant of . These are not given any special treatment in alphabetical ordering. For example, ''ch'' is treated simply as ''c'' followed by ''h'', and not as a single letter as in
Czech Czech may refer to: * Anything from or related to the Czech Republic, a country in Europe ** Czech language ** Czechs, the people of the area ** Czech culture ** Czech cuisine * One of three mythical brothers, Lech, Czech, and Rus' Places * Czech ...
or Slovak (e.g. ''
Chojnice Chojnice (; , or ''Chòjnice''; german: Konitz or ''Conitz'') is a town in northern Poland with 39,423 inhabitants as of December 2021, near the Tuchola Forest. It is the capital of the Chojnice County in the Pomeranian Voivodeship. History Pi ...
'' only has its first letter capitalised, and is sorted after ''
Canki Canki (german: Waldhof) is a village in the administrative district of Gmina Ryn, within Giżycko County, Warmian-Masurian Voivodeship, in northern Poland. It lies approximately north-east of Ryn, south-west of Giżycko, and east of the regiona ...
'' and before ''
Cieszyn Cieszyn ( , ; cs, Těšín ; german: Teschen; la, Tessin; szl, Ćeszyn) is a border town in southern Poland on the east bank of the Olza River, and the administrative seat of Cieszyn County, Silesian Voivodeship. The town has 33,500 inhabitants ...
'').


Spelling rules

See
below Below may refer to: *Earth * Ground (disambiguation) *Soil *Floor * Bottom (disambiguation) *Less than *Temperatures below freezing *Hell or underworld People with the surname *Ernst von Below (1863–1955), German World War I general *Fred Below ...
for rules regarding spelling of alveolo-palatal consonants. H may be glottal in a small number of dialects. Rarely, is not a digraph and represents two separate sounds: * in various forms of the verb ''zamarzać'' – "to freeze" * in various forms of the verb ''mierzić'' – "to disgust" * in the place name Murzasichle * in borrowings, for example ''erzac'' (from German ''Ersatz''), ''Tarzan''


Voicing and devoicing

Voiced consonant letters frequently come to represent voiceless sounds (as shown in the above tables). This is due to the neutralization that occurs at the end of words and in certain
consonant cluster In linguistics, a consonant cluster, consonant sequence or consonant compound, is a group of consonants which have no intervening vowel. In English, for example, the groups and are consonant clusters in the word ''splits''. In the education fi ...
s; for example, the in ''klub'' ("club") is pronounced like a , and the in ''prze-'' sounds like . Less frequently, voiceless consonant letters can represent voiced sounds; for example, the in ''także'' ("also") is pronounced like a . The conditions for this neutralization are described under ''Voicing and devoicing'' in the article on Polish phonology.


Palatal and palatalized consonants

The spelling rule for the
alveolo-palatal In phonetics, alveolo-palatal (or alveopalatal) consonants, sometimes synonymous with pre-palatal consonants, are intermediate in articulation between the coronal and dorsal consonants, or which have simultaneous alveolar and palatal arti ...
sounds , , , and is as follows: before the vowel the plain letters are used; before other vowels the combinations are used; when not followed by a vowel the diacritic forms are used. For example, the in ''siwy'' ("grey-haired"), the in ''siarka'' ("sulphur") and the in ''święty'' ("holy") all represent the sound . Special attention should be paid to before plus a vowel. In words of foreign origin the causes the palatalization of the preceding consonant to , and it is pronounced as . This situation occurs when the corresponding genitive form ends in ''-nii'', pronounced as , not with ''-ni'', pronounced as (which is a situation typical to the words of Polish origin). For examples, see the table in the next section. According to one system, similar principles apply to the palatalized consonants , and , except that these can only occur before vowels. The spellings are thus before , and otherwise. For example, the in ''kim'' ("whom", instr.) and the in ''kiedy'' both represent . In the system without the palatalized velars, they are analyzed as /k/, /ɡ/ and /x/ before /i/ and /kj/, /ɡj/ and /xj/ before other vowels.


Other issues with ''i'' and ''j''

Except in the cases mentioned in the previous paragraph, the letter if followed by another vowel in the same word usually represents , but it also has the palatalizing effect on the previous consonant. For example, ''pies'' ("dog") is pronounced (). Some words with before plus a vowel also follow this pattern (see below). In fact ''i'' is the usual spelling of between a preceding consonant and a following vowel. The letter normally appears in this position only after , and if the palatalization effect described above has to be avoided (as in ''presja'' "pressure", ''Azja'' "Asia", ''lekcja'' "lesson", and the common suffixes ''-cja'' "-tion", ''-zja'' "-sion": ''stacja'' "station", ''wizja'' "vision"). The letter after consonants is also used in concatenation of two words if the second word in the pair starts with , e.g. ''wjazd'' "entrance" originates from ''w'' + . The pronunciation of the sequence ''wja'' (in ''wjazd'') is the same as the pronunciation of ''wia'' (in ''wiadro'' "bucket"). The ending ''-ii'' which appears in the inflected forms of some nouns of foreign origin, which have ''-ia'' in the nominative case (always after , , , and ; sometimes after , , and other consonants), is pronounced as , with the palatalization of the preceding consonant. For example, ''dalii'' (genitive of ''dalia'' "dalia"), ''Bułgarii'' (genitive of ''Bułgaria'' "Bulgaria"), ''chemii'' (genitive of ''chemia'' "chemistry"), ''religii'' (genitive of ''religia'' "religion"), ''amfibii'' (genitive of ''amfibia'' "amphibia"). The common pronunciation is . This is why children commonly misspell and write ''-i'' in the inflected forms as ''armii'', ''Danii'' or hypercorrectly write ''ziemii'' instead of ''ziemi'' (words of Polish origin do not have the ending ''-ii'' but simple ''-i'', e.g. ''ziemi'', genitive of ''ziemia''). In some rare cases, however, when the consonant is preceded by another consonant, ''-ii'' may be pronounced as , but the preceding consonant is still palatalized, for example, ''Anglii'' (genitive of ''Anglia'' "England") is pronounced . (The spelling ''Angli'', very frequently met with on the Internet, is simply an error in orthography, caused by this pronunciation.) A special situation applies to : it has the full palatalization to before ''-ii'' which is pronounced as – and such a situation occurs only when the corresponding nominative form in ''-nia'' is pronounced as , not as . For example (pay attention to the upper- and lower-case letters): The ending ''-ji'', is always pronounced as . It appears only after ''c'', ''s'' and ''z''. Pronunciation of it as a simple is considered a pronunciation error. For example, (genitive of "pressure") is ; (genitive of "poetry") is ; (genitive of "reason") is .


Nasal vowels

The letters and , when followed by plosives and affricates, represent an oral vowel followed by a nasal consonant, rather than a nasal vowel. For example, in ''dąb'' ("oak") is pronounced , and in ''tęcza'' ("rainbow") is pronounced (the nasal assimilates with the following consonant). When followed by or , and in the case of , always at the end of words, these letters are pronounced as just or .


Homophonic spellings

Apart from the cases in the sections above, there are three sounds in Polish that can be spelt in two different ways, depending on the word. Those result from historical sound changes. The correct spelling can often be deduced from the spelling of other morphological forms of the word or cognates in Polish or in other Slavic languages. * can be spelt either or . ** only occurs in loanwords; however, many of them have been nativized and are not perceived as loanwords. is used: *** when cognate words have the letter , or , e.g.: ***: ''wahadło – waga'' ***: ''druh – drużyna'' ***: ''błahy – błazen'' *** when the same letter is used in the language from which the word was borrowed, e.g. Greek prefixes ''hekto-, hetero-, homo-, hipo-, hiper-, hydro-'', also ''honor, historia, herbata'', etc. ** is used: *** in all native words, e.g. ''chyba, chrust, chrapać, chować, chcieć'' *** when the same digraph is used in the language from which the word was borrowed, e.g. ''chór, echo, charakter, chronologia'', etc. * can be spelt or ; the spelling indicates that the sound developed from the historical long /oː/. ** is used: *** usually at the beginning of a word (except for ''ósemka, ósmy, ów, ówczesny, ówdzie'') *** always at the end of a word *** in the endings ''-uch, -ucha, -uchna, -uchny, -uga, -ula, -ulec, -ulek, -uleńka, -ulka, -ulo, -un, -unek, -uni, -unia, -unio, -ur, -us, -usi, -usieńki, -usia, -uszek, -uszka, -uszko, -uś, -utki'' ** is used: *** when cognate words or other morphological forms have the letter , or , e.g.: ***: ''mróz – mrozu'' ***: ''wiózł – wieźć'' ***: ''skrócić – skracać'' *** in the endings ''-ów, -ówka, -ówna'' (except for ''zasuwka, skuwka, wsuwka'') * can be spelt either or ; the spelling indicates that the sound developed from /r̝/ (cf. Czech ). ** is used: *** when cognate words or other morphological forms have the letter/digraph , , , , , , e.g.: ***: ''może – mogę'' ***: ''mosiężny – mosiądz'' ***: ''drużyna – druh'' ***: ''każe – kazać'' ***: ''wożę – woźnica'' ***: ''bliżej – blisko'' *** in the particle ''że'', e.g. ''skądże, tenże, także'' *** after , , , e.g.: ***: ''lżej'' ***: ''łże'' ***: ''rżysko'' *** in loanwords, especially from French, e.g.: ***: ''rewanż'' ***: ''żakiet'' ***: ''garaż'' *** when cognates in other Slavic languages contain the sound or , e.g. ''żuraw'' – Russian журавль ** is used: *** when cognate words or other morphological forms have the letter , e.g. ''morze – morski, karze – kara'' *** usually after , , , , , , , , , e.g.: ***: ''przygoda'' ***: ''brzeg'' ***: ''trzy'' ***: ''drzewo'' ***: ''krzywy'' ***: ''grzywa'' ***: ''chrzest'' ***: ''ujrzeć'' ***: ''wrzeć'' *** when cognates in other Slavic languages contain the sound or , e.g. ''rzeka'' – Russian река


Other points

The letter represents in the digraphs and in loanwords, for example ''autor, Europa''; but not in native words, like ''nauka'', pronounced . There are certain clusters where a written consonant would not normally be pronounced. For example, the in the words ''mógł'' ("could") and ''jabłko'' ("apple") is omitted in ordinary speech.


Capitalization

Names are generally capitalized in Polish as in English. Polish does not capitalize the months and days of the week, nor adjectives and other forms derived from proper nouns (for example, ''angielski'' "English"). Titles such as ''pan'' ("Mr"), ''pani'' ("Mrs/Ms"), ''lekarz'' ("doctor"), etc. and their abbreviations are not capitalized, except in written polite address. Second-person pronouns are traditionally capitalized in formal writing (e.g. letters or official emails); so may be other words used to refer to someone directly in a formal setting, like ''Czytelnik'' ("reader", in newspapers or books). Third-person pronouns are capitalized to show reverence, most often in a sacred context.


Punctuation

Polish punctuation is similar to that of English. However, there are more rigid rules concerning use of
comma The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...
s— subordinate clauses are almost always marked off with a comma, while it is normally considered incorrect to use a comma before a coordinating conjunction with the meaning "and" (''i'', ''a'' or ''oraz''). Abbreviations (but not
acronym An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...
s or
initialism An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...
s) are followed by a period when they end with a letter other than the one which ends the full word. For example, ''dr'' has no period when it stands for ''doktor'', but takes one when it stands for an inflected form such as ''doktora'' and ''prof.'' has period because it comes from ''profesor'' (
professor Professor (commonly abbreviated as Prof.) is an Academy, academic rank at university, universities and other post-secondary education and research institutions in most countries. Literally, ''professor'' derives from Latin as a "person who pr ...
).
Apostrophe The apostrophe ( or ) is a punctuation mark, and sometimes a diacritical mark, in languages that use the Latin alphabet and some other alphabets. In English, the apostrophe is used for two basic purposes: * The marking of the omission of one o ...
s are used to mark the elision of the final sound of foreign words not pronounced before Polish inflectional endings, as in ''Harry'ego'' (, genitive of ''Harry'' – the final is elided in the genitive). However, it is often erroneously used to separate a loanword stem from any inflectional ending, for example, ''*John'a'', which should be ''Johna'' (genitive of ''John''; no sound is elided).
Quotation mark Quotation marks (also known as quotes, quote marks, speech marks, inverted commas, or talking marks) are punctuation marks used in pairs in various writing systems to set off direct speech, a quotation, or a phrase. The pair consists of an ...
s are used in different ways: either „ordinary Polish quotes” or «French quotes» (without space) for first level, and ‚single Polish quotes’ or «French quotes» for second level, which gives three styles of nested quotes: # „Quote ‚inside’ quote” # „Quote «inside» quote” # «Quote ‚inside’ quote» Some older prints have used „such Polish quotes“.


History

Poles adopted the Latin alphabet in the 12th century. However, that alphabet was ill-equipped to represent certain Polish sounds, such as the palatal consonants and nasal vowels. Consequently, Polish spelling in the Middle Ages was highly inconsistent, as different writers used different systems to represent these sounds, For example, in early documents the letter ''c'' could signify the sounds now written ''c, cz, k'', while the letter ''z'' was used for the sounds now written ''z, ż, ś, ź''. Writers soon began to experiment with digraphs (combinations of letters), new letters (φ and ſ, no longer used), and eventually diacritics. The Polish alphabet was one of two major forms of Latin-based orthography developed for Slavic languages, the other being
Czech orthography Czech orthography is a system of rules for proper formal writing (orthography) in Czech. The earliest form of separate Latin script specifically designed to suit Czech was devised by Czech theologian and church reformist Jan Hus, the namesake of ...
, characterized by
caron A caron (), háček or haček (, or ; plural ''háčeks'' or ''háčky'') also known as a hachek, wedge, check, kvačica, strešica, mäkčeň, varnelė, inverted circumflex, inverted hat, flying bird, inverted chevron, is a diacritic mark (� ...
s (háčeks), as in the letter ''č''. The other major Slavic languages which are now written in Latin-based alphabets ( Slovak, Slovene, and Serbo-Croatian) use systems similar to the Czech. However a Polish-based orthography is used for Kashubian and usually for Silesian, while the
Sorbian languages The Sorbian languages ( hsb, serbska rěč, dsb, serbska rěc) are the Upper Sorbian language and Lower Sorbian language, two closely related and partially mutually intelligible languages spoken by the Sorbs, a West Slavic ethno-cultural min ...
use elements of both systems.


Computer encoding

There are several different systems for encoding the Polish alphabet for computers. All letters of the Polish alphabet are included in Unicode, and thus Unicode-based encodings such as UTF-8 and
UTF-16 UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode (in fact this number of code points is dictated by the design of UTF-16). The encoding is variable-length, as cod ...
can be used. The Polish alphabet is completely included in the
Basic Multilingual Plane In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadeci ...
of Unicode. ISO 8859-2 (Latin-2),
ISO 8859-13 ISO/IEC 8859-13:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 13: Latin alphabet No. 7'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1998. ...
(Latin-7),
ISO 8859-16 ISO/IEC 8859-16:2001, ''Information technology — 8-bit single-byte coded graphic character sets — Part 16: Latin alphabet No. 10'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001 ...
(Latin-10) and
Windows-1250 Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use Latin script, such as Czech (which is its main user with half its use, though Czech has 96.6% use of UTF-8, and ...
are popular 8-bit encodings that support the Polish alphabet. The Polish letters which are not present in the
English alphabet The alphabet for Modern English is a Latin-script alphabet consisting of 26 letters, each having an upper- and lower-case form. The word ''alphabet'' is a compound of the first two letters of the Greek alphabet, '' alpha'' and '' beta''. ...
use the following HTML character entities and Unicode codepoints: For other encodings, see the following table. Numbers in the table are hexadecimal. A common test sentence containing all the Polish diacritic letters is the nonsensical "''Zażółć gęślą jaźń''".


See also

* Polish Braille * Polish manual alphabet


Further reading

*


References


External links


Polish Pronunciation Audio and Grammar Charts

Online editor for typing Polish characters
{{DEFAULTSORT:Polish Orthography Orthography ! Indo-European Latin-script orthographies