HOME

TheInfoList



OR:

A cedilla ( ; from Spanish ', "small ''ceda''", i.e. small "z"), or cedille (from French , ), is a hook or tail () added under certain letters (as a
diacritical mark A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacrit ...
) to indicate that their pronunciation is modified. In Catalan (where it is called ), French, and Portuguese (where it is called a ) it is used only under the letter (to form ), and the entire letter is called, respectively, (i.e. "broken C"), , and (or , colloquially). It is used to mark vowel nasalization in many languages of
Sub-Saharan Africa Sub-Saharan Africa is the area and regions of the continent of Africa that lie south of the Sahara. These include Central Africa, East Africa, Southern Africa, and West Africa. Geopolitically, in addition to the list of sovereign states and ...
, including
Vute Vute is a Mambiloid language of Cameroon and Gabon, with a thousand speakers in Nigeria. The orthography was standardized on March 9, 1979. Noted dialect clusters are eastern, central, and Doume. Phonology Consonants Consonants in Vute are ...
from
Cameroon Cameroon, officially the Republic of Cameroon, is a country in Central Africa. It shares boundaries with Nigeria to the west and north, Chad to the northeast, the Central African Republic to the east, and Equatorial Guinea, Gabon, and the R ...
. This diacritic is not to be confused with the ''
ogonek The tail or ( ; Polish: , "little tail", diminutive of ) is a diacritic hook placed under the lower right corner of a vowel in the Latin alphabet used in several European languages, and directly under a vowel in several Native American langu ...
'' (◌̨), which resembles the cedilla but mirrored. It looks also very similar to the
diacritical comma The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
, which is used in the Romanian and
Latvian alphabet The modern Latvian orthography is based on Latin script adapted to phonetic principles, following the pronunciation of the language. The standard alphabet consists of 33 letters – 22 unmodified Latin letters and 11 modified by diacritics. It ...
, and which is misnamed "cedilla" in the Unicode standard. There is substantial overlap between the cedilla and a
diacritical comma The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
. The cedilla is traditionally centered on the letter, and when there is no stroke for it to attach to in that position, as in ''Ņ ņ'', the connecting stroke is omitted, taking the form of a comma. However, the cedilla may instead be shifted left or right to attach to a descending leg. In some orthographies the comma form has been generalized even in cases where the cedilla could attach, as in ''Ḑ ḑ'', but is still considered to be a cedilla. This produces a contrast between attached and non-attached (comma) glyphs, which is usually left to the font but in the cases of ''Ş ş Ţ ţ'' and ''Ș ș Ț ț'' is formalized by Unicode.


Origin

The tail originated in Spain as the bottom half of a miniature
cursive Cursive (also known as joined-up writing) is any style of penmanship in which characters are written joined in a flowing manner, generally for the purpose of making writing faster, in contrast to block letters. It varies in functionality and m ...
z. The word ''cedilla'' is the
diminutive A diminutive is a word obtained by modifying a root word to convey a slighter degree of its root meaning, either to convey the smallness of the object or quality named, or to convey a sense of intimacy or endearment, and sometimes to belittle s ...
of the
Old Spanish Old Spanish (, , ; ), also known as Old Castilian or Medieval Spanish, refers to the varieties of Ibero-Romance spoken predominantly in Castile and environs during the Middle Ages. The earliest, longest, and most famous literary composition in O ...
name for this letter, (). Modern Spanish and isolationist Galician no longer use this diacritic, although it is used in Reintegrationist Galician, Portuguese, Catalan,
Occitan Occitan may refer to: * Something of, from, or related to the Occitania territory in parts of France, Italy, Monaco and Spain. * Something of, from, or related to the Occitania administrative region of France. * Occitan language, spoken in parts o ...
, and French, which gives English the alternative spellings of ''cedille'', from French "", and the Portuguese form . An obsolete spelling of ''cedilla'' is ''cerilla''. The earliest use in English cited by the ''
Oxford English Dictionary The ''Oxford English Dictionary'' (''OED'') is the principal historical dictionary of the English language, published by Oxford University Press (OUP), a University of Oxford publishing house. The dictionary, which published its first editio ...
'' is a 1599 Spanish-English dictionary and grammar. Chambers' ''Cyclopædia''Chambers, Ephraim (1738) ''Cyclopædia; or, an universal dictionary of arts and sciences'' (2nd ed.) is cited for the printer-trade variant '' ceceril'' in use in 1738. Its use in English is not universal and applies to loan words from French and Portuguese such as ''
façade A façade or facade (; ) is generally the front part or exterior of a building. It is a loanword from the French language, French (), which means "frontage" or "face". In architecture, the façade of a building is often the most important asp ...
'', ''
limaçon In geometry, a limaçon or limacon , also known as a limaçon of Pascal or Pascal's Snail, is defined as a roulette curve formed by the path of a point fixed to a circle when that circle rolls around the outside of a circle of equal radius. I ...
'' and ''
cachaça ''Cachaça'' () is a Liquor, distilled spirit made from fermented sugarcane juice. Also known as ''pinga'', ''caninha'', and other names, it is the most popular spirit in Brazil.Cavalcante, Messias Soares. Todos os nomes da cachaça. São Pau ...
'' (often typed ''facade'', ''limacon'' and ''cachaca'' because of lack of ''ç'' keys on English-language keyboards). With the advent of typeface modernism, the calligraphic nature of the cedilla was thought somewhat jarring on
sans-serif In typography and lettering, a sans-serif, sans serif (), gothic, or simply sans letterform is one that does not have extending features called "serifs" at the end of strokes. Sans-serif typefaces tend to have less stroke width variation than ...
typefaces, and so some designers instead substituted a comma design, which could be made bolder and more compatible with the style of the text. This reduces the visual distinction between the cedilla and the
diacritical comma The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
.


C

The most frequent character with cedilla is "ç" ("c" with cedilla, as in ''façade''). It was first used for the sound of the
voiceless alveolar affricate A voiceless alveolar affricate is a type of affricate consonant pronounced with the tip or blade of the tongue against the alveolar ridge (gum line) just behind the teeth. This refers to a class of sounds, not a single sound. There are several typ ...
in old Spanish and stems from the letter (the
Visigothic The Visigoths (; ) were a Germanic people united under the rule of a king and living within the Roman Empire during late antiquity. The Visigoths first appeared in the Balkans, as a Roman-allied barbarian military group united under the comman ...
form of the letter ), whose upper loop was lengthened and reinterpreted as a "c", whereas its lower loop became the diminished appendage, the cedilla. It represents the "soft" sound , the
voiceless alveolar sibilant The voiceless alveolar fricatives are a type of fricative consonant pronounced with the tip or blade of the tongue against the alveolar ridge (gum line) just behind the teeth. This refers to a class of sounds, not a single sound. There are at leas ...
, where a "c" would normally represent the "hard" sound (before "a", "o", "u", or at the end of a word) in English and in certain Romance languages such as Catalan, Galician, French (where ç appears in the name of the language itself, '), Ligurian,
Occitan Occitan may refer to: * Something of, from, or related to the Occitania territory in parts of France, Italy, Monaco and Spain. * Something of, from, or related to the Occitania administrative region of France. * Occitan language, spoken in parts o ...
, and Portuguese. In Occitan, Friulian, and Catalan, ''ç'' can also be found at the beginning of a word (', ') or at the end ('). It represents the
voiceless postalveolar affricate The voiceless palato-alveolar sibilant affricate or voiceless domed postalveolar sibilant affricate is a type of consonantal sound used in some spoken languages. The sound is transcribed in the International Phonetic Alphabet with , , (formerly ...
(as in English "church") in Albanian, Azerbaijani, Crimean Tatar, Friulian, Kurdish, Tatar, Turkish (as in ', ', ', '), and Turkmen. It is also sometimes used this way in Manx, to distinguish it from the
velar fricative A velar fricative is a fricative consonant produced at the velar place of articulation. It is possible to distinguish the following kinds of velar fricatives: *Voiced velar fricative, a consonant sound written as in the International Phonetic Alp ...
. In the
International Phonetic Alphabet The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standard written representation ...
, ⟨ç⟩ represents the voiceless palatal fricative.


S

The character "ş" represents the
voiceless postalveolar fricative A voiceless postalveolar fricative is a type of consonantal sound used in some Speech, spoken languages. The International Phonetic Association uses the term ''voiceless postalveolar fricative'' only for the sound #Voiceless palato-alveolar frica ...
(as in "show") in several languages, including many belonging to the
Turkic languages The Turkic languages are a language family of more than 35 documented languages, spoken by the Turkic peoples of Eurasia from Eastern Europe and Southern Europe to Central Asia, East Asia, North Asia (Siberia), and West Asia. The Turkic langua ...
, and included as a separate letter in their alphabets: * Turkish * Azerbaijani * Crimean Tatar * Gagauz * Tatar * Turkmen * Romanian (substitution use when S-comma was missing from pre-3.0
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
standards, and older standards, still frequent, but an error) * Kurdish In HTML character entity references Ş and ş can be used.


T

Gagauz uses Ţ (T with cedilla), one of the few languages to do so, and Ş (S with cedilla). Besides being present in some Gagauz orthographies, T with Cedilla also exists in the
General Alphabet of Cameroon Languages The General Alphabet of Cameroon Languages is an orthography, orthographic system created in the late 1970s for all Languages of Cameroon, Cameroonian languages.Tadadjeu, Maurice and Etienne Sadembouo. 1979Alphabet Générale des Langues Camerounai ...
, in the Kabyle language, in the Manjak and Mankanya languages, and possibly elsewhere. In 1868, Ambroise Firmin-Didot suggested in his book ' (Observations on French Spelling) that French phonetics could be better regularized by adding a cedilla beneath the letter "t" in some words. For example, the suffix ' is usually not pronounced as but as . It has to be distinctly learned that in words such as ' (but not '), it is pronounced . A similar effect occurs with other prefixes or within words. Firmin-Didot surmised that a new character could be added to French orthography. A letter with the same description, T-cedilla (majuscule: Ţ, minuscule: ţ), is used in Gagauz. A similar letter, the T-comma (majuscule: Ț, minuscule: ț), exists in Romanian, but it has a comma accent, not a cedilla.


Languages with other characters with cedillas


Latvian

Comparatively, some consider the diacritics on the palatalized Latvian consonants "ģ", "ķ", "ļ", "ņ", and formerly "ŗ" to be cedillas. Although their Adobe glyph names are commas, their names in the Unicode Standard are "g", "k", "l", "n", and "r" with a cedilla. The letters were introduced to the Unicode standard before 1992, and their names cannot be altered. The uppercase equivalent "Ģ" sometimes has a regular cedilla.


Marshallese

In Marshallese orthography, four letters in Marshallese have cedillas: . In standard printed text they are ''always'' cedillas, and their omission or the substitution of comma below and
dot below When used as a diacritic mark, the term dot refers to the glyphs "combining dot above" (, and "combining dot below" ( which may be combined with some letters of the extended Latin alphabets in use in a variety of languages. Similar marks are ...
diacritics are nonstandard. , many font rendering engines do not display ''any'' of these properly, for two reasons: * "" and "" usually do not display properly at all, because of the use of the cedilla in Latvian. Unicode has precombined glyphs for these letters, but most quality fonts display them with comma below diacritics to accommodate the expectations of
Latvian orthography The modern Latvian language, Latvian orthography is based on Latin script adapted to phonetic principles, following the Latvian phonology, pronunciation of the language. The standard alphabet consists of 33 letters – 22 unmodified Latin letters ...
. This is considered nonstandard in Marshallese. The use of a
zero-width non-joiner The zero-width non-joiner (ZWNJ, ; rendered: ; HTML entity: or ) is a non-printing character used in the computerization of writing systems that make use of Typographic ligature, ligatures. For example, in writing systems that feature initial, ...
between the letter and the diacritic can alleviate this problem: "" and "" may display properly, but may not; see below. * "" and "" do not currently exist in Unicode as precombined glyphs, and must be encoded as the plain Latin letters "" and "" with the combining cedilla diacritic. Most Unicode fonts issued with
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
do not display combining diacritics properly, showing them too far to the right of the letter, as with Tahoma ("" and "") and
Times New Roman Times New Roman is a serif typeface commissioned for use by the British newspaper ''The Times'' in 1931. It has become one of the most popular typefaces of all time and is installed on most personal computers. The typeface was conceived by Stanl ...
("" and ""). This mostly affects "", and may or may not affect "". But some common Unicode fonts like
Arial Unicode MS Arial Unicode MS is a TrueType font and the extended version of the font Arial. Compared to Arial, it includes higher line height, omits kerning pairs and adds enough glyphs to cover a large subset of Unicode 2.1—thus supporting most Microsoft ...
("" and ""),
Cambria Cambria is a name for Wales, being the Latinised form of the Welsh name for the country, . The term was not in use during the Roman period (when Wales had not come into existence as a distinct entity) or the early medieval period. After the ...
("" and "") and
Lucida Sans Unicode Lucida Sans Unicode is an OpenType typeface from the design studio of Bigelow & Holmes,All Bigelow & Holmes Lucida typefaces are distributed by the designers througThe Lucida Fonts Storeand a subset of Lucida fonts is distributed bAscender Corpo ...
("" and "") do not have this problem. When "" is properly displayed, the cedilla is either underneath the center of the letter, or is underneath the right-most leg of the letter, but is always directly underneath the letter wherever it is positioned. Because of these font display issues, it is not uncommon to find nonstandard ''ad hoc'' substitutes for these letters. The online version of the Marshallese-English Dictionary (the only complete Marshallese dictionary in existence) displays the letters with dot below diacritics, all of which do exist as precombined glyphs in Unicode: "", "", "" and "". The first three exist in the
International Alphabet of Sanskrit Transliteration The International Alphabet of Sanskrit Transliteration (IAST) is a transliteration scheme that allows the lossless romanisation of Indic scripts as employed by Sanskrit and related Indic languages. It is based on a scheme that emerged during ...
, and "" exists in the
Vietnamese alphabet The Vietnamese alphabet (, ) is the modern writing script for the Vietnamese language. It uses the Latin script based on Romance languages like French language, French, originally developed by Francisco de Pina (1585–1625), a missionary from P ...
, and both of these systems are supported by the most recent versions of common fonts like
Arial Arial is a sans-serif typeface in the Sans-serif#Neo-grotesque, neo-grotesque style. Fonts from the Arial family are included with all versions of Microsoft Windows after Windows 3.1, as well as in other Microsoft programs, Apple's macOS, and ma ...
,
Courier New Courier is a monospaced slab serif typeface commissioned by IBM and designed by Howard "Bud" Kettler (1919–1999) in the mid-1950s. The Courier name and typeface concept are in the public domain. Courier has been adapted for use as a computer f ...
, Tahoma and
Times New Roman Times New Roman is a serif typeface commissioned for use by the British newspaper ''The Times'' in 1931. It has become one of the most popular typefaces of all time and is installed on most personal computers. The typeface was conceived by Stanl ...
. This sidesteps most of the Marshallese text display issues associated with the cedilla, but is still inappropriate for polished standard text.


Vute

Vute Vute is a Mambiloid language of Cameroon and Gabon, with a thousand speakers in Nigeria. The orthography was standardized on March 9, 1979. Noted dialect clusters are eastern, central, and Doume. Phonology Consonants Consonants in Vute are ...
, a Mambiloid language from
Cameroon Cameroon, officially the Republic of Cameroon, is a country in Central Africa. It shares boundaries with Nigeria to the west and north, Chad to the northeast, the Central African Republic to the east, and Equatorial Guinea, Gabon, and the R ...
, uses cedilla for the nasalization of all vowel qualities (cf. the
ogonek The tail or ( ; Polish: , "little tail", diminutive of ) is a diacritic hook placed under the lower right corner of a vowel in the Latin alphabet used in several European languages, and directly under a vowel in several Native American langu ...
used in Polish and
Navajo The Navajo or Diné are an Indigenous people of the Southwestern United States. Their traditional language is Diné bizaad, a Southern Athabascan language. The states with the largest Diné populations are Arizona (140,263) and New Mexico (1 ...
for the same purpose). This includes unconventional Roman letters that are formalized from the IPA into the official writing system. These include <''i̧ ȩ ɨ̧ ə̧ a̧ u̧ o̧ ɔ̧>.''


Hebrew

The
ISO 259 ISO 259 is a series of international standards for the romanization of Hebrew characters into Latin characters, dating to 1984, with updated ISO 259-2 (a simplification, disregarding several vowel signs, 1994) and ISO 259-3 ( Phonemic Conversion, ...
romanization of
Biblical Hebrew Biblical Hebrew ( or ), also called Classical Hebrew, is an archaic form of the Hebrew language, a language in the Canaanite languages, Canaanitic branch of the Semitic languages spoken by the Israelites in the area known as the Land of Isra ...
uses Ȩ (E with cedilla) and Ḝ (E with cedilla and breve).


Diacritical comma

Languages such as Romanian, Latvian and Livonian add a comma (virgula) to some letters, such as ', which looks somewhat like a cedilla, but is more precisely a
diacritical comma The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
. This is particularly confusing with letters which can take either diacritic: for example, the consonant is written as "ş" in Turkish but as "ș" in Romanian, and Romanian writers will sometimes use the former instead of the latter because of insufficient computer support.
Adobe Adobe (from arabic: الطوب Attub ; ) is a building material made from earth and organic materials. is Spanish for mudbrick. In some English-speaking regions of Spanish heritage, such as the Southwestern United States, the term is use ...
names of the Latvian letters ( and formerly ) use the word "comma", but in the
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
Standard they are named "g", "k", "l", "n", and "r" with ''cedilla''. The letters were introduced to the
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
standard before 1992, and their names cannot be altered. Influenced by Latvian, Livonian has the same problem for "d̦", "ļ", "ņ", "ŗ" and "ț". The Polish letters and and Lithuanian letters and are not made with the cedilla either, but with the unrelated
ogonek The tail or ( ; Polish: , "little tail", diminutive of ) is a diacritic hook placed under the lower right corner of a vowel in the Latin alphabet used in several European languages, and directly under a vowel in several Native American langu ...
diacritic.


Unicode

Unicode encodes a number of cases of "letter with cedilla" (so called, as explained above) as
precomposed character A precomposed character (alternatively composite character or decomposable character) is a Unicode entity that can also be defined as a sequence of one or more other characters. A precomposed character may typically represent a letter with a diac ...
s. In addition, several more letters in language orthographies are composed using the
combining character In digital typography, combining characters are Character (computing), characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritic, diacritical marks (including c ...
facility ( and ). In ambiguous cases, typeface designers must choose whether to use a cedilla diacritic or comma-below diacritic for these
codepoint A code point, codepoint or code position is a particular position in a table, where the position has been assigned a meaning. The table may be one dimensional (a column), two dimensional (like cells in a spreadsheet), three dimensional (sheets in ...
s, leaving it to others to provide the user with a method to achieve the other form (i.e., that relies on the combining character method). Here are three popular faces that demonstrate the choices made: * Arial: Ç ç Ḉ ḉ Ḑ ḑ Ȩ ȩ Ḝ ḝ Ģ ģ Ḩ ḩ Ķ ķ Ļ ļ M̧ m̧ Ņ ņ O̧ o̧ Ŗ ŗ Ş ş Ţ ţ Z̧ z̧ * Times New Roman: Ç ç Ḉ ḉ Ḑ ḑ Ȩ ȩ Ḝ ḝ Ģ ģ Ḩ ḩ Ķ ķ Ļ ļ M̧ m̧ Ņ ņ O̧ o̧ Ŗ ŗ Ş ş Ţ ţ Z̧ z̧ * Courier New: Ç ç Ḉ ḉ Ḑ ḑ Ȩ ȩ Ḝ ḝ Ģ ģ Ḩ ḩ Ķ ķ Ļ ļ M̧ m̧ Ņ ņ O̧ o̧ Ŗ ŗ Ş ş Ţ ţ Z̧ z̧ In each case, the diacritic displayed with D, G, K, L, N and R is a comma-below; in the other cases it is displayed as a cedilla. It may be that computer fonts are sold in the Romanian and Turkish markets that favour the national standard form of this diacritic.


References


External links


ScriptSource—Positioning the traditional cedilla

Diacritics Project—All you need to design a font with correct accents


��Learn how to make world language accent marks and other diacriticals on a computer {{Latin script, , cedilla Latin-script diacritics Turkish language