syllabication
   HOME

TheInfoList



OR:

Syllabification () or syllabication (), also known as hyphenation, is the separation of a
word A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no conse ...
into syllables, whether spoken, written or signed.


Overview

The written separation into syllables is usually marked by a
hyphen The hyphen is a punctuation mark used to join words and to separate syllables of a single word. The use of hyphens is called hyphenation. ''Son-in-law'' is an example of a hyphenated word. The hyphen is sometimes confused with dashes ( figure ...
when using
English orthography English orthography is the writing system used to represent spoken English, allowing readers to connect the graphemes to sound and to meaning. It includes English's norms of spelling, hyphenation, capitalisation, word breaks, emphasis, and ...
(e.g., syl-la-ble) and with a period when transcribing the actually spoken syllables in the
International Phonetic Alphabet The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standardized representation ...
(e.g., ). For presentation purposes,
typographer Typography is the art and technique of arranging type to make written language legible, readable and appealing when displayed. The arrangement of type involves selecting typefaces, point sizes, line lengths, line-spacing ( leading), an ...
s may use an interpunct (
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
character U+00B7, e.g., syl·la·ble), a special-purpose "hyphenation point" (U+2027, e.g., syl‧la‧ble), or a
space Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually cons ...
(e.g., syl la ble). At the end of a line, a word is separated in writing into parts, conventionally called "syllables", if it does not fit the line and if moving it to the next line would make the first line much shorter than the others. This can be a particular problem with very long words, and with narrow columns in newspapers.
Word processing A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no conse ...
has automated the process of justification, making syllabification of shorter words often unnecessary. In some languages, the spoken syllables are also the basis of syllabification in writing. However, possibly due to the weak correspondence between sounds and letters in the spelling of modern English, written syllabification in English is based mostly on
etymological Etymology () The New Oxford Dictionary of English (1998) – p. 633 "Etymology /ˌɛtɪˈmɒlədʒi/ the study of the class in words and the way their meanings have changed throughout time". is the study of the history of the form of words a ...
or morphological, instead of
phonetic Phonetics is a branch of linguistics that studies how humans produce and perceive sounds, or in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians. ...
, principles. For example, it is not possible to syllabify "learning" as ''lear-ning'' according to the correct syllabification of the living language. Seeing only ''lear-'' at the end of a line might mislead the reader into pronouncing the word incorrectly, as the digraph ''ea'' can hold many different values. The history of English orthography accounts for such phenomena. English written syllabification therefore deals with a concept of "syllable" that does not correspond to the linguistic concept of a phonological (as opposed to morphological) unit. As a result, even most native English speakers are unable to syllabify words according to established rules without consulting a dictionary or using a word processor. Schools usually do not provide much more advice on the topic than to consult a dictionary. In addition, there are differences between British and US syllabification and even between dictionaries of the same English variety. In
Finnish Finnish may refer to: * Something or someone from, or related to Finland * Culture of Finland * Finnish people or Finns, the primary ethnic group in Finland * Finnish language, the national language of the Finnish people * Finnish cuisine See also ...
,
Italian Italian(s) may refer to: * Anything of, from, or related to the people of Italy over the centuries ** Italians, an ethnic group or simply a citizen of the Italian Republic or Italian Kingdom ** Italian language, a Romance language *** Regional Ita ...
,
Portuguese Portuguese may refer to: * anything of, from, or related to the country and nation of Portugal ** Portuguese cuisine, traditional foods ** Portuguese language, a Romance language *** Portuguese dialects, variants of the Portuguese language ** Portu ...
and other nearly phonemically spelled languages, writers can in principle correctly syllabify any existing or newly created word using only general rules. In Finland, children are first taught to hyphenate every word until they produce the correct syllabification reliably, after which the hyphens can be omitted.


Algorithm

A hyphenation algorithm is a set of rules, especially one codified for implementation in a computer program, that decides at which points a word can be broken over two lines with a hyphen. For example, a hyphenation algorithm might decide that ''impeachment'' can be broken as ''impeach-ment'' or ''im-peachment'' but not ''impe-achment''. One of the reasons for the complexity of the rules of word-breaking is that different dialects of English tend to differ on hyphenation:
American English American English, sometimes called United States English or U.S. English, is the set of varieties of the English language native to the United States. English is the most widely spoken language in the United States and in most circumstances i ...
tends to work on sound, but
British English British English (BrE, en-GB, or BE) is, according to Lexico, Oxford Dictionaries, "English language, English as used in Great Britain, as distinct from that used elsewhere". More narrowly, it can refer specifically to the English language in ...
tends to look to the origins of the word and then to sound. There are also a large number of exceptions, which further complicates matters. Some rules of thumb can be found in the Major Keary's "On Hyphenation – Anarchy of Pedantry." Among the
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
ic approaches to hyphenation, the one implemented in the TeX typesetting system is widely used. It is thoroughly documented in the first two volumes of ''
Computers and Typesetting ''Computers and Typesetting'' is a 5-volume set of books by Donald Knuth published in 1986 describing the TeX and Metafont systems for digital typography. Knuth's computers and typesetting project was the result of his frustration with the lac ...
'' by Donald Knuth and in Franklin Mark Liang's dissertation. The aim of Liang's work was to get the algorithm as accurate as he practically could and to keep any exception dictionary small. In TeX's original hyphenation patterns for American English, the exception list contains only 14 words.


In TeX

Ports of the TeX hyphenation algorithm are available as libraries for several programming languages, including
Haskell Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lan ...
,
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of websites use JavaScript on the client side for webpage behavior, of ...
,
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
, PostScript,
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
,
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called ...
, C#, and TeX can be made to show hyphens in the log by the command \showhyphens. In
LaTeX Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well. In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
, hyphenation correction can be added by users by using:
\hyphenation
The \hyphenation command declares allowed hyphenation points in which words is a list of words, separated by spaces, in which each hyphenation point is indicated by a - character. For example,
\hyphenation
declares that in the current job "fortran" should not be hyphenated and that if "ergonomic" must be hyphenated, it will be at one of the indicated points. However, there are several limits. For example, the stock \hyphenation command accepts only
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
letters by default and so it cannot be used to correct hyphenation for words with non-ASCII characters (like ''ä'', ''é'', ''ç''), which are very common in almost all languages except English. Simple workarounds exist, however.


Worked

*
Phonotactics Phonotactics (from Ancient Greek "voice, sound" and "having to do with arranging") is a branch of phonology that deals with restrictions in a language on the permissible combinations of phonemes. Phonotactics defines permissible syllable struc ...
* Tautosyllabic, heterosyllabic and ambisyllabic phones * Syllable structure in English phonology


Notes


External links


Online Lyric Hyphenator
Hyphenates English text into syllables
Online hyphenation tool
Hyphenation algorithms for several languages
Hyphenation tool for the French Language
Hyphenates French words with explanation {{Authority control Phonotactics