Language complexity is a topic in
linguistics
Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Lingu ...
which can be divided into several sub-topics such as
phonological
Phonology is the branch of linguistics that studies how languages or dialects systematically organize their sounds or, for sign languages, their constituent parts of signs. The term can also refer specifically to the sound or sign system of a ...
,
morphological,
syntactic
In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure ( constituenc ...
, and
semantic
Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
complexity.
The subject also carries importance for
language evolution
Evolutionary linguistics or Darwinian linguistics is a sociobiological approach to the study of language. Evolutionary linguists consider linguistics as a subfield of sociobiology and evolutionary psychology. The approach is also closely linked ...
.
Language complexity has been studied less than many other traditional fields of linguistics. While the
consensus is turning towards recognizing that complexity is a suitable research area, a central focus has been on
methodological
In its most common sense, methodology is the study of research methods. However, the term can also refer to the methods themselves or to the philosophical discussion of associated background assumptions. A method is a structured procedure for bri ...
choices. Some languages, particularly
pidgin
A pidgin , or pidgin language, is a grammatically simplified means of communication that develops between two or more groups of people that do not have a language in common: typically, its vocabulary and grammar are limited and often drawn from s ...
s and
creoles, are considered simpler than most other languages, but there is no direct ranking, and no universal method of measurement although several possibilities are now proposed within different schools of analysis.
History
Throughout the 19th century, differential complexity was taken for granted. The classical languages
Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power ...
and
Greek
Greek may refer to:
Greece
Anything of, from, or related to Greece, a country in Southern Europe:
*Greeks, an ethnic group.
*Greek language, a branch of the Indo-European language family.
**Proto-Greek language, the assumed last common ancestor ...
, as well as
Sanskrit
Sanskrit (; attributively , ; nominalization, nominally , , ) is a classical language belonging to the Indo-Aryan languages, Indo-Aryan branch of the Indo-European languages. It arose in South Asia after its predecessor languages had Trans-cul ...
, were considered to possess qualities which could be achieved by the rising European
national language
A national language is a language (or language variant, e.g. dialect) that has some connection—de facto or de jure—with a nation. There is little consistency in the use of this term. One or more languages spoken as first languages in the te ...
s only through an elaboration that would give them the necessary structural and lexical complexity that would meet the requirements of an advanced civilization. At the same time, languages described as 'primitive' were naturally considered to reflect the simplicity of their speakers.
[ On the other hand, ]Friedrich Schlegel
Karl Wilhelm Friedrich (after 1814: von) Schlegel (; ; 10 March 1772 – 12 January 1829) was a German poet, literary critic, philosopher, philologist, and Indologist. With his older brother, August Wilhelm Schlegel, he was one of the main figure ...
noted that some nations "which appear to be at the very lowest grade of intellectual culture", such as Basque
Basque may refer to:
* Basques, an ethnic group of Spain and France
* Basque language, their language
Places
* Basque Country (greater region), the homeland of the Basque people with parts in both Spain and France
* Basque Country (autonomous c ...
, Sámi
The Sámi ( ; also spelled Sami or Saami) are a Finno-Ugric-speaking people inhabiting the region of Sápmi (formerly known as Lapland), which today encompasses large northern parts of Norway, Sweden, Finland, and of the Murmansk Oblast, R ...
and some native American languages
Over a thousand indigenous languages are spoken by the Indigenous peoples of the Americas. These languages cannot all be demonstrated to be related to each other and are classified into a hundred or so language families (including a large num ...
, possess a striking degree of elaborateness.[
]
Equal complexity hypothesis
During the 20th century, linguists and anthropologists
An anthropologist is a person engaged in the practice of anthropology. Anthropology is the study of aspects of humans within past and present societies. Social anthropology, cultural anthropology and philosophical anthropology study the norms and ...
adopted a standpoint that would reject any nationalist
Nationalism is an idea and movement that holds that the nation should be congruent with the State (polity), state. As a movement, nationalism tends to promote the interests of a particular nation (as in a in-group and out-group, group of peo ...
ideas about superiority of the languages of establishment. The first known quote that puts forward the idea that all languages are equally complex comes from Rulon S. Wells III, 1954, who attributes it to Charles F. Hockett. While laymen never ceased to consider certain languages as simple and others as complex, such a view was erased from official contexts. For instance, the 1971 edition of Guinness Book of World Records
''Guinness World Records'', known from its inception in 1955 until 1999 as ''The Guinness Book of Records'' and in previous United States editions as ''The Guinness Book of World Records'', is a reference book published annually, listing world ...
featured Saramaccan
Saramaccan () is a creole language spoken by about 58,000 ethnic African people near the Saramacca and the upper Suriname River, as well as in Paramaribo, capital of Suriname (formerly also known as Dutch Guiana). The language also has 25,000 s ...
, a creole language, as "the world's least complex language". According to linguists, this claim was "not founded on any serious evidence", and it was removed from later editions. Apparent complexity differences in certain areas were explained with a balancing force by which the simplicity in one area would be compensated with the complexity of another; e.g. David Crystal
David Crystal, (born 6 July 1941) is a British linguist, academic, and prolific author best known for his works on linguistics and the English language.
Family
Crystal was born in Lisburn, Northern Ireland, on 6 July 1941 after his mother had ...
, 1987:
In 2001 the compensation hypothesis was eventually refuted by the creolist John McWhorter who pointed out the absurdity of the idea that, as languages change, each would have to include a mechanism that calibrates it according to the complexity of all the other 6,000 or so languages around the world. He underscored that linguistics has no knowledge of any such mechanism.[
Revisiting the idea of differential complexity, McWhorter argued that it is indeed creole languages, such as Saramaccan, that are structurally "much simpler than all but very few older languages". In McWhorter's notion this is not problematic in terms of the equality of creole languages because simpler structures convey logical meanings in the most straightforward manner, while increased language complexity is largely a question of features which may not add much to the functionality, or improve usefulness, of the language. Examples of such features are inalienable possessive marking, ]switch-reference
In linguistics, switch-reference (SR) describes any clause-level morpheme that signals whether certain prominent arguments in 'adjacent' clauses are coreferential. In most cases, it marks whether the subject of the verb in one clause is corefe ...
marking, syntactic asymmetries between matrix
Matrix most commonly refers to:
* ''The Matrix'' (franchise), an American media franchise
** '' The Matrix'', a 1999 science-fiction action film
** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchi ...
and subordinate clauses, grammatical gender
In linguistics, grammatical gender system is a specific form of noun class system, where nouns are assigned with gender categories that are often not related to their real-world qualities. In languages with grammatical gender, most or all nouns ...
, and other secondary features which are most typically absent in creoles.[ McWhorter's notion that "unnatural" language contact in pidgins, creoles and other contact varieties inevitably destroys "natural" accretions in complexity perhaps represents a recapitulation of 19th-century ideas about the relationship between language contact and complexity.]
During the years following McWhorter's article, several books and dozens of articles were published on the topic. As to date, there have been research projects on language complexity, and several workshops for researchers have been organised by various universities.[
]
Complexity metrics
At a general level, language complexity can be characterized as the number and variety of elements, and the elaborateness of their interrelational structure. This general characterisation can be broken down into sub-areas:
* ''Syntagmatic complexity'': number of parts, such as word length in terms of phonemes, syllables etc.
* ''Paradigmatic complexity'': variety of parts, such as phoneme inventory size, number of distinctions in a grammatical category, e.g. aspect
* ''Organizational complexity'': e.g. ways of arranging components, phonotactic restrictions, variety of word orders.
* ''Hierarchic complexity'': e.g. recursion, lexical–semantic hierarchies.[
Measuring complexity is considered difficult, and the comparison of whole natural languages as a daunting task. On a more detailed level, it is possible to demonstrate that some structures are more complex than others. Phonology and morphology are areas where such comparisons have traditionally been made. For instance, linguistics has tools for the assessment of the phonological system of any given language. As for the study of syntactic complexity, grammatical rules have been proposed as a basis,][ but generative frameworks, such as the ]minimalist program
In linguistics, the minimalist program is a major line of inquiry that has been developing inside generative grammar since the early 1990s, starting with a 1993 paper by Noam Chomsky.
Following Imre Lakatos's distinction, Chomsky presents minim ...
and the Simpler Syntax framework, have been less successful in defining complexity and its predictions than non-formal ways of description.
Many researchers suggest that several different concepts may be needed when approaching complexity: entropy, size, description length, effective complexity, information, connectivity, irreducibility, low probability, syntactic depth etc. Research suggests that while methodological choices affect the results, even rather crude analytic tools may provide a feasible starting point for measuring grammatical complexity.[
]
Computational tools
* Coh-Metrix
* L2 Syntactic Complexity Analyzer
References
Bibliography
*
*
*
*
*
*
{{Authority control
Grammar
Phonology
Language