Linguistic distance is the
measure of how different one
language
Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and signed language, signed forms, and may also be conveyed through writing syste ...
(or
dialect
A dialect is a Variety (linguistics), variety of language spoken by a particular group of people. This may include dominant and standard language, standardized varieties as well as Vernacular language, vernacular, unwritten, or non-standardize ...
) is from another.
Although they lack a uniform approach to quantifying linguistic distance between languages, linguists apply the concept to a variety of
linguistic
Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...
contexts, such as
second-language acquisition,
historical linguistics
Historical linguistics, also known as diachronic linguistics, is the scientific study of how languages change over time. It seeks to understand the nature and causes of linguistic change and to trace the evolution of languages. Historical li ...
, language-based conflicts, and the effects of language differences on trade.
Measures
Lexicostatistics
The proposed measures used for linguistic distance reflect varying understandings of the term itself. One approach is based on
mutual intelligibility
In linguistics, mutual intelligibility is a relationship between different but related language varieties in which speakers of the different varieties can readily understand each other without prior familiarity or special effort. Mutual intelli ...
, i.e. the ability of speakers of one language to understand the other language. With this, the higher the linguistic distance, the lower is the level of mutual intelligibility.
Because
cognate words play an important role in mutual intelligibility between languages, these figure prominently in such analyses. The higher the percentage of cognate (as opposed to non-cognate) words in the two languages with respect to one another, the lower is their linguistic distance. Also, the greater the degree of grammatical relatedness (i.e. the cognates mean roughly similar things) and lexical relatedness (i.e. the cognates are easily discernible as related words), the lower is the linguistic distance.
As an example of this, the
Hindustani word ''pānch'' is grammatically identical and
lexically similar (but non-identical) to its cognate
Punjabi and
Persian word ''panj'' as well as to the lexically dissimilar but still grammatically identical
Greek ''pent-''
[ List of Greek and Latin roots in English#P] and
English ''five''. As another example, the English ''dish'' and
German ''Tisch'' ('table') are lexically (phonologically) similar but grammatically (semantically) dissimilar. Cognates in related languages can even be identical in form, but semantically distinct, such as ''caldo'' and ''largo'', which mean respectively 'hot' and 'wide' in Italian but 'broth, soup' and 'long' in Spanish. Using a statistical approach (called
lexicostatistics) by comparing each language's mass of words, distances can be calculated between them; in technical terms, what is calculated is the
Levenshtein distance.
Based on this, one study compared both
Afrikaans
Afrikaans is a West Germanic languages, West Germanic language spoken in South Africa, Namibia and to a lesser extent Botswana, Zambia, Zimbabwe and also Argentina where there is a group in Sarmiento, Chubut, Sarmiento that speaks the Pat ...
and
West Frisian with
Dutch to see which was closer to Dutch. It determined that Dutch and Afrikaans (mutual distance of 20.9%) were considerably closer than Dutch and West Frisian (mutual distance of 34.2%).
However, lexicostatistical methods, which are based on retentions from a common proto-language – and not innovations – are problematic due to a number of reasons, so some linguists argue they cannot be relied upon during the tracing of a phylogenetic tree (for example, highest retention rates can sometimes be found in the opposite, peripheral ends of a language family).
Unusual innovativeness or conservativeness of a language can distort linguistic distance and the assumed separation date, examples being
Romani language
Romani ( ; also Romanes , Romany, Roma; ) is an Indo-Aryan languages, Indo-Aryan macrolanguage of the Romani people. The largest of these are Vlax Romani language, Vlax Romani (about 500,000 speakers), Balkan Romani (600,000), and Sinte Roma ...
and
East Baltic languages respectively.
On the one hand, continued
adjacency of closely related languages after their separation can make some loanwords 'invisible' (indistinguishable from cognates), therefore, from lexicostatistical point of view these languages appear less distant then they actually are (examples being
Finnic and
Saami languages).
On the other hand, strong foreign influence of languages spreading far from their homeland can make them share fewer inherited words than they ought to (examples being
Hungarian and
Samoyedic languages in the East Uralic branch).
Other internal aspects
Besides cognates, other aspects that are often measured are similarities of
syntax
In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...
and written forms.
[ ''... vocabulary, grammar, written form, syntax and myriad other statistics ... this scalar measure of "linguisitic distance" is demonstrated through an analysis of the determinants of English language proficiency among immigrants ...'']
To overcome the aforementioned problems of the lexicostatistical methods,
Donald Ringe,
Tandy Warnow and
Luay Nakhleh developed a complex phylogenetical method relying on phonological and morphological innovations in 2000s.
Language learning
A 2005 paper by economists Barry Chiswick and Paul Miller attempted to put forth a metric for linguistic distances that was based on empirical observations of how rapidly speakers of a given language gained proficiency in another one when immersed in a society that overwhelmingly communicated in the latter language. In this study, the speed of English language acquisition was studied for immigrants of various linguistic backgrounds in the
United States
The United States of America (USA), also known as the United States (U.S.) or America, is a country primarily located in North America. It is a federal republic of 50 U.S. state, states and a federal capital district, Washington, D.C. The 48 ...
and
Canada
Canada is a country in North America. Its Provinces and territories of Canada, ten provinces and three territories extend from the Atlantic Ocean to the Pacific Ocean and northward into the Arctic Ocean, making it the world's List of coun ...
.
See also
*
Abstand and ausbau languages
In sociolinguistics, an abstand language is a language variety or cluster of varieties with significant linguistic distance from all others, while an ausbau language is a standard variety, possibly with related dependent varieties. Heinz Klo ...
*
Language transfer
*
Second-language acquisition
*
Historical linguistics
Historical linguistics, also known as diachronic linguistics, is the scientific study of how languages change over time. It seeks to understand the nature and causes of linguistic change and to trace the evolution of languages. Historical li ...
References
{{reflist, 30em
Applied linguistics
Historical linguistics
Quantitative linguistics
Language acquisition
Language comparison