Eurasiatic is a hypothetical and controversial
language macrofamily proposal that would include many
language families
A language family is a group of languages related through descent from a common ancestor, called the proto-language of that family. The term ''family'' is a metaphor borrowed from biology, with the tree model used in historical linguistics ana ...
historically spoken in northern, western, and southern
Eurasia
Eurasia ( , ) is a continental area on Earth, comprising all of Europe and Asia. According to some geographers, Physical geography, physiographically, Eurasia is a single supercontinent. The concept of Europe and Asia as distinct continents d ...
.
The idea of a Eurasiatic superfamily dates back more than 100 years.
Joseph Greenberg
Joseph Harold Greenberg (May 28, 1915 – May 7, 2001) was an American linguist, known mainly for his work concerning linguistic typology and the genetic classification of languages.
Life Early life and education
Joseph Greenberg was born on M ...
's proposal, dating to the 1990s, is the most widely discussed version. In 2013,
Mark Pagel and three colleagues published what they believe to be statistical evidence for a Eurasiatic language family.
The branches of Eurasiatic vary between proposals, but typically include the highly controversial
Altaic macrofamily (composed in part of
Mongolic,
Tungusic and
Turkic),
Chukchi-Kamchatkan,
Eskimo–Aleut,
Indo-European
The Indo-European languages are a language family native to the northern Indian subcontinent, most of Europe, and the Iranian plateau with additional native branches found in regions such as Sri Lanka, the Maldives, parts of Central Asia (e. ...
, and
Uralic—although Greenberg uses the controversial
Uralic-Yukaghir classification instead. Other branches sometimes included are the
Kartvelian and
Dravidian families, as proposed by Pagel et al., in addition to the
language isolate
A language isolate is a language that has no demonstrable genetic relationship with any other languages. Basque in Europe, Ainu and Burushaski in Asia, Sandawe in Africa, Haida and Zuni in North America, Kanoê in South America, and Tiwi ...
s
Nivkh,
Etruscan and Greenberg's "Korean–Japanese–Ainu". Some proposals group Eurasiatic with even larger macrofamilies, such as
Nostratic; again, many other professional linguists regard the methods used as invalid.
The hypothesis has fallen out of favour and has limited degrees of acceptance, predominantly among a minority of Russian linguists. Linguists worldwide reject Eurasiatic and many other macrofamily hypotheses such as
Nostratic, with the exception of
Dené–Yeniseian languages, which has been met with some degree of acceptance.
History of the concept
In 1994
Merritt Ruhlen claimed Eurasiatic is supported by the existence of a grammatical pattern "whereby plurals of nouns are formed by suffixing -''t'' to the noun root ... whereas ''duals'' of nouns are formed by suffixing -''k''."
Rasmus Rask noted this grammatical pattern in the groups now called Uralic and Eskimo–Aleut as early as 1818, but it can also be found in Tungusic, Nivkh (also called Gilyak) and Chukchi–Kamchatkan—all of which Greenberg placed in Eurasiatic. According to Ruhlen, this pattern is not found in language families or languages outside Eurasiatic.
In 1998,
Joseph Greenberg
Joseph Harold Greenberg (May 28, 1915 – May 7, 2001) was an American linguist, known mainly for his work concerning linguistic typology and the genetic classification of languages.
Life Early life and education
Joseph Greenberg was born on M ...
extended his work in
mass comparison, a methodology he first proposed in the 1950s to categorize the languages of Africa, to suggest a Eurasiatic language.
[Pagel ''et al.''. (SI), p. 1] In 2000, he expanded his argument for Eurasiatic into a full-length book, ''Indo-European and Its Closest Relatives: The Eurasiatic Language Family'', in which he outlines both phonetic and grammatical evidence that he feels demonstrate the validity of language family. The heart of his argument is 72
morphological features that he judges as common across the various language families he examines. Of the many variant proposals, Greenberg's has attracted the most academic attention.
Greenberg's Eurasiatic hypothesis has been dismissed by many linguists, often on the ground that his research on mass comparison is unreliable. The primary criticism of comparative methods is that
cognates
In historical linguistics, cognates or lexical cognates are sets of words that have been inherited in direct descent from an etymological ancestor in a common parent language.
Because language change can have radical effects on both the soun ...
are assumed to have a common origin on the basis of similar sounds and word meanings. It is generally assumed that semantic and phonetic corruption destroys any trace of original sound and meaning within 5,000 to 9,000 years making the application of comparative methods to ancient superfamilies highly questionable. Additionally, apparent cognates can arise by chance or from
loan words
A loanword (also a loan word, loan-word) is a word at least partly assimilated from one language (the donor language) into another language (the recipient or target language), through the process of borrowing. Borrowing is a metaphorical term t ...
. Without the existence of statistical estimates of chance collisions, conclusions based on comparison alone are thus viewed as doubtful.
[Pagel ''et al.'', p. 1]
Stefan Georg and
Alexander Vovin, who, unlike many of their colleagues, do not stipulate ''a priori'' that attempts to find ancient relationships are bound to fail, examined Greenberg's claims in detail. They state that Greenberg's morphological arguments are the correct approach to determining families, but doubt his conclusions. They write "
reenberg's72 morphemes look like massive evidence in favour of Eurasiatic at first glance. If valid, few linguists would have the right to doubt that a point has been made
..However, closer inspection
..shows too many misinterpretations, errors and wrong analyses
..these allow no other judgement than that
reenberg'sattempt to demonstrate the validity of his Eurasiatic has failed."
In the 1980s, Russian linguist 's hypothesis () linked the
Indo-European
The Indo-European languages are a language family native to the northern Indian subcontinent, most of Europe, and the Iranian plateau with additional native branches found in regions such as Sri Lanka, the Maldives, parts of Central Asia (e. ...
,
Uralic, and
Altaic (including Korean in his later papers) language families. Andreev also proposed 203 lexical roots for his hypothesized Boreal macrofamily. After Andreev's death in 1997, the Boreal hypothesis was further expanded by
Sorin Paliga (2003, 2007).
[Paliga, Sorin (2003)]
N. D. Andreev's Proto-Boreal Theory and Its Implications in Understanding the Central-East and Southeast European Ethnogenesis: Slavic, Baltic and Thracian
''Romanoslavica'' 38: 93–104. Papers and articles for the 13th International Congress of Slavicists, Ljubljana, August 15–21, 2003.
Pagel et al.
In 2013,
Mark Pagel, Quentin D. Atkinson, Andreea S. Calude, and Andrew Meade published statistical evidence that attempts to overcome these objections. According to their earlier work, most words exhibit a "
half-life Half-life is a mathematical and scientific description of exponential or gradual decay.
Half-life, half life or halflife may also refer to:
Film
* Half-Life (film), ''Half-Life'' (film), a 2008 independent film by Jennifer Phang
* ''Half Life: ...
" of between 2,000 and 4,000 years, consistent with existing theories of linguistic replacement. However, they also identified some words – numerals, pronouns, and certain adverbs – that exhibit a much slower rate of replacement with half-lives of 10,000 to 20,000 or more years. Drawing from research in a diverse group of modern languages, the authors were able to show the same slow replacement rates for key words regardless of current pronunciation. They conclude that a stable core of largely unchanging words is a common feature of all human discourse, and model replacement as inversely proportional to usage frequency.
Pagel et al. used hypothesized reconstructions of proto-words from seven language families listed in the Languages of the World Etymological Database (LWED).
They limited their search to the 200 most common words as described by the
Swadesh fundamental vocabulary list. Twelve words were excluded because proto-words had been proposed for two or fewer language families. The remaining 188 words yielded 3804 different reconstructions (sometimes with multiple constructions for a given family). In contrast to traditional comparative linguistics, the researchers did not attempt to "prove" any given pairing as cognates (based on similar sounds), but rather treated each pairing as a
binary random variable subject to error. The set of possible cognate pairings was then analyzed as a whole for predictable regularities.
[Pagel ''et al.'', p. 2]
Words were separated into groupings based on how many language families appeared to be cognate for the word. Among the 188 words, cognate groups ranged from 1 (no cognates) to 7 (all languages cognate) with a mean of 2.3 ± 1.1. The distribution of cognate class size was
positively skewed − many more small groups than large ones − as predicted by their hypothesis of variant decay rates.
Words were then grouped by their generalized worldwide frequency of use, part of speech, and previously estimated rate of replacement. Cognate class size was positively correlated with estimated replacement rate (
r=0.43,
p<0.001). Generalized frequency combined with part of speech was also a strong predictor of class size (r=0.48, p<0.001). Pagel ''et al.'' conclude "This result suggests that, consistent with their short estimated half-lives, infrequently used words typically do not exist long enough to be deeply ancestral, but that above the threshold frequency words gain greater stability, which then translates into larger cognate class sizes."
[Pagel ''et al.'', p. 3]
Twenty-three word meanings had cognate class sizes of four or more.
Words used more than once per 1,000 spoken words (
χ2=24.29, P<0.001), pronouns (χ
2=26.1, P<0.0001), and adverbs (χ
2=14.5, P=0.003) were over-representing among those 23 words. Frequently used words, controlled for part of speech, were 7.5 times more likely (P<0.001) than infrequently used words to be judged as cognate. These findings matched their ''
a priori
('from the earlier') and ('from the later') are Latin phrases used in philosophy to distinguish types of knowledge, Justification (epistemology), justification, or argument by their reliance on experience. knowledge is independent from any ...
'' predictions about word classes more likely to retain sound and meaning over long periods of time.
[Pagel ''et al.'', p. 4] The authors write "Our ability to predict these words independently of their sound
correspondences dilutes the usual criticisms leveled at such long-range linguistic reconstructions, that proto-words are unreliable or
inaccurate, or that apparent phonetic similarities among them reflect chance sound resemblances." On the first point, they argue that inaccurate reconstructions should weaken, not enhance, the signals. On the second, they argue that chance resemblances should be equally common across all word usage frequencies, in contrast to what the data shows.
[Pagel ''et al.'', p. 5]
The team then created a
Markov chain Monte Carlo
In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it – that ...
simulation to estimate and date the
phylogenetic tree
A phylogenetic tree or phylogeny is a graphical representation which shows the evolutionary history between a set of species or taxa during a specific time.Felsenstein J. (2004). ''Inferring Phylogenies'' Sinauer Associates: Sunderland, MA. In ...
s of the seven language families under examination. Five separate runs produced the same (unrooted) tree, with three sets of language families: an eastern grouping of Altaic, Inuit–Yupik, and Chukchi–Kamchatkan; a central and southern Asia grouping of Kartvelian and Dravidian; and a northern and western European grouping of Indo-European and Uralic.
Two rootings were considered, using established age estimates for Proto-Indo-European and Proto-Chukchi–Kamchatkan as calibration. The first roots the tree to the midpoint of the branch leading to proto-Dravidian and yields an estimated origin for Eurasiatic of 14450 ± 1750 years ago. The second roots the tree to the proto-Kartvelian branch and yields 15610 ± 2290 years ago. Internal nodes have less certainty, but exceed chance expectations, and do not affect the top-level age estimate. The authors conclude "All inferred ages must be treated with caution but our estimates are consistent with proposals linking the near concomitant spread of the language families that comprise this group to the retreat of glaciers in Eurasia at the end of the last ice age ~15,000 years ago."
Many academics specializing in
historical linguistics
Historical linguistics, also known as diachronic linguistics, is the scientific study of how languages change over time. It seeks to understand the nature and causes of linguistic change and to trace the evolution of languages. Historical li ...
via the
comparative method
In linguistics, the comparative method is a technique for studying the development of languages by performing a feature-by-feature comparison of two or more languages with common descent from a shared ancestor and then extrapolating backwards ...
are, however, skeptical of the conclusions of the paper, and critical of its assumptions and methodology. Writing on University of Pennsylvania blog ''
Language Log'',
Sarah Thomason questions the accuracy of the LWED data on which the paper was based. She notes that LWED lists multiple possible proto-word reconstructions for most words, increasing the possibility of chance matches.
[Thomason] Pagel ''et al.'' anticipated this criticism and state that since infrequently used words generally have more proposed reconstructions, such errors should "produce a bias in the opposite direction" of what the statistics actually show (i.e. that infrequently used words should have larger cognate groups if chance alone was the source).
[Pagel ''et al.''. (SI), pp. 3-4] Thomason also argues that since the LWED is contributed to primarily by believers in
Nostratic, a proposed superfamily even broader than Eurasiatic, the data is likely to be biased towards proto-words that can be judged cognate.
Pagel ''et al.'' admit they "cannot rule out this bias", but say they think it is unlikely that bias has systematically impacted their results. They argue certain word types generally believed to be long lived (e.g., numbers) do not appear on their 23 word list, while other words of relatively low importance in modern society, but important to ancient people, do appear on the list (e.g., ''bark'' and ''ashes''), thus casting doubt on bias being the cause of the apparent cognates.
Thomason says she is "unqualified" to comment on the statistics themselves, but says any model that uses bad data as input cannot provide reliable results.
Asya Pereltsvaig takes a different approach to her critique of the paper. Outlining the history (in English) of several of the words on the Pagel list, she concludes it is impossible that such words could have retained any sound and meaning pairings from 15,000 years ago given how much they have changed in the 1,500 or so-year attested history of English. She also states that the authors are "looking in the wrong place" to begin with since "grammatical properties are more reliable than words as indicators of familial relationships".
Pagel ''et al.'' also examined two other possible objections to their conclusions. They rule out
linguistic borrowing as a significant factor in the results on the basis that for a word to appear cognate in many language families solely because of borrowing would require frequent swapping back and forth. This is deemed unlikely because of the large geographical area covered by the language groups and because frequently-used words are the least likely to be borrowed in modern times.
Finally, they state that leaving aside
closed-class words with simple phonologies (e.g., ''I'' and ''we'') does not affect their conclusions.
Classification
According to Greenberg, the language family that Eurasiatic is most closely connected to is
Amerind. He states that "the Eurasiatic-Amerind family represents a relatively recent expansion (circa 15,000 years ago) into territory opened up by the melting of the Arctic ice cap". In contrast, "Eurasiatic-Amerind stands apart from the other families of the Old World, among which the differences are much greater and represent deeper chronological groupings". Like Eurasiatic, Amerind is not a generally accepted proposal.
Eurasiatic and another proposed macrofamily,
Nostratic, often include many of the same language families.
Vladislav Illich-Svitych's Nostratic dictionary did not include the smaller Siberian language families listed in Eurasiatic, but this was only because protolanguages had not been reconstructed for them; Nostraticists have not attempted to exclude these languages from Nostratic. Many Nostratic theorists have accepted Eurasiatic as a subgroup within Nostratic alongside
Afroasiatic,
Kartvelian, and
Dravidian. LWED likewise views Eurasiatic as a subfamily of Nostratic.
The Nostratic family is not endorsed by the mainstream of
comparative linguistics
Comparative linguistics is a branch of historical linguistics that is concerned with comparing languages to establish their historical relatedness.
Genetic relatedness implies a common origin or proto-language and comparative linguistics aim ...
.
Harold C. Fleming includes Eurasiatic as a subgroup of the hypothetical
Borean family.
Subdivisions

The subdivisioning of Eurasiatic varies by proposal, but usually includes
Turkic,
Tungusic,
Mongolic,
Chukchi-Kamchatkan,
Eskimo–Aleut,
Indo-European
The Indo-European languages are a language family native to the northern Indian subcontinent, most of Europe, and the Iranian plateau with additional native branches found in regions such as Sri Lanka, the Maldives, parts of Central Asia (e. ...
, and
Uralic.
Greenberg enumerates eight branches of Eurasiatic, as follows: Altaic
urkic, Mongolic, Tungusic Chukchi-Kamchatkan, Eskimo–Aleut,
Etruscan, Indo-European, "Korean-Japanese-Ainu",
Nivkh, and
Uralic–Yukaghir. He then breaks these families into smaller sub-groups, some of which are themselves not widely accepted as phylogenetic groupings.
Pagel ''et al.'' use a slightly different branching, listing seven language families: Altaic
urkic, Mongolic, Tungusic Chukchi-Kamchatkan,
Dravidian, "Inuit-Yupik"—which is a name giving to LWED grouping of Inuit (Eskimo) languages that does not include Aleut —Indo-European,
Kartvelian, and Uralic.
Murray Gell-Mann
Murray Gell-Mann (; September 15, 1929 – May 24, 2019) was an American theoretical physicist who played a preeminent role in the development of the theory of elementary particles. Gell-Mann introduced the concept of quarks as the funda ...
,
Ilia Peiros, and
Georgiy Starostin group
Chukotko-Kamchatkan and
Nivkh with
Almosan instead of Eurasiatic.
Regardless of version, these lists cover the languages spoken in most of
Europe
Europe is a continent located entirely in the Northern Hemisphere and mostly in the Eastern Hemisphere. It is bordered by the Arctic Ocean to the north, the Atlantic Ocean to the west, the Mediterranean Sea to the south, and Asia to the east ...
,
Central and
Northern Asia and (in the case of Eskimo-Aleut) on either side of the
Bering Strait.
The branching of Eurasiatic is roughly (following Greenberg):
*Eurasiatic
**
Indo-European
The Indo-European languages are a language family native to the northern Indian subcontinent, most of Europe, and the Iranian plateau with additional native branches found in regions such as Sri Lanka, the Maldives, parts of Central Asia (e. ...
(unity undisputed)
**
Uralic–Yukaghir (hypothetical)
***
Uralic (unity undisputed)
***
Yukaghir (unity undisputed)
**
Nivkh (unity undisputed)
**
Chukotko-Kamchatkan (unity undisputed)
**
Eskaleut (unity undisputed)
**
Altaic (controversial)
***
Turkic (unity undisputed)
***
Mongolic (unity undisputed)
***
Tungusic (unity undisputed)
**Korean–Japanese–Ainu (hypothetical)
***
Koreanic (unity undisputed)
***
Japonic (unity undisputed)
***
Ainu (unity undisputed)
**
Tyrsenian (grouping of three closely related extinct languages; their affiliation with Eurasiatic, based primarily on "mi" first person singular, is highly speculative given lack of attestation)
Jäger (2015)
A computational phylogenetic analysis by Jäger (2015) provided the following phylogeny of language families in Eurasia:
Geographical distribution
Merritt Ruhlen suggests that the geographical distribution of Eurasiatic shows that it and the
Dené–Caucasian family are the result of separate migrations. Dené–Caucasian is the older of the two groups, with the emergence of Eurasiatic being more recent. The Eurasiatic expansion overwhelmed Dené–Caucasian, leaving speakers of the latter restricted mainly to isolated pockets (the
Basques
The Basques ( or ; ; ; ) are a Southwestern European ethnic group, characterised by the Basque language, a Basque culture, common culture and shared genetic ancestry to the ancient Vascones and Aquitanians. Basques are indigenous peoples, ...
in the
Pyrenees Mountains,
Caucasian peoples in the
Caucasus Mountains, and the
Burushaski in the
Hindu Kush
The Hindu Kush is an mountain range in Central Asia, Central and South Asia to the west of the Himalayas. It stretches from central and eastern Afghanistan into northwestern Pakistan and far southeastern Tajikistan. The range forms the wester ...
Mountains) surrounded by Eurasiatic speakers. Dené–Caucasian survived in these areas because they were difficult to access and therefore easy to defend; the reasons for its survival elsewhere are unclear. Ruhlen argues that Eurasiatic is supported by stronger and clearer evidence than Dené–Caucasian, and that this also indicates that the spread of Dené–Caucasian occurred before that of Eurasiatic.
The existence of a Dené–Caucasian family is disputed or rejected by most linguists, including
Lyle Campbell
Lyle Richard Campbell (born October 22, 1942) is an American scholar and linguist known for his studies of indigenous American languages, especially those of Central America, and on historical linguistics in general. Campbell is professor emeri ...
,
Ives Goddard, and
Larry Trask.
[Trask, p. 85]
The last common ancestor of the family was estimated by phylogenetic analysis of ultraconserved words at roughly 15,000 years old, suggesting that these languages spread from a
"refuge" area at the Last Glacial Maximum.
See also
*
Indo-Semitic languages
*
Indo-Uralic languages
*
Nostratic languages
Nostratic is a hypothetical language macrofamily including many of the language families of northern Eurasia first proposed in 1903. Though a historically important proposal, it is now generally considered a fringe theory. Its exact composition ...
*
Proto-Human language
*
Ural–Altaic languages
*
Uralo-Siberian languages
Notes
The 23 words are (listed in order of cognate class size): ''Thou'' (7 cognates), ''I'' (6), ''Not, That, To give, We, Who'' (5), ''Ashes, Bark, Black, Fire, Hand, Male/man, Mother, Old, This, To flow, To hear, To pull, To spit, What, Worm, Ye'' (4)
References
*
*
*
*
*
* Greenberg, Joseph H. 1957. ''Essays in Linguistics''. Chicago: University of Chicago Press.
* Greenberg, Joseph H. 2000. ''Indo-European and Its Closest Relatives: The Eurasiatic Language Family. Volume 1, Grammar''. Stanford: Stanford University Press.
* Greenberg, Joseph H. 2002. ''Indo-European and Its Closest Relatives: The Eurasiatic Language Family. Volume 2, Lexicon''. Stanford: Stanford University Press.
* Greenberg, Joseph H. 2005. ''Genetic Linguistics: Essays on Theory and Method'', edited by William Croft. Oxford: Oxford University Press.
* Mithun, Marianne. 1999. ''The Languages of Native North America''. Cambridge: Cambridge University Press.
* Nichols, Johanna. 1992. ''Linguistic Diversity in Space and Time.'' Chicago: University of Chicago Press.
*
*
*
*
*
*
Further reading
* Bancel, Pierre J.; de l'Etang, Alain Matthey. "The millennial persistence of Indo-European and Eurasiatic pronouns and the origin of nominals". In: ''In Hot Pursuit of Language in Prehistory: Essays in the four fields of anthropology. In honor of Harold Crane Fleming''. Edited by John D. Bengtson. John Benjamins Publishing Company, 2008. pp. 439–464. https://doi.org/10.1075/z.145.32ban
External links
NorthEuraLex"Indo-Uralic and Altaic"by Frederik Kortlandt (2006)
Regions Based on Social Structure: A Reconsideration
{{DEFAULTSORT:Eurasiatic Languages
Altaic languages
Eskaleut languages
Etruscan language
Paleo-Siberian languages
Proposed language families
Uralic languages