Genetics and archaeogenetics of South Asia
   HOME

TheInfoList



OR:

Genetics and archaeogenetics of South Asia is the study of the
genetics Genetics is the study of genes, genetic variation, and heredity in organisms.Hartl D, Jones E (2005) It is an important branch in biology because heredity is vital to organisms' evolution. Gregor Mendel, a Moravian Augustinian friar work ...
and
archaeogenetics Archaeogenetics is the study of ancient DNA using various molecular genetic methods and DNA resources. This form of genetic analysis can be applied to human, animal, and plant specimens. Ancient DNA can be extracted from various fossilized specimen ...
of the ethnic groups of South Asia. It aims at uncovering these groups'
genetic history Archaeogenetics is the study of ancient DNA using various molecular genetic methods and DNA resources. This form of genetic analysis can be applied to human, animal, and plant specimens. Ancient DNA can be extracted from various fossilized specime ...
. The geographic position of South Asia makes its biodiversity important for the study of the early dispersal of
anatomically modern human Anatomy () is the branch of biology concerned with the study of the structure of organisms and their parts. Anatomy is a branch of natural science that deals with the structural organization of living things. It is an old science, having its ...
s across
Asia Asia (, ) is one of the world's most notable geographical regions, which is either considered a continent in its own right or a subcontinent of Eurasia, which shares the continental landmass of Afro-Eurasia with Africa. Asia covers an are ...
. Based on Mitochondrial DNA (
mtDNA Mitochondrial DNA (mtDNA or mDNA) is the DNA located in mitochondria, cellular organelles within eukaryotic cells that convert chemical energy from food into a form that cells can use, such as adenosine triphosphate (ATP). Mitochondrial DNA ...
) variations, genetic unity across various South Asian sub–populations have shown that most of the ancestral nodes of the phylogenetic tree of all the mtDNA types originated in South Asia. Conclusions of studies based on Y Chromosome variation and Autosomal DNA variation have been varied. South Asians are descendants of an indigenous South Asian component (termed ''Ancient Ancestral South Indians'', short "AASI"), closest to modern isolated tribal groups from South India, as well as
Andamanese peoples The Andamanese are the indigenous peoples of the Andaman Islands, part of India's Andaman and Nicobar Islands union territory in the southeastern part of the Bay of Bengal in Southeast Asia. The Andamanese peoples are among the various groups ...
, and more distantly related to
Aboriginal Australians Aboriginal Australians are the various Indigenous peoples of the Australian mainland and many of its islands, such as Tasmania, Fraser Island, Hinchinbrook Island, the Tiwi Islands, and Groote Eylandt, but excluding the Torres Strait ...
and East Asians and later-arriving West-Eurasian (European/Middle Eastern-related) and additional East/Southeast Asian components respectively, in varying degrees. The AASI type ancestry is found at the highest levels among tribal groups of southern India, such as the
Paniya Paniya is one of the Malayalam languages spoken in India. It is spoken by the Paniya people, a scheduled tribe with a majority of its speakers in the state of Kerala. The language is also known as ''Pania'', ''Paniyan'' and ''Panyah''. It belon ...
, and is generally found throughout all South Asian ethnic groups in substantially varying degrees. The West-Eurasian ancestry, specifically an Iranian-related component, combined with varying degrees of AASI ancestry formed the ''Indus Periphery Cline'' around ~5400–3700 BCE, the main ancestry of most modern South Asian groups. The Indus Periphery ancestry, around the 2nd millennium BCE, mixed with another West-Eurasian wave, the incoming mostly male-mediated Yamnaya-Steppe component to form the ''Ancestral North Indians'' (ANI), while at the same time it contributed to the formation of ''Ancestral South Indians'' (ASI) by admixture with hunter-gatherers having higher proportions of AASI-related ancestry. The ANI-ASI gradient, as demonstrated by the higher proportion of ANI in traditionally upper caste and Indo-European speakers, that resulted because of the admixture between the ANI and the ASI after 2000 BCE at various proportions is termed as the ''Indian Cline''.; The East Asian ancestry component forms the major ancestry among
Tibeto-Burmese The Tibeto-Burman languages are the non-Sinitic members of the Sino-Tibetan language family, over 400 of which are spoken throughout the Southeast Asian Massif ("Zomia") as well as parts of East Asia and South Asia. Around 60 million people spe ...
and Khasi-Aslian speakers in the
Himalayan foothills The Himalayas, or Himalaya (; ; ), is a mountain range in Asia, separating the plains of the Indian subcontinent from the Tibetan Plateau. The range has some of the planet's highest peaks, including the very highest, Mount Everest. Over 100 ...
and
Northeast India , native_name_lang = mni , settlement_type = , image_skyline = , image_alt = , image_caption = , motto = , image_map = Northeast india.png , ...
, and is generally distributed throughout South Asia at lower frequency, with substantial presence in Mundari-speaking groups, as well as in some populations of northern, central and eastern South Asia.


Overview

According to recent genome studies, South Asians are overall descendants of three ancestral groups in varying degrees: an indigenous South Asian component (often termed ''Ancient Ancestral South Indians'', short "AASI"), an ancient population relatively most closely related to the
Andamanese The Andamanese are the indigenous peoples of the Andaman Islands, part of India's Andaman and Nicobar Islands union territory in the southeastern part of the Bay of Bengal in Southeast Asia. The Andamanese peoples are among the various grou ...
, and more distantly to East Asians, and
Aboriginal Australians Aboriginal Australians are the various Indigenous peoples of the Australian mainland and many of its islands, such as Tasmania, Fraser Island, Hinchinbrook Island, the Tiwi Islands, and Groote Eylandt, but excluding the Torres Strait ...
), with its highest frequency among southern Indian tribal groups, a West-Eurasian (European/Middle Eastern-related) component which makes up the majority of derived ancestry for South Asians, and an additional East/Southeast Asian component, which is found primarily among ethnic minority groups along the Himalayan mountain range and Northeastern India. A specific Neolithic Iranian component, which may be associated with the spread of
Dravidian languages The Dravidian languages (or sometimes Dravidic) are a family of languages spoken by 250 million people, mainly in southern India, north-east Sri Lanka, and south-west Pakistan. Since the colonial era, there have been small but significant im ...
, forms the base ancestry of South Asians. This component, paired with substantial AASI ancestry resulted in the ''Indus Periphery Cline'', which is characteristic for South Asians. A Yamnaya Steppe pastoralist component is found in higher frequency among Indo-Aryan speakers, and is generally distributed throughout the
Indian subcontinent The Indian subcontinent is a physiographical region in Southern Asia. It is situated on the Indian Plate, projecting southwards into the Indian Ocean from the Himalayas. Geopolitically, it includes the countries of Bangladesh, Bhutan, In ...
. An East Asian ancestry component forms the major ancestry among
Tibeto-Burmese The Tibeto-Burman languages are the non-Sinitic members of the Sino-Tibetan language family, over 400 of which are spoken throughout the Southeast Asian Massif ("Zomia") as well as parts of East Asia and South Asia. Around 60 million people spe ...
and Khasi-Aslian speakers in the
Himalayan foothills The Himalayas, or Himalaya (; ; ), is a mountain range in Asia, separating the plains of the Indian subcontinent from the Tibetan Plateau. The range has some of the planet's highest peaks, including the very highest, Mount Everest. Over 100 ...
and
Northeast India , native_name_lang = mni , settlement_type = , image_skyline = , image_alt = , image_caption = , motto = , image_map = Northeast india.png , ...
, and is also found in substantial presence in Mundari-speaking groups. The AASI population became genetically isolated from other populations since approximately ~45,000 years BCe. The Andamanese people are hypothesized to be most closely related to the AASI population and sometimes used as an (imperfect) proxy for it, but others propose the Indian tribal groups like
Paniya Paniya is one of the Malayalam languages spoken in India. It is spoken by the Paniya people, a scheduled tribe with a majority of its speakers in the state of Kerala. The language is also known as ''Pania'', ''Paniyan'' and ''Panyah''. It belon ...
and Irula as better proxies for indigenous South Asian (AASI) ancestry than the Andamanese. According to Yelmen et al. 2019, the AASI "''separated from East Asian and Andamanese populations, shortly after having separated from West Eurasian populations''". According to Yang (2022): "''This distinct South Asian ancestry, denoted as the Ancient Ancestral South Indian (AASI) lineage, was only found in a small percentage of ancient and present-day South Asians. Present-day Onge from the Andamanese Islands are the best reference population to date, but Narasimhan et al. used qpGraph to show that the divergence between the AASI lineage and the ancestry found in present-day Onge was very deep. Ancestry associated with the AASI lineage was found at low levels in almost all present-day Indian populations''". Earliest West-Eurasian ancestry is proposed to have perhaps arrived already during the Paleolithic, about ~40,000 BC and may be linked to expanding
Aurignacian The Aurignacian () is an archaeological industry of the Upper Paleolithic associated with European early modern humans (EEMH) lasting from 43,000 to 26,000 years ago. The Upper Paleolithic developed in Europe some time after the Levant, where ...
groups of the
Levant The Levant () is an approximate historical geographical term referring to a large area in the Eastern Mediterranean region of Western Asia. In its narrowest sense, which is in use today in archaeology and other cultural contexts, it is ...
. Genetic data shows that the main West-Eurasian wave, happened during the
Neolithic period The Neolithic period, or New Stone Age, is an Old World archaeological period and the final division of the Stone Age. It saw the Neolithic Revolution, a wide-ranging set of developments that appear to have arisen independently in several parts ...
, or already during the
Holocene The Holocene ( ) is the current geological epoch. It began approximately 11,650 cal years Before Present (), after the Last Glacial Period, which concluded with the Holocene glacial retreat. The Holocene and the preceding Pleistocene togeth ...
, in tandem with the arrival of East Asian-related components during the
Neolithic period The Neolithic period, or New Stone Age, is an Old World archaeological period and the final division of the Stone Age. It saw the Neolithic Revolution, a wide-ranging set of developments that appear to have arisen independently in several parts ...
with
Austroasiatic The Austroasiatic languages , , are a large language family in Mainland Southeast Asia and South Asia. These languages are scattered throughout parts of Thailand, Laos, India, Myanmar, Malaysia, Bangladesh, Nepal, and southern China and are th ...
and
Tibeto-Burmese The Tibeto-Burman languages are the non-Sinitic members of the Sino-Tibetan language family, over 400 of which are spoken throughout the Southeast Asian Massif ("Zomia") as well as parts of East Asia and South Asia. Around 60 million people spe ...
groups from Southeast Asia and East Asia respectively. According to an international research team led by palaeogeneticists of the Johannes Gutenberg University Mainz (JGU), one of the most important ancestry components of South Asians is derived from a population related to Neolithic farmers from the eastern
Fertile Crescent The Fertile Crescent ( ar, الهلال الخصيب) is a crescent-shaped region in the Middle East, spanning modern-day Iraq, Syria, Lebanon, Israel, Palestine and Jordan, together with the northern region of Kuwait, southeastern region of ...
and
Iran Iran, officially the Islamic Republic of Iran, and also called Persia, is a country located in Western Asia. It is bordered by Iraq and Turkey to the west, by Azerbaijan and Armenia to the northwest, by the Caspian Sea and Turkmeni ...
. They concluded "that the Iranian genomes represent the main ancestors of modern-day South Asians". In the 2nd millennium BCE, the ''Indus Periphery''-related ancestry mixed with the arriving Yamnaya-Steppe component forming the ''Ancestral North Indians'' (ANI), while at the same time it contributed to the formation of ''Ancestral South Indians'' (ASI) by admixture with hunter-gatherers further South having higher proportions of AASI-related ancestry. The proximity to West-Eurasian populations is based on the ''ANI-ASI'' gradient, also termed the ''Indian Cline'', with the groups harboring higher ANI-ancestry being closer to West Eurasians as compared to populations harboring higher ASI-ancestry. Tribal groups from southern India harbor mostly ASI ancestry and sits farthest from West-Eurasian groups on the PCA compared to other South Asians. It has been found that the ancestral node of the phylogenetic tree of all the
mtDNA Mitochondrial DNA (mtDNA or mDNA) is the DNA located in mitochondria, cellular organelles within eukaryotic cells that convert chemical energy from food into a form that cells can use, such as adenosine triphosphate (ATP). Mitochondrial DNA ...
types ( mitochondrial DNA haplogroups) typically found in Central Asia, the West Asia and Europe are also to be found in South Asia at relatively high frequencies. The inferred divergence of this common ancestral node is estimated to have occurred slightly less than 50,000 years ago. In India, the major maternal lineages are various M subclades, followed by R and U sublineages. These mitochondrial haplogroups' coalescence times have been approximated to date to 50,000 BP. The major paternal lineages of South Asians, represented by
Y chromosome The Y chromosome is one of two sex chromosomes (allosomes) in therian mammals, including humans, and many other animals. The other is the X chromosome. Y is normally the sex-determining chromosome in many species, since it is the presence or abs ...
s, are haplogroups
R1a1 Haplogroup R1a, or haplogroup R-M420, is a human Y-chromosome DNA haplogroup which is distributed in a large region in Eurasia, extending from Scandinavia and Central Europe to southern Siberia and South Asia. While R1a originated c. 22,000 ...
, R2, H, L and J2, as well as O-M175. R1a1, J2 and L are mainly found among European and Middle Eastern populations, O-M175 is mainly restricted among
Austroasiatic The Austroasiatic languages , , are a large language family in Mainland Southeast Asia and South Asia. These languages are scattered throughout parts of Thailand, Laos, India, Myanmar, Malaysia, Bangladesh, Nepal, and southern China and are th ...
and
Tibeto-Burmese The Tibeto-Burman languages are the non-Sinitic members of the Sino-Tibetan language family, over 400 of which are spoken throughout the Southeast Asian Massif ("Zomia") as well as parts of East Asia and South Asia. Around 60 million people spe ...
speakers, and also common among East and Southeast Asians, while H is mostly restricted to South Asians. Some researchers have argued that Y-DNA Haplogroup R1a1 (M17) is of autochthonous South Asian origin. However, proposals for a Central Asian/Eurasian steppe origin for R1a1 are also quite common and supported by several more recent studies. Genetic studies comparing eight X chromosome based STR markers using a multidimensional scaling plot (MDS plot), revealed that South Asians like Indians, Bangladeshis and
Sinhalese people Sinhalese people ( si, සිංහල ජනතාව, Sinhala Janathāva) are an Indo-Aryan ethnolinguistic group native to the island of Sri Lanka. They were historically known as Hela people ( si, හෙළ). They constitute about 75% of ...
cluster close to each other, but also closer to
Europeans Europeans are the focus of European ethnology, the field of anthropology related to the various ethnic groups that reside in the states of Europe. Groups may be defined by common genetic ancestry, common language, or both. Pan and Pfeil (20 ...
. In contrast Southeast Asians, East Asians and Africans were placed at a distant positions, outside the main cluster.


mtDNA

The most frequent mtDNA haplogroups in South Asia are M, R and U (where U is a descendant of R). Arguing for the longer term "rival Y-Chromosome model", Stephen Oppenheimer believes that it is highly suggestive that India is the origin of the
Eurasia Eurasia (, ) is the largest continental area on Earth, comprising all of Europe and Asia. Primarily in the Northern and Eastern Hemispheres, it spans from the British Isles and the Iberian Peninsula in the west to the Japanese archipelag ...
n
mtDNA Mitochondrial DNA (mtDNA or mDNA) is the DNA located in mitochondria, cellular organelles within eukaryotic cells that convert chemical energy from food into a form that cells can use, such as adenosine triphosphate (ATP). Mitochondrial DNA ...
haplogroups which he calls the "Eurasian Eves". According to Oppenheimer it is highly probable that nearly all human maternal lineages in Central Asia, the Middle East and Europe descended from only four mtDNA lines that originated in South Asia 50,000–100,000 years ago.


Macrohaplogroup M

The macrohaplogroup M, which is considered as a cluster of the proto-Asian maternal lineages, represents more than 60% of South Asian MtDNA. The M macrohaplotype in India includes many subgroups that differ profoundly from other sublineages in East Asia especially Mongoloid populations. The deep roots of M phylogeny clearly ascertain the relic of South Asian lineages as compared to other M sublineages (in East Asia and elsewhere) suggesting 'in-situ' origin of these sub-haplogroups in South Asia, most likely in India. These deep-rooting lineages are not language specific and spread over all the language groups in India. Virtually all modern Central Asian MtDNA M lineages seem to belong to the Eastern Eurasian ( Mongolian) rather than the South Asian subtypes of haplogroup M, which indicates that no large-scale migration from the present Turkic-speaking populations of Central Asia occurred to India. The absence of haplogroup M in Europeans, compared to its equally high frequency among South Asians, East Asians and in some Central Asian populations contrasts with the Western Eurasian leanings of South Asian paternal lineages. Most of the extant mtDNA boundaries in South and Southwest Asia were likely shaped during the initial settlement of Eurasia by anatomically modern humans.


Macrohaplogroup R

The macrohaplogroup R (a very large and old subdivision of macrohaplogroup N) is also widely represented and accounts for the other 40% of South Asian MtDNA. A very old and most important subdivision of it is haplogroup U that, while also present in
West Eurasia Eurasia (, ) is the largest continental area on Earth, comprising all of Europe and Asia. Primarily in the Northern and Eastern Hemispheres, it spans from the British Isles and the Iberian Peninsula in the west to the Japanese archipelago ...
, has several subclades specific to South Asia. Most important South Asian haplogroups within R:


Haplogroup U

Haplogroup U is a sub-haplogroup of macrohaplogroup R. The distribution of haplogroup U is a mirror image of that for haplogroup M: the former has not been described so far among eastern Asians but is frequent in European populations as well as among South Asians. South Asian U lineages differ substantially from those in Europe and their coalescence to a common ancestor also dates back to about 50,000 years.


Y chromosome

The major South Asian Y-chromosome DNA haplogroups are H, J2, L,
R1a1 Haplogroup R1a, or haplogroup R-M420, is a human Y-chromosome DNA haplogroup which is distributed in a large region in Eurasia, extending from Scandinavia and Central Europe to southern Siberia and South Asia. While R1a originated c. 22,000 ...
, R2, which are commonly found among other West-Eurasian populations, such as Middle Easterners or Europeans. Their geographical origins are listed as follows, according to the latest scholarship:


Haplogroup H

Haplogroup H (Y-DNA) Haplogroup H (Y-DNA), also known as H-L901/M2939 is a Y-chromosome haplogroup. The primary branch H1 (H-M69) and its subclades is one of the most predominant haplogroups amongst populations in South Asia, particularly its descendant H1a1 ( ...
is found at a high frequency in South Asia and is considered to represent the major indigenous paternal lineage. H is today rarely found outside of South Asia, but is common among South Asian-descended populations, such as the Romanis, particularly the H-M82 subgroup. H was also found in some ancient samples of Europe and is still found today at a low frequency in certain southeastern Europeans and Arabs of the
Levant The Levant () is an approximate historical geographical term referring to a large area in the Eastern Mediterranean region of Western Asia. In its narrowest sense, which is in use today in archaeology and other cultural contexts, it is ...
. Haplogroup H is frequently found among populations of
India India, officially the Republic of India (Hindi: ), is a country in South Asia. It is the List of countries and dependencies by area, seventh-largest country by area, the List of countries and dependencies by population, second-most populous ...
,
Sri Lanka Sri Lanka (, ; si, ශ්‍රී ලංකා, Śrī Laṅkā, translit-std=ISO (); ta, இலங்கை, Ilaṅkai, translit-std=ISO ()), formerly known as Ceylon and officially the Democratic Socialist Republic of Sri Lanka, is an ...
,
Nepal Nepal (; ne, नेपाल ), formerly the Federal Democratic Republic of Nepal ( ne, सङ्घीय लोकतान्त्रिक गणतन्त्र नेपाल ), is a landlocked country in South Asia. It is ma ...
,
Pakistan Pakistan ( ur, ), officially the Islamic Republic of Pakistan ( ur, , label=none), is a country in South Asia. It is the world's List of countries and dependencies by population, fifth-most populous country, with a population of almost 24 ...
and the
Maldives Maldives (, ; dv, ދިވެހިރާއްޖެ, translit=Dhivehi Raajje, ), officially the Republic of Maldives ( dv, ދިވެހިރާއްޖޭގެ ޖުމްހޫރިއްޔާ, translit=Dhivehi Raajjeyge Jumhooriyyaa, label=none, ), is an archipelag ...
. All three branches of
Haplogroup H (Y-DNA) Haplogroup H (Y-DNA), also known as H-L901/M2939 is a Y-chromosome haplogroup. The primary branch H1 (H-M69) and its subclades is one of the most predominant haplogroups amongst populations in South Asia, particularly its descendant H1a1 ( ...
are found in South Asia. Haplogroup H is believed to have arisen in South Asia between 30,000 and 40,000 years ago. Its probable site of introduction is South Asia, since it is concentrated there. It seems to represent the main Y-Chromosome haplogroup of the paleolithic inhabitants of South Asia and Europe respectively. Some individuals in South Asia have also been shown to belong to the much rarer subclade H3 (Z5857). Haplogroup H is by no means restricted to specific populations. For example, H is possessed by about 28.8% of Indo-Aryan castes. and in tribals about 25–35%.


Haplogroup J2

Haplogroup J2 has been present in South Asia mostly as J2a-M410 and J2b-M102, since neolithic times (9500 YBP). J2 clades attain peak frequencies in the North-West and South India and is found at 19% within South Indian castes, 11% in North Indian castes and 12% in Pakistan. In
South India South India, also known as Dakshina Bharata or Peninsular India, consists of the peninsular southern part of India. It encompasses the Indian states of Andhra Pradesh, Karnataka, Kerala, Tamil Nadu, and Telangana, as well as the union terr ...
, the presence of J2 is higher among middle castes at 21%, followed by upper castes at 18.6% and lower castes at 14%. Among caste groups, the highest frequency of J2-M172 is observed among Tamil
Vellalar Vellalar is a generic Tamil term used primarily to refer to various castes who traditionally pursued agriculture as a profession in the Indian states of Tamil Nadu, Kerala and northeastern parts of Sri Lanka. The Vellalar are members of sev ...
s of South India, at 38.7%. J2 is present in tribals too and has a frequency of 11% in Austro-Asiatic tribals. Among the Austro-Asiatic tribals, the predominant J2 occurs in the Lodha (35%). J2 is also present in the South Indian
hill tribe Hill people, also referred to as mountain people, is a general term for people who live in the hills and mountains. This includes all rugged land above and all land (including plateaus) above elevation. The climate is generally harsh, with ...
Toda at 38.46%, in the Andh tribe of
Telangana Telangana (; , ) is a state in India situated on the south-central stretch of the Indian peninsula on the high Deccan Plateau. It is the eleventh-largest state and the twelfth-most populated state in India with a geographical area of and 35 ...
at 35.19% and in the
Kol tribe The Kol people referred to tribals of Chotanagpur in Eastern Parts of India. The Mundas, Oraons, Hos and Bhumijs were called Kols by British. It also refers to some tribe and caste of south-east Uttar Pradesh. They are mostly landless and depe ...
of
Uttar Pradesh Uttar Pradesh (; , 'Northern Province') is a state in northern India. With over 200 million inhabitants, it is the most populated state in India as well as the most populous country subdivision in the world. It was established in 195 ...
at a frequency of 33.34%. Haplogroup J-P209 was found to be more common in India's
Shia Muslim Shīʿa Islam or Shīʿīsm is the second-largest branch of Islam. It holds that the Islamic prophet Muhammad designated ʿAlī ibn Abī Ṭālib as his successor (''khalīfa'') and the Imam (spiritual and political leader) after him, most ...
s, of which 28.7% belong to haplogroup J, with 13.7% in J-M410, 10.6% in J-M267 and 4.4% in J2b. In
Pakistan Pakistan ( ur, ), officially the Islamic Republic of Pakistan ( ur, , label=none), is a country in South Asia. It is the world's List of countries and dependencies by population, fifth-most populous country, with a population of almost 24 ...
, the highest frequencies of J2-M172 were observed among the
Parsi Parsis () or Parsees are an ethnoreligious group of the Indian subcontinent adhering to Zoroastrianism. They are descended from Persians who migrated to Medieval India during and after the Arab conquest of Iran (part of the early Muslim conq ...
s at 38.89%, the Dravidian-speaking
Brahuis The Brahui ( brh, ), Brahvi or Brohi, are an ethnic group of pastoralists principally found in Balochistan, Pakistan. A minority speaks the Brahui language, which belongs to the Dravidian language family, while the rest speaks Balochi and tend ...
at 28.18% and the Makrani Balochs at 24%. It also occurs at 18.18% in Makrani Siddis and at 3% in Karnataka Siddis. J2-M172 is found at an overall frequency of 10.3% among the
Sinhalese people Sinhalese people ( si, සිංහල ජනතාව, Sinhala Janathāva) are an Indo-Aryan ethnolinguistic group native to the island of Sri Lanka. They were historically known as Hela people ( si, හෙළ). They constitute about 75% of ...
of
Sri Lanka Sri Lanka (, ; si, ශ්‍රී ලංකා, Śrī Laṅkā, translit-std=ISO (); ta, இலங்கை, Ilaṅkai, translit-std=ISO ()), formerly known as Ceylon and officially the Democratic Socialist Republic of Sri Lanka, is an ...
. In
Maldives Maldives (, ; dv, ދިވެހިރާއްޖެ, translit=Dhivehi Raajje, ), officially the Republic of Maldives ( dv, ދިވެހިރާއްޖޭގެ ޖުމްހޫރިއްޔާ, translit=Dhivehi Raajjeyge Jumhooriyyaa, label=none, ), is an archipelag ...
, 20.6% of Maldivian population were found to be haplogroup J2 positive.


Haplogroup L

According to Dr.
Spencer Wells Spencer Wells (born April 6, 1969) is an American geneticist, anthropologist, author and entrepreneur. He co-hosts The Insight podcast with Razib Khan. Wells led The Genographic Project from 2005 to 2015, as an Explorer-in-Residence at the N ...
, L-M20 originated in the
Pamir Knot The Pamir Mountains are a mountain range between Central Asia and Pakistan. It is located at a junction with other notable mountains, namely the Tian Shan, Karakoram, Kunlun, Hindu Kush and the Himalaya mountain ranges. They are among the world' ...
region in
Tajikistan Tajikistan (, ; tg, Тоҷикистон, Tojikiston; russian: Таджикистан, Tadzhikistan), officially the Republic of Tajikistan ( tg, Ҷумҳурии Тоҷикистон, Jumhurii Tojikiston), is a landlocked country in Centr ...
and migrated into
Pakistan Pakistan ( ur, ), officially the Islamic Republic of Pakistan ( ur, , label=none), is a country in South Asia. It is the world's List of countries and dependencies by population, fifth-most populous country, with a population of almost 24 ...
and
India India, officially the Republic of India (Hindi: ), is a country in South Asia. It is the List of countries and dependencies by area, seventh-largest country by area, the List of countries and dependencies by population, second-most populous ...
ca. 30,000 years ago. However, most other studies have proposed a South Asian origin for L-M20 and associated its expansion with the
Indus valley The Indus ( ) is a transboundary river of Asia and a trans-Himalayan river of South and Central Asia. The river rises in mountain springs northeast of Mount Kailash in Western Tibet, flows northwest through the disputed region of Kashmir, ...
(~7,000 YBP). There are three subbranches of haplogroup L: L1-M76 (L1a1), L2-M317 (L1b) and L3-M357 (L1a2), found at varying levels in South Asia.


India

Haplogroup L shows time of neolithic expansion. The clade is present in the Indian population at an overall frequency of ca. 7–15%. Haplogroup L has a higher frequency among south Indian castes (ca. 17–19%) and reaches 68% in some castes in
Karnataka Karnataka (; ISO 15919, ISO: , , also known as Karunāḍu) is a States and union territories of India, state in the southwestern region of India. It was Unification of Karnataka, formed on 1 November 1956, with the passage of the States Reor ...
but is somewhat rarer in northern Indian castes (ca. 5–6%). The presence of haplogroup L is quite rare among tribal groups (ca. 5,6–7%), however 14.6% has been observed among the
Chenchu The Chenchus are a Dravidian tribe, a designated Scheduled Tribe in the Indian states of Andhra Pradesh, Telangana, Karnataka and Odisha. They are an aboriginal tribe whose traditional way of life been based on hunting and gathering. The C ...
s. Among regional and social groups, moderate to high frequencies have been observed in Konkanastha Brahmins (18.6%), Punjabis (12.1%), Gujaratis (10.4%), Lambadis (17.1%), and
Jats The Jat people ((), ()) are a traditionally agricultural community in Northern India and Pakistan. Originally pastoralists in the lower Indus river-valley of Sindh, Jats migrated north into the Punjab region in late medieval times, and su ...
(36.8%).


Pakistan

In Pakistan, L1-M76 and L3-M357 subclades of L-M20 reach overall frequencies of 5.1% and 6.8%, respectively. Haplogroup L3 (M357) is found frequently among Burusho (approx. 12%) and
Pashtuns Pashtuns (, , ; ps, پښتانه, ), also known as Pakhtuns or Pathans, are an Iranian ethnic group who are native to the geographic region of Pashtunistan in the present-day countries of Afghanistan and Pakistan. They were historically r ...
(approx. 7%). Its highest frequency can be found in south western
Balochistan Balochistan ( ; bal, بلۏچستان; also romanised as Baluchistan and Baluchestan) is a historical region in Western and South Asia, located in the Iranian plateau's far southeast and bordering the Indian Plate and the Arabian Sea coastline. ...
province along the
Makran Makran ( fa, مكران), mentioned in some sources as Mecran and Mokrān, is the coastal region of Baluchistan. It is a semi-desert coastal strip in Balochistan, in Pakistan and Iran, along the coast of the Gulf of Oman. It extends westwards, f ...
coast (28%) to
Indus River The Indus ( ) is a transboundary river of Asia and a trans-Himalayan river of South and Central Asia. The river rises in mountain springs northeast of Mount Kailash in Western Tibet, flows northwest through the disputed region of Kashmi ...
delta. L3a (PK3) is found in approximately 23% of Nuristani in northwest
Pakistan Pakistan ( ur, ), officially the Islamic Republic of Pakistan ( ur, , label=none), is a country in South Asia. It is the world's List of countries and dependencies by population, fifth-most populous country, with a population of almost 24 ...
. The clade is present in moderate distribution among the general Pakistani population (14% approx).


Sri Lanka

In one study, 16% of the Sinhalese were found to be Haplogroup L-M20 positive. In another study 18% were found to belong to L1.


Haplogroup R1a1

In South Asia, R1a1 has been observed often with high frequency in a number of demographic groups, as well as with highest STR diversity which lead some to see it as the locus of origin. While R1a originated ca. 22,000 to 25,000 years ago, its subclade M417 (R1a1a1) diversified ca. 5,800 years ago. The distribution of M417-subclades R1-Z282 (including R1-Z280) in Central and Eastern Europe and R1-Z93 in Asia suggests that R1a1a diversified within the
Eurasian Steppe The Eurasian Steppe, also simply called the Great Steppe or the steppes, is the vast steppe ecoregion of Eurasia in the temperate grasslands, savannas and shrublands biome. It stretches through Hungary, Bulgaria, Romania, Moldova and Transnistr ...
s or the
Middle East The Middle East ( ar, الشرق الأوسط, ISO 233: ) is a geopolitical region commonly encompassing Arabian Peninsula, Arabia (including the Arabian Peninsula and Bahrain), Anatolia, Asia Minor (Asian part of Turkey except Hatay Pro ...
and
Caucasus The Caucasus () or Caucasia (), is a region between the Black Sea and the Caspian Sea, mainly comprising Armenia, Azerbaijan, Georgia, and parts of Southern Russia. The Caucasus Mountains, including the Greater Caucasus range, have historica ...
region. The place of origin of these subclades plays a role in the debate about the origins of Indo-Europeans.


India

In
India India, officially the Republic of India (Hindi: ), is a country in South Asia. It is the List of countries and dependencies by area, seventh-largest country by area, the List of countries and dependencies by population, second-most populous ...
, a high percentage of this haplogroup is observed in West Bengal Brahmins (72%) to the east, Gujarat Lohanas (60%) to the west,
Khatri Khatri is a caste of the Indian subcontinent that is predominantly found in India, but also in Pakistan and Afghanistan. In the subcontinent, they were mostly engaged in mercantilistic professions such as banking and trade, they were the d ...
s (67%) in the north, and Iyengar Brahmins (31%) in the south. It has also been found in several
South Indian South India, also known as Dakshina Bharata or Peninsular India, consists of the peninsular southern part of India. It encompasses the Indian states of Andhra Pradesh, Karnataka, Kerala, Tamil Nadu, and Telangana, as well as the union territ ...
Dravidian-speaking
tribals The term tribe is used in many different contexts to refer to a category of human social group. The predominant worldwide usage of the term in English is in the discipline of anthropology. This definition is contested, in part due to conflic ...
including the Kotas (41%) of Tamil Nadu,
Chenchu The Chenchus are a Dravidian tribe, a designated Scheduled Tribe in the Indian states of Andhra Pradesh, Telangana, Karnataka and Odisha. They are an aboriginal tribe whose traditional way of life been based on hunting and gathering. The C ...
(26%) and Valmikis of
Andhra Pradesh Andhra Pradesh (, abbr. AP) is a state in the south-eastern coastal region of India. It is the seventh-largest state by area covering an area of and tenth-most populous state with 49,386,799 inhabitants. It is bordered by Telangana to the ...
as well as the
Yadav Yadav refers to a grouping of traditionally non-elite, Quote: "The Yadavs were traditionally a low-to-middle-ranking cluster of pastoral-peasant castes that have become a significant political force in Uttar Pradesh (and other northern state ...
and Kallar of
Tamil Nadu Tamil Nadu (; , TN) is a state in southern India. It is the tenth largest Indian state by area and the sixth largest by population. Its capital and largest city is Chennai. Tamil Nadu is the home of the Tamil people, whose Tamil language ...
suggesting that M17 is widespread in these southern Indians tribes. Besides these, studies show high percentages in regionally diverse groups such as
Manipuris The Meitei people, also known as the Manipuri people,P.20: "historically, academically and conventionally Manipuri prominently refers to the Meetei people."P.24: "For the Meeteis, Manipuris comprise Meeteis, Lois, Kukis, Nagas and Pangal." is ...
(50%) to the extreme northeast and in among
Punjabis The Punjabis (Punjabi: ; ਪੰਜਾਬੀ ; romanised as Panjābīs), are an Indo-Aryan ethnolinguistic group associated with the Punjab region of the Indian subcontinent, comprising areas of eastern Pakistan and northwestern India. ...
(47%) to the extreme northwest.


Pakistan

In Pakistan, it is found at 71% among the Mohanna of Sindh Province to the south and 46% among the
Baltis Baltis was an ancient Arabian goddess. She was revered at Carrhae and identified with the planet Venus. Isaac of Antioch mentions Baltis in a text written in the middle of the 5th century CE as a deity worshipped by the Arabs. Baltis here is e ...
of
Gilgit-Baltistan Gilgit-Baltistan (; ), formerly known as the Northern Areas, is a region administered by Pakistan as an administrative territory, and constituting the northern portion of the larger Kashmir region which has been the subject of a dispute bet ...
to the north.


Sri Lanka

23% of the
Sinhalese people Sinhalese people ( si, සිංහල ජනතාව, Sinhala Janathāva) are an Indo-Aryan ethnolinguistic group native to the island of Sri Lanka. They were historically known as Hela people ( si, හෙළ). They constitute about 75% of ...
out of a sample of 87 subjects were found to be R1a1a (R-SRY1532) positive according to a 2003 research.


Maldives

In the
Maldives Maldives (, ; dv, ދިވެހިރާއްޖެ, translit=Dhivehi Raajje, ), officially the Republic of Maldives ( dv, ދިވެހިރާއްޖޭގެ ޖުމްހޫރިއްޔާ, translit=Dhivehi Raajjeyge Jumhooriyyaa, label=none, ), is an archipelag ...
, 23.8% of the Maldivian people were found to be R1a1a (M17) positive.


Nepal

People in
Terai , image =Terai nepal.jpg , image_size = , image_alt = , caption =Aerial view of Terai plains near Biratnagar, Nepal , map = , map_size = , map_alt = , map_caption = , biogeographic_realm = Indomalayan realm , global200 = Terai-Duar savanna a ...
region,
Nepal Nepal (; ne, नेपाल ), formerly the Federal Democratic Republic of Nepal ( ne, सङ्घीय लोकतान्त्रिक गणतन्त्र नेपाल ), is a landlocked country in South Asia. It is ma ...
show R1a1a at 69%.


Haplogroup R2

In South Asia, the frequency of R2 and R2a lineage is around 10–15% in India and
Sri Lanka Sri Lanka (, ; si, ශ්‍රී ලංකා, Śrī Laṅkā, translit-std=ISO (); ta, இலங்கை, Ilaṅkai, translit-std=ISO ()), formerly known as Ceylon and officially the Democratic Socialist Republic of Sri Lanka, is an ...
and 7–8% in Pakistan. At least 90% of R-M124 individuals are located in South Asia. It is also reported in
Caucasus The Caucasus () or Caucasia (), is a region between the Black Sea and the Caspian Sea, mainly comprising Armenia, Azerbaijan, Georgia, and parts of Southern Russia. The Caucasus Mountains, including the Greater Caucasus range, have historica ...
and
Central Asia Central Asia, also known as Middle Asia, is a region of Asia that stretches from the Caspian Sea in the west to western China and Mongolia in the east, and from Afghanistan and Iran in the south to Russia in the north. It includes the fo ...
at a lower frequency. A genetic study by Mondal et al. in 2017 concluded that
Haplogroup R2 Haplogroup R2, or R-M479, is a Y-chromosome haplogroup characterized by genetic marker M479. It is one of two primary descendants of Haplogroup R (R-M207), the other being R1 (R-M173). R-M479 has been concentrated geographically in South Asia ...
originated in northern India and was already present before the Steppe migration.


India

Among regional groups, it is found among
West West or Occident is one of the four cardinal directions or points of the compass. It is the opposite direction from east and is the direction in which the Sun sets on the Earth. Etymology The word "west" is a Germanic word passed into some ...
Bengalis Bengalis (singular Bengali bn, বাঙ্গালী/বাঙালি ), also rendered as Bangalee or the Bengali people, are an Indo-Aryan ethnolinguistic group originating from and culturally affiliated with the Bengal region of ...
(23%),
New Delhi New Delhi (, , ''Naī Dillī'') is the capital of India and a part of the National Capital Territory of Delhi (NCT). New Delhi is the seat of all three branches of the government of India, hosting the Rashtrapati Bhavan, Parliament Hous ...
Hindu Hindus (; ) are people who religiously adhere to Hinduism. Jeffery D. Long (2007), A Vision for Hinduism, IB Tauris, , pages 35–37 Historically, the term has also been used as a geographical, cultural, and later religious identifier for ...
s (20%),
Punjabis The Punjabis (Punjabi: ; ਪੰਜਾਬੀ ; romanised as Panjābīs), are an Indo-Aryan ethnolinguistic group associated with the Punjab region of the Indian subcontinent, comprising areas of eastern Pakistan and northwestern India. ...
(5%) and Gujaratis (3%). Among tribal groups, Karmalis of
West Bengal West Bengal (, Bengali: ''Poshchim Bongo'', , abbr. WB) is a state in the eastern portion of India. It is situated along the Bay of Bengal, along with a population of over 91 million inhabitants within an area of . West Bengal is the fou ...
showed highest at 100% followed by Lodhas (43%) to the east, while
Bhil Bhil or Bheel is an ethnic group in western India. They speak the Bhil languages, a subgroup of the Western Zone of the Indo-Aryan languages. As of 2013, Bhils were the largest tribal group in India. Bhils are listed as tribal people of ...
of
Gujarat Gujarat (, ) is a state along the western coast of India. Its coastline of about is the longest in the country, most of which lies on the Kathiawar peninsula. Gujarat is the fifth-largest Indian state by area, covering some ; and the ninth ...
in the west were at 18%, Tharus of the north showed it at 17%, the
Chenchu The Chenchus are a Dravidian tribe, a designated Scheduled Tribe in the Indian states of Andhra Pradesh, Telangana, Karnataka and Odisha. They are an aboriginal tribe whose traditional way of life been based on hunting and gathering. The C ...
and Pallan of the south were at 20% and 14% respectively. Among caste groups, high percentages are shown by Jaunpur
Kshatriya Kshatriya ( hi, क्षत्रिय) (from Sanskrit ''kṣatra'', "rule, authority") is one of the four varna (social orders) of Hindu society, associated with warrior aristocracy. The Sanskrit term ''kṣatriyaḥ'' is used in the co ...
s (87%), Kamma (73%),
Bihar Bihar (; ) is a state in eastern India. It is the 2nd largest state by population in 2019, 12th largest by area of , and 14th largest by GDP in 2021. Bihar borders Uttar Pradesh to its west, Nepal to the north, the northern part of West ...
Yadav Yadav refers to a grouping of traditionally non-elite, Quote: "The Yadavs were traditionally a low-to-middle-ranking cluster of pastoral-peasant castes that have become a significant political force in Uttar Pradesh (and other northern state ...
(50%),
Khandayat Khandayat or Khandait is a landed militia caste from Odisha, East india. They were feudal chiefs, military generals, zamindars, large land holders and agriculturalists. During British raj, they ruled many tributary states in Odisha. They are lar ...
(46%)and Kallar (44%). It is also significantly high in many
Brahmin Brahmin (; sa, ब्राह्मण, brāhmaṇa) is a varna as well as a caste within Hindu society. The Brahmins are designated as the priestly class as they serve as priests ( purohit, pandit, or pujari) and religious teachers ( ...
groups including
Punjabi Brahmins Punjabi, or Panjabi, most often refers to: * Something of, from, or related to Punjab, a region in India and Pakistan * Punjabi language * Punjabi people * Punjabi dialects and languages Punjabi may also refer to: * Punjabi (horse), a British Th ...
(25%), Bengali Brahmins (22%), Konkanastha Brahmins (20%),
Chaturvedi Chaturvedi is a surname of Indian origin. The name "Chaturvedi" literally means "Knower of the Four Vedas". It is used by Brahmins; a related surname is Chaubey, which has the same meaning. They are from the families of rishis (saints). The Earlie ...
s (32%), Bhargavas (32%),
Kashmiri Pandit The Kashmiri Pandits (also known as Kashmiri Brahmins) are a group of Kashmiri Hindus and a part of the larger Saraswat Brahmin community of India. They belong to the Pancha Gauda Brahmin group from the Kashmir Valley, a mountainous region l ...
s (14%) and
Lingayat Lingayatism or Veera Saivism is a Hindu denomination based on Shaivism. Initially known as ''Veerashaivas'', since the 12th-century adherents of this faith are known as ''Lingayats''. The terms ''Lingayatism'' and '' Veerashaivism'' have bee ...
Brahmins (30%). North Indian Muslims have a frequency of 19% (
Sunni Sunni Islam () is the largest branch of Islam, followed by 85–90% of the world's Muslims. Its name comes from the word '' Sunnah'', referring to the tradition of Muhammad. The differences between Sunni and Shia Muslims arose from a dis ...
) and 13% (
Shia Shīʿa Islam or Shīʿīsm is the second-largest branch of Islam. It holds that the Islamic prophet Muhammad designated ʿAlī ibn Abī Ṭālib as his successor (''khalīfa'') and the Imam (spiritual and political leader) after him, mos ...
), while Dawoodi Bohra Muslim in the western state of Gujarat have a frequency of 16% and
Mappila Muslims Mappila Muslim, often shortened to Mappila, formerly anglicized as Moplah/Mopla and historically known as Jonaka/Chonaka Mappila or Moors Mopulars/Mouros da Terra and Mouros Malabares, in general, is a member of the Muslim community of same n ...
of southern India have a frequency of 5%.


Pakistan

The R2 haplogroup is found in 14% of the
Burusho people The Burusho, or Brusho, also known as the Botraj, are an ethnolinguistic group indigenous to the Yasin, Hunza, Nagar, and other valleys of Gilgit–Baltistan in northern Pakistan, as well as in Jammu and Kashmir, India. Their language, Buru ...
. Among the
Hunza people The Burusho, or Brusho, also known as the Botraj, are an ethnolinguistic group indigenous to the Yasin, Hunza, Nagar, and other valleys of Gilgit–Baltistan in northern Pakistan, as well as in Jammu and Kashmir, India. Their language, Buru ...
it is found at 18% while the
Parsi Parsis () or Parsees are an ethnoreligious group of the Indian subcontinent adhering to Zoroastrianism. They are descended from Persians who migrated to Medieval India during and after the Arab conquest of Iran (part of the early Muslim conq ...
s show it at 20%.


Sri Lanka

38% of the Sinhalese of Sri Lanka were found to be R2 positive according to a 2003 research.


Maldives

12% of the Maldivians are found to have R2.


Nepal

In Nepal, R2 percentages range from 2% to 26% within different groups under various studies.
Newar Newar (; new, नेवार, endonym: Newa; new, नेवा, Pracalit script:) or Nepami, are the historical inhabitants of the Kathmandu Valley and its surrounding areas in Nepal and the creators of its historic heritage and civilisat ...
s show a significantly high frequency of 26% while people of
Kathmandu , pushpin_map = Nepal Bagmati Province#Nepal#Asia , coordinates = , subdivision_type = Country , subdivision_name = , subdivision_type1 = Province , subdivision_name1 = Bagmati Prov ...
show it at 10%.


Haplogroup O

Haplogroup O1 (O-F265) and O2 (O-M122), the primary branches of
Haplogroup O-M175 Haplogroup O, also known as O-M175, is a human Y-chromosome DNA haplogroup. It is primarily found among populations in Southeast Asia and East Asia. It also is found in various percentages of populations of the Russian Far East, South Asia, ...
are very common among the
Austroasiatic The Austroasiatic languages , , are a large language family in Mainland Southeast Asia and South Asia. These languages are scattered throughout parts of Thailand, Laos, India, Myanmar, Malaysia, Bangladesh, Nepal, and southern China and are th ...
and
Tibeto-Burmese The Tibeto-Burman languages are the non-Sinitic members of the Sino-Tibetan language family, over 400 of which are spoken throughout the Southeast Asian Massif ("Zomia") as well as parts of East Asia and South Asia. Around 60 million people spe ...
speaking populations of South Asia respectively.
Haplogroup O-M95 Haplogroup O-K18 also known as O-F2320 and (as of 2017) Haplogroup O1b1, is a human Y-chromosome DNA haplogroup. Haplogroup O-K18 is a descendant branch of Haplogroup O-P31. Based on its disjunct distribution, O-K18 can be further divided into ...
, a subclade of O1-F265, is mainly restricted in Austroasiatic-speaking groups in South Asia. According to Kumar ''et al'' 2007, M95 averages at 55% in Munda and 41% of Khasi-Khmuic speakers of from Northeast India, while Reddy et al. 2007 found an average frequency 53% among Mundari and 31% among Khasi speakers. Zhang et al. 2015, found a higher average of 67.53% and 74,00% among Munda and Khasi-speaking groups respectively. Abundant in the
Andaman and Nicobar Islands The Andaman and Nicobar Islands is a union territory of India consisting of 572 islands, of which 37 are inhabited, at the junction of the Bay of Bengal and the Andaman Sea. The territory is about north of Aceh in Indonesia and separated f ...
(averaging ~45%), it is fixed (100%) in some populations like Shompen, Onge and Nicobarese. A migration of O-M95 from Southeast Asia into India has been suggested with an expansion time of 5.2 ± 0.6 KYA in Northeast India. Haplogroup O2-M122 is primarily found among the males of
Tibeto-Burmese The Tibeto-Burman languages are the non-Sinitic members of the Sino-Tibetan language family, over 400 of which are spoken throughout the Southeast Asian Massif ("Zomia") as well as parts of East Asia and South Asia. Around 60 million people spe ...
ancestry in the Himalayas and Northeast India. Haplogroup O-M122, believed to have originated in
Southern China South China () is a geographical and cultural region that covers the southernmost part of China. Its precise meaning varies with context. A notable feature of South China in comparison to the rest of China is that most of its citizens are not n ...
shows very high percentages. It is found at 86.6% among Tamangs of Nepal, with similarly high frequencies, 75% to 85%, among the northeastern Indian Tibeto-Burman groups, including Adi,
Naga Naga or NAGA may refer to: Mythology * Nāga, a serpentine deity or race in Hindu, Buddhist and Jain traditions * Naga Kingdom, in the epic ''Mahabharata'' * Phaya Naga, mythical creatures believed to live in the Laotian stretch of the Mekong Ri ...
, Apatani,
Nyishi The Nyishi community is the largest ethnic group in Arunachal Pradesh in north-eastern India. In Nyishi, ''Nyi'' refers to "a human" and the word ''shi'' denotes "highland".The Nyishis are mentioned as the Daflas in the contemporary Ahom docu ...
, Kachari and Rabha. In
Northeast India , native_name_lang = mni , settlement_type = , image_skyline = , image_alt = , image_caption = , motto = , image_map = Northeast india.png , ...
, Baric speakers display a high frequency and homogeneity of O-M134, indicating a population bottleneck effect that occurred during a westward and then southward migration of the founding population of Tibeto-Burmans during its branching from the parental population. It has a significant presence among the
Khasis The Khasi people are an ethnic group of Meghalaya in north-eastern India with a significant population in the bordering state of Assam, and in certain parts of Bangladesh. Khasi people form the majority of the population of the eastern part of M ...
(29%), despite being generally absent in other Austroasiatics of India, and it shows up at 55% among neighbouring Garos, a Tibeto-Burman group.


Reconstructing South Asian population history

The , divides the population of South Asia into four ethnolinguistic (not genetic) groups:
Indo-European The Indo-European languages are a language family native to the overwhelming majority of Europe, the Iranian plateau, and the northern Indian subcontinent. Some European languages of this family, English, French, Portuguese, Russian, Du ...
, Dravidian,
Tibeto-Burman The Tibeto-Burman languages are the non- Sinitic members of the Sino-Tibetan language family, over 400 of which are spoken throughout the Southeast Asian Massif ("Zomia") as well as parts of East Asia and South Asia. Around 60 million people spea ...
and
Austro-Asiatic The Austroasiatic languages , , are a large language family in Mainland Southeast Asia and South Asia. These languages are scattered throughout parts of Thailand, Laos, India, Myanmar, Malaysia, Bangladesh, Nepal, and southern China and are th ...
. The molecular anthropology studies use three different type of markers: Mitochondrial DNA (mtDNA) variation which is maternally inherited and highly polymorphic, Y Chromosome variation which involves uniparental transmission along the male lines, and
Autosomal DNA An autosome is any chromosome that is not a sex chromosome. The members of an autosome pair in a diploid cell have the same morphology, unlike those in allosomal (sex chromosome) pairs, which may have different structures. The DNA in autosomes ...
variation.


mtDNA variation

Most of the studies based on mtDNA variation have reported genetic unity of South Asian populations across language, caste and tribal groups. It is likely that haplogroup M was brought to Asia from East Africa along the southern route by earliest migration wave 78,000 years ago. According to Kivisild et al. (1999), "Minor overlaps with lineages described in other Eurasian populations clearly demonstrate that recent immigrations have had very little impact on the innate structure of the maternal
gene pool The gene pool is the set of all genes, or genetic information, in any population, usually of a particular species. Description A large gene pool indicates extensive genetic diversity, which is associated with robust populations that can surv ...
of South Asians. Despite the variations found within India, these populations stem from a limited number of founder lineages. These lineages were most likely introduced to South Asia during the Middle Palaeolithic, before the peopling of Europe 48,000 years ago and perhaps the Old World in general." Basu et al. (2003) also emphasises underlying unity of female lineages in India.


Y Chromosome variation

Conclusions based on Y Chromosome variation have been more varied than those based on mtDNA variation. While Kivisild et al. proposes an ancient and shared genetic heritage of male lineages in South Asia, Bamshad et al. (2001) suggests an affinity between South Asian male lineages and modern west Eurasians proportionate to upper-caste rank and places upper-caste populations of southern Indian states closer to East Europeans. Basu et al. (2003) concludes that Austro–Asiatic tribal populations entered India first from the Northwest corridor and much later some of them through Northeastern corridor. Whereas, Kumar et al. (2007) analysed 25 South Asian Austro-Asiatic tribes and found a strong paternal genetic link among the sub-linguistic groups of the South Asian Austro-Asiatic populations. Mukherjee et al. (2001) places Pakistanis and North Indians between west Asian and Central Asian populations, whereas Cordaux et al. (2004) argues that the Indian caste populations are closer to Central Asian populations. Sahoo et al. (2006) and Sengupta et al. (2006) suggest that Indian caste populations have not been subject to any recent admixtures. Sanghamitra Sahoo concludes his study with: Closest-neighbor analysis done by Mondal et al. in 2017 concluded that Indian Y-lineages are close to southern
Europe Europe is a large peninsula conventionally considered a continent in its own right because of its great physical size and the weight of its history and traditions. Europe is also considered a Continent#Subcontinents, subcontinent of Eurasia ...
an populations and the time of divergence between the two predated Steppe migration:


Autosomal DNA variation


AASI-ANI-ASI

Results of studies based upon autosomal DNA variation have also been varied. In a major study (2009) using over 500,000 biallelic autosomal markers, Reich hypothesized that the modern South Asian population was the result of admixture between two genetically divergent ancestral populations dating from the post-Holocene era. These two "reconstructed" ancient populations he termed "Ancestral South Indians" (ASI) and "Ancestral North Indians" (ANI). According to Reich: "ANI ancestry is significantly higher in Indo-European than Dravidian speakers, suggesting that the ancestral ASI may have spoken a Dravidian language before mixing with the ANI." While the ANI is genetically close to Middle Easterners, Central Asians and Europeans, the ASI is not closely related to groups outside of the subcontinent. As no "ASI" ancient DNA is available, the indigenous
Andamanese The Andamanese are the indigenous peoples of the Andaman Islands, part of India's Andaman and Nicobar Islands union territory in the southeastern part of the Bay of Bengal in Southeast Asia. The Andamanese peoples are among the various grou ...
Onge are used as an (imperfect) proxy of ASI (according to Reich et al., the Andamanese, though distinct from them, are the closest living population to the ASI). According to Reich et al., both ANI and ASI ancestry are found all over the subcontinent (in both northern and southern India) in varying proportions, and that "ANI ancestry ranges from 39-71% in India, and is higher in traditionally upper caste and Indo-European speakers." Moorjani et al. 2013 state that the ASI, though not closely related to any living group, are "related (distantly) to indigenous Andaman Islanders." Moorjani et al. however suggest possible gene flow into the Andamanese from a population related to the ASI, causing the modeled relationship. The study concluded that "almost all groups speaking Indo-European or Dravidian languages lie along a gradient of varying relatedness to West-Eurasians in PCA (referred to as "Indian cline")". A 2013 study by Chaubey using the single-nucleotide polymorphism (SNP), shows that the genome of Andamanese people (Onge) is closer to those of other Oceanic Negrito groups than to that of South Asians. According to Basu et al. 2016, further analysis revealed that the genomic structure of mainland Indian populations is best explained by contributions from four ancestral components. In addition to the ANI and ASI, Basu et al. (2016) identified two East Asian ancestral components in mainland India that are major for the Austro-Asiatic-speaking tribals and the Tibeto-Burman speakers, which they denoted as AAA (for "Ancestral Austro-Asiatic") and ATB (for "Ancestral Tibeto-Burman") respectively. The study also infers that the populations of the
Andaman Islands The Andaman Islands () are an archipelago in the northeastern Indian Ocean about southwest off the coasts of Myanmar's Ayeyarwady Region. Together with the Nicobar Islands to their south, the Andamans serve as a maritime boundary between t ...
archipelago form a distinct ancestry, which "was found to be coancestral to Oceanic populations" but more distant from South Asians. The cline of admixture between the ANI and ASI lineages is dated to the period of c. 4.2–1.9 kya by Moorjani et al. (2013), corresponding to the Indian Bronze Age, and associated by the authors with the process of deurbanisation of the
Indus Valley civilization The Indus Valley Civilisation (IVC), also known as the Indus Civilisation was a Bronze Age civilisation in the northwestern regions of South Asia, lasting from 3300  BCE to 1300 BCE, and in its mature form 2600 BCE to 1900& ...
and the population shift to the Gangetic system in the incipient Indian Iron Age. Basu et al. (2003) suggests that "Dravidian speakers were possibly widespread throughout India before the arrival of the Indo-European-speaking nomads" and that "formation of populations by fission that resulted in founder and drift effects have left their imprints on the genetic structures of contemporary populations". The geneticist PP Majumder (2010) has recently argued that the findings of Reich et al. (2009) are in remarkable concordance with previous research using mtDNA and Y-DNA: Chaubey et al. 2015 detected a distinctive East Asian ancestral component, mainly restricted to specific populations in the foothills of Himalaya and northeastern part of India. Highest frequency of the component is observed among the Tibeto-Burmese speaking groups of northeast India and was also detected in Andamanese populations at 32%, with substantial presence also among Austroasiatic speakers. It is found to be largely absent in Indo-European and Dravidian speakers, except in some specific ethnic groups living in the Himalayan foothills and central-south India. The researchers however suggested that the East Asian ancestry (represented by the Han) measured in the studied Andamanese groups may actually reflect the capture of the affinity of the Andamanese with Melanesians and Malaysian Negritos (rather than true East Asian admixture), as a previous study by Chaubey et al. suggested "a deep common ancestry" between Andamanese, Melanesians and other Negrito groups, and an affinity between Southeast Asian Negritos and Melanesians (as well as the Andamanese) with East Asians. Lazaridis et al. (2016) notes "The demographic impact of steppe related populations on South Asia was substantial, as the
Mala Mala may refer to: Comics * Mala (Amazon), an Amazon from Wonder Woman's side of the DC Universe * Mala (Kryptonian), a villain from Superman's corner of the DC Universe Films and television * ''Mala'' (1941 film), a Bollywood drama film * , ...
, a south Indian
Dalit Dalit (from sa, दलित, dalita meaning "broken/scattered"), also previously known as untouchable, is the lowest stratum of the castes in India. Dalits were excluded from the four-fold varna system of Hinduism and were seen as forming ...
population with minimal ANI (Ancestral North Indian) along the 'Indian Cline' of such ancestry is inferred to have ~ 18%
steppe-related ancestry In archaeogenetics, the term Western Steppe Herders (WSH), or Western Steppe Pastoralists, is the name given to a distinct ancestral component first identified in individuals from the Eneolithic steppe around the turn of the 5th millennium BCE, ...
, while the Kalash of Pakistan are inferred to have ~ 50%, similar to present-day northern Europeans." The study estimated (6.5–50.2%) steppe-related admixture in South Asians. Lazaridis et al. further notes that "A useful direction of future research is a more comprehensive sampling of ancient DNA from steppe populations, as well as populations of central Asia (east of Iran and south of the steppe), which may reveal more proximate sources of the ANI than the ones considered here, and of South Asia to determine the trajectory of population change in the area directly. Pathak et al. 2018 concluded that the
Indo-European The Indo-European languages are a language family native to the overwhelming majority of Europe, the Iranian plateau, and the northern Indian subcontinent. Some European languages of this family, English, French, Portuguese, Russian, Du ...
speakers of the
Gangetic Plains The Indo-Gangetic Plain, also known as the North Indian River Plain, is a fertile plain encompassing northern regions of the Indian subcontinent, including most of northern and eastern India, around half of Pakistan, virtually all of Ba ...
and the Dravidian speakers have significant Yamnaya Early-Middle Bronze Age (Steppe_EMBA) ancestry but no Middle-Late Bronze Age Steppe (Steppe_MLBA) ancestry. On the other hand, the "North-Western Indian and Pakistani" populations (PNWI) showed significant Steppe_MLBA ancestry along with Yamnaya (Steppe_EMBA) ancestry. The study also noted that ancient South Asian samples had significantly higher Steppe_MLBA than Steppe_EMBA (or Yamnaya). The study also suggested that the
Ror Ror is a caste found primarily in Haryana and Uttar Pradesh. In the parts of Baiswara in Uttar Pradesh that are inhabited by Ror people, Rors along with groups such as the Chauhans and Tomars are associated with Delhi and its outskirts. Occ ...
s could be used as a proxy for the ANI. David Reich in his 2018 book ''
Who We Are and How We Got Here ''Who We Are and How We Got Here'' is a 2018 book on the contribution of genome-wide ancient DNA research to human population genetics by the geneticist David Reich. He describes discoveries made by his group and others, based on analysis and ...
'' states that the 2016 analyses found the ASI to have significant amounts of an ancestry component deriving from Iranian farmers (about 25% of their ancestry), with the remaining 75% of their ancestry deriving from native South Asian hunter-gatherers. He adds that ASI were unlikely the local hunter-gatherers of South Asia as previously established, but a population responsible for spreading agriculture throughout South Asia. In the case of the ANI, the Iranian farmer ancestry is 50%, with the rest being from steppe groups related to the Yamnaya. , similarly, conclude that ANI and ASI were formed in the 2nd millennium BCE. They were preceded by a mixture of AASI (ancient ancestral south Indian, i.e. hunter-gatherers sharing a distant root with the Andamanese, Australian Aboriginals, and East Asians); and Iranian agriculturalists who arrived in India ca. 4700–3000 BCE, and "must have reached the Indus Valley by the 4th millennium BCE". According to Narasimhan et al., this mixed population, which probably was native to the Indus Valley Civilisation, "contributed in large proportions to both the ANI and ASI", which took shape during the 2nd millennium BCE. ANI formed out of a mixture of "''Indus Periphery''-related groups" and migrants from the steppe, while ASI was formed out of "''Indus Periphery''-related groups" who moved south and mixed further with local hunter-gatherers. The ancestry of the ASI population is suggested to have averaged about 73% from the AASI and 27% from Iranian-related farmers. Narasimhan et al. observe that samples from the Indus periphery group are always mixes of the same two proximal sources of AASI and Iranian agriculturalist-related ancestry; with "one of the Indus Periphery individuals having ~42% AASI ancestry and the other two individuals having ~14-18% AASI ancestry" (with the remainder of their ancestry being from the Iranian agriculturalist-related population). The authors propose that the AASI indigenous hunter-gatherers represent a divergent branch that split off around the same time that East Asian, Onge (Andamanese) and Australian Aboriginal ancestors separated from each other. It inferred, "essentially all the ancestry of present-day eastern and southern Asians (prior to West Eurasian-related admixture in southern Asians) derives from a single eastward spread, which gave rise in a short span of time to the lineages leading to AASI, East Asians, Onge, and Australians." A genetic study by Yelmen et al. (2019) argue that the native South Asian genetic component is rather distinct from the Andamanese, and that the Andamanese are thus an imperfect proxy. This component (when represented by the Andamanese Onge) was not detected in the northern Indian
Gujarati Gujarati may refer to: * something of, from, or related to Gujarat, a state of India * Gujarati people, the major ethnic group of Gujarat * Gujarati language, the Indo-Aryan language spoken by them * Gujarati languages, the Western Indo-Aryan sub- ...
, and thus it is suggested that the South Indian tribal
Paniya Paniya is one of the Malayalam languages spoken in India. It is spoken by the Paniya people, a scheduled tribe with a majority of its speakers in the state of Kerala. The language is also known as ''Pania'', ''Paniyan'' and ''Panyah''. It belon ...
people (a group of predominantly ASI ancestry) would serve as a better proxy than the Andamanese (Onge) for the "native South Asian" component in modern South Asians, as the Paniya are directly derived from the natives of South Asia (rather than distantly related to them as the Onge are). Two genetic studies analysing remains from the Indus Valley civilisation (of parts of Bronze Age Northwest India and East Pakistan), found them to have a mixture of ancestry, both from native South Asian hunter-gatherers sharing a distant root with the Andamanese, and from a group related to Iranian farmers. The samples analyzed by Shinde derived about 50-98% of their genome fom Iranian-related peoples and from 2-50% from native South Asian hunter-gatherers. The samples analyzed by Narasimhan et al. had 45–82% of Iranian farmer-related ancestry and 11–50% of South Asian hunter-gatherer origin. The analysed samples of both studies have little to none of the " Steppe ancestry" component associated with later Indo-European migrations into India. The authors found that the respective amounts of those ancestries varied significantly between individuals, and concluded that more samples are needed to get the full picture of Indian population history. Yang 2022 summarized that the indigenous South Asian (AASI) lineage gave rise to two groups, the proper AASI within South Asia, and the
Andamanese peoples The Andamanese are the indigenous peoples of the Andaman Islands, part of India's Andaman and Nicobar Islands union territory in the southeastern part of the Bay of Bengal in Southeast Asia. The Andamanese peoples are among the various groups ...
. Both populations are closer to each other than to any other population, while ultimately having trifurcated from an "eastern population" which gave also rise to Australasians (AA lineage) and East/Southeast Asians (ESEA lineage). According to her, the "''Comparison with ancient individuals from South Asia showed that all present-day Indians have a mixture of ancestry related to the AASI lineage, basal Iranian ancestry, and Steppe ancestry. Northern and southern Indians are both associated with Indus Periphery ancestry observed in populations near and in the Indus Valley older than 4,000 years. Southern Indian populations possess additional ancestry related to the AASI lineage beyond that found in the ancient Indus Valley individuals, which suggests that ancient individuals representing the AASI lineage, who have yet to be sampled, likely lived in southern India. Northern Indians show genetic patterns similar to those found in ancient populations near the Indus Valley younger than 4,000 years; all show admixture with populations associated with Steppe ancestry. These patterns illustrate that in South Asia, the formation of ancestries associated with northern and southern Indians likely post-dated 4,000 years ago, where northern Indian populations associated with the Indus Periphery cline mixed with populations of Steppe ancestry and southern Indian populations in the Indus Periphery cline primarily mixed with populations of the AASI lineage. Just as in other regions of Asia, admixture played a key role in the formation of present-day Indian populations''".


Genetic distance between caste groups and tribes

Studies by Watkins et al. (2005) and Kivisild et al. (2003) based on autosomal markers conclude that Indian caste and tribal populations have a common ancestry. Reddy et al. (2005) found fairly uniform allele frequency distributions across caste groups of southern
Andhra Pradesh Andhra Pradesh (, abbr. AP) is a state in the south-eastern coastal region of India. It is the seventh-largest state by area covering an area of and tenth-most populous state with 49,386,799 inhabitants. It is bordered by Telangana to the ...
, but significantly larger genetic distance between caste groups and tribes indicating
genetic isolation Introduction Geographic isolation or other factors that prevent reproduction have resulted in a population of organisms with a change in genetic diversity and ultimately leads to the genetic isolation of species. Genetic isolates form new specie ...
of the tribes and castes. Viswanathan et al. (2004) in a study on genetic structure and affinities among tribal populations of southern India concludes, "''Genetic differentiation was high and genetic distances were not significantly correlated with geographic distances. Genetic drift therefore probably played a significant role in shaping the patterns of genetic variation observed in southern Indian tribal populations.'' Otherwise, analyses of population relationships showed that all Indian and South Asian populations are still similar to one another, regardless of phenotypic characteristics, and do not show any particular affinities to Africans. We conclude that the phenotypic similarities of some Indian groups to Africans ''do not'' reflect a close relationship between these groups, but are better explained by ''convergence''." A 2011 study published in the
American Journal of Human Genetics The ''American Journal of Human Genetics'' is a monthly peer-reviewed scientific journal in the field of human genetics. It was established in 1948 by the American Society of Human Genetics and covers all aspects of heredity in humans, includin ...
indicates that Indian ancestral components are the result of a more complex demographic history than was previously thought. According to the researchers, South Asia harbours two major ancestral components, one of which is spread at comparable frequency and genetic diversity in populations of Central Asia, West Asia and Europe; the other component is more restricted to South Asia. However, if one were to rule out the possibility of a large-scale Indo-Aryan migration, these findings suggest that the genetic affinities of both Indian ancestral components are the result of multiple gene flows over the course of thousands of years. Narasimhan et al. 2019 found Austroasiatic-speaking Munda tribals could not be modeled simply as mixture of ASI, AASI, or ANI ancestry unlike other South Asians but required additional ancestry component from Southeast Asia. They were modeled as mixture of 64% AASI, and 36% East Asian-related ancestry, samplified by the Nicobarese, thus the ancestry profile of the Mundas provides an independent line of ancestry from Southeast Asia around the 3rd millennium BCE. Lipson et al. 2018 found similar admixture results in regard to Munda tribals stating ''"we obtained a good fit with three ancestry components: one western Eurasian, one deep eastern Eurasian (interpreted as an indigenous South Asian lineage), and one from the Austroasiatic clade"''. Lipson et al. 2018 further found that the Austroasiatic source clad (proportion 35%) in Munda tribals was inferred to be closest to Mlabri. Singh et al. 2020 similarly found Austroasiatic speakers in South Asia fall out of the South Asian cline due to their Southeast Asian genetic affinity.


See also

*
Archaeogenetics Archaeogenetics is the study of ancient DNA using various molecular genetic methods and DNA resources. This form of genetic analysis can be applied to human, animal, and plant specimens. Ancient DNA can be extracted from various fossilized specimen ...
* Ethnic groups of South Asia *
List of ethnolinguistic regions of South Asia The following list is a list of ethnolinguistic regions of South Asia. An ethnolinguistic region indicates a region of people that are united by a common language and ethnicity. South Asia is the southern region of the Asian continent, consistin ...
*
Peopling of India The peopling of India refers to the migration of ''Homo sapiens'' into the Indian subcontinent. Anatomically modern humans settled India in multiple waves of early migrations, over tens of millennia. The first migrants came with the Coastal M ...
*
Y-DNA haplogroups in populations of South Asia Y-DNA haplogroups in populations of South Asia are haplogroups of the male Y-chromosome found in South Asian populations. Major Y-chromosome DNA haplogroups in South Asia South Asia, located on the crossroads of Western Eurasia and Eastern Eura ...
* Genetic studies on Gujarati people * Genetic history of Europe * Genetic history of the Middle East * Genetic history of Southeast Asia


References


Further reading

* * * * * * * (PhD) * * * *


External links

* * * * * {{Human genetics +
South Asia South Asia is the southern subregion of Asia, which is defined in both geographical and ethno-cultural terms. The region consists of the countries of Afghanistan, Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan, and Sri Lanka.;;;;; ...
South Asia South Asia is the southern subregion of Asia, which is defined in both geographical and ethno-cultural terms. The region consists of the countries of Afghanistan, Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan, and Sri Lanka.;;;;; ...
South Asia South Asia is the southern subregion of Asia, which is defined in both geographical and ethno-cultural terms. The region consists of the countries of Afghanistan, Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan, and Sri Lanka.;;;;; ...
South Asian people History of India History of Pakistan Peopling of the world
South Asia South Asia is the southern subregion of Asia, which is defined in both geographical and ethno-cultural terms. The region consists of the countries of Afghanistan, Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan, and Sri Lanka.;;;;; ...
South Asia South Asia is the southern subregion of Asia, which is defined in both geographical and ethno-cultural terms. The region consists of the countries of Afghanistan, Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan, and Sri Lanka.;;;;; ...
South Asia South Asia is the southern subregion of Asia, which is defined in both geographical and ethno-cultural terms. The region consists of the countries of Afghanistan, Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan, and Sri Lanka.;;;;; ...