The Austronesian languages ( ) are a
language family
A language family is a group of languages related through descent from a common ancestor, called the proto-language of that family. The term ''family'' is a metaphor borrowed from biology, with the tree model used in historical linguistics ...
widely spoken throughout
Maritime Southeast Asia
Maritime Southeast Asia comprises the Southeast Asian countries of Brunei, Indonesia, Malaysia, the Philippines, Singapore, and East Timor.
The terms Island Southeast Asia and Insular Southeast Asia are sometimes given the same meaning as ...
, parts of
Mainland Southeast Asia
Mainland Southeast Asia (historically known as Indochina and the Indochinese Peninsula) is the continental portion of Southeast Asia. It lies east of the Indian subcontinent and south of Mainland China and is bordered by the Indian Ocean to th ...
,
Madagascar
Madagascar, officially the Republic of Madagascar, is an island country that includes the island of Madagascar and numerous smaller peripheral islands. Lying off the southeastern coast of Africa, it is the world's List of islands by area, f ...
, the islands of the
Pacific Ocean
The Pacific Ocean is the largest and deepest of Earth's five Borders of the oceans, oceanic divisions. It extends from the Arctic Ocean in the north to the Southern Ocean, or, depending on the definition, to Antarctica in the south, and is ...
and
Taiwan
Taiwan, officially the Republic of China (ROC), is a country in East Asia. The main geography of Taiwan, island of Taiwan, also known as ''Formosa'', lies between the East China Sea, East and South China Seas in the northwestern Pacific Ocea ...
(by
Taiwanese indigenous peoples
Taiwanese indigenous peoples, formerly called Taiwanese aborigines, are the indigenous peoples of Taiwan, with the nationally recognized subgroups numbering about 600,303 or 3% of the Geography of Taiwan, island's population. This total is incr ...
). They are spoken by about 328 million people (4.4% of the
world population
In demographics of the world, world demographics, the world population is the total number of humans currently alive. It was estimated by the United Nations to have exceeded eight billion in mid-November 2022. It took around 300,000 years of h ...
).
This makes it the fifth-largest language family by number of speakers. Major Austronesian languages include
Malay (around 250–270 million in Indonesia alone in its own literary standard named "
Indonesian"),
Javanese,
Sundanese,
Tagalog (standardized as
Filipino),
Malagasy and
Cebuano. According to some estimates, the family contains 1,257 languages, which is the second most of any language family.
In 1706, the Dutch scholar
Adriaan Reland first observed similarities between the languages spoken in the
Malay Archipelago
The Malay Archipelago is the archipelago between Mainland Southeast Asia and Australia, and is also called Insulindia or the Indo-Australian Archipelago. The name was taken from the 19th-century European concept of a Malay race, later based ...
and by peoples on islands in the Pacific Ocean. In the 19th century, researchers (e.g.
Wilhelm von Humboldt
Friedrich Wilhelm Christian Karl Ferdinand von Humboldt (22 June 1767 – 8 April 1835) was a German philosopher, linguist, government functionary, diplomat, and founder of the Humboldt University of Berlin. In 1949, the university was named aft ...
,
Herman van der Tuuk) started to apply the
comparative method
In linguistics, the comparative method is a technique for studying the development of languages by performing a feature-by-feature comparison of two or more languages with common descent from a shared ancestor and then extrapolating backwards ...
to the Austronesian languages. The first extensive study on the history of the
phonology
Phonology (formerly also phonemics or phonematics: "phonemics ''n.'' 'obsolescent''1. Any procedure for identifying the phonemes of a language from a corpus of data. 2. (formerly also phonematics) A former synonym for phonology, often pre ...
was made by the German linguist
Otto Dempwolff.
It included a reconstruction of the
Proto-Austronesian lexicon. The term ''Austronesian'' was coined (as German ') by
Wilhelm Schmidt, deriving it from
Latin
Latin ( or ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken by the Latins (Italic tribe), Latins in Latium (now known as Lazio), the lower Tiber area aroun ...
' "south" and
Ancient Greek
Ancient Greek (, ; ) includes the forms of the Greek language used in ancient Greece and the classical antiquity, ancient world from around 1500 BC to 300 BC. It is often roughly divided into the following periods: Mycenaean Greek (), Greek ...
' ( "island"), meaning the "Southern Island languages".
Most Austronesian languages are spoken by the people of
Insular Southeast Asia and
Oceania
Oceania ( , ) is a region, geographical region including Australasia, Melanesia, Micronesia, and Polynesia. Outside of the English-speaking world, Oceania is generally considered a continent, while Mainland Australia is regarded as its co ...
. Only a few languages, such as
Urak Lawoiʼ and the
Chamic languages (except
Acehnese), are
indigenous to mainland Asia, or
Malagasy which is the only Austronesian language indigenous to Insular East Africa. There are few Austronesian languages which have populations exceeding a few thousand, but a handful have speaking populations in the millions; Indonesian, the most widely spoken, has around 252 million speakers, making makes it the tenth
most-spoken language in the world. Approximately twenty Austronesian languages are
official
An official is someone who holds an office (function or Mandate (politics), mandate, regardless of whether it carries an actual Office, working space with it) in an organization or government and participates in the exercise of authority (eithe ...
in their respective countries (see the
list of major and official Austronesian languages).
By the number of languages they include, Austronesian and
Niger–Congo are the two largest language families in the world. They each contain roughly one-fifth of the world's languages. The geographical span of Austronesian was the largest of any language family in the first half of the second millennium CE, before the spread of
Indo-European languages
The Indo-European languages are a language family native to the northern Indian subcontinent, most of Europe, and the Iranian plateau with additional native branches found in regions such as Sri Lanka, the Maldives, parts of Central Asia (e. ...
in the
colonial period. It ranged from Madagascar to
Easter Island in the eastern Pacific.
According to
Robert Blust (1999), Austronesian is divided into several primary branches, all but one of which are found exclusively in Taiwan. The
Formosan languages
The Formosan languages are a geographic grouping comprising the languages of the indigenous peoples of Taiwan, all of which are Austronesian. They do not form a single subfamily of Austronesian but rather up to nine separate primary subfamili ...
of Taiwan are grouped into as many as nine first-order subgroups of Austronesian. All Austronesian languages spoken outside the Taiwan mainland (including its offshore
Yami language) belong to the
Malayo-Polynesian (sometimes called ''Extra-Formosan'') branch.
Most Austronesian languages lack a long history of written attestation. The oldest inscription in the
Cham language, the
Đông Yên Châu inscription dated to AD, is the first attestation of any Austronesian language.
Typological characteristics
Phonology
The Austronesian languages overall possess
phoneme
A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...
inventories which are smaller than the world average. Around 90% of the Austronesian languages have inventories of 19–25 sounds (15–20 consonants and 4–5 vowels), thus lying at the lower end of the global typical range of 20–37 sounds. However, extreme inventories are also found, such as
Nemi (
New Caledonia
New Caledonia ( ; ) is a group of islands in the southwest Pacific Ocean, southwest of Vanuatu and east of Australia. Located from Metropolitan France, it forms a Overseas France#Sui generis collectivity, ''sui generis'' collectivity of t ...
) with 43 consonants.
The canonical root type in
Proto-Austronesian is disyllabic with the shape CV(C)CVC (C = consonant; V = vowel), and is still found in many Austronesian languages. In most languages, consonant clusters are only allowed in medial position, and often, there are restrictions for the first element of the cluster. There is a common
drift to reduce the number of consonants which can appear in final position, e.g.
Buginese, which only allows the two consonants /ŋ/ and /ʔ/ as finals, out of a total number of 18 consonants. Complete absence of final consonants is observed e.g. in
Nias,
Malagasy and many
Oceanic languages
The approximately 450 Oceanic languages are a branch of the Austronesian languages. The area occupied by speakers of these languages includes Polynesia, as well as much of Melanesia and Micronesia. Though covering a vast area, Oceanic languages ...
.
Tonal contrasts are rare in Austronesian languages, although
Moken–Moklen and a few languages of the
Chamic,
South Halmahera–West New Guinea and
New Caledonian subgroups do show lexical tone.
Morphology
Most Austronesian languages are
agglutinative languages with a relatively high number of
affix
In linguistics, an affix is a morpheme that is attached to a word stem to form a new word or word form. The main two categories are Morphological derivation, derivational and inflectional affixes. Derivational affixes, such as ''un-'', ''-ation' ...
es, and clear morpheme boundaries. Most affixes are
prefix
A prefix is an affix which is placed before the stem of a word. Particularly in the study of languages, a prefix is also called a preformative, because it alters the form of the word to which it is affixed.
Prefixes, like other affixes, can b ...
es (
Malay ''ber-jalan'' 'walk' < ''jalan'' 'road'), with a smaller number of
suffix
In linguistics, a suffix is an affix which is placed after the stem of a word. Common examples are case endings, which indicate the grammatical case of nouns and adjectives, and verb endings, which form the conjugation of verbs. Suffixes can ca ...
es (
Tagalog ''titis-án'' 'ashtray' < ''títis'' 'ash') and
infix
An infix is an affix inserted inside a word stem (an existing word or the core of a family of words). It contrasts with '' adfix,'' a rare term for an affix attached to the outside of a stem, such as a prefix or suffix.
When marking text for ...
es (
Roviana ''t
avete'' 'work (noun)' < ''tavete'' 'work (verb)').
Reduplication
In linguistics, reduplication is a Morphology (linguistics), morphological process in which the Root (linguistics), root or Stem (linguistics), stem of a word, part of that, or the whole word is repeated exactly or with a slight change.
The cla ...
is commonly employed in Austronesian languages. This includes full reduplication ( Malay ''anak-anak'' 'children' < ''anak'' 'child'; Karo Batak ''nipe-nipe'' 'caterpillar' < ''nipe'' 'snake') or partial reduplication ( Agta ''taktakki'' 'legs' < ''takki'' 'leg', ''at-atu'' 'puppy' < ''atu'' 'dog').
Syntax
It is difficult to make generalizations about the languages that make up a family as diverse as Austronesian. Very broadly, one can divide the Austronesian languages into three groups: Philippine-type languages, Indonesian-type languages and post-Indonesian type languages:
* The first group includes, besides the languages of the Philippines
The Philippines, officially the Republic of the Philippines, is an Archipelagic state, archipelagic country in Southeast Asia. Located in the western Pacific Ocean, it consists of List of islands of the Philippines, 7,641 islands, with a tot ...
, the Austronesian languages of Taiwan, Sabah, North Sulawesi and Madagascar. It is primarily characterized by the retention of the original system of Philippine-type voice alternations, where typically three or four verb voices determine which semantic role the "subject"/"topic" expresses (it may express either the actor, the patient, the location and the beneficiary, or various other circumstantial roles such as instrument and concomitant). The phenomenon has frequently been referred to as ''focus'' (not to be confused with the usual sense of that term in linguistics). Furthermore, the choice of voice is influenced by the definiteness of the participants. The word order has a strong tendency to be verb-initial.
* In contrast, the more innovative Indonesian-type languages, which are particularly represented in Malaysia and western Indonesia, have reduced the voice system to a contrast between only two voices (actor voice and "undergoer" voice), but these are supplemented by applicative morphological devices (originally two: the more direct *''-i'' and more oblique *''-an/- ən''), which serve to modify the semantic role of the "undergoer". They are also characterized by the presence of preposed clitic pronouns. Unlike the Philippine type, these languages mostly tend towards verb-second word-orders. A number of languages, such as the Batak languages, Old Javanese
Old Javanese or Kawi is an Austronesian languages, Austronesian language and the oldest attested phase of the Javanese language. It was natively spoken in the central and eastern part of Java Island, what is now Central Java, Special Region o ...
, Balinese, Sasak and several Sulawesi languages seem to represent an intermediate stage between these two types.
* Finally, in some languages, which Ross calls "post-Indonesian", the original voice system has broken down completely and the voice-marking affixes no longer preserve their functions.
Lexicon
The Austronesian language family has been established by the linguistic comparative method on the basis of cognate sets, sets of words from multiple languages, which are similar in sound and meaning which can be shown to be descended from the same ancestral word in Proto-Austronesian according to regular rules. Some cognate sets are very stable. The word for ''eye'' in many Austronesian languages is ''mata'' (from the most northerly Austronesian languages, Formosan languages
The Formosan languages are a geographic grouping comprising the languages of the indigenous peoples of Taiwan, all of which are Austronesian. They do not form a single subfamily of Austronesian but rather up to nine separate primary subfamili ...
such as Bunun and Amis all the way south to Māori).
Other words are harder to reconstruct. The word for ''two'' is also stable, in that it appears over the entire range of the Austronesian family, but the forms (e.g. Bunun ''dusa''; Amis ''tusa''; Māori ''rua'') require some linguistic expertise to recognise. The Austronesian Basic Vocabulary Database gives word lists (coded for cognateness) for approximately 1000 Austronesian languages.
Classification
The internal structure of the Austronesian languages is complex. The family consists of many similar and closely related languages with large numbers of dialect continua, making it difficult to recognize boundaries between branches. The first major step towards high-order subgrouping was Dempwolff's recognition of the Oceanic subgroup (called ''Melanesisch'' by Dempwolff).[ The special position of the languages of Taiwan was first recognized by André-Georges Haudricourt (1965), who divided the Austronesian languages into three subgroups: Northern Austronesian (= Formosan), Eastern Austronesian (= Oceanic), and Western Austronesian (all remaining languages).
In a study that represents the first lexicostatistical classification of the Austronesian languages, Isidore Dyen (1965) presented a radically different subgrouping scheme. He posited 40 first-order subgroups, with the highest degree of diversity found in the area of ]Melanesia
Melanesia (, ) is a subregion of Oceania in the southwestern Pacific Ocean. It extends from New Guinea in the west to the Fiji Islands in the east, and includes the Arafura Sea.
The region includes the four independent countries of Fiji, Vanu ...
. The Oceanic languages are not recognized, but are distributed over more than 30 of his proposed first-order subgroups. Dyen's classification was widely criticized and for the most part rejected, but several of his lower-order subgroups are still accepted (e.g. the Cordilleran languages, the Bilic languages or the Murutic languages).
Subsequently, the position of the Formosan languages as the most archaic group of Austronesian languages was recognized by Otto Christian Dahl (1973), followed by proposals from other scholars that the Formosan languages actually make up more than one first-order subgroup of Austronesian. Robert Blust (1977) first presented the subgrouping model which is currently accepted by virtually all scholars in the field, with more than one first-order subgroup on Taiwan, and a single first-order branch encompassing all Austronesian languages spoken outside of Taiwan, viz. Malayo-Polynesian. The relationships of the Formosan languages to each other and the internal structure of Malayo-Polynesian continue to be debated.
Primary branches on Taiwan (Formosan languages)
In addition to Malayo-Polynesian, thirteen Formosan subgroups are broadly accepted. The seminal article in the classification of Formosan—and, by extension, the top-level structure of Austronesian—is . Prominent Formosanists (linguists who specialize in Formosan languages) take issue with some of its details, but it remains the point of reference for current linguistic analyses. Debate centers primarily around the relationships between these families. Of the classifications presented here, links two families into a Western Plains group, two more in a Northwestern Formosan group, and three into an Eastern Formosan group, while also links five families into a Northern Formosan group. Harvey (1982), Chang (2006) and Ross (2012) split Tsouic, and Blust (2013) agrees the group is probably not valid.
Other studies have presented phonological evidence for a reduced Paiwanic family of Paiwanic, Puyuma, Bunun, Amis, and Malayo-Polynesian, but this is not reflected in vocabulary. The Eastern Formosan peoples Basay, Kavalan, and Amis share a homeland motif that has them coming originally from an island called ''Sinasay'' or ''Sanasay''. The Amis, in particular, maintain that they came from the east, and were treated by the Puyuma, amongst whom they settled, as a subservient group.
Blust (1999)
*Formosan
**
*** Tsou language
*** Saaroa language
*** Kanakanavu language
**
*** Thao language
Thao (; endonym: Thau a lalawa), also known as Sao, is the nearly extinct language of the Thao people, an indigenous people of Taiwan from the Sun Moon Lake region in central Taiwan. It is a Formosan language of the Austronesian family; Bar ...
Sao: Brawbaw and Shtafari dialects
*** Central Western Plains
**** Babuza language
Babuza is a Formosan language of the Babuza and Taokas, indigenous peoples of Taiwan
Taiwan, officially the Republic of China (ROC), is a country in East Asia. The main geography of Taiwan, island of Taiwan, also known as ''Formosa'', ...
; old Favorlang language: Taokas and Poavosa dialects
**** Papora-Hoanya language: Papora, Hoanya dialects
**
*** Saisiyat language: Taai and Tungho dialects
*** Pazeh language and Kulun
**
*** Atayal language
*** Seediq language Truku/Taroko
**
*** Northern (Kavalanic languages)
**** Basay language: Trobiawa and Linaw–Qauqaut dialects
**** Kavalan language
**** Ketagalan language, or Ketangalan
*** Central ( Ami)
**** Amis proper
**** Sakizaya
*** Siraya language
Siraya is a Formosan languages, Formosan language spoken until the end of the 19th century by the indigenous Siraya people of Taiwan, derived from Proto-Siraya language, Proto-Siraya. Some scholars believe Taivoan language, Taivoan and Makatao la ...
**
**
*** Mantauran, Tona, and Maga dialects of Rukai are divergent
**
**
**(outside Formosa)
**
Li (2008)
This classification retains Blust's East Formosan, and unites the other northern languages. proposes a Proto-Formosan (F0) ancestor and equates it with Proto-Austronesian (PAN), following the model in Starosta (1995). Rukai and Tsouic are seen as highly divergent, although the position of Rukai is highly controversial.
*Formosan
** F0: Proto-Formosan = Proto-Austronesian
***
**** Mantauran
**** Maga–Tona, Budai–Labuan–Taromak
** F1: ''(unnamed branch)''
***
**** Tsou
**** Southern Tsouic
***** Saaroa
***** Kanakanavu
** F2: ''(unnamed branch)''
***
**** Northwestern (Plains)
***** Saisiyat– Kulon, Pazeh
***** Western
****** Thao
****** West Coast ( Papora– Hoanya– Babuza–Taokas)
**** Atayalic
***** Squliq Atayal
***** Ts'ole' Atayal (= C'uli')
***** Seediq
***
**** Kavalan– Basay
**** Siraya– Amis– Nataoran
**** Sakizaya
*** ? Southern ncertain****
***** Isbukun
***** Northern and Central (Takitudu and Takbanuaz)
****
(2004, 2021)
Sagart (2004) proposes that the numerals of the Formosan languages reflect a nested series of innovations, from languages in the northwest (near the putative landfall of the Austronesian migration from the mainland), which share only the numerals 1–4 with proto-Malayo-Polynesian, counter-clockwise to the eastern languages (purple on map), which share all numerals 1–10. Sagart (2021) finds other shared innovations that follow the same pattern. He proposes that pMP *lima 'five' is a lexical replacement (from 'hand'), and that pMP *pitu 'seven', *walu 'eight' and *Siwa 'nine' are contractions of pAN *RaCep 'five', a ligature *a or *i 'and', and *duSa 'two', *telu 'three', *Sepat 'four', an analogical pattern historically attested from Pazeh. The fact that the Kradai languages share the numeral system (and other lexical innovations) of pMP suggests that they are a coordinate branch with Malayo-Polynesian, rather than a sister family to Austronesian.
Sagart's resulting classification is:
*Austronesian (pAN ca. 5200 BP)
**
**
**
** Pituish
(pAN *RaCepituSa 'five-and-two' truncated to *pitu 'seven'; *sa-ŋ-aCu 'nine' it. one taken away
***
*** Limaish
(pAN *RaCep 'five' replaced by *lima 'hand'; *Ca~ reduplication to form the series of numerals for counting humans)
****
**** Enemish
(additive 'five-and-one' or 'twice-three' replaced by reduplicated *Nem-Nem > *emnem Nem 'three' is reflected in Basay, Siraya and Makatao pAN *kawaS 'year, sky' replaced by *CawiN)
*****
***** Walu-Siwaish
(*walu 'eight' and *Siwa 'nine' from *RaCepat(e)lu 'five-and-three' and *RaCepiSepat 'five-and-four')
******
******
******* Bunun
******* Rukai– Tsouic
(CV~ reduplication in human-counting series replaced with competing pAN noun-marker *u- nknown whether Bunun once had the same eleven lexical innovations such as *cáni 'one', *kəku 'leg')
****** East WS (pEWS ca. 4500 BP)
(innovations *baCaq-an 'ten'; *nanum 'water' alongside pAN *daNum)
*******
*******
******** Northern: Ami– Puyuma
(*sasay 'one'; *mukeCep 'ten' for the human and non-human series; *ukak 'bone', *kuCem 'cloud')
******** Paiwan
******** Southern Austronesian (pSAN ca. 4000 BP)
(linker *atu 'and' > *at after *sa-puluq in numerals 11–19; lexical innovations such as *baqbaq 'mouth', *qa-sáuŋ 'canine tooth', *qi(d)zúR 'saliva', *píntu 'door', *-ŋel 'deaf')
********* Kra-Dai
********* Malayo-Polynesian
Malayo-Polynesian
The Malayo-Polynesian languages are—among other things—characterized by certain sound changes, such as the mergers of Proto-Austronesian (PAN) *t/*C to Proto-Malayo-Polynesian (PMP) *t, and PAN *n/*N to PMP *n, and the shift of PAN *S to PMP *h.
There appear to have been two great migrations of Austronesian languages that quickly covered large areas, resulting in multiple local groups with little large-scale structure. The first was Malayo-Polynesian, distributed across the Malay archipelago
The Malay Archipelago is the archipelago between Mainland Southeast Asia and Australia, and is also called Insulindia or the Indo-Australian Archipelago. The name was taken from the 19th-century European concept of a Malay race, later based ...
and Melanesia
Melanesia (, ) is a subregion of Oceania in the southwestern Pacific Ocean. It extends from New Guinea in the west to the Fiji Islands in the east, and includes the Arafura Sea.
The region includes the four independent countries of Fiji, Vanu ...
. The second migration was that of the Oceanic languages
The approximately 450 Oceanic languages are a branch of the Austronesian languages. The area occupied by speakers of these languages includes Polynesia, as well as much of Melanesia and Micronesia. Though covering a vast area, Oceanic languages ...
into Polynesia and Micronesia.
Major languages
History
From the standpoint of historical linguistics
Historical linguistics, also known as diachronic linguistics, is the scientific study of how languages change over time. It seeks to understand the nature and causes of linguistic change and to trace the evolution of languages. Historical li ...
, the place of origin (in linguistic terminology, ''Urheimat
In historical linguistics, the homeland or ( , from German 'original' and 'home') of a proto-language is the region in which it was spoken before splitting into different daughter languages. A proto-language is the reconstructed or historicall ...
'') of the Austronesian languages (Proto-Austronesian language
Proto-Austronesian (commonly abbreviated as PAN or PAn) is a proto-language. It is the reconstructed ancestor of the Austronesian languages, one of the world's major language families. Proto-Austronesian is assumed to have begun to diversify ...
) is most likely the main island of Taiwan, also known as Formosa; on this island the deepest divisions in Austronesian are found along small geographic distances, among the families of the native Formosan languages
The Formosan languages are a geographic grouping comprising the languages of the indigenous peoples of Taiwan, all of which are Austronesian. They do not form a single subfamily of Austronesian but rather up to nine separate primary subfamili ...
.
According to Robert Blust, the Formosan languages form nine of the ten primary branches of the Austronesian language family. noted this when he wrote: ... the internal diversity among the... Formosan languages... is greater than that in all the rest of Austronesian put together, so there is a major genetic split within Austronesian between Formosan and the rest... Indeed, the genetic diversity within Formosan is so great that it may well consist of several primary branches of the overall Austronesian family.
At least since , writing in 1949, linguists have generally accepted that the chronology of the dispersal of languages within a given language family can be traced from the area of greatest linguistic variety to that of the least. For example, English in North America has large numbers of speakers, but relatively low dialectal diversity, while English in Great Britain has much higher diversity; such low linguistic variety by Sapir's thesis suggests a more recent spread of English in North America. While some scholars suspect that the number of principal branches among the Formosan languages may be somewhat less than Blust's estimate of nine (e.g. ), there is little contention among linguists with this analysis and the resulting view of the origin and direction of the migration. For a recent dissenting analysis, see .
The protohistory of the Austronesian people can be traced farther back through time. To get an idea of the original homeland of the populations ancestral to the Austronesian peoples (as opposed to strictly linguistic arguments), evidence from archaeology and population genetics
Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as Adaptation (biology), adaptation, s ...
may be adduced. Studies from the science of genetics have produced conflicting outcomes. Some researchers find evidence for a proto-Austronesian homeland on the Asian mainland (e.g., ), while others mirror the linguistic research, rejecting an East Asian origin in favor of Taiwan (e.g., ). Archaeological evidence (e.g., ) is more consistent, suggesting that the ancestors of the Austronesians spread from the South Chinese mainland to Taiwan at some time around 8,000 years ago.
Evidence from historical linguistics suggests that it is from this island that seafaring peoples migrated, perhaps in distinct waves separated by millennia, to the entire region encompassed by the Austronesian languages. It is believed that this migration began around 6,000 years ago. However, evidence from historical linguistics cannot bridge the gap between those two periods. The view that linguistic evidence connects Austronesian languages to the Sino-Tibetan ones, as proposed for example by , is a minority one. As states:Implied in... discussions of subgrouping f Austronesian languagesis a broad consensus that the homeland of the Austronesians was in Taiwan. This homeland area may have also included the P'eng-hu (Pescadores) islands between Taiwan and China and possibly even sites on the coast of mainland China, especially if one were to view the early Austronesians as a population of related dialect communities living in scattered coastal settlements.
Linguistic analysis of the Proto-Austronesian language stops at the western shores of Taiwan; any related mainland language(s) have not survived. The only exceptions, the Chamic languages, derive from more recent migration to the mainland. However, according to Ostapirat's interpretation of the seriously discussed Austro-Tai hypothesis, the Kra–Dai languages
The Kra–Dai languages ( , also known as Tai–Kadai and Daic ), are a language family in mainland Southeast Asia, southern China, and northeastern India. All languages in the family are tonal language, tonal, including Thai language, Thai a ...
(also known as Tai–Kadai) are exactly those related mainland languages.
Hypothesized relations
Genealogical links have been proposed between Austronesian and various families of East and Southeast Asia
Southeast Asia is the geographical United Nations geoscheme for Asia#South-eastern Asia, southeastern region of Asia, consisting of the regions that are situated south of China, east of the Indian subcontinent, and northwest of the Mainland Au ...
.
Austro-Tai
An Austro-Tai proposal linking Austronesian and the Kra-Dai languages of the southeastern continental Asian mainland was first proposed by Paul K. Benedict, and is supported by Weera Ostapirat, Roger Blench, and Laurent Sagart, based on the traditional comparative method
In linguistics, the comparative method is a technique for studying the development of languages by performing a feature-by-feature comparison of two or more languages with common descent from a shared ancestor and then extrapolating backwards ...
. proposes a series of regular correspondences linking the two families and assumes a primary split, with Kra-Dai speakers being the people who stayed behind in their Chinese homeland. suggests that, if the connection is valid, the relationship is unlikely to be one of two sister families. Rather, he suggests that proto-Kra-Dai speakers were Austronesians who migrated to Hainan
Hainan is an island provinces of China, province and the southernmost province of China. It consists of the eponymous Hainan Island and various smaller islands in the South China Sea under the province's administration. The name literally mean ...
Island and back to the mainland from the northern Philippines, and that their distinctiveness results from radical restructuring following contact with Hmong–Mien and Sinitic. An extended version of Austro-Tai was hypothesized by Benedict who added the Japonic languages
Japonic or Japanese–Ryukyuan () is a language family comprising Japanese language, Japanese, spoken in the main islands of Japan, and the Ryukyuan languages, spoken in the Ryukyu Islands. The family is universally accepted by linguists, and sig ...
to the proposal as well.
Austric
A link with the Austroasiatic languages
The Austroasiatic languages ( ) are a large language family spoken throughout Mainland Southeast Asia, South Asia and East Asia. These languages are natively spoken by the majority of the population in Vietnam and Cambodia, and by minority popu ...
in an ' Austric' phylum
In biology, a phylum (; : phyla) is a level of classification, or taxonomic rank, that is below Kingdom (biology), kingdom and above Class (biology), class. Traditionally, in botany the term division (taxonomy), division has been used instead ...
is based mostly on typological evidence. However, there is also morphological evidence of a connection between the conservative Nicobarese languages and Austronesian languages of the Philippines. Robert Blust supports the hypothesis which connects the lower Yangtze neolithic Austro-Tai entity with the rice-cultivating Austro-Asiatic cultures, assuming the center of East Asian rice domestication, and putative Austric homeland, to be located in the Yunnan/Burma border area. Under that view, there was an east-west genetic alignment, resulting from a rice-based population expansion, in the southern part of East Asia: Austroasiatic-Kra-Dai-Austronesian, with unrelated Sino-Tibetan occupying a more northerly tier.
Sino-Austronesian
French linguist and Sinologist Laurent Sagart considers the Austronesian languages to be related to the Sino-Tibetan languages
Sino-Tibetan (also referred to as Trans-Himalayan) is a family of more than 400 languages, second only to Indo-European in number of native speakers. Around 1.4 billion people speak a Sino-Tibetan language. The vast majority of these are the 1.3 ...
, and also groups the Kra–Dai languages
The Kra–Dai languages ( , also known as Tai–Kadai and Daic ), are a language family in mainland Southeast Asia, southern China, and northeastern India. All languages in the family are tonal language, tonal, including Thai language, Thai a ...
as more closely related to the Malayo-Polynesian languages
The Malayo-Polynesian languages are a subgroup of the Austronesian languages, with approximately 385.5 million speakers. The Malayo-Polynesian languages are spoken by the Austronesian peoples outside of Taiwan, in the island nations of Southeas ...
. Sagart argues for a north-south genetic relationship between Chinese and Austronesian, based on sound correspondences in the basic vocabulary and morphological parallels. Laurent Sagart (2017) concludes that the possession of the two kinds of millets in Taiwanese Austronesian languages (not just Setaria, as previously thought) places the pre-Austronesians in northeastern China, adjacent to the probable Sino-Tibetan homeland. Ko et al.'s genetic research (2014) appears to support Laurent Sagart's linguistic proposal, pointing out that the exclusively Austronesian mtDNA E-haplogroup and the largely Sino-Tibetan M9a haplogroup are twin sisters, indicative of an intimate connection between the early Austronesian and Sino-Tibetan maternal gene pools, at least. Additionally, results from Wei et al. (2017) are also in agreement with Sagart's proposal, in which their analyses show that the predominantly Austronesian Y-DNA haplogroup O3a2b*-P164(xM134) belongs to a newly defined haplogroup O3a2b2-N6 being widely distributed along the eastern coastal regions of Asia, from Korea to Vietnam. Sagart also groups the Austronesian languages in a recursive-like fashion, placing Kra-Dai as a sister branch of Malayo-Polynesian. His methodology has been found to be spurious by his peers.
Japanese
Several linguists have proposed that Japanese is genetically related to the Austronesian family, cf. Benedict (1990), Matsumoto (1975), Miller (1967).
Some other linguists think it is more plausible that Japanese is not genetically related to the Austronesian languages, but instead was influenced by an Austronesian substratum
Substrata, plural of substratum, may refer to:
*Earth's substrata, the geologic layering of the Earth
*''Hypokeimenon'', sometimes translated as ''substratum'', a concept in metaphysics
*Substrata (album), a 1997 ambient music album by Biosphere
* ...
or adstratum.
Those who propose this scenario suggest that the Austronesian family once covered the islands to the north as well as to the south. Martine Robbeets (2017) claims that Japanese genetically belongs to the "Transeurasian" (= Macro-Altaic) languages, but underwent lexical influence from "para-Austronesian", a presumed sister language of Proto-Austronesian.
The linguist Ann Kumar (2009) proposed that some Austronesians might have migrated to Japan, possibly an elite-group from Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, and created the Japanese-hierarchical society. She also identifies 82 possible cognates between Austronesian and Japanese, however her theory remains very controversial. The linguist Asha Pereltsvaig criticized Kumar's theory on several points. The archaeological problem with that theory is that, contrary to the claim that there was no rice farming in China and Korea in prehistoric times, excavations have indicated that rice farming has been practiced in this area since at least 5000 BC. There are also genetic problems. The pre-Yayoi Japanese lineage was not shared with Southeast Asians, but was shared with Northwest Chinese, Tibetans and Central Asians. Linguistic problems were also pointed out. Kumar did not claim that Japanese was an Austronesian language derived from proto-Javanese language, but only that it provided a superstratum language for old Japanese
is the oldest attested stage of the Japanese language, recorded in documents from the Nara period (8th century). It became Early Middle Japanese in the succeeding Heian period, but the precise delimitation of the stages is controversial.
Old Ja ...
, based on 82 plausible Javanese-Japanese cognates, mostly related to rice farming.
East Asian
In 2001, Stanley Starosta proposed a new language family named East Asian
East Asia is a geocultural region of Asia. It includes China, Japan, Mongolia, North Korea, South Korea, and Taiwan, plus two special administrative regions of China, Hong Kong and Macau. The economies of Economy of China, China, Economy of Ja ...
, that includes all primary language families in the broader East Asia
East Asia is a geocultural region of Asia. It includes China, Japan, Mongolia, North Korea, South Korea, and Taiwan, plus two special administrative regions of China, Hong Kong and Macau. The economies of Economy of China, China, Economy of Ja ...
region except Japonic and Koreanic. This proposed family consists of two branches, Austronesian and Sino-Tibetan-Yangzian, with the Kra-Dai family considered to be a branch of Austronesian, and "Yangzian" to be a new sister branch of Sino-Tibetan consisting of the Austroasiatic and Hmong-Mien languages. This proposal was further researched by linguists like Michael D. Larish in 2006, who also included the Japonic and Koreanic languages in the macrofamily. The proposal has since been adopted by linguists such as George van Driem, albeit without the inclusion of Japonic and Koreanic.[van Driem, George. 2018. "", ''Journal of the Asiatic Society'', LX (4): 1–38.]
Ongan
proposed that the Austronesian and the Ongan protolanguage are the descendants of an Austronesian–Ongan protolanguage. This view is not supported by mainstream linguists and remains very controversial. Robert Blust rejects Blevins' proposal as far-fetched and based solely on chance resemblances and methodologically flawed comparisons.
Writing systems
Most Austronesian languages have Latin
Latin ( or ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken by the Latins (Italic tribe), Latins in Latium (now known as Lazio), the lower Tiber area aroun ...
-based writing systems today. Some non-Latin-based writing systems are listed below.
* Brahmi script
Brahmi ( ; ; ISO 15919, ISO: ''Brāhmī'') is a writing system from ancient India. "Until the late nineteenth century, the script of the Aśokan (non-Kharosthi) inscriptions and its immediate derivatives was referred to by various names such as ...
** Kawi script
The Kawi script or the Old Javanese script (, ) is a Brahmic script found primarily in Java and used across much of Maritime Southeast Asia between the 8th century and the 16th century.Aditya Bayu Perdana and Ilham Nurwansah 2020Proposal to en ...
*** Balinese alphabet – used to write Balinese, Kawi, Malay, Sasak, and Sanskrit
Sanskrit (; stem form ; nominal singular , ,) is a classical language belonging to the Indo-Aryan languages, Indo-Aryan branch of the Indo-European languages. It arose in northwest South Asia after its predecessor languages had Trans-cultural ...
.
*** Batak alphabet – used to write several Batak languages.
*** Baybayin
Baybayin (,),
also sometimes erroneously referred to as alibata, is a Suyat, Philippine script widely used primarily in Luzon during the 16th and 17th centuries and prior to write Tagalog language, Tagalog and to a lesser extent Visayan lang ...
– used to write Tagalog and several Philippine languages
The Philippine languages or Philippinic are a proposed group by R. David Paul Zorc (1986) and Robert Blust (1991; 2005; 2019) that include all the languages of the Philippines and northern Sulawesi, Indonesia—except Sama–Bajaw (language ...
.
*** Bima alphabet – once used to write the Bima language.
*** Buhid alphabet – used to write Buhid language.
*** Hanunó'o alphabet – used to write Hanuno'o language
Hanunoo, or Hanunó'o (), is a language spoken by Mangyans in the Mindoro, island of Mindoro, Philippines.
It is written in the Hanunoo script.
Phonology
Consonants
Hanunoo has 16 consonant phonemes.
Vowels
* can be heard as w ...
.
*** Javanese script – used to write the Javanese language
Javanese ( , , ; , Aksara Jawa, Javanese script: , Pegon script, Pegon: , IPA: ) is an Austronesian languages, Austronesian language spoken primarily by the Javanese people from the central and eastern parts of the island of Java, Indones ...
and several neighbouring languages like Madurese.
*** Kerinci alphabet (''Kaganga'') – used to write the Kerinci language.
*** Kulitan alphabet – used to write the Kapampangan language.
*** Lampung alphabet – used to write Lampung and Komering.
*** Lontara alphabet – used to write the Buginese, Makassarese and several languages of Sulawesi
Sulawesi ( ), also known as Celebes ( ), is an island in Indonesia. One of the four Greater Sunda Islands, and the List of islands by area, world's 11th-largest island, it is situated east of Borneo, west of the Maluku Islands, and south of Min ...
.
*** Sundanese script – standardized script based on Old Sundanese script, used to write the Sundanese language.
*** Rejang alphabet – used to write the Rejang language.
*** Rencong alphabet – once used to write the Malay language
Malay ( , ; , Jawi alphabet, Jawi: ) is an Austronesian languages, Austronesian language spoken primarily by Malays (ethnic group), Malays in several islands of Maritime Southeast Asia and the Malay Peninsula on the mainland Asia. The lang ...
in Sumatra.
*** Tagbanwa alphabet – once used to write various Palawan languages.
*** Lota alphabet – used to write the Ende-Li'o language.
** Cham alphabet – used to write Cham language.
** Thai script
The Thai script (, , ) is the abugida used to write Thai language, Thai, Southern Thai language, Southern Thai and many other languages spoken in Thailand. The Thai script itself (as used to write Thai) has 44 consonant symbols (, ), 16 vowel s ...
– used to write Pattani Malay language
Pattani (or Patani in Malay spelling) may refer to:
Places Continental Asia
* Patani (historical region), a historical region in the Malay peninsula, in Thailand and Malaysia.
* Pattani Province, modern province in southern Thailand
** Pattani, Th ...
.
* Arabic script
The Arabic script is the writing system used for Arabic (Arabic alphabet) and several other languages of Asia and Africa. It is the second-most widely used alphabetic writing system in the world (after the Latin script), the second-most widel ...
** Pegon alphabet – used to write Javanese, Sundanese and Madurese as well as several smaller neighbouring languages.
** Jawi alphabet
Jawi (; ; ; ) is a writing system used for writing several languages of Southeast Asia, such as Acehnese language, Acehnese, Banjarese language, Banjarese, Betawi language, Betawi, Maguindanao language, Magindanao, Malay language, Malay, Mar ...
– used to write Malay, Acehnese, Banjar, Minangkabau, Maguindanao, Tausug, Western Cham and others.
** Sorabe alphabet – once used to write several dialects of Malagasy language
Malagasy ( ; ; Sorabe: ) is an Austronesian languages, Austronesian language and dialect continuum spoken in Madagascar. The standard variety, called Official Malagasy, is one of the official languages of Madagascar, alongside French language, F ...
.
* Hangul
The Korean alphabet is the modern writing system for the Korean language. In North Korea, the alphabet is known as (), and in South Korea, it is known as (). The letters for the five basic consonants reflect the shape of the speech organs ...
– used to write the Cia-Cia language but the project is no longer active.
* Dunging – used to write the Iban language
* Avoiuli – used to write the Raga language.
* Eskayan – used to write the Eskayan language, a secret language based on Boholano.
* Woleai script (Caroline Island script) – used to write the Carolinian language (Refaluwasch).
* Rongorongo – possibly used to write the Rapa Nui language.
* Gagarit Abada – used to write Dusunic languages but it was not widely used.
* Gangga Melayu – used to write Perak Malay
* Braille
Braille ( , ) is a Tactile alphabet, tactile writing system used by blindness, blind or visually impaired people. It can be read either on embossed paper or by using refreshable braille displays that connect to computers and smartphone device ...
– used in Filipino, Malay, Indonesian, Tolai, Motu, Māori, Samoan, Malagasy, and many other Austronesian languages.
Comparison charts
Below are two charts comparing list of numbers of 1–10 and thirteen words in Austronesian languages; spoken in Taiwan
Taiwan, officially the Republic of China (ROC), is a country in East Asia. The main geography of Taiwan, island of Taiwan, also known as ''Formosa'', lies between the East China Sea, East and South China Seas in the northwestern Pacific Ocea ...
, the Philippines
The Philippines, officially the Republic of the Philippines, is an Archipelagic state, archipelagic country in Southeast Asia. Located in the western Pacific Ocean, it consists of List of islands of the Philippines, 7,641 islands, with a tot ...
, the Mariana Islands
The Mariana Islands ( ; ), also simply the Marianas, are a crescent-shaped archipelago comprising the summits of fifteen longitudinally oriented, mostly dormant volcanic mountains in the northwestern Pacific Ocean, between the 12th and 21st pa ...
, Indonesia
Indonesia, officially the Republic of Indonesia, is a country in Southeast Asia and Oceania, between the Indian Ocean, Indian and Pacific Ocean, Pacific oceans. Comprising over List of islands of Indonesia, 17,000 islands, including Sumatra, ...
, Malaysia
Malaysia is a country in Southeast Asia. Featuring the Tanjung Piai, southernmost point of continental Eurasia, it is a federation, federal constitutional monarchy consisting of States and federal territories of Malaysia, 13 states and thre ...
, Chams or Champa
Champa (Cham language, Cham: ꨌꩌꨛꨩ, چمڤا; ; 占城 or 占婆) was a collection of independent Chams, Cham Polity, polities that extended across the coast of what is present-day Central Vietnam, central and southern Vietnam from ...
(in Thailand
Thailand, officially the Kingdom of Thailand and historically known as Siam (the official name until 1939), is a country in Southeast Asia on the Mainland Southeast Asia, Indochinese Peninsula. With a population of almost 66 million, it spa ...
, Cambodia
Cambodia, officially the Kingdom of Cambodia, is a country in Southeast Asia on the Mainland Southeast Asia, Indochinese Peninsula. It is bordered by Thailand to the northwest, Laos to the north, and Vietnam to the east, and has a coastline ...
, and Vietnam
Vietnam, officially the Socialist Republic of Vietnam (SRV), is a country at the eastern edge of mainland Southeast Asia, with an area of about and a population of over 100 million, making it the world's List of countries and depende ...
), East Timor
Timor-Leste, also known as East Timor, officially the Democratic Republic of Timor-Leste, is a country in Southeast Asia. It comprises the eastern half of the island of Timor, the coastal exclave of Oecusse in the island's northwest, and ...
, Papua, New Zealand
New Zealand () is an island country in the southwestern Pacific Ocean. It consists of two main landmasses—the North Island () and the South Island ()—and List of islands of New Zealand, over 600 smaller islands. It is the List of isla ...
, Hawaii
Hawaii ( ; ) is an island U.S. state, state of the United States, in the Pacific Ocean about southwest of the U.S. mainland. One of the two Non-contiguous United States, non-contiguous U.S. states (along with Alaska), it is the only sta ...
, Madagascar
Madagascar, officially the Republic of Madagascar, is an island country that includes the island of Madagascar and numerous smaller peripheral islands. Lying off the southeastern coast of Africa, it is the world's List of islands by area, f ...
, Borneo
Borneo () is the List of islands by area, third-largest island in the world, with an area of , and population of 23,053,723 (2020 national censuses). Situated at the geographic centre of Maritime Southeast Asia, it is one of the Greater Sunda ...
, Kiribati
Kiribati, officially the Republic of Kiribati, is an island country in the Micronesia subregion of Oceania in the central Pacific Ocean. Its permanent population is over 119,000 as of the 2020 census, and more than half live on Tarawa. The st ...
, Caroline Islands
The Caroline Islands (or the Carolines) are a widely scattered archipelago of tiny islands in the western Pacific Ocean, to the north of New Guinea. Politically, they are divided between the Federated States of Micronesia (FSM) in the cen ...
, and Tuvalu
Tuvalu ( ) is an island country in the Polynesian subregion of Oceania in the Pacific Ocean, about midway between Hawaii and Australia. It lies east-northeast of the Santa Cruz Islands (which belong to the Solomon Islands), northeast of Van ...
.
See also
* Languages of Indonesia
Indonesia is home to over 700 living languages spoken across its extensive archipelago. This significant linguistic variety constitutes approximately 10% of the world’s total languages, positioning Indonesia as the second most linguisticall ...
* Languages of Taiwan
The languages of Taiwan consist of several varieties of languages under the families of Austronesian languages and Sino-Tibetan languages. The Formosan languages, a geographically designated branch of Austronesian languages, have been spoken by th ...
* Austronesian Formal Linguistics Association
* List of Austronesian languages
* List of Austronesian regions
* Taiwanese Indigenous pop music
Notes
References
Bibliography
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
Further reading
* Bengtson, John D.
The "Greater Austric" Hypothesis
Association for the Study of Language in Prehistory.
*
* Blust, R. A. (1983). ''Lexical reconstruction and semantic reconstruction: the case of the Austronesian "house" words''. Hawaii: R. Blust.
* Cohen, E. M. K. (1999). ''Fundaments of Austronesian roots and etymology''. Canberra: Pacific Linguistics.
* Marion, P., ''Liste Swadesh élargie de onze langues austronésiennes,'' éd. Carré de sucre, 2009
* Pawley, A., & Ross, M. (1994). ''Austronesian terminologies: continuity and change''. Canberra, Australia: Dept. of Linguistics, Research School of Pacific and Asian Studies, The Australian National University.
* Sagart, Laurent, Roger Blench, and Alicia Sanchez-Nazas (Eds.) (2004). ''The peopling of East Asia: Putting Together Archaeology, Linguistics and Genetics''. London: RoutledgeCurzon. .
*
* Tryon, D. T., & Tsuchida, S. (1995). ''Comparative Austronesian dictionary: an introduction to Austronesian studies''. Trends in linguistics, 10. Berlin: Mouton de Gruyter.
* Wittmann, Henri (1972). "Le caractère génétiquement composite des changements phonétiques du malgache."
Proceedings of the International Congress of Phonetic Sciences
' 7.807–810. La Haye: Mouton.
* Wolff, John U., "Comparative Austronesian Dictionary. An Introduction to Austronesian Studies", ''Language'', vol. 73, no. 1, pp. 145–156, Mar 1997,
External links
Blust's Austronesian Comparative Dictionary
Swadesh lists of Austronesian basic vocabulary words
(from Wiktionary'
Swadesh-list appendix
*
Summer Institute of Linguistics site showing languages (Austronesian and Papuan) of Papua New Guinea.
*
Spreadsheet of 1600+ Austronesian and Papuan number names and systems – ongoing study to determine their relationships and distribution
*
* ttp://www.pro-classic.com/ethnicgv/maps/map_index.htm 南島語族分布圖
{{DEFAULTSORT:Austronesian Languages
Language families
Languages of Southeast Asia
Languages of Oceania
Sino-Austronesian languages