Pangloss Collection
   HOME

TheInfoList



OR:

The Pangloss Collection is a
digital library A digital library, also called an online library, an internet library, a digital repository, or a digital collection is an online database of digital objects that can include text, still images, audio, video, digital documents, or other digital ...
whose objective is to store and facilitate access to
audio recording Sound recording and reproduction is the electrical, mechanical, electronic, or digital inscription and re-creation of sound waves, such as spoken voice, singing, instrumental music, or sound effects. The two main classes of sound recording ...
s in
endangered languages An endangered language or moribund language is a language that is at risk of disappearing as its speakers die out or shift to speaking other languages. Language loss occurs when the language has no more native speakers and becomes a "dead langua ...
of the world. Developed by the
LACITO LACITO (''Langues et Civilisations à Tradition Orale'') is a multidisciplinary research organisation, principally devoted to the study of cultures and languages of oral tradition. LACITO is a branch of the Centre National de la Recherche Scienti ...
centre of
CNRS The French National Centre for Scientific Research (french: link=no, Centre national de la recherche scientifique, CNRS) is the French state research organisation and is the largest fundamental science agency in Europe. In 2016, it employed 31,63 ...
in
Paris Paris () is the capital and most populous city of France, with an estimated population of 2,165,423 residents in 2019 in an area of more than 105 km² (41 sq mi), making it the 30th most densely populated city in the world in 2020. ...
, the collection provides free online access to documents of connected, spontaneous speech, in otherwise little-
documented A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" or ...
languages of all continents.Michailovsky, Boyd, Martine Mazaudon, Alexis Michaud, Séverine Guillaume, Alexandre François &
Evangelia Adamou Evangelia Adamou is a senior researcher at the French National Centre for Scientific Research, specializing in language contact and endangered languages. Biography Adamou studied at the Aristotle University of Thessaloniki (MA 1997) and Paris Desc ...
. 2014
Documenting and researching endangered languages: the Pangloss Collection
'' Language Documentation & Conservation'' 8, pp. 119-135.


Principles


A sound archive with synchronized transcriptions

For the science of
linguistics Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Lingu ...
, language is first and foremost spoken language. The medium of spoken language is sound. The Pangloss Collection gives access to original recordings simultaneously with transcriptions and translations, as a resource for further research. After being recorded in its cultural context, texts have been transcribed in collaboration with native speakers.


A structured, open architecture

The archived data is structured in accordance with the latest data-processing standards, as
open architecture Open architecture is a type of computer architecture or software architecture intended to make adding, upgrading, and swapping components with other computers easy. For example, the IBM PC, Amiga 500 and Apple IIe have an open architecture suppor ...
, in an
open format An open file format is a file format for storing digital data, defined by an openly published specification usually maintained by a standards organization, and which can be used and implemented by anyone. Open file format is licensed with open lic ...
, and may be downloaded under a
Creative Commons license A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work".A "work" is any creative material made by a person. A painting, a graphic, a book, a song/lyric ...
. The software used to prepare and disseminate it is
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
. The Pangloss Collection is a member of the OLAC network of archival repositories and of the Digital Endangered Languages and Music Archive Network (DELAMAN).


History

The collection was initially called the ''LACITO Archive''.Screen capture of LACITO's archive homepage
— 27 February 2001.
The project originated in 1996 from the collaboration of Boyd Michailovsky, linguist at LACITO, with John B. Lowe, engineer; they were later joined by Michel Jacobson, engineer, who developed some tools for the project, and brought it online. The purpose of the archive was “''to conserve, and to make available for research, recorded and transcribed oral traditions and other linguistic materials in (mainly) unwritten languages, giving simultaneous access to sound recordings and text annotation''.” The earliest archived corpora in the collection were languages from
Nepal Nepal (; ne, नेपाल ), formerly the Federal Democratic Republic of Nepal ( ne, सङ्घीय लोकतान्त्रिक गणतन्त्र नेपाल ), is a landlocked country in South Asia. It is ma ...
, from New Caledonia, from
eastern Africa East Africa, Eastern Africa, or East of Africa, is the eastern subregion of the African continent. In the United Nations Statistics Division scheme of geographic regions, 10-11-(16*) territories make up Eastern Africa: Due to the historical ...
and
French Guiana French Guiana ( or ; french: link=no, Guyane ; gcr, label=French Guianese Creole, Lagwiyann ) is an overseas departments and regions of France, overseas department/region and single territorial collectivity of France on the northern Atlantic ...
.Screen capture of LACITO's archive contents
— 22 April 2002.
The archive has grown steadily since the early 2000s,“About us” section
of the Pangloss Collection (retrieved 24 April 2021)
incorporating corpora from various linguists, whether members of LACITO or not. In 2009, the archive had 200 recordings in 45 languages.
— 26 November 2009.
In 2014, the (newly renamed) ''Pangloss Collection'' had recordings in 70 languages. As of April 2021, the Pangloss archive contains recordings in 196 languages, totalling 780 hours of audio and video recordings. The main languages represented in the Pangloss Collection are Mwotlap (
Austronesian Austronesian may refer to: *The Austronesian languages *The historical Austronesian peoples The Austronesian peoples, sometimes referred to as Austronesian-speaking peoples, are a large group of peoples in Taiwan, Maritime Southeast Asia, M ...
;
Vanuatu Vanuatu ( or ; ), officially the Republic of Vanuatu (french: link=no, République de Vanuatu; bi, Ripablik blong Vanuatu), is an island country located in the South Pacific Ocean. The archipelago, which is of volcanic origin, is east of no ...
),
Japhug Japhug is a Gyalrong language spoken in Barkam County, Rngaba, Sichuan, China, in the three townships of Gdong-brgyad (, Japhug ), Gsar-rdzong (, Japhug ) and Da-tshang (, Japhug ). The endonym of the Japhug language is . The name Japhug (; ...
(
Sino-Tibetan Sino-Tibetan, also cited as Trans-Himalayan in a few sources, is a family of more than 400 languages, second only to Indo-European in number of native speakers. The vast majority of these are the 1.3 billion native speakers of Chinese languages. ...
;
Southwest China Southwest China () is a region in the south of the People's Republic of China. Geography Southwest China is a rugged and mountainous region, transitioning between the Tibetan Plateau to the west and the Chinese coastal hills (东南丘陵) and ...
), Ersu (
Sino-Tibetan Sino-Tibetan, also cited as Trans-Himalayan in a few sources, is a family of more than 400 languages, second only to Indo-European in number of native speakers. The vast majority of these are the 1.3 billion native speakers of Chinese languages. ...
;
Southwest China Southwest China () is a region in the south of the People's Republic of China. Geography Southwest China is a rugged and mountainous region, transitioning between the Tibetan Plateau to the west and the Chinese coastal hills (东南丘陵) and ...
), Naxi (or ''Yongnin Na'':
Sino-Tibetan Sino-Tibetan, also cited as Trans-Himalayan in a few sources, is a family of more than 400 languages, second only to Indo-European in number of native speakers. The vast majority of these are the 1.3 billion native speakers of Chinese languages. ...
;
Southwest China Southwest China () is a region in the south of the People's Republic of China. Geography Southwest China is a rugged and mountainous region, transitioning between the Tibetan Plateau to the west and the Chinese coastal hills (东南丘陵) and ...
), and Cèmuhî (
Austronesian Austronesian may refer to: *The Austronesian languages *The historical Austronesian peoples The Austronesian peoples, sometimes referred to as Austronesian-speaking peoples, are a large group of peoples in Taiwan, Maritime Southeast Asia, M ...
; New Caledonia). The main contributors to Pangloss (in number of resources archived) are linguists Alexandre François, Katia Chirkova, Guillaume Jacques, and Michel Ferlus.Ferlus contribute
530 resources


References


External links


Homepage of the Pangloss Collection
* Sample text from the collection
“The Ogre Kanayongba”
a story in the
Limbu language Limbu (Limbu: , ''yakthuṅ pan'') is a Sino-Tibetan language spoken by the Limbu people of Nepal and Northeastern India (particularly Darjeeling, Kalimpong, Sikkim, Assam and Nagaland) as well as expatriate communities in Bhutan. The Limbu refer ...
of
Nepal Nepal (; ne, नेपाल ), formerly the Federal Democratic Republic of Nepal ( ne, सङ्घीय लोकतान्त्रिक गणतन्त्र नेपाल ), is a landlocked country in South Asia. It is ma ...
, presented in bilingual format.
Access to the Pangloss Collection through its language mapAccess to the Pangloss Collection through the CoCoON search interface


{{Webarchive, url=https://web.archive.org/web/20210424163930/http://dla.library.upenn.edu/dla/olac/browse.html?q=Pangloss&browse=subject_language_facet&fq=archive_facet:%22COllections%20de%20COrpus%20Oraux%20Numeriques%20(CoCoON%20ex-CRDO)%22&browse.sort=true , date=2021-04-24 Endangered languages projects Sound archives Creative Commons-licensed websites French National Centre for Scientific Research