Lacito Archive
   HOME

TheInfoList



OR:

The Pangloss Collection is a
digital library A digital library (also called an online library, an internet library, a digital repository, a library without walls, or a digital collection) is an online database of digital resources that can include text, still images, audio, video, digital ...
whose objective is to store and facilitate access to
audio recording Sound recording and reproduction is the electrical, mechanical, electronic, or digital inscription and re-creation of sound waves, such as spoken voice, singing, instrumental music, or sound effects. The two main classes of sound recording t ...
s in
endangered languages An endangered language or moribund language is a language that is at risk of disappearing as its speakers die out or shift to speaking other languages. Language loss occurs when the language has no more native speakers and becomes a " dead langua ...
of the world. Developed by the
LACITO LACITO (''Langues et Civilisations à Tradition Orale'') is a multidisciplinary research organisation, principally devoted to the study of cultures and languages of oral tradition. LACITO is a branch of the Centre National de la Recherche Scient ...
centre of
CNRS The French National Centre for Scientific Research (, , CNRS) is the French state research organisation and is the largest fundamental science agency in Europe. In 2016, it employed 31,637 staff, including 11,137 tenured researchers, 13,415 eng ...
in
Paris Paris () is the Capital city, capital and List of communes in France with over 20,000 inhabitants, largest city of France. With an estimated population of 2,048,472 residents in January 2025 in an area of more than , Paris is the List of ci ...
, the collection provides free online access to documents of
connected Connected may refer to: Film and television * ''Connected'' (2008 film), a Hong Kong remake of the American movie ''Cellular'' * '' Connected: An Autoblogography About Love, Death & Technology'', a 2011 documentary film * ''Connected'' (2015 TV ...
, spontaneous speech, in otherwise little-
documented A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ', which denotes a "teaching" or "lesson": t ...
languages of all continents.Michailovsky, Boyd, Martine Mazaudon, Alexis Michaud, Séverine Guillaume, Alexandre François & Evangelia Adamou. 2014
Documenting and researching endangered languages: the Pangloss Collection
''
Language Documentation & Conservation ''Language Documentation & Conservation'' is a peer-reviewed open-access academic journal covering all topics related to language documentation and conservation, including the goals of data management, field-work methods, ethics, orthography desi ...
'' 8, pp. 119-135.


Principles


A sound archive with synchronized transcriptions

For the science of
linguistics Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...
, language is first and foremost spoken language. The medium of spoken language is sound. The Pangloss Collection gives access to original recordings simultaneously with transcriptions and translations, as a resource for further research. After being recorded in its cultural context, texts have been transcribed in collaboration with
native speakers A first language (L1), native language, native tongue, or mother tongue is the first language a person has been exposed to from birth or within the critical period. In some countries, the term ''native language'' or ''mother tongue'' refers ...
.


A structured, open architecture

The archived data is based on robust standards, as
open architecture Open architecture is a type of computer architecture or software architecture intended to make adding, upgrading, and swapping components with other computers easy. For example, the IBM PC, Amiga 2000 and Apple IIe have an open architecture supp ...
, in an
open format An open file format is a file format for storing digital data, defined by an openly published specification usually maintained by a standards organization, and which can be used and implemented by anyone. An open file format is licensed with a ...
, and may be downloaded under a
Creative Commons license A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work". A CC license is used when an author wants to give other people the right to share, use, and bu ...
. The software used to prepare and disseminate it is
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
. The Pangloss Collection is a member of the
OLAC OLAC, the Open Language Archives Community, is an initiative to create a unified means of searching online databases of language resources for linguistic research. The information about resources is stored in XML format for easy searching. OLAC wa ...
network of archival repositories and of the Digital Endangered Languages and Music Archive Network (DELAMAN).


History

The collection was initially called the ''LACITO Archive''.Screen capture of LACITO's archive homepage
— 27 February 2001.
The project originated in 1996 from the collaboration of Boyd Michailovsky, linguist at LACITO, with John B. Lowe, engineer; they were later joined by Michel Jacobson, engineer, who developed some tools for the project, and brought it online. The purpose of the archive was “''to conserve, and to make available for research, recorded and transcribed oral traditions and other linguistic materials in (mainly) unwritten languages, giving simultaneous access to sound recordings and text annotation''.” The earliest archived corpora in the collection were languages from
Nepal Nepal, officially the Federal Democratic Republic of Nepal, is a landlocked country in South Asia. It is mainly situated in the Himalayas, but also includes parts of the Indo-Gangetic Plain. It borders the Tibet Autonomous Region of China Ch ...
, from
New Caledonia New Caledonia ( ; ) is a group of islands in the southwest Pacific Ocean, southwest of Vanuatu and east of Australia. Located from Metropolitan France, it forms a Overseas France#Sui generis collectivity, ''sui generis'' collectivity of t ...
, from
eastern Africa East Africa, also known as Eastern Africa or the East of Africa, is a region at the eastern edge of the Africa, African continent, distinguished by its unique geographical, historical, and cultural landscape. Defined in varying scopes, the regi ...
and
French Guiana French Guiana, or Guyane in French, is an Overseas departments and regions of France, overseas department and region of France located on the northern coast of South America in the Guianas and the West Indies. Bordered by Suriname to the west ...
.Screen capture of LACITO's archive contents
— 22 April 2002.
The archive has grown steadily since the early 2000s,“About us” section
of the Pangloss Collection (retrieved 24 April 2021)
incorporating corpora from various linguists, whether members of LACITO or not. In 2009, the archive had 200 recordings in 45 languages.
— 26 November 2009.
In 2014, the (newly renamed) ''Pangloss Collection'' had recordings in 70 languages. As of April 2021, the Pangloss archive contains recordings in 196 languages, totalling 780 hours of audio and video recordings. Languages in the Pangloss Collection include
Mwotlap Mwotlap (pronounced ; formerly known as ''Motlav'') is an Oceanic language spoken by about 2,100 people in Vanuatu. The majority of speakers are found on the island of Motalava in the Banks Islands, with smaller communities in the islands of ...
(
Austronesian Austronesian may refer to: *The Austronesian languages *The historical Austronesian peoples The Austronesian people, sometimes referred to as Austronesian-speaking peoples, are a large group of peoples who have settled in Taiwan, maritime Sout ...
;
Vanuatu Vanuatu ( or ; ), officially the Republic of Vanuatu (; ), is an island country in Melanesia located in the South Pacific Ocean. The archipelago, which is of volcanic origin, is east of northern Australia, northeast of New Caledonia, east o ...
), Japhug (
Sino-Tibetan Sino-Tibetan (also referred to as Trans-Himalayan) is a family of more than 400 languages, second only to Indo-European in number of native speakers. Around 1.4 billion people speak a Sino-Tibetan language. The vast majority of these are the 1.3 ...
;
Southwest China Southwestern China () is a region in the People's Republic of China. It consists of five provincial administrative regions, namely Chongqing, Sichuan, Guizhou, Yunnan, and Xizang. Geography Southwestern China is a rugged and mountainous region, ...
), Ersu (
Sino-Tibetan Sino-Tibetan (also referred to as Trans-Himalayan) is a family of more than 400 languages, second only to Indo-European in number of native speakers. Around 1.4 billion people speak a Sino-Tibetan language. The vast majority of these are the 1.3 ...
;
Southwest China Southwestern China () is a region in the People's Republic of China. It consists of five provincial administrative regions, namely Chongqing, Sichuan, Guizhou, Yunnan, and Xizang. Geography Southwestern China is a rugged and mountainous region, ...
), Naxi (or ''Yongning Na'':
Sino-Tibetan Sino-Tibetan (also referred to as Trans-Himalayan) is a family of more than 400 languages, second only to Indo-European in number of native speakers. Around 1.4 billion people speak a Sino-Tibetan language. The vast majority of these are the 1.3 ...
;
Southwest China Southwestern China () is a region in the People's Republic of China. It consists of five provincial administrative regions, namely Chongqing, Sichuan, Guizhou, Yunnan, and Xizang. Geography Southwestern China is a rugged and mountainous region, ...
), and Cèmuhî (
Austronesian Austronesian may refer to: *The Austronesian languages *The historical Austronesian peoples The Austronesian people, sometimes referred to as Austronesian-speaking peoples, are a large group of peoples who have settled in Taiwan, maritime Sout ...
;
New Caledonia New Caledonia ( ; ) is a group of islands in the southwest Pacific Ocean, southwest of Vanuatu and east of Australia. Located from Metropolitan France, it forms a Overseas France#Sui generis collectivity, ''sui generis'' collectivity of t ...
).Cèmuhî corpus230 resources


References


External links


Homepage of the Pangloss Collection
* Sample text from the collection
“The Ogre Kanayongba”
a story in the
Limbu language Limbu (Limbu: , ''yakthuṅ pan'') is a Sino-Tibetan language spoken by the Limbu people of Nepal and Northeastern India (particularly West Bengal, Sikkim, Assam and Nagaland) as well as expatriate communities in Bhutan. The Limbu refer to the ...
of
Nepal Nepal, officially the Federal Democratic Republic of Nepal, is a landlocked country in South Asia. It is mainly situated in the Himalayas, but also includes parts of the Indo-Gangetic Plain. It borders the Tibet Autonomous Region of China Ch ...
, presented in bilingual format.
Access to the Pangloss Collection through its language mapAccess to the Pangloss Collection through the CoCoON search interface


{{Webarchive, url=https://web.archive.org/web/20210424163930/http://dla.library.upenn.edu/dla/olac/browse.html?q=Pangloss&browse=subject_language_facet&fq=archive_facet:%22COllections%20de%20COrpus%20Oraux%20Numeriques%20(CoCoON%20ex-CRDO)%22&browse.sort=true , date=2021-04-24 Endangered languages projects Sound archives Creative Commons-licensed websites French National Centre for Scientific Research