The International Computer Archive of Modern and Medieval English (ICAME) is an international group of linguists and data scientists working in
corpus linguistics
Corpus linguistics is the study of language, study of a language as that language is expressed in its text corpus (plural ''corpora''), its body of "real world" text. Corpus linguistics proposes that a reliable analysis of a language is more feas ...
to digitise
English texts. The organisation was founded in
Oslo,
Norway in 1977 as the International Computer Archive of Modern English, before being renamed to its current title.
The portal to their materials is hosted at the
University of Bergen, where they have set out the aim of the organization to "collect and distribute information on English language material available for computer processing and on linguistic research to compile an archive of English text corpora in
machine-readable form, and to make material available to research institutions." Creating computer corpora, i.e. collections of texts in machine-readable form, is the most accessible way to study both transcribed spoken language and various genres of written texts for modern scholars, including both "descriptive and more theoretically-minded linguists".
The ICAME group hosts academic conferences that focus on corpus linguistic studies of historical changes and contemporary grammatical descriptions of English, and makes corpora of different varieties of English available to scholars, starting with editions of the 1960s
Brown Corpus. Their first academic conference was held in
Bergen, Norway
Bergen (), historically Bjørgvin, is a city and municipalities of Norway, municipality in Vestland county on the Western Norway, west coast of Norway. , its population is roughly 285,900. Bergen is the list of towns and cities in Norway, secon ...
in 1979, and scholars who were interested in corpus linguistics continued to meet each spring in different European and English-speaking countries. At these meetings, the compilation and distribution of corpora they enabled played a key role in the creation of the field of corpus linguistics in the 20th century, a precursor to current
big data
Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe Big data is the one associated with large body of information that we could not comprehend when used only in smaller am ...
analytics. In summarizing the field, Kennedy's ''Introduction to Corpus Linguistics'' notes that "for corpus linguists with an interest in the description of English, the International Computer Archive of Modern and Medieval English has been the major resource". The influence of ICAME on the field has also be laid out in Facchinetti's history, ''Corpus Linguistics Twenty-five Years On''.
One influential resource that ICAME made available was a CD of 20 different corpora, including those covering different regional Englishes (such as the
Australian Corpus of English
Australian(s) may refer to:
Australia
* Australia, a country
* Australians, citizens of the Commonwealth of Australia
** European Australians
** Anglo-Celtic Australians, Australians descended principally from British colonists
** Aboriginal ...
, the
Wellington Corpus of Spoken New Zealand English The Wellington Corpus of Spoken New Zealand English is a one-million-word corpus of transcribed English compiled from materials collected between 1988 and 1994, which is made up of excerpts from a range of speakers who have lived in New Zealand sinc ...
, the
Kolhapur Corpus of Indian English
Kolhapur () is a city on the banks of the Panchganga River in the southern part of the Indian state of Maharashtra. It is the administrative headquarter of the Kolhapur district. In, around 2 C.E. Kolapur's name was 'Kuntal'.
Kolhapur is k ...
, the
Bergen Corpus of London Teenage Language The Bergen Corpus of London Teenage Language (COLT) is a data set of samples of spoken English that was compiled in 1993 from tape recorded and transcribed conversations by teens between the ages of 13 and 17 in schools throughout London, England. ...
(COLT), the
Helsinki Corpus of Older Scots
Helsinki ( or ; ; sv, Helsingfors, ) is the Capital city, capital, primate city, primate, and List of cities and towns in Finland, most populous city of Finland. Located on the shore of the Gulf of Finland, it is the seat of the region of U ...
, and the
International Corpus of English The International Corpus of English (ICE) is a set of corpora representing varieties of English from around the world. Over twenty countries or groups of countries where English is the first language or an official second language are included.
His ...
—East-African component), as well as versions of the Brown Corpus and the
Lancaster-Bergen-Oslo (LOB) corpus tagged for
part of speech.
ICAME also published an annual journal, the ''ICAME Journal'', formerly ''ICAME News'', that contains articles, conference reports, reviews and notices related to corpus linguistics. The current editors of the ''ICAME Journal'' are Merja Kytö and
Anna-Brita Stenström.
References
Further reading
*
Leech, Geoffrey and Stig Johansson. 2009. "The coming of ICAME," ''ICAME Journal'' 33: 5-20. http://eprints.lancs.ac.uk/35628/
* Leech, Geoffrey. 2013. "The Development of ICAME and the Brown Family of Corpora."
{{Authority control
Corpus linguistics
Applied linguistics
English corpora
Linguistic research institutes
Full-text scholarly online databases
Corpora