Buckeye Corpus
   HOME

TheInfoList



OR:

The Buckeye Corpus of
conversation Conversation is interactive communication between two or more people. The development of conversational skills and etiquette is an important part of socialization. The development of conversational skills in a new language is a frequent focus ...
al
speech Speech is the use of the human voice as a medium for language. Spoken language combines vowel and consonant sounds to form units of meaning like words, which belong to a language's lexicon. There are many different intentional speech acts, suc ...
is a
speech corpus A speech corpus (or spoken corpus) is a database of speech audio files and text Transcription (linguistics), transcriptions. In speech technology, speech corpora are used, among other things, to create acoustic models (which can then be used with ...
created by a team of linguists and psychologists at
Ohio State University The Ohio State University (Ohio State or OSU) is a public university, public Land-grant university, land-grant research university in Columbus, Ohio, United States. A member of the University System of Ohio, it was founded in 1870. It is one ...
led by Prof. Mark Pitt. Dilley, L., & Pitt, M. (2007). A study of regressive place assimilation in spontaneous speech and its implications for spoken word recognition. Journal of the Acoustical Society of America, 122(4), 2340-2353. It contains high-quality recordings from 40 speakers in
Columbus, Ohio Columbus (, ) is the List of capitals in the United States, capital and List of cities in Ohio, most populous city of the U.S. state of Ohio. With a 2020 United States census, 2020 census population of 905,748, it is the List of United States ...
conversing freely with an interviewer. The interviewer's voice is heard only faintly in the background of these recordings. The sessions were conducted as
Sociolinguistic Sociolinguistics is the descriptive, scientific study of how language is shaped by, and used differently within, any given society. The field largely looks at how a language changes between distinct social groups, as well as how it varies unde ...
s interviews, and are essentially monologues. The speech has been orthographically transcribed and phonetically labeled. The audio and text files, together with time-aligned phonetic labels, are stored in a format for use with speech analysis software (Xwaves and Wavesurfer). Software for searching the transcription files is also available at the project web site. The corpus is available to researchers in
academics Academic means of or related to an academy, an institution learning. Academic or academics may also refer to: * Academic staff, or faculty, teachers or research staff * school of philosophers associated with the Platonic Academy in ancient Greece ...
and
industry Industry may refer to: Economics * Industry (economics), a generally categorized branch of economic activity * Industry (manufacturing), a specific branch of economic activity, typically in factories with machinery * The wider industrial sector ...
. The project was funded by the
National Institute on Deafness and Other Communication Disorders The National Institute on Deafness and Other Communication Disorders (NIDCD), a member of the U.S. National Institutes of Health, is mandated to conduct and support biomedical and behavioral research and research training in the normal and diso ...
and the Office of Research at Ohio State University.


References


Further reading

Pitt, M.A., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E., and Fosler-Lussier, E. (2007) ''Buckeye Corpus of Conversational Speech'' (2nd release) Columbus, OH: Department of Psychology, Ohio State University (Distributor).


External links


Buckeye Speech Corpus Homepage
English corpora Phonetics works Speech recognition Dialectology Linguistic research Corpora {{english-lang-stub