EXMARaLDA (Extensible Markup Language for Discourse Annotation) is a set of free software tools for creating, managing and analyzing
spoken language corpora. It consists of a transcription tool (comparable to tools like
Praat
Praat (; , '' "talk"'') is a free computer software package for speech analysis in phonetics. It was designed, and continues to be developed, by Paul Boersma and David Weenink of the University of Amsterdam. It can run on a wide range of operat ...
or
Transcriber
Transcriber is an open-source software tool for the transcription and annotation of speech signals for linguistic research. It supports multiple hierarchical layers of segmentation, named entity annotation, speaker lists, topic lists, and ove ...
), a tool for administering corpus meta data and a tool for doing queries (
KWIC searches) on spoken language corpora. EXMARaLDA is used for doing
conversation
Conversation is interactive communication between two or more people. The development of conversational skills and etiquette is an important part of socialization. The development of conversational skills in a new language is a frequent focus ...
and
discourse analysis
Discourse analysis (DA), or discourse studies, is an approach to the analysis of written, vocal, or sign language use, or any significant semiotic event.
The objects of discourse Analysis ( discourse, writing, conversation, communicative even ...
,
dialectology Dialectology (from Greek , ''dialektos'', "talk, dialect"; and , '' -logia'') is the scientific study of linguistic dialect, a sub-field of sociolinguistics. It studies variations in language based primarily on geographic distribution and their asso ...
,
phonology
Phonology is the branch of linguistics that studies how languages or dialects systematically organize their sounds or, for sign languages, their constituent parts of signs. The term can also refer specifically to the sound or sign system of a ...
and research into first and second
language acquisition
Language acquisition is the process by which humans acquire the capacity to perceive and comprehend language (in other words, gain the ability to be aware of language and to understand it), as well as to produce and use words and sentences to ...
in children and adults. EXMARaLDA is based on the open standards
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
and
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
and programmed in
Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
.
References
* Schmidt, Thomas and Wörner, Kai (2009). "EXMARaLDA – Creating, analysing and sharing spoken language corpora for pragmatic research." In: ''Pragmatics 19''.
* Schmidt, Thomas and Bennöhr, Jasmine (2008). "Rescuing Legacy Data." In: ''Language Documentation and Conservation 2'', 109–129.
External links
exmaralda.org- Official project website
std.metu.edu.tr- Website of the METU Spoken Turkish Corpus, a corpus constructed with EXMARaLDA
{{DEFAULTSORT:Exmaralda
Free science software
Phonetics
Phonology
Free audio software
Linguistic research software