In linguistics, co-occurrence or cooccurrence is an above-chance frequency of
occurrence of two
terms (also known as coincidence or
concurrence) from a
text corpus alongside each other in a certain order. Co-occurrence in this
linguistic sense can be interpreted as an indicator of
semantic proximity or an
idiomatic expression. Corpus linguistics and its statistic analyses reveal patterns of co-occurrences within a language and enable to work out typical
collocations for its lexical items. A ''co-occurrence restriction'' is identified when linguistic elements never occur together. Analysis of these restrictions can lead to discoveries about the
structure and development of a language.
Co-occurrence can be seen an extension of
word counting in higher dimensions. Co-occurrence can be quantitatively described using measures like
correlation or
mutual information
In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the " amount of information" (in units such as ...
.
See also
*
Distributional hypothesis
*
Statistical semantics
*
Co-occurrence matrix
*
Co-occurrence networks
*
Similarity measure
**
Dice coefficient
References
Corpus linguistics
{{Ling-stub