HOME

TheInfoList



OR:

In linguistics, co-occurrence or cooccurrence is an above-chance frequency of occurrence of two terms (also known as coincidence or
concurrence In Western jurisprudence, concurrence (also contemporaneity or simultaneity) is the apparent need to prove the simultaneous occurrence of both ("guilty action") and ("guilty mind"), to constitute a crime; except in crimes of strict liabilit ...
) from a
text corpus In linguistics, a corpus (plural ''corpora'') or text corpus is a language resource consisting of a large and structured set of texts (nowadays usually electronically stored and processed). In corpus linguistics, they are used to do statistical ...
alongside each other in a certain order. Co-occurrence in this linguistic sense can be interpreted as an indicator of semantic proximity or an
idiom An idiom is a phrase or expression that typically presents a figurative, non-literal meaning attached to the phrase; but some phrases become figurative idioms while retaining the literal meaning of the phrase. Categorized as formulaic language ...
atic expression. Corpus linguistics and its statistic analyses reveal patterns of co-occurrences within a language and enable to work out typical
collocation In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation is a type of compositional phraseme, meaning that it can be understood from the words ...
s for its lexical items. A ''co-occurrence restriction'' is identified when linguistic elements never occur together. Analysis of these restrictions can lead to discoveries about the
structure A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such a ...
and development of a language. Co-occurrence can be seen an extension of word counting in higher dimensions. Co-occurrence can be quantitatively described using measures like
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistic ...
or
mutual information In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the " amount of information" (in units such ...
.


See also

* Distributional hypothesis * Statistical semantics * Co-occurrence matrix *
Co-occurrence networks Co-occurrence network, sometimes referred to as a semantic network, is a method to analyze text that includes a graphic visualization of potential relationships between people, organizations, concepts, biological organisms like bacteria or other ...
* Similarity measure ** Dice coefficient


References

Corpus linguistics {{Ling-stub