The overlap coefficient,
or Szymkiewicz–Simpson coefficient, is a
similarity measure
In statistics and related fields, a similarity measure or similarity function or similarity metric is a real-valued function that quantifies the similarity between two objects. Although no single definition of a similarity exists, usually such mea ...
that measures the overlap between two finite
sets. It is related to the
Jaccard index
The Jaccard index, also known as the Jaccard similarity coefficient, is a statistic used for gauging the similarity and diversity of sample sets. It was developed by Grove Karl Gilbert in 1884 as his ratio of verification (v) and now is fre ...
and is defined as the size of the
intersection
In mathematics, the intersection of two or more objects is another object consisting of everything that is contained in all of the objects simultaneously. For example, in Euclidean geometry, when two lines in a plane are not parallel, thei ...
divided by the smaller of the size of the two sets:
:
If set ''X'' is a
subset
In mathematics, set ''A'' is a subset of a set ''B'' if all elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they are unequal, then ''A'' is a proper subset o ...
of ''Y'' or the converse then the overlap coefficient is equal to 1.
References
{{Reflist
Information retrieval techniques
Information retrieval evaluation
String metrics
Measure theory
Similarity measures