The index of dissimilarity is a
demographic
Demography () is the statistical study of populations, especially human beings.
Demographic analysis examines and measures the dimensions and dynamics of populations; it can cover whole societies or groups defined by criteria such as edu ...
measure of the evenness with which two groups are distributed across component geographic areas that make up a larger area. A group is evenly distributed when each geographic unit has the same percentage of group members as the total population. The index score can also be interpreted as the
percent
In mathematics, a percentage (from la, per centum, "by a hundred") is a number or ratio expressed as a fraction of 100. It is often denoted using the percent sign, "%", although the abbreviations "pct.", "pct" and sometimes "pc" are also use ...
age of one of the two groups included in the calculation that would have to move to different geographic areas in order to produce a distribution that matches that of the larger area. The index of dissimilarity can be used as a measure of
segregation. A score of zero (0%) reflects a fully integrated environment; a score of 1 (100%) reflects full segregation. In terms of black–white segregation, a score of .60 means that 60 percent of blacks would have to exchange places with whites in other units to achieve an even geographic distribution.
Basic formula
The basic formula for the index of dissimilarity is:
:
where (comparing a black and white population, for example):
:''a
i'' = the population of group A in the ''i''
th area, e.g. census tract
:''A'' = the total population in group A in the large geographic entity for which the index of dissimilarity is being calculated.
:''b
i'' = the population of group B in the ''i''
th area
:''B'' = the total population in group B in the large geographic entity for which the index of dissimilarity is being calculated.
The index of dissimilarity is applicable to any
categorical variable
In statistics, a categorical variable (also called qualitative variable) is a variable that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to a particular group or ...
(whether demographic or not) and because of its simple properties is useful for input into multidimensional scaling and clustering programs. It has been used extensively in the study of
social mobility
Social mobility is the movement of individuals, families, households or other categories of people within or between social strata in a society. It is a change in social status relative to one's current social location within a given society ...
to compare distributions of origin (or destination) occupational categories.
Linear algebra perspective
The formula for the Index of Dissimilarity can be made much more compact and meaningful by considering it from the perspective of
Linear algebra. Suppose we are studying the distribution of rich and poor people in a city (e.g.
London). Suppose our city contains
blocks:
Let's create a vector
which shows the number of rich people in each block of our city: