Simpson’s Paradox
   HOME

TheInfoList



OR:

Simpson's paradox is a phenomenon in
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
and
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
in which a trend appears in several groups of data but disappears or reverses when the groups are combined. This result is often encountered in social-science and medical-science statistics, and is particularly problematic when frequency data are unduly given
causal Causality is an influence by which one Event (philosophy), event, process, state, or Object (philosophy), object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the cause is at l ...
interpretations.
Judea Pearl Judea Pearl (; born September 4, 1936) is an Israeli-American computer scientist and philosopher, best known for championing the probabilistic approach to artificial intelligence and the development of Bayesian networks (see the article on belie ...
. ''Causality: Models, Reasoning, and Inference'', Cambridge University Press (2000, 2nd edition 2009). .
The paradox can be resolved when
confounding variable In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlati ...
s and causal relations are appropriately addressed in the statistical modeling (e.g., through
cluster analysis Cluster analysis or clustering is the data analyzing technique in which task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more Similarity measure, similar (in some specific sense defined by the ...
). Simpson's paradox has been used to illustrate the kind of misleading results that the
misuse of statistics Statistics, when used in a misleading fashion, can trick the casual observer into believing something other than what the data shows. That is, a misuse of statistics occurs when a statistical argument asserts a falsehood. In some cases, the misu ...
can generate. Edward H. Simpson first described this phenomenon in a technical paper in 1951; the statisticians
Karl Pearson Karl Pearson (; born Carl Pearson; 27 March 1857 – 27 April 1936) was an English biostatistician and mathematician. He has been credited with establishing the discipline of mathematical statistics. He founded the world's first university ...
(in 1899) and
Udny Yule George Udny Yule, CBE, FRS (18 February 1871 – 26 June 1951), usually known as Udny Yule, was a British statistician, particularly known for the Yule distribution and proposing the preferential attachment model for random graphs. Perso ...
(in 1903) had mentioned similar effects earlier. The name ''Simpson's paradox'' was introduced by Colin R. Blyth in 1972. It is also referred to as Simpson's reversal, the Yule–Simpson effect, the amalgamation paradox, or the reversal paradox. Mathematician Jordan Ellenberg argues that Simpson's paradox is misnamed as "there's no contradiction involved, just two different ways to think about the same data" and suggests that its lesson "isn't really to tell us which viewpoint to take but to insist that we keep both the parts and the whole in mind at once."


Examples


UC Berkeley gender bias

One of the best-known examples of Simpson's paradox comes from a study of gender bias among graduate school admissions to
University of California, Berkeley The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California), is a Public university, public Land-grant university, land-grant research university in Berkeley, California, United States. Founded in 1868 and named after t ...
. The admission figures for the fall of 1973 showed that men applying were more likely than women to be admitted, and the difference was so large that it was unlikely to be due to chance.
David Freedman David Freedman (April 26, 1898 – December 8, 1936) was a Romanian-born American playwright and biographer who became known as the "King of the Gag-writers" in the early days of radio. Biography David Freedman was born in Botoșani, Romania ...
, Robert Pisani, and Roger Purves (2007), ''Statistics'' (4th edition),
W. W. Norton W. W. Norton & Company is an American publishing company based in New York City. Established in 1923, it has been owned wholly by its employees since the early 1960s. The company is known for its Norton Anthologies (particularly '' The Norton ...
. .
However, when taking into account the information about departments being applied to, the different rejection percentages reveal the different difficulty of getting into the department, and at the same time it showed that women tended to apply to more competitive departments with lower rates of admission, even among qualified applicants (such as in the English department), whereas men tended to apply to less competitive departments with higher rates of admission (such as in the engineering department). The pooled and corrected data showed a "small but statistically significant bias in favor of women". The data from the six largest departments are listed below: The entire data showed total of 4 out of 85 departments to be significantly biased against women, while 6 to be significantly biased against men (not all present in the 'six largest departments' table above). Notably, the numbers of biased departments were not the basis for the conclusion, but rather it was the gender admissions pooled across all departments, while weighing by each department's rejection rate across all of its applicants.


Kidney stone treatment

Another example comes from a real-life medical study comparing the success rates of two treatments for
kidney stone Kidney stone disease (known as nephrolithiasis, renal calculus disease, or urolithiasis) is a crystallopathy and occurs when there are too many minerals in the urine and not enough liquid or hydration. This imbalance causes tiny pieces of cr ...
s. The table below shows the success rates (the term ''success rate'' here actually means the success proportion) and numbers of treatments for treatments involving both small and large kidney stones, where Treatment A includes open surgical procedures and Treatment B includes closed surgical procedures. The numbers in parentheses indicate the number of success cases over the total size of the group. The paradoxical conclusion is that treatment A is more effective when used on small stones, and also when used on large stones, yet treatment B appears to be more effective when considering both sizes at the same time. In this example, the "lurking" variable (or
confounding variable In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlati ...
) causing the paradox is the size of the stones, which was not previously known to researchers to be important until its effects were included. Which treatment is considered better is determined by which success ratio (successes/total) is larger. The reversal of the inequality between the two ratios when considering the combined data, which creates Simpson's paradox, happens because two effects occur together: # The sizes of the groups, which are combined when the lurking variable is ignored, are very different. Doctors tend to give cases with large stones the better treatment A, and the cases with small stones the inferior treatment B. Therefore, the totals are dominated by groups 3 and 2, and not by the two much smaller groups 1 and 4. # The lurking variable, stone size, has a large effect on the ratios; i.e., the success rate is more strongly influenced by the severity of the case than by the choice of treatment. Therefore, the group of patients with large stones using treatment A (group 3) does worse than the group with small stones, even if the latter used the inferior treatment B (group 2). Based on these effects, the paradoxical result is seen to arise because the effect of the size of the stones overwhelms the benefits of the better treatment (A). In short, the less effective treatment B appeared to be more effective because it was applied more frequently to the small stones cases, which were easier to treat. Jaynes argues that the correct conclusion is that though treatment A remains noticeably better than treatment B, the kidney stone size is more important.


Batting averages

A common example of Simpson's paradox involves the
batting average Batting average is a statistic in cricket, baseball, and softball that measures the performance of batters. The development of the baseball statistic was influenced by the cricket statistic. Cricket In cricket, a player's batting average is ...
s of players in
professional baseball Professional baseball is organized baseball in which players are selected for their talents and are paid to play for a specific team or club system. It is played in baseball league, leagues and associated farm teams throughout the world. Moder ...
. It is possible for one player to have a higher batting average than another player each year for a number of years, but to have a lower batting average across all of those years. This phenomenon can occur when there are large differences in the number of
at bat In baseball, an at bat (AB) or time at bat is a batter's turn batting against a pitcher. An at bat is different from a plate appearance. A batter is credited with a plate appearance regardless of what happens upon completion of his turn at bat, ...
s between the years. Mathematician Ken Ross demonstrated this using the batting average of two baseball players,
Derek Jeter Derek Sanderson Jeter ( ; born June 26, 1974), nicknamed "the Captain", is an American former professional baseball shortstop, businessman, and baseball executive. As a player, Jeter spent his entire 20-year Major League Baseball (MLB) caree ...
and David Justice, during the years 1995 and 1996:Ken Ross. "''A Mathematician at the Ballpark: Odds and Probabilities for Baseball Fans (Paperback)''" Pi Press, 2004. . 12–13 In both 1995 and 1996, Justice had a higher batting average (in bold type) than Jeter did. However, when the two baseball seasons are combined, Jeter shows a higher batting average than Justice. According to Ross, this phenomenon would be observed about once per year among the possible pairs of players.


Vector interpretation

Simpson's paradox can also be illustrated using a 2-dimensional
vector space In mathematics and physics, a vector space (also called a linear space) is a set (mathematics), set whose elements, often called vector (mathematics and physics), ''vectors'', can be added together and multiplied ("scaled") by numbers called sc ...
. A success rate of \frac (i.e., ''successes/attempts'') can be represented by a
vector Vector most often refers to: * Euclidean vector, a quantity with a magnitude and a direction * Disease vector, an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematics a ...
\vec = (q, p), with a
slope In mathematics, the slope or gradient of a Line (mathematics), line is a number that describes the direction (geometry), direction of the line on a plane (geometry), plane. Often denoted by the letter ''m'', slope is calculated as the ratio of t ...
of \frac. A steeper vector then represents a greater success rate. If two rates \frac and \frac are combined, as in the examples given above, the result can be represented by the sum of the vectors (q_1, p_1) and (q_2, p_2), which according to the
parallelogram rule In mathematics, the simplest form of the parallelogram law (also called the parallelogram identity) belongs to elementary geometry. It states that the sum of the squares of the lengths of the four sides of a parallelogram equals the sum of the s ...
is the vector (q_1 + q_2, p_1 + p_2), with slope \frac. Simpson's paradox says that even if a vector \vec_1 (in orange in figure) has a smaller slope than another vector \vec_1 (in blue), and \vec_2 has a smaller slope than \vec_2, the sum of the two vectors \vec_1 + \vec_2 can potentially still have a larger slope than the sum of the two vectors \vec_1 + \vec_2, as shown in the example. For this to occur one of the orange vectors must have a greater slope than one of the blue vectors (here \vec_2 and \vec_1), and these will generally be longer than the alternatively subscripted vectors – thereby dominating the overall comparison.


Correlation between variables

Simpson's reversal can also arise in
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
s, in which two variables appear to have (say) a positive correlation towards one another, when in fact they have a negative correlation, the reversal having been brought about by a "lurking" confounder. Berman et al. give an example from economics, where a dataset suggests overall demand is positively correlated with price (that is, higher prices lead to ''more'' demand), in contradiction of expectation. Analysis reveals time to be the confounding variable: plotting both price and demand against time reveals the expected negative correlation over various periods, which then reverses to become positive if the influence of time is ignored by simply plotting demand against price.


Psychology

Psychological interest in Simpson's paradox seeks to explain why people deem sign reversal to be impossible at first. The question is where people get this strong
intuition Intuition is the ability to acquire knowledge without recourse to conscious reasoning or needing an explanation. Different fields use the word "intuition" in very different ways, including but not limited to: direct access to unconscious knowledg ...
from, and how it is encoded in the
mind The mind is that which thinks, feels, perceives, imagines, remembers, and wills. It covers the totality of mental phenomena, including both conscious processes, through which an individual is aware of external and internal circumstances ...
. Simpson's paradox demonstrates that this intuition cannot be derived from either
classical logic Classical logic (or standard logic) or Frege–Russell logic is the intensively studied and most widely used class of deductive logic. Classical logic has had much influence on analytic philosophy. Characteristics Each logical system in this c ...
or
probability calculus Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
alone, and thus led
philosopher Philosophy ('love of wisdom' in Ancient Greek) is a systematic study of general and fundamental questions concerning topics like existence, reason, knowledge, Value (ethics and social sciences), value, mind, and language. It is a rational an ...
s to speculate that it is supported by an innate causal logic that guides people in reasoning about actions and their consequences. Savage's
sure-thing principle In decision theory, the sure-thing principle states that a decision maker who decided they would take a certain action in the case that event ''E'' has occurred, as well as in the case that the negation of ''E'' has occurred, should also take that ...
is an example of what such logic may entail. A qualified version of Savage's sure thing principle can indeed be derived from Pearl's ''do''-calculus and reads: "An action ''A'' that increases the probability of an event ''B'' in each subpopulation ''Ci'' of ''C'' must also increase the probability of ''B'' in the population as a whole, provided that the action does not change the distribution of the subpopulations." This suggests that knowledge about actions and consequences is stored in a form resembling Causal
Bayesian Networks A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their Conditional dependence, conditional dependencies via a directed a ...
.


Probability

A paper by Pavlides and Perlman presents a proof, due to Hadjicostas, that in a random 2 × 2 × 2 table with uniform distribution, Simpson's paradox will occur with a
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
of exactly . A study by Kock suggests that the probability that Simpson's paradox would occur at random in path models (i.e., models generated by path analysis) with two predictors and one criterion variable is approximately 12.8 percent; slightly higher than 1 occurrence per 8 path models.


Simpson's second paradox

A second, less well-known paradox was also discussed in Simpson's 1951 paper. It can occur when the "sensible interpretation" is not necessarily found in the separated data, like in the kidney stone example, but can instead reside in the combined data. Whether the partitioned or combined form of the data should be used hinges on the process giving rise to the data, meaning the correct interpretation of the data cannot always be determined by simply observing the tables.
Judea Pearl Judea Pearl (; born September 4, 1936) is an Israeli-American computer scientist and philosopher, best known for championing the probabilistic approach to artificial intelligence and the development of Bayesian networks (see the article on belie ...
has shown that, in order for the partitioned data to represent the correct causal relationships between any two variables, X and Y, the partitioning variables must satisfy a graphical condition called "back-door criterion": # They must block all spurious paths between X and Y # No variable can be affected by X This criterion provides an algorithmic solution to Simpson's second paradox, and explains why the correct interpretation cannot be determined by data alone; two different graphs, both compatible with the data, may dictate two different back-door criteria. When the back-door criterion is satisfied by a set ''Z'' of covariates, the adjustment formula (see
confounding In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlatio ...
) gives the correct causal effect of ''X'' on ''Y''. If no such set exists, Pearl's ''do''-calculus can be invoked to discover other ways of estimating the causal effect. The completeness of ''do''-calculus can be viewed as offering a complete resolution of the Simpson's paradox.


Criticism

One criticism is that the paradox is not really a paradox at all, but rather a failure to properly account for confounding variables or to consider causal relationships between variables. Focus on the paradox may distract from these more important statistical issues. Another criticism of the apparent Simpson's paradox is that it may be a result of the specific way that data are stratified or grouped. The phenomenon may disappear or even reverse if the data is stratified differently or if different confounding variables are considered. Simpson's example actually highlighted a phenomenon called noncollapsibility, which occurs when subgroups with high proportions do not make simple averages when combined. This suggests that the paradox may not be a universal phenomenon, but rather a specific instance of a more general statistical issue. Despite these criticisms, the apparent Simpson's paradox remains a popular and intriguing topic in statistics and data analysis. It continues to be studied and debated by researchers and practitioners in a wide range of fields, and it serves as a valuable reminder of the importance of careful statistical analysis and the potential pitfalls of simplistic interpretations of data.


See also

* * * * * * * * * * * *
Spurious correlation In statistics, a spurious relationship or spurious correlation is a mathematical relationship in which two or more events or variables are associated but '' not'' causally related, due to either coincidence or the presence of a certain third, u ...
*
Omitted-variable bias In statistics, omitted-variable bias (OVB) occurs when a statistical model leaves out one or more relevant variables. The bias results in the model attributing the effect of the missing variables to those that were included. More specifically, O ...


References


Bibliography

*
Leila Schneps Leila Schneps is an American mathematician and fiction writer at the Centre national de la recherche scientifique working in number theory. Schneps has written general audience math books and, under the pen name Catherine Shaw, has written mathe ...
and Coralie Colmez, '' Math on trial. How numbers get used and abused in the courtroom'', Basic Books, 2013. . (Sixth chapter: "Math error number 6: Simpson's paradox. The Berkeley sex bias case: discrimination detection").


External links


Simpson's Paradox
at the
Stanford Encyclopedia of Philosophy The ''Stanford Encyclopedia of Philosophy'' (''SEP'') is a freely available online philosophy resource published and maintained by Stanford University, encompassing both an online encyclopedia of philosophy and peer-reviewed original publication ...
, by Jan Sprenger and Naftali Weinberger.
How statistics can be misleading – Mark Liddell
– TED-Ed video and lesson. * Pearl, Judea
"Understanding Simpson's Paradox"
(PDF)
Simpson's Paradox
a short article by Alexander Bogomolny on the vector interpretation of Simpson's paradox
The Wall Street Journal column "The Numbers Guy"
for December 2, 2009 dealt with recent instances of Simpson's paradox in the news. Notably a Simpson's paradox in the comparison of unemployment rates of the 2009 recession with the 1983 recession.
At the Plate, a Statistical Puzzler: Understanding Simpson's Paradox
by Arthur Smith, August 20, 2010
Simpson's Paradox
a video by Henry Reich of MinutePhysics {{DEFAULTSORT:Simpson's Paradox Probability theory paradoxes Statistical paradoxes Causal inference 1951 introductions