HOME

TheInfoList



OR:

The phrase "correlation does not imply causation" refers to the inability to legitimately deduce a cause-and-effect relationship between two events or variables solely on the basis of an observed association or
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
between them. The idea that "correlation implies causation" is an example of a questionable-cause
logical fallacy In logic and philosophy, a formal fallacy is a pattern of reasoning rendered invalid by a flaw in its logical structure. Propositional logic, for example, is concerned with the meanings of sentences and the relationships between them. It focuses ...
, in which two events occurring together are taken to have established a cause-and-effect relationship. This fallacy is also known by the Latin phrase ''cum hoc ergo propter hoc'' ('with this, therefore because of this'). This differs from the fallacy known as ''
post hoc ergo propter hoc ''Post hoc ergo propter hoc'' (Latin: 'after this, therefore because of this') is an informal fallacy that states "Since event Y ''followed'' event X, event Y must have been ''caused'' by event X." It is a fallacy in which an event is presumed to ...
'' ("after this, therefore because of this"), in which an event following another is seen as a necessary consequence of the former event, and from
conflation Conflation is the merging of two or more sets of information, texts, ideas, or opinions into one, often in error. Conflation is defined as 'fusing blending', but is often used colloquially as 'being equal to' - treating two similar but disparate c ...
, the errant merging of two events, ideas, databases, etc., into one. As with any logical fallacy, identifying that the reasoning behind an argument is flawed does not necessarily imply that the resulting conclusion is false.
Statistical Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
methods have been proposed that use correlation as the basis for hypothesis tests for causality, including the Granger causality test and convergent cross mapping. The
Bradford Hill criteria The Bradford Hill criteria, otherwise known as Hill's criteria for causation, are a group of nine principles that can be useful in establishing epidemiologic evidence of a causal relationship between a presumed cause and an observed effect and ha ...
, also known as Hill's criteria for causation, are a group of nine principles that can be useful in establishing epidemiologic evidence of a causal relationship.


Usage and meaning of terms


"Imply"

In casual use, the word "implies" loosely means ''suggests'', rather than ''requires''. However, in
logic Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the study of deductively valid inferences or logical truths. It examines how conclusions follow from premises based on the structure o ...
, the technical use of the word "implies" means "is a '' sufficient condition'' for." That is the meaning intended by statisticians when they say causation is not certain. Indeed, ''p implies q'' has the technical meaning of the
material conditional The material conditional (also known as material implication) is a binary operation commonly used in logic. When the conditional symbol \to is interpreted as material implication, a formula P \to Q is true unless P is true and Q is false. M ...
: ''if p then q'' symbolized as ''p → q''. That is, "if circumstance ''p'' is true, then ''q'' follows." In that sense, it is always correct to say "Correlation does not ''imply'' causation."


"Cause"

The word "
cause Causality is an influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the cause is at least partly responsible for the effect, ...
" (or "causation") has multiple meanings in English. In philosophical terminology, "cause" can refer to necessary, sufficient, or contributing causes. In examining correlation, "cause" is most often used to mean "one contributing cause" (but not necessarily the only contributing cause).


Causal analysis


Examples of illogically inferring causation from correlation


B causes A (reverse causation or reverse causality)

Reverse causation or reverse causality or wrong direction is an
informal fallacy Informal fallacies are a type of incorrect argument in natural language. The source of the error is not just due to the ''form'' of the argument, as is the case for formal fallacies, but can also be due to their ''content'' and ''context''. Fallac ...
of questionable cause where cause and effect are reversed. The cause is said to be the effect and vice versa. ;Example 1 :The faster that windmills are observed to rotate, the more wind is observed. :Therefore, wind is caused by the rotation of windmills. (Or, simply put: windmills, as their name indicates, are machines used to produce wind.) In this example, the correlation (simultaneity) between windmill activity and wind velocity does not imply that wind is caused by windmills. It is rather the other way around, as suggested by the fact that wind does not need windmills to exist, while windmills need wind to rotate. Wind can be observed in places where there are no windmills or non-rotating windmills—and there are good reasons to believe that wind existed before the invention of windmills. ;Example 2 :Low cholesterol is associated with an increase in mortality. :Therefore, low cholesterol increases your risk of mortality. Causality is actually the other way around, since some diseases, such as cancer, cause low cholesterol due to a myriad of factors, such as weight loss, and they also cause an increase in mortality. This can also be seen in alcoholics. As alcoholics become diagnosed with cirrhosis of the liver, many quit drinking. However, they also experience an increased risk of mortality. In these instances, it is the diseases that cause an increased risk of mortality, but the increased mortality is attributed to the beneficial effects that follow the diagnosis, making healthy changes look unhealthy. Example 3 In other cases it may simply be unclear which is the cause and which is the effect. For example: :''Children that watch a lot of TV are the most violent. Clearly, TV makes children more violent''. This could easily be the other way round; that is, violent children like watching more TV than less violent ones. Example 4 A correlation between
recreational drug use Recreational drug use is the use of one or more psychoactive drugs to induce an altered state of consciousness, either for pleasure or for some other casual purpose or pastime. When a psychoactive drug enters the user's body, it induces an Sub ...
and
psychiatric disorder A mental disorder, also referred to as a mental illness, a mental health condition, or a psychiatric disability, is a behavioral or mental pattern that causes significant distress or impairment of personal functioning. A mental disorder is ...
s might be either way around: perhaps the drugs cause the disorders, or perhaps people use drugs to self medicate for preexisting conditions.
Gateway drug theory The gateway drug effect (alternatively, stepping-stone theory, escalation hypothesis, or progression hypothesis) is a comprehensive catchphrase for the often observed effect that the use of a psychoactive substance is coupled to an increased proba ...
may argue that
marijuana Cannabis (), commonly known as marijuana (), weed, pot, and ganja, List of slang names for cannabis, among other names, is a non-chemically uniform psychoactive drug from the ''Cannabis'' plant. Native to Central or South Asia, cannabis has ...
usage leads to usage of harder drugs, but hard drug usage may lead to marijuana usage (see also ''
confusion of the inverse Confusion of the inverse, also called the conditional probability fallacy or the inverse fallacy, is a logical fallacy whereupon a conditional probability is equated with its inverse; that is, given two events ''A'' and ''B'', the probability of ...
''). Indeed, in the
social science Social science (often rendered in the plural as the social sciences) is one of the branches of science, devoted to the study of societies and the relationships among members within those societies. The term was formerly used to refer to the ...
s where controlled experiments often cannot be used to discern the direction of causation, this fallacy can fuel long-standing scientific arguments. One such example can be found in
education economics Education economics or the economics of education is the study of economic issues relating to education, including the demand for education, the financing and provision of education, and the comparative efficiency of various educational program ...
, between the screening/ signaling and
human capital Human capital or human assets is a concept used by economists to designate personal attributes considered useful in the production process. It encompasses employee knowledge, skills, know-how, good health, and education. Human capital has a subs ...
models: it could either be that having innate ability enables one to complete an education, or that completing an education builds one's ability. Example 5 A historical example of this is that Europeans in the
Middle Ages In the history of Europe, the Middle Ages or medieval period lasted approximately from the 5th to the late 15th centuries, similarly to the post-classical period of global history. It began with the fall of the Western Roman Empire and ...
believed that
lice Louse (: lice) is the common name for any member of the infraorder Phthiraptera, which contains nearly 5,000 species of wingless parasitic insects. Phthiraptera was previously recognized as an order, until a 2021 genetic study determined th ...
were beneficial to health since there would rarely be any lice on sick people. The reasoning was that the people got sick because the lice left. The real reason however is that lice are extremely sensitive to
body temperature Thermoregulation is the ability of an organism to keep its body temperature within certain boundaries, even when the surrounding temperature is very different. A thermoconforming organism, by contrast, simply adopts the surrounding temperature ...
. A small increase of body temperature, such as in a
fever Fever or pyrexia in humans is a symptom of an anti-infection defense mechanism that appears with Human body temperature, body temperature exceeding the normal range caused by an increase in the body's temperature Human body temperature#Fever, s ...
, makes the lice look for another host. The medical
thermometer A thermometer is a device that measures temperature (the hotness or coldness of an object) or temperature gradient (the rates of change of temperature in space). A thermometer has two important elements: (1) a temperature sensor (e.g. the bulb ...
had not yet been invented and so that increase in temperature was rarely noticed. Noticeable symptoms came later, which gave the impression that the lice had left before the person became sick. In other cases, two phenomena can each be a partial cause of the other; consider poverty and lack of education, or procrastination and poor self-esteem. One making an argument based on these two phenomena must however be careful to avoid the fallacy of circular cause and consequence. Poverty is ''a'' cause of lack of education, but it is not the ''sole'' cause, and vice versa.


Third factor C (the common-causal variable) causes both A and B

The third-cause fallacy (also known as ''ignoring a common cause''Labossiere, M.C.
''Dr. LaBossiere's Philosophy Pages''
or ''questionable cause'') is a
logical fallacy In logic and philosophy, a formal fallacy is a pattern of reasoning rendered invalid by a flaw in its logical structure. Propositional logic, for example, is concerned with the meanings of sentences and the relationships between them. It focuses ...
in which a
spurious relationship In statistics, a spurious relationship or spurious correlation is a mathematical relationship in which two or more events or variables are associated but '' not'' causally related, due to either coincidence or the presence of a certain third, u ...
is confused for causation. It asserts that X causes Y when in reality, both X and Y are caused by Z. It is a variation on the ''
post hoc ergo propter hoc ''Post hoc ergo propter hoc'' (Latin: 'after this, therefore because of this') is an informal fallacy that states "Since event Y ''followed'' event X, event Y must have been ''caused'' by event X." It is a fallacy in which an event is presumed to ...
'' fallacy and a member of the questionable cause group of fallacies. All of those examples deal with a lurking variable, which is simply a hidden third variable that affects both of the variables observed to be correlated. That third variable is also known as a
confounding In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlatio ...
variable, with the slight difference that confounding variables need not be hidden and may thus be corrected for in an analysis. Note that the Wikipedia link to lurking variable redirects to confounding. A difficulty often also arises where the third factor, though fundamentally different from A and B, is so closely related to A and/or B as to be confused with them or very difficult to scientifically disentangle from them (see Example 4). ;Example 1 :Sleeping with one's shoes on is strongly correlated with waking up with a headache. :Therefore, sleeping with one's shoes on causes headache. The above example commits the correlation-implies-causation fallacy, as it prematurely concludes that sleeping with one's shoes on causes headache. A more plausible explanation is that both are caused by a third factor, in this case going to bed drunk, which thereby gives rise to a correlation. So the conclusion is false. ;Example 2 :Young children who sleep with the light on are much more likely to develop myopia in later life. :Therefore, sleeping with the light on causes myopia. This is a scientific example that resulted from a study at the
University of Pennsylvania The University of Pennsylvania (Penn or UPenn) is a Private university, private Ivy League research university in Philadelphia, Pennsylvania, United States. One of nine colonial colleges, it was chartered in 1755 through the efforts of f ...
Medical Center. Published in the May 13, 1999, issue of ''
Nature Nature is an inherent character or constitution, particularly of the Ecosphere (planetary), ecosphere or the universe as a whole. In this general sense nature refers to the Scientific law, laws, elements and phenomenon, phenomena of the physic ...
'', the study received much coverage at the time in the popular press. However, a later study at
Ohio State University The Ohio State University (Ohio State or OSU) is a public university, public Land-grant university, land-grant research university in Columbus, Ohio, United States. A member of the University System of Ohio, it was founded in 1870. It is one ...
did not find that
infant In common terminology, a baby is the very young offspring of adult human beings, while infant (from the Latin word ''infans'', meaning 'baby' or 'child') is a formal or specialised synonym. The terms may also be used to refer to juveniles of ...
s sleeping with the light on caused the development of myopia. It did find a strong link between parental myopia and the development of child myopia, also noting that myopic parents were more likely to leave a light on in their children's bedroom. In this case, the cause of both conditions is parental myopia, and the above-stated conclusion is false. ;Example 3 :As ice cream sales increase, the rate of drowning deaths increases sharply. :Therefore, ice cream consumption causes drowning. This example fails to recognize the importance of time of year and temperature to ice cream sales. Ice cream is sold during the hot summer months at a much greater rate than during colder times, and it is during these hot summer months that people are more likely to engage in activities involving water, such as
swimming Swimming is the self-propulsion of a person through water, such as saltwater or freshwater environments, usually for recreation, sport, exercise, or survival. Swimmers achieve locomotion by coordinating limb and body movements to achieve hydrody ...
. The increased drowning deaths are simply caused by more exposure to water-based activities, not ice cream. The stated conclusion is false. ;Example 4 :A hypothetical study shows a relationship between test anxiety scores and shyness scores, with a statistical ''r'' value (strength of correlation) of +.59. :Therefore, it may be simply concluded that shyness, in some part, causally influences test anxiety. However, as encountered in many psychological studies, another variable, a "self-consciousness score", is discovered that has a sharper correlation (+.73) with shyness. This suggests a possible "third variable" problem, however, when three such closely related measures are found, it further suggests that each may have bidirectional tendencies (see " bidirectional variable", above), being a cluster of correlated values each influencing one another to some extent. Therefore, the simple conclusion above may be false. ;Example 5 :Since the 1950s, both the atmospheric CO2 level and
obesity Obesity is a medical condition, considered by multiple organizations to be a disease, in which excess Adipose tissue, body fat has accumulated to such an extent that it can potentially have negative effects on health. People are classifi ...
levels have increased sharply. :Hence, atmospheric CO2 causes obesity. Richer populations tend to eat more food and produce more CO2. ;Example 6 : HDL ("good")
cholesterol Cholesterol is the principal sterol of all higher animals, distributed in body Tissue (biology), tissues, especially the brain and spinal cord, and in Animal fat, animal fats and oils. Cholesterol is biosynthesis, biosynthesized by all anima ...
is negatively correlated with incidence of heart attack. :Therefore, taking medication to raise HDL decreases the chance of having a heart attack. Further research has called this conclusion into question. Instead, it may be that other underlying factors, like genes, diet and exercise, affect both HDL levels and the likelihood of having a heart attack; it is possible that medicines may affect the directly measurable factor, HDL levels, without affecting the chance of heart attack.


Bidirectional causation: A causes B, and B causes A

Causality is not necessarily one-way; in a predator-prey relationship, predator numbers affect prey numbers, but prey numbers, i.e. food supply, also affect predator numbers. Another well-known example is that cyclists have a lower
Body Mass Index Body mass index (BMI) is a value derived from the mass (Mass versus weight, weight) and height of a person. The BMI is defined as the human body weight, body mass divided by the square (algebra), square of the human height, body height, and is ...
than people who do not cycle. This is often explained by assuming that cycling increases
physical activity Physical activity is defined as any voluntary movement produced by skeletal muscles that requires energy expenditure.Global Recommendations on Physical Activity for Health, 2009. World Health Organization. Geneva, Switzerland. Accessed 13/07/2018 ...
levels and therefore decreases BMI. Because results from prospective studies on people who increase their bicycle use show a smaller effect on BMI than cross-sectional studies, there may be some reverse causality as well. For example, people with a lower BMI may be more likely to want to cycle in the first place.


The relationship between A and B is coincidental

The two variables are not related at all, but correlate by chance. The more things are examined, the more likely it is that two unrelated variables will appear to be related. For example: *The result of the last home game by the
Washington Commanders The Washington Commanders are a professional American football team based in the Washington metropolitan area. The Commanders compete in the National Football League (NFL) as a member of the National Football Conference (NFC) East division ...
prior to the presidential election predicted the outcome of every presidential election from 1936 to 2000 inclusive, despite the fact that the outcomes of football games had nothing to do with the outcome of the popular election. This streak was finally broken in
2004 2004 was designated as an International Year of Rice by the United Nations, and the International Year to Commemorate the Struggle Against Slavery and Its Abolition (by UNESCO). Events January * January 3 – Flash Airlines Flight 60 ...
(or
2012 2012 was designated as: *International Year of Cooperatives *International Year of Sustainable Energy for All Events January *January 4 – The Cicada 3301 internet hunt begins. * January 12 – Peaceful protests begin in the R ...
using an alternative formulation of the original rule). *The Mierscheid law, which correlates the
Social Democratic Party of Germany The Social Democratic Party of Germany ( , SPD ) is a social democratic political party in Germany. It is one of the major parties of contemporary Germany. Saskia Esken has been the party's leader since the 2019 leadership election together w ...
's share of the popular vote with the size of crude steel production in Western Germany. *Alternating bald–hairy Russian leaders: A bald (or obviously balding) state leader of Russia has succeeded a non-bald ("hairy") one, and vice versa, for nearly 200 years. *The Bible code, Hebrew words predicting historical events supposedly hidden within the
Torah The Torah ( , "Instruction", "Teaching" or "Law") is the compilation of the first five books of the Hebrew Bible, namely the books of Genesis, Exodus, Leviticus, Numbers and Deuteronomy. The Torah is also known as the Pentateuch () ...
: the huge number of combinations of letters makes appearances of any word in sufficiently lengthy text statistically insignificant.


Use of correlation as scientific evidence

Much of scientific evidence is based upon a correlation of variables that are observed to occur together. Scientists are careful to point out that correlation does not necessarily mean causation. The assumption that A causes B simply because A correlates with B is not accepted as a legitimate form of argument. However, sometimes people commit the opposite fallacy of dismissing correlation entirely. That would dismiss a large swath of important scientific evidence. Since it may be difficult or ethically impossible to run controlled
double-blind In a blind or blinded experiment, information which may influence the participants of the experiment is withheld until after the experiment is complete. Good blinding can reduce or eliminate experimental biases that arise from a participants' expec ...
studies to address certain questions, correlational evidence from several different angles may be useful for ''prediction'' despite failing to provide evidence for ''causation''. For example, social workers might be interested in knowing how child abuse relates to academic performance. Although it would be unethical to perform an experiment in which children are randomly assigned to receive or not receive abuse, researchers can look at existing groups using a non-experimental correlational design. If in fact a negative correlation exists between abuse and academic performance, researchers could potentially use this knowledge of a statistical correlation to make predictions about children outside the study who experience abuse even though the study failed to provide causal evidence that abuse decreases academic performance. The combination of limited available methodologies with the dismissing correlation fallacy has on occasion been used to counter a scientific finding. For example, the
tobacco industry The tobacco industry comprises those persons and companies who are engaged in the growth, preparation for sale, shipment, advertisement, and distribution of tobacco and tobacco-related products. It is a global industry; tobacco can grow in any ...
has historically relied on a dismissal of correlational evidence to reject a link between tobacco smoke and lung cancer, as did biologist and statistician
Ronald Fisher Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
(frequently on the industry's behalf). Correlation is a valuable type of
scientific evidence Scientific evidence is evidence that serves to either support or counter a scientific theory or hypothesis, although scientists also use evidence in other ways, such as when applying theories to practical problems. "Discussions about empirical ev ...
in fields such as medicine, psychology, and sociology. Correlations must first be confirmed as real, and every possible causative relationship must then be systematically explored. In the end, correlation alone cannot be used as evidence for a cause-and-effect relationship between a treatment and benefit, a risk factor and a disease, or a social or economic factor and various outcomes. It is one of the most abused types of evidence because it is easy and even tempting to come to premature conclusions based upon the preliminary appearance of a correlation.


See also

* * * * ** ** ** ** ** * *
Bradford Hill criteria The Bradford Hill criteria, otherwise known as Hill's criteria for causation, are a group of nine principles that can be useful in establishing epidemiologic evidence of a causal relationship between a presumed cause and an observed effect and ha ...
* * * * Curse of the rainbow jersey - example of such a correlation fallacy in sport * * * * * * * * * *


References

; Bundled references


Bibliography

* * {{DEFAULTSORT:Correlation Does Not Imply Causation Causal fallacies Causal inference Covariance and correlation English phrases Misuse of statistics