Causal analysis is the field of
experimental design and
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
pertaining to establishing cause and effect. Typically it involves establishing four elements:
correlation
In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
, sequence in time (that is, causes must occur before their proposed effect), a plausible physical or
information-theoretical mechanism for an observed effect to follow from a possible cause, and eliminating the possibility of
common and alternative ("special") causes. Such analysis usually involves one or more artificial or
natural experiments.
Motivation
Data analysis is primarily concerned with causal questions.
For example, did the fertilizer cause the crops to grow? Or, can a given sickness be prevented? Or, why is my friend depressed? The
potential outcomes and
regression analysis techniques handle such queries when data is collected using designed experiments. Data collected in
observational
Observation is the active acquisition of information from a primary source. In living beings, observation employs the senses. In science, observation can also involve the perception and recording of data via the use of scientific instruments. The ...
studies require different techniques for
causal inference
Causal inference is the process of determining the independent, actual effect of a particular phenomenon that is a component of a larger system. The main difference between causal inference and inference of association is that causal inference ana ...
(because, for example, of issues such as
confounding
In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Con ...
). Causal inference techniques used with experimental data require additional assumptions to produce reasonable inferences with observation data.
The difficulty of causal inference under such circumstances is often summed up as "
correlation does not imply causation
The phrase "correlation does not imply causation" refers to the inability to legitimately deduce a cause-and-effect relationship between two events or variables solely on the basis of an observed association or correlation between them. The id ...
".
In philosophy and physics
The nature of causality is systematically investigated in several
academic disciplines, including
philosophy
Philosophy (from , ) is the systematized study of general and fundamental questions, such as those about existence, reason, knowledge, values, mind, and language. Such questions are often posed as problems to be studied or resolved. Some ...
and
physics.
In academia, there are a significant number of theories on causality; ''The Oxford Handbook of Causation'' encompasses 770 pages. Among the more influential theories within
philosophy
Philosophy (from , ) is the systematized study of general and fundamental questions, such as those about existence, reason, knowledge, values, mind, and language. Such questions are often posed as problems to be studied or resolved. Some ...
are
Aristotle's
Four causes and
Al-Ghazali's
occasionalism.
David Hume argued that beliefs about causality are based on experience, and experience similarly based on the assumption that the future models the past, which in turn can only be based on experience – leading to
circular logic. In conclusion, he asserted that
causality is not based on actual reasoning: only correlation can actually be perceived.
Immanuel Kant, according to , held that "a causal principle according to which every event has a cause, or follows according to a causal law, cannot be established through induction as a purely empirical claim, since it would then lack strict universality, or necessity".
Outside the field of philosophy, theories of causation can be identified in
classical mechanics,
statistical mechanics
In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. It does not assume or postulate any natural laws, but explains the macroscopic be ...
,
quantum mechanics,
spacetime theories,
biology,
social sciences, and
law.
To establish a correlation as causal within
physics, it is normally understood that the cause and the effect must connect through a local
mechanism (cf. for instance the concept of
impact) or a
nonlocal mechanism (cf. the concept of
field), in accordance with known
laws of nature.
From the point of view of
thermodynamics, universal properties of causes as compared to effects have been identified through the
Second law of thermodynamics, confirming the ancient, medieval and
Cartesian Cartesian means of or relating to the French philosopher René Descartes—from his Latinized name ''Cartesius''. It may refer to:
Mathematics
*Cartesian closed category, a closed category in category theory
*Cartesian coordinate system, modern ...
view that "the cause is greater than the effect" for the particular case of
thermodynamic free energy. This, in turn, is challenged by popular interpretations of the concepts of
nonlinear system
In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other ...
s and the
butterfly effect, in which small events cause large effects due to, respectively, unpredictability and an unlikely triggering of large amounts of
potential energy
In physics, potential energy is the energy held by an object because of its position relative to other objects, stresses within itself, its electric charge, or other factors.
Common types of potential energy include the gravitational potentia ...
.
Causality construed from counterfactual states
Intuitively, causation seems to require not just a correlation, but a
counterfactual dependence. Suppose that a student performed poorly on a test and guesses that the cause was his not studying. To prove this, one thinks of the counterfactual – the same student writing the same test under the same circumstances but having studied the night before. If one could rewind history, and change only one small thing (making the student study for the exam), then causation could be observed (by comparing version 1 to version 2). Because one cannot rewind history and replay events after making small controlled changes, causation can only be inferred, never exactly known. This is referred to as the Fundamental Problem of Causal Inference – it is impossible to directly observe causal effects.
A major goal of scientific
experiments and statistical methods is to approximate as best possible the counterfactual state of the world. For example, one could run an
experiment on identical twins who were known to consistently get the same grades on their tests. One twin is sent to study for six hours while the other is sent to the amusement park. If their test scores suddenly diverged by a large degree, this would be strong evidence that studying (or going to the amusement park) had a causal effect on test scores. In this case, correlation between studying and test scores would almost certainly imply causation.
Well-designed
experimental studies replace equality of individuals as in the previous example by equality of groups. The objective is to construct two groups that are similar except for the treatment that the groups receive. This is achieved by selecting subjects from a single population and randomly assigning them to two or more groups. The likelihood of the groups behaving similarly to one another (on average) rises with the number of subjects in each group. If the groups are essentially equivalent except for the treatment they receive, and a difference in the outcome for the groups is observed, then this constitutes evidence that the treatment is responsible for the outcome, or in other words the treatment causes the observed effect. However, an observed effect could also be caused "by chance", for example as a result of random perturbations in the population. Statistical tests exist to quantify the likelihood of erroneously concluding that an observed difference exists when in fact it does not (for example see
P-value
In null-hypothesis significance testing, the ''p''-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small ''p''-value means ...
).
Operational definitions of causality
Clive Granger created the first operational definition of causality in 1969.
Granger made the definition of
probabilistic causality proposed by
Norbert Wiener
Norbert Wiener (November 26, 1894 – March 18, 1964) was an American mathematician and philosopher. He was a professor of mathematics at the Massachusetts Institute of Technology (MIT). A child prodigy, Wiener later became an early researcher i ...
operational as a comparison of variances.
Verification by "truth"
Peter Spirtes
Peter may refer to:
People
* List of people named Peter, a list of people and fictional characters with the given name
* Peter (given name)
** Saint Peter (died 60s), apostle of Jesus, leader of the early Christian Church
* Peter (surname), a sur ...
,
Clark Glymour
Clark N. Glymour (born 1942) is the Alumni University Professor Emeritus in the Department of Philosophy at Carnegie Mellon University. He is also a senior research scientist at the Florida Institute for Human and Machine Cognition.
Work
Glymou ...
, and
Richard Scheines
Richard is a male given name. It originates, via Old French, from Old Frankish and is a compound of the words descending from Proto-Germanic ''*rīk-'' 'ruler, leader, king' and ''*hardu-'' 'strong, brave, hardy', and it therefore means 'strong ...
introduced the idea of explicitly not providing a definition of causality .
Spirtes and Glymour introduced the PC algorithm for causal discovery in 1990. Many recent causal discovery algorithms follow the Spirtes-Glymour approach to verification.
Exploratory
Exploratory causal analysis, also known as "data causality" or "causal discovery"
is the use of statistical
algorithms to infer associations in observed data sets that are potentially causal under strict assumptions. ECA is a type of
causal inference
Causal inference is the process of determining the independent, actual effect of a particular phenomenon that is a component of a larger system. The main difference between causal inference and inference of association is that causal inference ana ...
distinct from
causal modeling
In the philosophy of science, a causal model (or structural causal model) is a conceptual model that describes the causal mechanisms of a system. Causal models can improve study designs by providing clear rules for deciding which independent va ...
and
treatment effects in
randomized controlled trials
A randomized controlled trial (or randomized control trial; RCT) is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical te ...
.
It is
exploratory research Exploratory research is "the preliminary research to clarify the exact nature of the problem to be solved." It is used to ensure additional research is taken into consideration during an experiment as well as determining research priorities, collect ...
usually preceding more formal
causal research in the same way
exploratory data analysis often precedes
statistical hypothesis testing
A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis.
Hypothesis testing allows us to make probabilistic statements about population parameters.
...
in
data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, enco ...
.
Computer programs for measuring "Granger causality"
*
R packag
*
Python (programming language), Python packag
See also
*
Causal inference
Causal inference is the process of determining the independent, actual effect of a particular phenomenon that is a component of a larger system. The main difference between causal inference and inference of association is that causal inference ana ...
*
Causal model
*
Causality
Causality (also referred to as causation, or cause and effect) is influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the cau ...
*
Causal reasoning
External links
Causality Workbench team tools and dataUniversity of Pittsburgh CCD team tools
References
Bibliography
*
{{DEFAULTSORT:Causal Analysis
*