HOME

TheInfoList



OR:

Bayesian probability ( or ) is an interpretation of the concept of probability, in which, instead of
frequency Frequency is the number of occurrences of a repeating event per unit of time. Frequency is an important parameter used in science and engineering to specify the rate of oscillatory and vibratory phenomena, such as mechanical vibrations, audio ...
or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief. The Bayesian interpretation of probability can be seen as an extension of
propositional logic The propositional calculus is a branch of logic. It is also called propositional logic, statement logic, sentential calculus, sentential logic, or sometimes zeroth-order logic. Sometimes, it is called ''first-order'' propositional logic to contra ...
that enables reasoning with
hypotheses A hypothesis (: hypotheses) is a proposed explanation for a phenomenon. A scientific method, scientific hypothesis must be based on observations and make a testable and reproducible prediction about reality, in a process beginning with an educ ...
; that is, with propositions whose truth or falsity is unknown. In the Bayesian view, a probability is assigned to a hypothesis, whereas under frequentist inference, a hypothesis is typically tested without being assigned a probability. Bayesian probability belongs to the category of evidential probabilities; to evaluate the probability of a hypothesis, the Bayesian probabilist specifies a
prior probability A prior probability distribution of an uncertain quantity, simply called the prior, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the ...
. This, in turn, is then updated to a
posterior probability The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posteri ...
in the light of new, relevant
data Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
(evidence). The Bayesian interpretation provides a standard set of procedures and formulae to perform this calculation. The term ''Bayesian'' derives from the 18th-century mathematician and theologian
Thomas Bayes Thomas Bayes ( , ; 7 April 1761) was an English statistician, philosopher and Presbyterian minister who is known for formulating a specific case of the theorem that bears his name: Bayes' theorem. Bayes never published what would become his m ...
, who provided the first mathematical treatment of a non-trivial problem of statistical
data analysis Data analysis is the process of inspecting, Data cleansing, cleansing, Data transformation, transforming, and Data modeling, modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Da ...
using what is now known as Bayesian inference. Mathematician
Pierre-Simon Laplace Pierre-Simon, Marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French polymath, a scholar whose work has been instrumental in the fields of physics, astronomy, mathematics, engineering, statistics, and philosophy. He summariz ...
pioneered and popularized what is now called Bayesian probability.


Bayesian methodology

Bayesian methods are characterized by concepts and procedures as follows: * The use of
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s, or more generally unknown quantities, to model all sources of
uncertainty Uncertainty or incertitude refers to situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown, and is particularly relevant for decision ...
in statistical models including uncertainty resulting from lack of information (see also aleatoric and epistemic uncertainty). * The need to determine the prior
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
taking into account the available (prior) information. * The sequential use of
Bayes' theorem Bayes' theorem (alternatively Bayes' law or Bayes' rule, after Thomas Bayes) gives a mathematical rule for inverting Conditional probability, conditional probabilities, allowing one to find the probability of a cause given its effect. For exampl ...
: as more data become available, calculate the posterior distribution using Bayes' theorem; subsequently, the posterior distribution becomes the next prior. * While for the frequentist, a
hypothesis A hypothesis (: hypotheses) is a proposed explanation for a phenomenon. A scientific hypothesis must be based on observations and make a testable and reproducible prediction about reality, in a process beginning with an educated guess o ...
is a
proposition A proposition is a statement that can be either true or false. It is a central concept in the philosophy of language, semantics, logic, and related fields. Propositions are the object s denoted by declarative sentences; for example, "The sky ...
(which must be either true or false) so that the frequentist probability of a hypothesis is either 0 or 1, in Bayesian statistics, the probability that can be assigned to a hypothesis can also be in a range from 0 to 1 if the truth value is uncertain.


Objective and subjective Bayesian probabilities

Broadly speaking, there are two interpretations of Bayesian probability. For objectivists, who interpret probability as an extension of
logic Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the study of deductively valid inferences or logical truths. It examines how conclusions follow from premises based on the structure o ...
, ''probability'' quantifies the reasonable expectation that everyone (even a "robot") who shares the same knowledge should share in accordance with the rules of Bayesian statistics, which can be justified by Cox's theorem. For subjectivists, ''probability'' corresponds to a personal belief. Rationality and coherence allow for substantial variation within the constraints they pose; the constraints are justified by the Dutch book argument or by
decision theory Decision theory or the theory of rational choice is a branch of probability theory, probability, economics, and analytic philosophy that uses expected utility and probabilities, probability to model how individuals would behave Rationality, ratio ...
and de Finetti's theorem. The objective and subjective variants of Bayesian probability differ mainly in their interpretation and construction of the prior probability.


History

The term ''Bayesian'' derives from
Thomas Bayes Thomas Bayes ( , ; 7 April 1761) was an English statistician, philosopher and Presbyterian minister who is known for formulating a specific case of the theorem that bears his name: Bayes' theorem. Bayes never published what would become his m ...
(1702–1761), who proved a special case of what is now called
Bayes' theorem Bayes' theorem (alternatively Bayes' law or Bayes' rule, after Thomas Bayes) gives a mathematical rule for inverting Conditional probability, conditional probabilities, allowing one to find the probability of a cause given its effect. For exampl ...
in a paper titled " An Essay Towards Solving a Problem in the Doctrine of Chances". In that special case, the prior and posterior distributions were beta distributions and the data came from
Bernoulli trial In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is ...
s. It was
Pierre-Simon Laplace Pierre-Simon, Marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French polymath, a scholar whose work has been instrumental in the fields of physics, astronomy, mathematics, engineering, statistics, and philosophy. He summariz ...
(1749–1827) who introduced a general version of the theorem and used it to approach problems in
celestial mechanics Celestial mechanics is the branch of astronomy that deals with the motions of objects in outer space. Historically, celestial mechanics applies principles of physics (classical mechanics) to astronomical objects, such as stars and planets, to ...
, medical statistics, reliability, and
jurisprudence Jurisprudence, also known as theory of law or philosophy of law, is the examination in a general perspective of what law is and what it ought to be. It investigates issues such as the definition of law; legal validity; legal norms and values ...
. Early Bayesian inference, which used uniform priors following Laplace's principle of insufficient reason, was called " inverse probability" (because it infers backwards from observations to parameters, or from effects to causes). After the 1920s, "inverse probability" was largely supplanted by a collection of methods that came to be called frequentist statistics. In the 20th century, the ideas of Laplace developed in two directions, giving rise to ''objective'' and ''subjective'' currents in Bayesian practice.
Harold Jeffreys Sir Harold Jeffreys, FRS (22 April 1891 – 18 March 1989) was a British geophysicist who made significant contributions to mathematics and statistics. His book, ''Theory of Probability'', which was first published in 1939, played an importan ...
' ''Theory of Probability'' (first published in 1939) played an important role in the revival of the Bayesian view of probability, followed by works by Abraham Wald (1950) and Leonard J. Savage (1954). The adjective ''Bayesian'' itself dates to the 1950s; the derived ''Bayesianism'', ''neo-Bayesianism'' is of 1960s coinage. In the objectivist stream, the statistical analysis depends on only the model assumed and the data analysed. No subjective decisions need to be involved. In contrast, "subjectivist" statisticians deny the possibility of fully objective analysis for the general case. In the 1980s, there was a dramatic growth in research and applications of Bayesian methods, mostly attributed to the discovery of
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it – that ...
methods and the consequent removal of many of the computational problems, and to an increasing interest in nonstandard, complex applications. While frequentist statistics remains strong (as demonstrated by the fact that much of undergraduate teaching is based on it ), Bayesian methods are widely accepted and used, e.g., in the field of
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
.


Justification

The use of Bayesian probabilities as the basis of Bayesian inference has been supported by several arguments, such as Cox axioms, the Dutch book argument, arguments based on
decision theory Decision theory or the theory of rational choice is a branch of probability theory, probability, economics, and analytic philosophy that uses expected utility and probabilities, probability to model how individuals would behave Rationality, ratio ...
and de Finetti's theorem.


Axiomatic approach

Richard T. Cox showed that Bayesian updating follows from several axioms, including two functional equations and a hypothesis of differentiability. The assumption of differentiability or even continuity is controversial; Halpern found a counterexample based on his observation that the Boolean algebra of statements may be finite. Other axiomatizations have been suggested by various authors with the purpose of making the theory more rigorous.


Dutch book approach

Bruno de Finetti Bruno de Finetti (13 June 1906 – 20 July 1985) was an Italian probabilist statistician and actuary, noted for the "operational subjective" conception of probability. The classic exposition of his distinctive theory is the 1937 , which discuss ...
proposed the Dutch book argument based on betting. A clever bookmaker makes a Dutch book by setting the
odds In probability theory, odds provide a measure of the probability of a particular outcome. Odds are commonly used in gambling and statistics. For example for an event that is 40% probable, one could say that the odds are or When gambling, o ...
and bets to ensure that the bookmaker profits—at the expense of the gamblers—regardless of the outcome of the event (a horse race, for example) on which the gamblers bet. It is associated with
probabilities Probability is a branch of mathematics and statistics concerning Event (probability theory), events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probab ...
implied by the odds not being coherent. However,
Ian Hacking Ian MacDougall Hacking (February 18, 1936 – May 10, 2023) was a Canadian philosopher specializing in the philosophy of science. Throughout his career, he won numerous awards, such as the Killam Prize for the Humanities and the Balzan Prize, ...
noted that traditional Dutch book arguments did not specify Bayesian updating: they left open the possibility that non-Bayesian updating rules could avoid Dutch books. For example, Hacking writes "And neither the Dutch book argument, nor any other in the personalist arsenal of proofs of the probability axioms, entails the dynamic assumption. Not one entails Bayesianism. So the personalist requires the dynamic assumption to be Bayesian. It is true that in consistency a personalist could abandon the Bayesian model of learning from experience. Salt could lose its savour." In fact, there are non-Bayesian updating rules that also avoid Dutch books (as discussed in the literature on " probability kinematics" following the publication of Richard C. Jeffrey's rule, which is itself regarded as Bayesian). The additional hypotheses sufficient to (uniquely) specify Bayesian updating are substantial and not universally seen as satisfactory.


Decision theory approach

A decision-theoretic justification of the use of Bayesian inference (and hence of Bayesian probabilities) was given by Abraham Wald, who proved that every admissible statistical procedure is either a Bayesian procedure or a limit of Bayesian procedures. Conversely, every Bayesian procedure is admissible.


Personal probabilities and objective methods for constructing priors

Following the work on
expected utility The expected utility hypothesis is a foundational assumption in mathematical economics concerning decision making under uncertainty. It postulates that rational agents maximize utility, meaning the subjective desirability of their actions. Ratio ...
theory A theory is a systematic and rational form of abstract thinking about a phenomenon, or the conclusions derived from such thinking. It involves contemplative and logical reasoning, often supported by processes such as observation, experimentation, ...
of Ramsey and von Neumann, decision-theorists have accounted for rational behavior using a probability distribution for the agent. Johann Pfanzagl completed the '' Theory of Games and Economic Behavior'' by providing an axiomatization of subjective probability and utility, a task left uncompleted by von Neumann and
Oskar Morgenstern Oskar Morgenstern (; January 24, 1902 – July 26, 1977) was a German-born economist. In collaboration with mathematician John von Neumann, he is credited with founding the field of game theory and its application to social sciences and strategic ...
: their original theory supposed that all the agents had the same probability distribution, as a convenience. Pfanzagl's axiomatization was endorsed by Oskar Morgenstern: "Von Neumann and I have anticipated ...
he question whether probabilities He or HE may refer to: Language * He (letter), the fifth letter of the Semitic abjads * He (pronoun), a pronoun in Modern English * He (kana), one of the Japanese kana (へ in hiragana and ヘ in katakana) * Ge (Cyrillic), a Cyrillic letter ca ...
might, perhaps more typically, be subjective and have stated specifically that in the latter case axioms could be found from which could derive the desired numerical utility together with a number for the probabilities (cf. p. 19 of The Theory of Games and Economic Behavior). We did not carry this out; it was demonstrated by Pfanzagl ... with all the necessary rigor". Ramsey and Savage noted that the individual agent's probability distribution could be objectively studied in experiments. Procedures for testing hypotheses about probabilities (using finite samples) are due to Ramsey (1931) and de Finetti (1931, 1937, 1964, 1970). Both
Bruno de Finetti Bruno de Finetti (13 June 1906 – 20 July 1985) was an Italian probabilist statistician and actuary, noted for the "operational subjective" conception of probability. The classic exposition of his distinctive theory is the 1937 , which discuss ...
and
Frank P. Ramsey Frank Plumpton Ramsey (; 22 February 1903 – 19 January 1930) was a British people, British philosopher, mathematician, and economist who made major contributions to all three fields before his death at the age of 26. He was a close friend of ...
acknowledge their debts to pragmatic philosophy, particularly (for Ramsey) to Charles S. Peirce. The "Ramsey test" for evaluating probability distributions is implementable in theory, and has kept experimental psychologists occupied for a half century. This work demonstrates that Bayesian-probability propositions can be falsified, and so meet an empirical criterion of Charles S. Peirce, whose work inspired Ramsey. (This
falsifiability Falsifiability (or refutability) is a deductive standard of evaluation of scientific theories and hypotheses, introduced by the Philosophy of science, philosopher of science Karl Popper in his book ''The Logic of Scientific Discovery'' (1934). ...
-criterion was popularized by
Karl Popper Sir Karl Raimund Popper (28 July 1902 – 17 September 1994) was an Austrian–British philosopher, academic and social commentator. One of the 20th century's most influential philosophers of science, Popper is known for his rejection of the ...
.) Modern work on the experimental evaluation of personal probabilities uses the randomization, blinding, and Boolean-decision procedures of the Peirce-Jastrow experiment.Peirce & Jastrow (1885) Since individuals act according to different probability judgments, these agents' probabilities are "personal" (but amenable to objective study). Personal probabilities are problematic for science and for some applications where decision-makers lack the knowledge or time to specify an informed probability-distribution (on which they are prepared to act). To meet the needs of science and of human limitations, Bayesian statisticians have developed "objective" methods for specifying prior probabilities. Indeed, some Bayesians have argued the prior state of knowledge defines ''the'' (unique) prior probability-distribution for "regular" statistical problems; cf. well-posed problems. Finding the right method for constructing such "objective" priors (for appropriate classes of regular problems) has been the quest of statistical theorists from Laplace to
John Maynard Keynes John Maynard Keynes, 1st Baron Keynes ( ; 5 June 1883 – 21 April 1946), was an English economist and philosopher whose ideas fundamentally changed the theory and practice of macroeconomics and the economic policies of governments. Originall ...
,
Harold Jeffreys Sir Harold Jeffreys, FRS (22 April 1891 – 18 March 1989) was a British geophysicist who made significant contributions to mathematics and statistics. His book, ''Theory of Probability'', which was first published in 1939, played an importan ...
, and
Edwin Thompson Jaynes Edwin Thompson Jaynes (July 5, 1922 – April 30, 1998) was the Wayman Crow Distinguished Professor of Physics at Washington University in St. Louis. He wrote extensively on statistical mechanics and on foundations of probability and statistic ...
. These theorists and their successors have suggested several methods for constructing "objective" priors (Unfortunately, it is not always clear how to assess the relative "objectivity" of the priors proposed under these methods): * Maximum entropy * Transformation group analysis * Reference analysis Each of these methods contributes useful priors for "regular" one-parameter problems, and each prior can handle some challenging
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repre ...
s (with "irregularity" or several parameters). Each of these methods has been useful in Bayesian practice. Indeed, methods for constructing "objective" (alternatively, "default" or "ignorance") priors have been developed by avowed subjective (or "personal") Bayesians like James Berger (
Duke University Duke University is a Private university, private research university in Durham, North Carolina, United States. Founded by Methodists and Quakers in the present-day city of Trinity, North Carolina, Trinity in 1838, the school moved to Durham in 1 ...
) and José-Miguel Bernardo ( Universitat de València), simply because such priors are needed for Bayesian practice, particularly in science. The quest for "the universal method for constructing priors" continues to attract statistical theorists. Thus, the Bayesian statistician needs either to use informed priors (using relevant expertise or previous data) or to choose among the competing methods for constructing "objective" priors.


See also

* '' An Essay Towards Solving a Problem in the Doctrine of Chances'' * Bayesian epistemology * Bertrand paradox—a paradox in classical probability * Credal network * Credence (statistics) * De Finetti's game—a procedure for evaluating someone's subjective probability * Evidence under Bayes' theorem * Monty Hall problem * QBism—an interpretation of quantum mechanics based on subjective Bayesian probability * Reference class problem


References


Bibliography

* * * * * * * (translation of de Finetti, 1931) * (translation of de Finetti, 1937, above) * , , two volumes. * Goertz, Gary and James Mahoney. 2012. ''A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences''. Princeton University Press. *. * :(Partly reprinted in ) * * * * * * ( * * * * * * * * * * {{divcol end
Probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
Justification (epistemology) Probability interpretations Philosophy of mathematics Philosophy of science