Baysian Statistics
   HOME

TheInfoList



OR:

Bayesian statistics ( or ) is a theory in the field of
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
based on the Bayesian interpretation of probability, where
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
expresses a ''degree of belief'' in an
event Event may refer to: Gatherings of people * Ceremony, an event of ritual significance, performed on a special occasion * Convention (meeting), a gathering of individuals engaged in some common interest * Event management, the organization of eve ...
. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other
interpretations of probability The word "probability" has been used in a variety of ways since it was first applied to the mathematical study of games of chance. Does probability measure the real, physical, tendency of something to occur, or is it a measure of how strongly on ...
, such as the
frequentist Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pro ...
interpretation, which views probability as the limit of the relative frequency of an event after many trials. More concretely, analysis in Bayesian methods codifies prior knowledge in the form of a
prior distribution A prior probability distribution of an uncertain quantity, simply called the prior, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the ...
. Bayesian statistical methods use
Bayes' theorem Bayes' theorem (alternatively Bayes' law or Bayes' rule, after Thomas Bayes) gives a mathematical rule for inverting Conditional probability, conditional probabilities, allowing one to find the probability of a cause given its effect. For exampl ...
to compute and update probabilities after obtaining new data. Bayes' theorem describes the
conditional probability In probability theory, conditional probability is a measure of the probability of an Event (probability theory), event occurring, given that another event (by assumption, presumption, assertion or evidence) is already known to have occurred. This ...
of an event based on data as well as prior information or beliefs about the event or conditions related to the event. For example, in
Bayesian inference Bayesian inference ( or ) is a method of statistical inference in which Bayes' theorem is used to calculate a probability of a hypothesis, given prior evidence, and update it as more information becomes available. Fundamentally, Bayesian infer ...
, Bayes' theorem can be used to estimate the parameters of a
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
or
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repre ...
. Since Bayesian statistics treats probability as a degree of belief, Bayes' theorem can directly assign a probability distribution that quantifies the belief to the parameter or set of parameters. Bayesian statistics is named after
Thomas Bayes Thomas Bayes ( , ; 7 April 1761) was an English statistician, philosopher and Presbyterian minister who is known for formulating a specific case of the theorem that bears his name: Bayes' theorem. Bayes never published what would become his m ...
, who formulated a specific case of Bayes' theorem in a paper published in 1763. In several papers spanning from the late 18th to the early 19th centuries,
Pierre-Simon Laplace Pierre-Simon, Marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French polymath, a scholar whose work has been instrumental in the fields of physics, astronomy, mathematics, engineering, statistics, and philosophy. He summariz ...
developed the Bayesian interpretation of probability. Laplace used methods now considered Bayesian to solve a number of statistical problems. While many Bayesian methods were developed by later authors, the term "Bayesian" was not commonly used to describe these methods until the 1950s. Throughout much of the 20th century, Bayesian methods were viewed unfavorably by many statisticians due to philosophical and practical considerations. Many of these methods required much computation, and most widely used approaches during that time were based on the frequentist interpretation. However, with the advent of powerful computers and new
algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s like
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it – that ...
, Bayesian methods have gained increasing prominence in statistics in the 21st century.


Bayes's theorem

Bayes's theorem is used in Bayesian methods to update probabilities, which are degrees of belief, after obtaining new data. Given two events A and B, the conditional probability of A given that B is true is expressed as follows: P(A \mid B) = \frac where P(B) \neq 0. Although Bayes's theorem is a fundamental result of
probability theory Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
, it has a specific interpretation in Bayesian statistics. In the above equation, A usually represents a
proposition A proposition is a statement that can be either true or false. It is a central concept in the philosophy of language, semantics, logic, and related fields. Propositions are the object s denoted by declarative sentences; for example, "The sky ...
(such as the statement that a coin lands on heads fifty percent of the time) and B represents the evidence, or new data that is to be taken into account (such as the result of a series of coin flips). P(A) is the
prior probability A prior probability distribution of an uncertain quantity, simply called the prior, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the ...
of A which expresses one's beliefs about A before evidence is taken into account. The prior probability may also quantify prior knowledge or information about A. P(B \mid A) is the
likelihood function A likelihood function (often simply called the likelihood) measures how well a statistical model explains observed data by calculating the probability of seeing that data under different parameter values of the model. It is constructed from the ...
, which can be interpreted as the probability of the evidence B given that A is true. The likelihood quantifies the extent to which the evidence B supports the proposition A. P(A \mid B) is the
posterior probability The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posteri ...
, the probability of the proposition A after taking the evidence B into account. Essentially, Bayes's theorem updates one's prior beliefs P(A) after considering the new evidence B. The probability of the evidence P(B) can be calculated using the
law of total probability In probability theory, the law (or formula) of total probability is a fundamental rule relating marginal probabilities to conditional probabilities. It expresses the total probability of an outcome which can be realized via several distinct ev ...
. If \ is a partition of the
sample space In probability theory, the sample space (also called sample description space, possibility space, or outcome space) of an experiment or random trial is the set of all possible outcomes or results of that experiment. A sample space is usually den ...
, which is the set of all outcomes of an experiment, then, P(B) = P(B \mid A_1)P(A_1) + P(B \mid A_2)P(A_2) + \dots + P(B \mid A_n)P(A_n) = \sum_i P(B \mid A_i)P(A_i) When there are an infinite number of outcomes, it is necessary to integrate over all outcomes to calculate P(B) using the law of total probability. Often, P(B) is difficult to calculate as the calculation would involve sums or integrals that would be time-consuming to evaluate, so often only the product of the prior and likelihood is considered, since the evidence does not change in the same analysis. The posterior is proportional to this product: P(A \mid B) \propto P(B \mid A)P(A) The
maximum a posteriori An estimation procedure that is often claimed to be part of Bayesian statistics is the maximum a posteriori (MAP) estimate of an unknown quantity, that equals the mode of the posterior density with respect to some reference measure, typically ...
, which is the
mode Mode ( meaning "manner, tune, measure, due measure, rhythm, melody") may refer to: Arts and entertainment * MO''D''E (magazine), a defunct U.S. women's fashion magazine * ''Mode'' magazine, a fictional fashion magazine which is the setting fo ...
of the posterior and is often computed in Bayesian statistics using
mathematical optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
methods, remains the same. The posterior can be approximated even without computing the exact value of P(B) with methods such as
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it – that ...
or
variational Bayesian methods Variational Bayesian methods are a family of techniques for approximating intractable integrals arising in Bayesian inference and machine learning. They are typically used in complex statistical models consisting of observed variables (usually ...
.


Bayesian methods

The general set of statistical techniques can be divided into a number of activities, many of which have special Bayesian versions.


Bayesian inference

Bayesian inference refers to
statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
where uncertainty in inferences is quantified using probability. In classical
frequentist inference Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pr ...
, model
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
s and hypotheses are considered to be fixed. Probabilities are not assigned to parameters or hypotheses in frequentist inference. For example, it would not make sense in frequentist inference to directly assign a probability to an event that can only happen once, such as the result of the next flip of a fair coin. However, it would make sense to state that the proportion of heads approaches one-half as the number of coin flips increases.
Statistical models A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
specify a set of statistical assumptions and processes that represent how the sample data are generated. Statistical models have a number of parameters that can be modified. For example, a coin can be represented as samples from a
Bernoulli distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with pro ...
, which models two possible outcomes. The Bernoulli distribution has a single parameter equal to the probability of one outcome, which in most cases is the probability of landing on heads. Devising a good model for the data is central in Bayesian inference. In most cases, models only approximate the true process, and may not take into account certain factors influencing the data. In Bayesian inference, probabilities can be assigned to model parameters. Parameters can be represented as
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s. Bayesian inference uses Bayes' theorem to update probabilities after more evidence is obtained or known. Furthermore, Bayesian methods allow for placing priors on entire models and calculating their posterior probabilities using Bayes' theorem. These posterior probabilities are proportional to the product of the prior and the marginal likelihood, where the marginal likelihood is the integral of the sampling density over the prior distribution of the parameters. In complex models, marginal likelihoods are generally computed numerically.


Statistical modeling

The formulation of
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repre ...
s using Bayesian statistics has the identifying feature of requiring the specification of
prior distribution A prior probability distribution of an uncertain quantity, simply called the prior, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the ...
s for any unknown parameters. Indeed, parameters of prior distributions may themselves have prior distributions, leading to
Bayesian hierarchical modeling Bayesian hierarchical modelling is a statistical model written in multiple levels (hierarchical form) that estimates the parametric model, parameters of the Posterior probability, posterior distribution using the Bayesian inference, Bayesian metho ...
,Hajiramezanali, E. & Dadaneh, S. Z. & Karbalayghareh, A. & Zhou, Z. & Qian, X. Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data. 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montréal, Canada. also known as multi-level modeling. A special case is
Bayesian networks A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their Conditional dependence, conditional dependencies via a directed a ...
. For conducting a Bayesian statistical analysis, best practices are discussed by van de Schoot et al. For reporting the results of a Bayesian statistical analysis, Bayesian analysis reporting guidelines (BARG) are provided in an open-access article by John K. Kruschke.


Design of experiments

The Bayesian design of experiments includes a concept called 'influence of prior beliefs'. This approach uses
sequential analysis In statistics, sequential analysis or sequential hypothesis testing is statistical analysis where the sample size is not fixed in advance. Instead data is evaluated as it is collected, and further sampling is stopped in accordance with a pre-defi ...
techniques to include the outcome of earlier experiments in the design of the next experiment. This is achieved by updating 'beliefs' through the use of prior and
posterior distribution The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior ...
. This allows the design of experiments to make good use of resources of all types. An example of this is the multi-armed bandit problem.


Exploratory analysis of Bayesian models

Exploratory analysis of Bayesian models is an adaptation or extension of the
exploratory data analysis In statistics, exploratory data analysis (EDA) is an approach of data analysis, analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or ...
approach to the needs and peculiarities of Bayesian modeling. In the words of Persi Diaconis: The inference process generates a posterior distribution, which has a central role in Bayesian statistics, together with other distributions like the posterior predictive distribution and the prior predictive distribution. The correct visualization, analysis, and interpretation of these distributions is key to properly answer the questions that motivate the inference process. When working with Bayesian models there are a series of related tasks that need to be addressed besides inference itself: * Diagnoses of the quality of the inference, this is needed when using numerical methods such as
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it – that ...
techniques * Model criticism, including evaluations of both model assumptions and model predictions * Comparison of models, including model selection or model averaging * Preparation of the results for a particular audience All these tasks are part of the Exploratory analysis of Bayesian models approach and successfully performing them is central to the iterative and interactive modeling process. These tasks require both numerical and visual summaries.


See also

*
Bayesian epistemology Bayesian epistemology is a formal approach to various topics in epistemology that has its roots in Thomas Bayes' work in the field of probability theory. One advantage of its formal method in contrast to traditional epistemology is that its concep ...
* For a list of mathematical logic notation used in this article **
Notation in probability and statistics Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols. Probability theory * Random variables are usually written in upper case Roman letters, such as X or Y ...
**
List of logic symbols In logic, a set of symbols is commonly used to express logical representation. The following table lists many common symbols, together with their name, how they should be read out loud, and the related field of mathematics. Additionally, the sub ...


References


Further reading

* * * * * * *''Johnson, Alicia A.; Ott, Miles Q.; Dogucu, Mine. (2022)'
Bayes Rules! An Introduction to Applied Bayesian Modeling
Chapman and Hall ISBN 9780367255398


External links

* *
Bayesian statistics
David Spiegelhalter Sir David John Spiegelhalter (born 16 August 1953) is a British statistician and a Fellow of Churchill College, Cambridge. From 2007 to 2018 he was Winton Professorship of the Public Understanding of Risk, Winton Professor of the Public Under ...
, Kenneth Rice
Scholarpedia ''Scholarpedia'' is an English-language wiki-based online encyclopedia with features commonly associated with Open access (publishing), open-access online academic journals, which aims to have quality content in science and medicine. ''Scholarpe ...
4(8):5230. doi:10.4249/scholarpedia.5230
Bayesian modeling book
and examples available for downloading. *
Bayesian A/B Testing Calculator
Dynamic Yield {{Authority control