The nested sampling algorithm is a
computation
Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm).
Mechanical or electronic devices (or, historically, people) that perform computations are known as '' computers''. An esp ...
al approach to the
Bayesian statistics
Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
problems of comparing models and generating samples from posterior distributions. It was developed in 2004 by
physicist
A physicist is a scientist who specializes in the field of physics, which encompasses the interactions of matter and energy at all length and time scales in the physical universe.
Physicists generally are interested in the root or ultimate ca ...
John Skilling.
Background
Bayes' theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For exa ...
can be applied to a pair of competing models
and
for data
, one of which may be true (though which one is unknown) but which both cannot be true simultaneously. The posterior probability for
may be calculated as:
:
The prior probabilities
and
are already known, as they are chosen by the researcher ahead of time. However, the remaining
Bayes factor
The Bayes factor is a ratio of two competing statistical models represented by their marginal likelihood, and is used to quantify the support for one model over the other. The models in questions can have a common set of parameters, such as a nu ...
is not so easy to evaluate, since in general it requires marginalizing nuisance parameters. Generally,
has a set of parameters that can be grouped together and called
, and
has its own vector of parameters that may be of different dimensionality, but is still termed
. The marginalization for
is
:
and likewise for
. This integral is often analytically intractable, and in these cases it is necessary to employ a numerical algorithm to find an approximation. The nested sampling algorithm was developed by John Skilling specifically to approximate these marginalization integrals, and it has the added benefit of generating samples from the posterior distribution
.
It is an alternative to methods from the Bayesian literature
such as bridge sampling and defensive importance sampling.
Here is a simple version of the nested sampling algorithm, followed by a description of how it computes the marginal probability density
where
is
or
:
Start with
points
sampled from prior.
for
to
do % The number of iterations j is chosen by guesswork.
current likelihood values of the points
;
Save the point with least likelihood as a sample point with weight
.
Update the point with least likelihood with some
Markov chain Monte Carlo
In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
steps according to the prior, accepting only steps that
keep the likelihood above
.
end
return
;
At each iteration,
is an estimate of the amount of prior mass covered by the hypervolume in parameter space of all points with likelihood greater than
. The weight factor
is an estimate of the amount of prior mass that lies between two nested hypersurfaces
and
. The update step
computes the sum over
of
to numerically approximate the integral
:
In the limit
, this estimator has a positive bias of order
which can be removed by using
instead of the
in the above algorithm.
The idea is to subdivide the range of
and estimate, for each interval