The nested sampling algorithm is a
computation
Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm).
Mechanical or electronic devices (or, historically, people) that perform computations are known as ''computers''. An es ...
al approach to the
Bayesian statistics problems of comparing models and generating samples from posterior distributions. It was developed in 2004 by
physicist
A physicist is a scientist who specializes in the field of physics, which encompasses the interactions of matter and energy at all length and time scales in the physical universe.
Physicists generally are interested in the root or ultimate caus ...
John Skilling.
Background
Bayes' theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
can be applied to a pair of competing models
and
for data
, one of which may be true (though which one is unknown) but which both cannot be true simultaneously. The posterior probability for
may be calculated as:
:
The prior probabilities
and
are already known, as they are chosen by the researcher ahead of time. However, the remaining
Bayes factor is not so easy to evaluate, since in general it requires marginalizing nuisance parameters. Generally,
has a set of parameters that can be grouped together and called
, and
has its own vector of parameters that may be of different dimensionality, but is still termed
. The marginalization for
is
:
and likewise for
. This integral is often analytically intractable, and in these cases it is necessary to employ a numerical algorithm to find an approximation. The nested sampling algorithm was developed by John Skilling specifically to approximate these marginalization integrals, and it has the added benefit of generating samples from the posterior distribution
.
It is an alternative to methods from the Bayesian literature
such as bridge sampling and defensive importance sampling.
Here is a simple version of the nested sampling algorithm, followed by a description of how it computes the marginal probability density
where
is
or
:
Start with
points
sampled from prior.
for
to
do % The number of iterations j is chosen by guesswork.
current likelihood values of the points
;
Save the point with least likelihood as a sample point with weight
.
Update the point with least likelihood with some
Markov chain Monte Carlo
In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
steps according to the prior, accepting only steps that
keep the likelihood above
.
end
return
;
At each iteration,
is an estimate of the amount of prior mass covered by the hypervolume in parameter space of all points with likelihood greater than
. The weight factor
is an estimate of the amount of prior mass that lies between two nested hypersurfaces
and
. The update step
computes the sum over
of
to numerically approximate the integral
:
In the limit
, this estimator has a positive bias of order
which can be removed by using
instead of the
in the above algorithm.
The idea is to subdivide the range of
and estimate, for each interval
Lebesgue integration
In mathematics, the integral of a non-negative function of a single variable can be regarded, in the simplest case, as the area between the graph of that function and the -axis. The Lebesgue integral, named after French mathematician Henri Lebe ...
.
Implementations
Example implementations demonstrating the nested sampling algorithm are publicly available for download, written in several
programming language
A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language.
The description of a programming ...
s.
* Simple examples in
C,
R, or
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
are on John Skilling's website.
* A
Haskell port of the above simple codes is on Hackage.
* An example in
R originally designed for
fitting spectra is described at and is on GitHub.
* An example in
C++, named Diamonds, is on GitHub.
* A highly modular
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
parallel example for
statistical physics and
condensed matter physics
Condensed matter physics is the field of physics that deals with the macroscopic and microscopic physical properties of matter, especially the solid and liquid phases which arise from electromagnetic forces between atoms. More generally, the sub ...
uses is on GitHub.
* pymatnest is a
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
package designed for exploring the
energy landscape
An energy landscape is a mapping of possible states of a system. The concept is frequently used in physics, chemistry, and biochemistry, e.g. to describe all possible conformations of a molecular entity, or the spatial positions of interacting mole ...
of different materials, calculating thermodynamic variables at arbitrary temperatures and locating
phase transitions is on GitHub.
* The MultiNest software package is capable of performing nested sampling on multi-modal posterior distributions.
It has interfaces for C++, Fortran and Python inputs, and is available on GitHub.
* PolyChord is another nested sampling software package available on GitHub. PolyChord's computational efficiency scales better with an increase in the number of parameters than MultiNest, meaning PolyChord can be more efficient for high dimensional problems.
* NestedSamplers.jl a
Julia package for implementing single- and multi-ellipsoidal nested sampling algorithms is on GitHub.
Koraliis a high-performance framework for uncertainty quantification, optimization, and deep reinforcement learning, which also implements nested sampling.
Applications
Since nested sampling was proposed in 2004, it has been used in many aspects of the field of
astronomy
Astronomy () is a natural science that studies astronomical object, celestial objects and phenomena. It uses mathematics, physics, and chemistry in order to explain their origin and chronology of the Universe, evolution. Objects of interest ...
. One paper suggested using nested sampling for
cosmological model selection and object detection, as it "uniquely combines accuracy, general applicability and computational feasibility."
A refinement of the algorithm to handle multimodal posteriors has been suggested as a means to detect astronomical objects in extant datasets.
Other applications of nested sampling are in the field of
finite element updating Finite element model updating is the process of ensuring that finite element analysis results in models that better reflect the measured data than the initial models. It is part of verification and validation of numerical models.
The process
The ...
where the algorithm is used to choose an optimal
finite element model, and this was applied to
structural dynamics. This sampling method has also been used in the field of materials modeling. It can be used to learn the
partition function from
statistical mechanics
In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. It does not assume or postulate any natural laws, but explains the macroscopic be ...
and derive
thermodynamic
Thermodynamics is a branch of physics that deals with heat, work, and temperature, and their relation to energy, entropy, and the physical properties of matter and radiation. The behavior of these quantities is governed by the four laws of ther ...
properties.
Dynamic nested sampling
Dynamic nested sampling is a generalisation of the nested sampling algorithm in which the number of samples taken in different regions of the parameter space is dynamically adjusted to maximise calculation accuracy. This can lead to large improvements in accuracy and computational efficiency when compared to the original nested sampling algorithm, in which the allocation of samples cannot be changed and often many samples are taken in regions which have little effect on calculation accuracy.
Publicly available dynamic nested sampling software packages include:
* - a Python implementation of dynamic nested sampling which can be downloaded from GitHub.
* dyPolyChord: a software package which can be used with Python, C++ and Fortran likelihood and prior distributions. dyPolyChord is available on GitHub.
Dynamic nested sampling has been applied to a variety of scientific problems, including analysis of gravitational waves, mapping distances in space and exoplanet detection.
See also
*
Bayesian model comparison
The Bayes factor is a ratio of two competing statistical models represented by their marginal likelihood, and is used to quantify the support for one model over the other. The models in questions can have a common set of parameters, such as a nul ...
*
List of algorithms
References
{{Reflist
Bayesian statistics
Model selection
Randomized algorithms