A computer experiment or simulation experiment is an experiment used to study a computer simulation, also referred to as an
in silico system. This area includes
computational physics,
computational chemistry
Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into computer programs, to calculate the structures and properties of mo ...
,
computational biology and other similar disciplines.
Background
Computer simulation
Computer simulation is the process of mathematical modelling, performed on a computer, which is designed to predict the behaviour of, or the outcome of, a real-world or physical system. The reliability of some mathematical models can be dete ...
s are constructed to emulate a physical system. Because these are meant to replicate some aspect of a system in detail, they often do not yield an analytic solution. Therefore, methods such as
discrete event simulation or
finite element
The finite element method (FEM) is a popular method for numerically solving differential equations arising in engineering and mathematical models, mathematical modeling. Typical problem areas of interest include the traditional fields of struct ...
solvers are used. A
computer model is used to make inferences about the system it replicates. For example,
climate models are often used because experimentation on an earth sized object is impossible.
Objectives
Computer experiments have been employed with many purposes in mind. Some of those include:
*
Uncertainty quantification
Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system a ...
: Characterize the uncertainty present in a computer simulation arising from unknowns during the computer simulation's construction.
*
Inverse problems: Discover the underlying properties of the system from the physical data.
* Bias correction: Use physical data to correct for bias in the simulation.
*
Data assimilation: Combine multiple simulations and physical data sources into a complete predictive model.
*
Systems design: Find inputs that result in optimal system performance measures.
Computer simulation modeling
Modeling of computer experiments typically uses a Bayesian framework.
Bayesian statistics
Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
is an interpretation of the field of
statistics where all evidence about the true state of the world is explicitly expressed in the form of
probabilities. In the realm of computer experiments, the Bayesian interpretation would imply we must form a
prior distribution that represents our prior belief on the structure of the computer model. The use of this philosophy for computer experiments started in the 1980s and is nicely summarized by Sacks et al. (1989
While the Bayesian approach is widely used,
frequentist approaches have been recently discusse
The basic idea of this framework is to model the computer simulation as an unknown function of a set of inputs. The computer simulation is implemented as a piece of computer code that can be evaluated to produce a collection of outputs. Examples of inputs to these simulations are coefficients in the underlying model,
initial conditions and
forcing functions. It is natural to see the simulation as a deterministic function that maps these ''inputs'' into a collection of ''outputs''. On the basis of seeing our simulator this way, it is common to refer to the collection of inputs as
, the computer simulation itself as
, and the resulting output as
. Both
and
are vector quantities, and they can be very large collections of values, often indexed by space, or by time, or by both space and time.
Although
is known in principle, in practice this is not the case. Many simulators comprise tens of thousands of lines of high-level computer code, which is not accessible to intuition. For some simulations, such as climate models, evaluation of the output for a single set of inputs can require millions of computer hour
Gaussian process prior
The typical model for a computer code output is a Gaussian process. For notational simplicity, assume
is a scalar. Owing to the Bayesian framework, we fix our belief that the function
follows a
Gaussian process,
where
is the mean function and
is the covariance function. Popular mean functions are low order polynomials and a popular
covariance function is
Matern covariance, which includes both the exponential (
) and Gaussian covariances (as
).
Design of computer experiments
The design of computer experiments has considerable differences from
design of experiments
The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
for parametric models. Since a Gaussian process prior has an infinite dimensional representation, the concepts of A and D criteria (see
Optimal design), which focus on reducing the error in the parameters, cannot be used. Replications would also be wasteful in cases when the computer simulation has no error. Criteria that are used to determine a good experimental design include integrated mean squared prediction erro
and distance based criteri
Popular strategies for design include
latin hypercube sampling Latin hypercube sampling (LHS) is a statistical method for generating a near-random sample of parameter values from a multidimensional distribution. The sampling method is often used to construct computer experiments or for Monte Carlo integratio ...
and
low discrepancy sequences.
Problems with massive sample sizes
Unlike physical experiments, it is common for computer experiments to have thousands of different input combinations. Because the standard inference requires
matrix inversion of a square matrix of the size of the number of samples (
), the cost grows on the
. Matrix inversion of large, dense matrices can also cause numerical inaccuracies. Currently, this problem is solved by greedy decision tree techniques, allowing effective computations for unlimited dimensionality and sample siz
patent WO2013055257A1 or avoided by using approximation methods, e.g
See also
*
Simulation
A simulation is the imitation of the operation of a real-world process or system over time. Simulations require the use of models; the model represents the key characteristics or behaviors of the selected system or process, whereas the ...
*
Uncertainty quantification
Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system a ...
*
Bayesian statistics
Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
*
Gaussian process emulator
*
Design of experiments
The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
*
Molecular dynamics
Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of th ...
*
Monte Carlo method
Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deter ...
*
Surrogate model
*
Grey box completion and validation
In mathematics, statistics, and computational modelling, a grey box modelKroll, Andreas (2000). Grey-box models: Concepts and application. In: New Frontiers in Computational Intelligence and its Applications, vol.57 of Frontiers in artificial intel ...
Further reading
*
* {{cite journal , last1 = Fehr , first1 = Jörg , last2 = Heiland , first2 = Jan , last3 = Himpe , first3 = Christian , last4 = Saak , first4 = Jens , title = Best practices for replicability, reproducibility and reusability of computer-based experiments exemplified by model reduction software , journal = AIMS Mathematics , volume = 1 , issue = 3 , pages = 261–281 , date = 2016 , doi = 10.3934/Math.2016.3.261 , arxiv = 1607.01191
Computational science
Design of experiments
Simulation