The control variates method is a
variance reduction technique used in
Monte Carlo methods
Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deter ...
. It exploits information about the errors in estimates of known quantities to reduce the error of an estimate of an unknown quantity.
[Glasserman, P. (2004). ''Monte Carlo Methods in Financial Engineering''. New York: Springer. (p. 185)]
Underlying principle
Let the unknown
parameter
A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
of interest be
, and assume we have a
statistic
A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hy ...
such that the
expected value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
of ''m'' is μ:
, i.e. ''m'' is an
unbiased estimator
In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In st ...
for μ. Suppose we calculate another statistic
such that
is a known value. Then
:
is also an unbiased estimator for
for any choice of the coefficient
.
The
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...
of the resulting estimator
is
:
By differentiating the above expression with respect to
, it can be shown that choosing the optimal coefficient
:
minimizes the variance of
, and that with this choice,
:
where
:
is the
correlation coefficient
A correlation coefficient is a numerical measure of some type of correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two componen ...
of
and
. The greater the value of
, the greater the
variance reduction achieved.
In the case that
,
, and/or
are unknown, they can be estimated across the Monte Carlo replicates. This is equivalent to solving a certain
least squares
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the r ...
system; therefore this technique is also known as regression sampling.
When the expectation of the control variable,
, is not known analytically, it is still possible to increase the precision in estimating
(for a given fixed simulation budget), provided that the two conditions are met: 1) evaluating
is significantly cheaper than computing
; 2) the magnitude of the correlation coefficient
is close to unity.
Example
We would like to estimate
:
using
Monte Carlo integration. This integral is the expected value of
, where
:
and ''U'' follows a
uniform distribution
Uniform distribution may refer to:
* Continuous uniform distribution
* Discrete uniform distribution
* Uniform distribution (ecology)
* Equidistributed sequence In mathematics, a sequence (''s''1, ''s''2, ''s''3, ...) of real numbers is said to be ...
, 1
Using a sample of size ''n'' denote the points in the sample as
. Then the estimate is given by
:
Now we introduce
as a control variate with a known expected value
and combine the two into a new estimate
:
Using
realizations and an estimated optimal coefficient
we obtain the following results
The variance was significantly reduced after using the control variates technique. (The exact result is
.)
See also
*
Antithetic variates
*
Importance sampling
Notes
References
* Ross, Sheldon M. (2002) ''Simulation'' 3rd edition
* Averill M. Law & W. David Kelton (2000), ''Simulation Modeling and Analysis'', 3rd edition.
* S. P. Meyn (2007) ''Control Techniques for Complex Networks'', Cambridge University Press. {{ISBN, 978-0-521-88441-9.
Downloadable draft(Section 11.4: Control variates and shadow functions)
Monte Carlo methods
Statistical randomness
Computational statistics
Variance reduction