The control variates method is a variance reduction technique used in

Monte Carlo methods Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deter ...

. It exploits information about the errors in estimates of known quantities to reduce the error of an estimate of an unknown quantity. Glasserman, P. (2004). ''Monte Carlo Methods in Financial Engineering''. New York: Springer. (p. 185)

Underlying principle

Let the unknown

parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...

of interest be

\mu

, and assume we have a

statistic A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hy ...

m

such that the

expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...

of ''m'' is μ:

\mathbb\left \right \mu

, i.e. ''m'' is an

unbiased estimator In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In st ...

for μ. Suppose we calculate another statistic

t

such that

\tau

is a known value. Then :

m^\star = m + c\left(t-\tau\right) \,

is also an unbiased estimator for

\mu

for any choice of the coefficient

c

. The

variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...

of the resulting estimator

m^

is :

\textrm\left(m^\right)=\textrm\left(m\right) + c^2\,\textrm\left(t\right) + 2c\,\textrm\left(m,t\right).

By differentiating the above expression with respect to

c

, it can be shown that choosing the optimal coefficient :

c^\star = - \frac

minimizes the variance of

m^

, and that with this choice, :

\begin
\textrm\left(m^\right) & =\textrm\left(m\right) - \frac \\
& = \left(1-\rho_^2\right)\textrm\left(m\right)
\end

where :

\rho_=\textrm\left(m,t\right) \,

is the

correlation coefficient A correlation coefficient is a numerical measure of some type of correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two componen ...

m

and

t

. The greater the value of

\vert\rho_\vert

, the greater the variance reduction achieved. In the case that

\textrm\left(m,t\right)

\textrm\left(t\right)

, and/or

\rho_\;

are unknown, they can be estimated across the Monte Carlo replicates. This is equivalent to solving a certain

least squares The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the r ...

system; therefore this technique is also known as regression sampling. When the expectation of the control variable,

\tau

, is not known analytically, it is still possible to increase the precision in estimating

\mu

(for a given fixed simulation budget), provided that the two conditions are met: 1) evaluating

t

is significantly cheaper than computing

m

; 2) the magnitude of the correlation coefficient

, \rho_,

is close to unity.

Example

We would like to estimate :

I = \int_0^1 \frac \, \mathrmx

using Monte Carlo integration. This integral is the expected value of

f(U)

, where :

f(U) = \frac

and ''U'' follows a

uniform distribution Uniform distribution may refer to: * Continuous uniform distribution * Discrete uniform distribution * Uniform distribution (ecology) * Equidistributed sequence In mathematics, a sequence (''s''1, ''s''2, ''s''3, ...) of real numbers is said to be ...

, 1 Using a sample of size ''n'' denote the points in the sample as

u_1, \cdots, u_n

. Then the estimate is given by :

I \approx \frac \sum_i f(u_i).

Now we introduce

g(U) = 1+U

as a control variate with a known expected value

\int_0^1 (1+x) \, \mathrmx=\tfrac

and combine the two into a new estimate :

I \approx \frac \sum_i f(u_i)+c\left(\frac\sum_i g(u_i) -3/2\right).

Using

n=1500

realizations and an estimated optimal coefficient

c^\star \approx 0.4773

we obtain the following results The variance was significantly reduced after using the control variates technique. (The exact result is

I=\ln 2 \approx 0.69314718

Notes

References

* Ross, Sheldon M. (2002) ''Simulation'' 3rd edition * Averill M. Law & W. David Kelton (2000), ''Simulation Modeling and Analysis'', 3rd edition. * S. P. Meyn (2007) ''Control Techniques for Complex Networks'', Cambridge University Press. {{ISBN, 978-0-521-88441-9.
Downloadable draft
(Section 11.4: Control variates and shadow functions) Monte Carlo methods Statistical randomness Computational statistics Variance reduction

Underlying principle

Example

See also

Notes

References