In
statistics, propagation of uncertainty (or propagation of error) is the effect of
variables'
uncertainties (or
errors
An error (from the Latin ''error'', meaning "wandering") is an action which is inaccurate or incorrect. In some usages, an error is synonymous with a mistake. The etymology derives from the Latin term 'errare', meaning 'to stray'.
In statistic ...
, more specifically
random error
Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a "mistake ...
s) on the uncertainty of a
function
Function or functionality may refer to:
Computing
* Function key, a type of key on computer keyboards
* Function model, a structured representation of processes in a system
* Function object or functor or functionoid, a concept of object-orie ...
based on them. When the variables are the values of experimental measurements they have
uncertainties due to measurement limitations (e.g., instrument
precision) which propagate due to the combination of variables in the function.
The uncertainty ''u'' can be expressed in a number of ways.
It may be defined by the
absolute error
The approximation error in a data value is the discrepancy between an exact value and some ''approximation'' to it. This error can be expressed as an absolute error (the numerical amount of the discrepancy) or as a relative error (the absolute er ...
. Uncertainties can also be defined by the
relative error
The approximation error in a data value is the discrepancy between an exact value and some ''approximation'' to it. This error can be expressed as an absolute error (the numerical amount of the discrepancy) or as a relative error (the absolute er ...
, which is usually written as a percentage.
Most commonly, the uncertainty on a quantity is quantified in terms of the
standard deviation, , which is the positive square root of the
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...
. The value of a quantity and its error are then expressed as an interval . If the statistical
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...
of the variable is known or can be assumed, it is possible to derive
confidence limits to describe the region within which the true value of the variable may be found. For example, the 68% confidence limits for a one-dimensional variable belonging to a
normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu i ...
are approximately ± one standard deviation from the central value , which means that the region will cover the true value in roughly 68% of cases.
If the uncertainties are
correlated
In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statisti ...
then
covariance
In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the le ...
must be taken into account. Correlation can arise from two different sources. First, the ''measurement errors'' may be correlated. Second, when the underlying values are correlated across a population, the ''uncertainties in the group averages'' will be correlated. For very expensive data or complex functions, the error propagation may be achieved with a
surrogate model A surrogate model is an engineering method used when an outcome of interest cannot be easily measured or computed, so a model of the outcome is used instead. Most engineering design problems require experiments and/or simulations to evaluate design ...
, e.g. based on
Bayesian probability theory
Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification o ...
.
Linear combinations
Let
be a set of ''m'' functions, which are linear combinations of
variables
with combination coefficients
:
:
or in matrix notation,
:
Also let the
variance–covariance matrix of ''x'' = (''x''
1, ..., ''x''
''n'') be denoted by
and let the mean value be denoted by
:
:
is the
outer product
In linear algebra, the outer product of two coordinate vectors is a matrix. If the two vectors have dimensions ''n'' and ''m'', then their outer product is an ''n'' × ''m'' matrix. More generally, given two tensors (multidimensional arrays of n ...
.
Then, the variance–covariance matrix
of ''f'' is given by
:
In component notation, the equation
:
reads
:
This is the most general expression for the propagation of error from one set of variables onto another. When the errors on ''x'' are uncorrelated, the general expression simplifies to
:
where
is the variance of ''k''-th element of the ''x'' vector.
Note that even though the errors on ''x'' may be uncorrelated, the errors on ''f'' are in general correlated; in other words, even if
is a diagonal matrix,
is in general a full matrix.
The general expressions for a scalar-valued function ''f'' are a little simpler (here a is a row vector):
:
:
Each covariance term
can be expressed in terms of the
correlation coefficient
A correlation coefficient is a numerical measure of some type of correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two componen ...
by
, so that an alternative expression for the variance of ''f'' is
:
In the case that the variables in ''x'' are uncorrelated, this simplifies further to
:
In the simple case of identical coefficients and variances, we find
:
For the arithmetic mean,
, the result is the
standard error of the mean
The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error ...
:
:
Non-linear combinations
When ''f'' is a set of non-linear combination of the variables ''x'', an
interval propagation
In numerical mathematics, interval propagation or interval constraint propagation is the problem of contracting interval domains associated to variables of R without removing any value that is consistent with a set of constraints (i.e., equations ...
could be performed in order to compute intervals which contain all consistent values for the variables. In a probabilistic approach, the function ''f'' must usually be linearised by approximation to a first-order
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor se ...
expansion, though in some cases, exact formulae can be derived that do not depend on the expansion as is the case for the exact variance of products.
The Taylor expansion would be:
:
where
denotes the
partial derivative
In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). Pa ...
of ''f
k'' with respect to the ''i''-th variable, evaluated at the mean value of all components of vector ''x''. Or in
matrix notation
In mathematics, a matrix (plural matrices) is a rectangular array or table of numbers, symbols, or expressions, arranged in rows and columns, which is used to represent a mathematical object or a property of such an object.
For example,
\begi ...
,
:
where J is the
Jacobian matrix
In vector calculus, the Jacobian matrix (, ) of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables ...
. Since f
0 is a constant it does not contribute to the error on f. Therefore, the propagation of error follows the linear case, above, but replacing the linear coefficients, ''A
ki'' and ''A
kj'' by the partial derivatives,
and
. In matrix notation,
:
That is, the Jacobian of the function is used to transform the rows and columns of the variance-covariance matrix of the argument.
Note this is equivalent to the matrix expression for the linear case with
.
Simplification
Neglecting correlations or assuming independent variables yields a common formula among engineers and experimental scientists to calculate error propagation, the variance formula:
:
where
represents the standard deviation of the function
,
represents the standard deviation of
,
represents the standard deviation of
, and so forth.
It is important to note that this formula is based on the linear characteristics of the gradient of
and therefore it is a good estimation for the standard deviation of
as long as
are small enough. Specifically, the linear approximation of
has to be close to
inside a neighbourhood of radius
.
Example
Any non-linear differentiable function,
, of two variables,
and
, can be expanded as
:
now, taking variance on both sides, and using the formula for variance of a linear combination of variables:
hence:
:
where
is the standard deviation of the function
,
is the standard deviation of
,
is the standard deviation of
and
is the covariance between
and
.
In the particular case that
,
. Then
:
or
:
where
is the correlation between
and
.
When the variables
and
are uncorrelated,
. Then
:
Caveats and warnings
Error estimates for non-linear functions are
biased on account of using a truncated series expansion. The extent of this bias depends on the nature of the function. For example, the bias on the error calculated for log(1+''x'') increases as ''x'' increases, since the expansion to ''x'' is a good approximation only when ''x'' is near zero.
For highly non-linear functions, there exist five categories of probabilistic approaches for uncertainty propagation;
see
Uncertainty quantification
Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system a ...
for details.
Reciprocal and shifted reciprocal
In the special case of the inverse or reciprocal
, where
follows a
standard normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu i ...
, the resulting distribution is a reciprocal standard normal distribution, and there is no definable variance.
However, in the slightly more general case of a shifted reciprocal function
for
following a general normal distribution, then mean and variance statistics do exist in a
principal value
In mathematics, specifically complex analysis, the principal values of a multivalued function are the values along one chosen branch of that function, so that it is single-valued. The simplest case arises in taking the square root of a positiv ...
sense, if the difference between the pole
and the mean
is real-valued.
Ratios
Ratios are also problematic; normal approximations exist under certain conditions.
Example formulae
This table shows the variances and standard deviations of simple functions of the real variables
, with standard deviations
covariance
In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the le ...
, and correlation
.
The real-valued coefficients
and
are assumed exactly known (deterministic), i.e.,
.
In the columns "Variance" and "Standard Deviation", ''A'' and ''B'' should be understood as expectation values (i.e. values around which we're estimating the uncertainty), and
should be understood as the value of the function calculated at the expectation value of
.
:
For uncorrelated variables (
,
) expressions for more complicated functions can be derived by combining simpler functions. For example, repeated multiplication, assuming no correlation, gives
:
For the case
we also have Goodman's expression
for the exact variance: for the uncorrelated case it is
:
and therefore we have:
:
Effect of correlation on differences
If ''A'' and ''B'' are uncorrelated, their difference ''A-B'' will have more variance than either of them. An increasing positive correlation (
) will decrease the variance of the difference, converging to zero variance for perfectly correlated variables with the
same variance. On the other hand, a negative correlation (
) will further increase the variance of the difference, compared to the uncorrelated case.
For example, the self-subtraction ''f=A-A'' has zero variance
only if the variate is perfectly
autocorrelated (
). If ''A'' is uncorrelated,
, then the output variance is twice the input variance,
. And if ''A'' is perfectly anticorrelated,
, then the input variance is quadrupled in the output,
(notice
for ''f = aA - aA'' in the table above).
Example calculations
Inverse tangent function
We can calculate the uncertainty propagation for the inverse tangent function as an example of using partial derivatives to propagate error.
Define
:
where
is the absolute uncertainty on our measurement of . The derivative of with respect to is
:
Therefore, our propagated uncertainty is
:
where
is the absolute propagated uncertainty.
Resistance measurement
A practical application is an
experiment
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs wh ...
in which one measures
current
Currents, Current or The Current may refer to:
Science and technology
* Current (fluid), the flow of a liquid or a gas
** Air current, a flow of air
** Ocean current, a current in the ocean
*** Rip current, a kind of water current
** Current (stre ...
, , and
voltage
Voltage, also known as electric pressure, electric tension, or (electric) potential difference, is the difference in electric potential between two points. In a static electric field, it corresponds to the work needed per unit of charge t ...
, , on a
resistor
A resistor is a passive two-terminal electrical component that implements electrical resistance as a circuit element. In electronic circuits, resistors are used to reduce current flow, adjust signal levels, to divide voltages, bias activ ...
in order to determine the
resistance, , using
Ohm's law
Ohm's law states that the current through a conductor between two points is directly proportional to the voltage across the two points. Introducing the constant of proportionality, the resistance, one arrives at the usual mathematical equatio ...
, .
Given the measured variables with uncertainties, and , and neglecting their possible correlation, the uncertainty in the computed quantity, , is:
:
See also
*
Accuracy and precision
Accuracy and precision are two measures of '' observational error''.
''Accuracy'' is how close a given set of measurements (observations or readings) are to their '' true value'', while ''precision'' is how close the measurements are to each ot ...
*
Automatic differentiation
In mathematics and computer algebra, automatic differentiation (AD), also called algorithmic differentiation, computational differentiation, auto-differentiation, or simply autodiff, is a set of techniques to evaluate the derivative of a function ...
*
Bienaymé's identity
*
Delta method
In statistics, the delta method is a result concerning the approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator.
History
The delta meth ...
*
Dilution of precision (navigation)
Dilution of precision (DOP), or geometric dilution of precision (GDOP), is a term used in satellite navigation and geomatics engineering to specify the error propagation as a mathematical effect of navigation satellite geometry on positional measur ...
*
Errors and residuals in statistics
In statistics and optimization, errors and residuals are two closely related and easily confused measures of the deviation of an observed value of an element of a statistical sample from its " true value" (not necessarily observable). The err ...
*
Experimental uncertainty analysis
Experimental uncertainty analysis is a technique that analyses a ''derived'' quantity, based on the uncertainties in the experimentally ''measured'' quantities that are used in some form of mathematical relationship (" model") to calculate that ...
*
Interval finite element
In numerical analysis, the interval finite element method (interval FEM) is a finite element method that uses interval parameters. Interval FEM can be applied in situations where it is not possible to get reliable probabilistic characteristics ...
*
Measurement uncertainty
In metrology, measurement uncertainty is the expression of the statistical dispersion of the values attributed to a measured quantity. All measurements are subject to uncertainty and a measurement result is complete only when it is accompanied by ...
*
Numerical stability
*
Probability bounds analysis
Probability bounds analysis (PBA) is a collection of methods of uncertainty propagation for making qualitative and quantitative calculations in the face of uncertainties of various kinds. It is used to project partial information about random varia ...
*
Significance arithmetic
*
Uncertainty quantification
Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system a ...
*
Random-fuzzy variable
*
Variance#Propagation
References
Further reading
*
*
*
*
*
*
*
External links
A detailed discussion of measurements and the propagation of uncertaintyexplaining the benefits of using error propagation formulas and Monte Carlo simulations instead of simple
significance arithmeticGUM Guide to the Expression of Uncertainty in Measurement
EPFL An Introduction to Error Propagation Derivation, Meaning and Examples of Cy = Fx Cx Fx'
uncertainties package a program/library for transparently performing calculations with uncertainties (and error correlations).
soerp package a Python program/library for transparently performing *second-order* calculations with uncertainties (and error correlations).
*
Uncertainty CalculatorPropagate uncertainty for any expression
{{Authority control
Algebra of random variables
Numerical analysis
Statistical approximations
Statistical deviation and dispersion