Cumulative Error
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically
random error Observational error (or measurement error) is the difference between a measured value of a quantity and its unknown true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. Such errors are inherent in the measurement ...
s) on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations (e.g., instrument precision) which propagate due to the combination of variables in the function. The uncertainty ''u'' can be expressed in a number of ways. It may be defined by the absolute error . Uncertainties can also be defined by the relative error , which is usually written as a percentage. Most commonly, the uncertainty on a quantity is quantified in terms of the
standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...
, , which is the positive square root of the
variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
. The value of a quantity and its error are then expressed as an interval . However, the most general way of characterizing uncertainty is by specifying its
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
. If the
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
of the variable is known or can be assumed, in theory it is possible to get any of its statistics. In particular, it is possible to derive confidence limits to describe the region within which the true value of the variable may be found. For example, the 68% confidence limits for a one-dimensional variable belonging to a
normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...
are approximately ± one standard deviation from the central value , which means that the region will cover the true value in roughly 68% of cases. If the uncertainties are
correlated In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistic ...
then
covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. The sign of the covariance, therefore, shows the tendency in the linear relationship between the variables. If greater values of one ...
must be taken into account. Correlation can arise from two different sources. First, the ''measurement errors'' may be correlated. Second, when the underlying values are correlated across a population, the ''uncertainties in the group averages'' will be correlated. In a general context where a nonlinear function modifies the uncertain parameters (correlated or not), the standard tools to propagate uncertainty, and infer resulting quantity probability distribution/statistics, are sampling techniques from the
Monte Carlo method Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be ...
family. For very large datasets or complex functions, the calculation of the error propagation may be very expensive so that a surrogate model or a
parallel computing Parallel computing is a type of computing, computation in which many calculations or Process (computing), processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. ...
strategy may be necessary. In some particular cases, the uncertainty propagation calculation can be done through simplistic algebraic procedures. Some of these scenarios are described below.


Linear combinations

Let \ be a set of ''m'' functions, which are linear combinations of n variables x_1, x_2, \dots, x_n with combination coefficients A_, A_, \dots,A_, (k = 1, \dots, m): f_k = \sum_^n A_ x_i, or in matrix notation, \mathbf = \mathbf \mathbf. Also let the variance–covariance matrix of be denoted by \boldsymbol\Sigma^x and let the mean value be denoted by \boldsymbol: \begin \boldsymbol\Sigma^x = \operatorname \mathbf-\boldsymbol\mu)\otimes (\mathbf-\boldsymbol\mu)&= \begin \sigma^2_1 & \sigma_ & \sigma_ & \cdots \\ \sigma_ & \sigma^2_2 & \sigma_ & \cdots\\ \sigma_ & \sigma_ & \sigma^2_3 & \cdots \\ \vdots & \vdots & \vdots & \ddots \end \\ ex&= \begin ^x_ & ^x_ & ^x_ & \cdots \\ ^x_ & ^x_ & ^x_ & \cdots \\ ^x_ & ^x_ & ^x_ & \cdots \\ \vdots & \vdots & \vdots & \ddots \end. \end \otimes is the
outer product In linear algebra, the outer product of two coordinate vectors is the matrix whose entries are all products of an element in the first vector with an element in the second vector. If the two coordinate vectors have dimensions ''n'' and ''m'', the ...
. Then, the variance–covariance matrix \boldsymbol\Sigma^f of ''f'' is given by \begin \boldsymbol\Sigma^f &= \operatorname\left \mathbf - \operatorname[\mathbf \otimes (\mathbf - \operatorname[\mathbf">mathbf.html" ;"title="\mathbf - \operatorname[\mathbf">\mathbf - \operatorname[\mathbf \otimes (\mathbf - \operatorname[\mathbf\right] = \operatorname\left[\mathbf(\mathbf-\boldsymbol\mu) \otimes \mathbf(\mathbf-\boldsymbol\mu)\right] \\ ex&= \mathbf \operatorname\left \mathbf-\boldsymbol\mu) \otimes (\mathbf-\boldsymbol\mu)\right\mathbf^\mathrm = \mathbf \boldsymbol\Sigma^x \mathbf^\mathrm. \end In component notation, the equation \boldsymbol\Sigma^f = \mathbf \boldsymbol\Sigma^x \mathbf^\mathrm reads \Sigma^f_ = \sum_k^n \sum_l^n A_ ^x_ A_. This is the most general expression for the propagation of error from one set of variables onto another. When the errors on ''x'' are uncorrelated, the general expression simplifies to \Sigma^f_ = \sum_k^n A_ \Sigma^x_k A_, where \Sigma^x_k = \sigma^2_ is the variance of ''k''-th element of the ''x'' vector. Note that even though the errors on ''x'' may be uncorrelated, the errors on ''f'' are in general correlated; in other words, even if \boldsymbol\Sigma^x is a diagonal matrix, \boldsymbol\Sigma^f is in general a full matrix. The general expressions for a scalar-valued function ''f'' are a little simpler (here a is a row vector): f = \sum_i^n a_i x_i = \mathbf, \sigma^2_f = \sum_i^n \sum_j^n a_i \Sigma^x_ a_j = \mathbf \boldsymbol\Sigma^x \mathbf^\mathrm. Each covariance term \sigma_ can be expressed in terms of the
correlation coefficient A correlation coefficient is a numerical measure of some type of linear correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two c ...
\rho_ by \sigma_ = \rho_ \sigma_i \sigma_j, so that an alternative expression for the variance of ''f'' is \sigma^2_f = \sum_i^n a_i^2 \sigma^2_i + \sum_i^n \sum_^n a_i a_j \rho_ \sigma_i \sigma_j. In the case that the variables in ''x'' are uncorrelated, this simplifies further to \sigma^2_f = \sum_i^n a_i^2 \sigma^2_i. In the simple case of identical coefficients and variances, we find \sigma_f = \sqrt\, , a, \sigma. For the arithmetic mean, a=1/n, the result is the standard error of the mean: \sigma_f = \frac .


Non-linear combinations

When ''f'' is a set of non-linear combination of the variables ''x'', an
interval propagation In numerical mathematics, interval propagation or interval constraint propagation is the problem of contracting interval domains associated to variables of R without removing any value that is consistent with a set of constraints (i.e., equations ...
could be performed in order to compute intervals which contain all consistent values for the variables. In a probabilistic approach, the function ''f'' must usually be linearised by approximation to a first-order
Taylor series In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...
expansion, though in some cases, exact formulae can be derived that do not depend on the expansion as is the case for the exact variance of products. The Taylor expansion would be: f_k \approx f^0_k+ \sum_i^n \frac x_i where \partial f_k/\partial x_i denotes the
partial derivative In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). P ...
of ''fk'' with respect to the ''i''-th variable, evaluated at the mean value of all components of vector ''x''. Or in
matrix notation In mathematics, a matrix (: matrices) is a rectangular array or table of numbers, symbols, or expressions, with elements or entries arranged in rows and columns, which is used to represent a mathematical object or property of such an object. ...
, \mathrm \approx \mathrm^0 + \mathrm \mathrm\, where J is the
Jacobian matrix In vector calculus, the Jacobian matrix (, ) of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. If this matrix is square, that is, if the number of variables equals the number of component ...
. Since f0 is a constant it does not contribute to the error on f. Therefore, the propagation of error follows the linear case, above, but replacing the linear coefficients, ''Aki'' and ''Akj'' by the partial derivatives, \frac and \frac. In matrix notation, \mathrm^\mathrm = \mathrm \mathrm^\mathrm \mathrm^\top. That is, the Jacobian of the function is used to transform the rows and columns of the variance-covariance matrix of the argument. Note this is equivalent to the matrix expression for the linear case with \mathrm.


Simplification

Neglecting correlations or assuming independent variables yields a common formula among engineers and experimental scientists to calculate error propagation, the variance formula: s_f = \sqrt where s_f represents the standard deviation of the function f, s_x represents the standard deviation of x, s_y represents the standard deviation of y, and so forth. This formula is based on the linear characteristics of the gradient of f and therefore it is a good estimation for the standard deviation of f as long as s_x, s_y, s_z,\ldots are small enough. Specifically, the linear approximation of f has to be close to f inside a neighbourhood of radius s_x, s_y, s_z,\ldots.


Example

Any non-linear differentiable function, f(a,b), of two variables, a and b, can be expanded as f\approx f^0+\fraca+\fracb. If we take the variance on both sides and use the formula for the variance of a linear combination of variables \operatorname(aX + bY) = a^2\operatorname(X) + b^2\operatorname(Y) + 2ab \operatorname(X, Y), then we obtain \sigma^2_f\approx\left, \frac\ ^2\sigma^2_a+\left, \frac\^2\sigma^2_b+2\frac\frac \sigma_, where \sigma_ is the standard deviation of the function f, \sigma_ is the standard deviation of a, \sigma_ is the standard deviation of b and \sigma_ = \sigma_\sigma_ \rho_ is the covariance between a and b. In the particular case that Then \sigma^2_f \approx b^2\sigma^2_a+a^2 \sigma_b^2+2ab\,\sigma_ or \left(\frac\right)^2 \approx \left(\frac \right)^2 + \left(\frac\right)^2 + 2\left(\frac\right)\left(\frac\right)\rho_ where \rho_ is the correlation between a and b. When the variables a and b are uncorrelated, \rho_=0. Then \left(\frac\right)^2 \approx \left(\frac \right)^2 + \left(\frac\right)^2.


Caveats and warnings

Error estimates for non-linear functions are biased on account of using a truncated series expansion. The extent of this bias depends on the nature of the function. For example, the bias on the error calculated for log(1+''x'') increases as ''x'' increases, since the expansion to ''x'' is a good approximation only when ''x'' is near zero. For highly non-linear functions, there exist five categories of probabilistic approaches for uncertainty propagation; see
Uncertainty quantification Uncertainty quantification (UQ) is the science of quantitative characterization and estimation of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system ...
for details.


Reciprocal and shifted reciprocal

In the special case of the inverse or reciprocal 1/B, where B=N(0,1) follows a standard normal distribution, the resulting distribution is a reciprocal standard normal distribution, and there is no definable variance. However, in the slightly more general case of a shifted reciprocal function 1/(p-B) for B=N(\mu,\sigma) following a general normal distribution, then mean and variance statistics do exist in a principal value sense, if the difference between the pole p and the mean \mu is real-valued.


Ratios

Ratios are also problematic; normal approximations exist under certain conditions.


Example formulae

This table shows the variances and standard deviations of simple functions of the real variables A, B with standard deviations \sigma_A, \sigma_B,
covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. The sign of the covariance, therefore, shows the tendency in the linear relationship between the variables. If greater values of one ...
\sigma_ = \rho_ \sigma_A \sigma_B, and correlation \rho_. The real-valued coefficients a and b are assumed exactly known (deterministic), i.e., \sigma_a = \sigma_b = 0. In the right-hand columns of the table, A and B are expectation values, and f is the value of the function calculated at those values. For uncorrelated variables (\rho_ = 0, \sigma_ = 0) expressions for more complicated functions can be derived by combining simpler functions. For example, repeated multiplication, assuming no correlation, gives f = ABC; \qquad \left(\frac\right)^2 \approx \left(\frac\right)^2 + \left(\frac\right)^2+ \left(\frac\right)^2. For the case f = AB we also have Goodman's expression for the exact variance: for the uncorrelated case it is \operatorname Y= \operatorname 2 \operatorname + \operatorname 2 \operatorname + \operatorname\left left(X - \operatorname(X)\right)^2 \left(Y - \operatorname(Y)\right)^2\right and therefore we have \sigma_f^2 = A^2\sigma_B^2 + B^2\sigma_A^2 + \sigma_A^2\sigma_B^2.


Effect of correlation on differences

If ''A'' and ''B'' are uncorrelated, their difference ''A'' − ''B'' will have more variance than either of them. An increasing positive correlation (\rho_ \to 1) will decrease the variance of the difference, converging to zero variance for perfectly correlated variables with the same variance. On the other hand, a negative correlation (\rho_ \to -1) will further increase the variance of the difference, compared to the uncorrelated case. For example, the self-subtraction ''f'' = ''A'' − ''A'' has zero variance \sigma_f^2 = 0 only if the variate is perfectly autocorrelated (\rho_A = 1). If ''A'' is uncorrelated, \rho_A = 0, then the output variance is twice the input variance, \sigma_f^2 = 2\sigma^2_A. And if ''A'' is perfectly anticorrelated, \rho_A = -1, then the input variance is quadrupled in the output, \sigma_f^2 = 4 \sigma^2_A (notice 1 - \rho_A = 2 for ''f'' = ''aA'' − ''aA'' in the table above).


Example calculations


Inverse tangent function

We can calculate the uncertainty propagation for the inverse tangent function as an example of using partial derivatives to propagate error. Define f(x) = \arctan(x), where \Delta_x is the absolute uncertainty on our measurement of . The derivative of with respect to is \frac = \frac. Therefore, our propagated uncertainty is \Delta_ \approx \frac, where \Delta_f is the absolute propagated uncertainty.


Resistance measurement

A practical application is an
experiment An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs whe ...
in which one measures current, , and
voltage Voltage, also known as (electrical) potential difference, electric pressure, or electric tension, is the difference in electric potential between two points. In a Electrostatics, static electric field, it corresponds to the Work (electrical), ...
, , on a
resistor A resistor is a passive two-terminal electronic component that implements electrical resistance as a circuit element. In electronic circuits, resistors are used to reduce current flow, adjust signal levels, to divide voltages, bias active e ...
in order to determine the resistance, , using
Ohm's law Ohm's law states that the electric current through a Electrical conductor, conductor between two Node (circuits), points is directly Proportionality (mathematics), proportional to the voltage across the two points. Introducing the constant of ...
, . Given the measured variables with uncertainties, and , and neglecting their possible correlation, the uncertainty in the computed quantity, , is: \sigma_R \approx \sqrt = R\sqrt.


See also

*
Accuracy and precision Accuracy and precision are two measures of ''observational error''. ''Accuracy'' is how close a given set of measurements (observations or readings) are to their ''true value''. ''Precision'' is how close the measurements are to each other. The ...
* Automatic differentiation * Bienaymé's identity *
Delta method In statistics, the delta method is a method of deriving the asymptotic distribution of a random variable. It is applicable when the random variable being considered can be defined as a differentiable function of a random variable which is Asymptoti ...
* Dilution of precision (navigation) * Errors and residuals in statistics * Experimental uncertainty analysis * Interval finite element * Measurement uncertainty *
Numerical stability In the mathematical subfield of numerical analysis, numerical stability is a generally desirable property of numerical algorithms. The precise definition of stability depends on the context: one important context is numerical linear algebra, and ...
* Probability bounds analysis *
Uncertainty quantification Uncertainty quantification (UQ) is the science of quantitative characterization and estimation of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system ...
* Random-fuzzy variable *


References


Further reading

* * * * * * *


External links


A detailed discussion of measurements and the propagation of uncertainty
explaining the benefits of using error propagation formulas and Monte Carlo simulations instead of simple significance arithmetic
GUM
Guide to the Expression of Uncertainty in Measurement
EPFL An Introduction to Error Propagation
Derivation, Meaning and Examples of Cy = Fx Cx Fx'
uncertainties package
a program/library for transparently performing calculations with uncertainties (and error correlations).
soerp package
a Python program/library for transparently performing *second-order* calculations with uncertainties (and error correlations). *
Uncertainty Calculator
Propagate uncertainty for any expression {{Authority control Algebra of random variables Numerical analysis Statistical approximations Statistical deviation and dispersion