In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, a confidence region is a
multi-dimensional generalization of a
confidence interval. For a
bivariate normal distribution, it is an
ellipse
In mathematics, an ellipse is a plane curve surrounding two focus (geometry), focal points, such that for all points on the curve, the sum of the two distances to the focal points is a constant. It generalizes a circle, which is the special ty ...
, also known as the error ellipse. More generally, it is a set of points in an ''n''-dimensional space, often represented as a
hyperellipsoid around a point which is an estimated solution to a problem, although other shapes can occur.
Interpretation
The confidence region is calculated in such a way that if a set of measurements were repeated many times and a confidence region calculated in the same way on each set of measurements, then a certain percentage of the time (e.g. 95%) the confidence region would include the point representing the "true" values of the set of variables being estimated. However, unless certain assumptions about
prior probabilities are made, it does not mean, when one confidence region has been calculated, that there is a 95% probability that the "true" values lie inside the region, since we do not assume any particular probability distribution of the "true" values and we may or may not have other information about where they are likely to lie.
The case of independent, identically normally-distributed errors
Suppose we have found a solution
to the following overdetermined problem:
:
where Y is an ''n''-dimensional column vector containing observed values of the
dependent variable
A variable is considered dependent if it depends on (or is hypothesized to depend on) an independent variable. Dependent variables are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical functio ...
, X is an ''n''-by-''p'' matrix of observed values of
independent variable
A variable is considered dependent if it depends on (or is hypothesized to depend on) an independent variable. Dependent variables are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical function ...
s (which can represent a physical model) which is assumed to be known exactly,
is a column vector containing the ''p'' parameters which are to be estimated, and
is an ''n''-dimensional column vector of errors which are assumed to be
independently distributed with
normal distribution
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
f(x) = \frac ...
s with zero mean and each having the same unknown variance
.
A joint 100(1 − ''α'') % confidence region for the elements of
is represented by the set of values of the vector b which satisfy the following inequality:
:
where the variable b represents any point in the confidence region, ''p'' is the number of parameters, i.e. number of elements of the vector
is the vector of estimated parameters, and ''s''
2 is the
reduced chi-squared, an
unbiased estimate
In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In sta ...
of
equal to
:
Further, ''F'' is the
quantile function of the
F-distribution
In probability theory and statistics, the ''F''-distribution or ''F''-ratio, also known as Snedecor's ''F'' distribution or the Fisher–Snedecor distribution (after Ronald Fisher and George W. Snedecor), is a continuous probability distribut ...
, with ''p'' and
degrees of freedom
In many scientific fields, the degrees of freedom of a system is the number of parameters of the system that may vary independently. For example, a point in the plane has two degrees of freedom for translation: its two coordinates; a non-infinite ...
,
is the
statistical significance
In statistical hypothesis testing, a result has statistical significance when a result at least as "extreme" would be very infrequent if the null hypothesis were true. More precisely, a study's defined significance level, denoted by \alpha, is the ...
level, and the symbol
means the
transpose
In linear algebra, the transpose of a Matrix (mathematics), matrix is an operator which flips a matrix over its diagonal;
that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other ...
of
.
The expression can be rewritten as:
:
where
is the least-squares scaled covariance matrix of
.
The above inequality defines an
ellipsoid
An ellipsoid is a surface that can be obtained from a sphere by deforming it by means of directional Scaling (geometry), scalings, or more generally, of an affine transformation.
An ellipsoid is a quadric surface; that is, a Surface (mathemat ...
al region in the ''p''-dimensional Cartesian parameter space R
''p''. The centre of the ellipsoid is at the estimate
. According to Press et al., it is easier to plot the ellipsoid after doing
singular value decomposition
In linear algebra, the singular value decomposition (SVD) is a Matrix decomposition, factorization of a real number, real or complex number, complex matrix (mathematics), matrix into a rotation, followed by a rescaling followed by another rota ...
. The lengths of the axes of the ellipsoid are proportional to the reciprocals of the values on the diagonals of the
diagonal matrix
In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero; the term usually refers to square matrices. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagon ...
, and the directions of these axes are given by the rows of the 3rd matrix of the decomposition.
Weighted and generalised least squares
Now consider the more general case where some distinct elements of
have known nonzero
covariance
In probability theory and statistics, covariance is a measure of the joint variability of two random variables.
The sign of the covariance, therefore, shows the tendency in the linear relationship between the variables. If greater values of one ...
(in other words, the errors in the observations are not independently distributed), and/or the standard deviations of the errors are not all equal. Suppose the covariance matrix of
is
, where V is an ''n''-by-''n'' nonsingular matrix which was equal to
in the more specific case handled in the previous section, (where I is the
identity matrix
In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. It has unique properties, for example when the identity matrix represents a geometric transformation, the obje ...
,) but here is allowed to have nonzero
off-diagonal elements representing the covariance of pairs of individual observations, as well as not necessarily having all the diagonal elements equal.
It is possible to find a nonsingular symmetric matrix P such that
:
In effect, P is a square root of the covariance matrix V.
The least-squares problem
:
can then be transformed by left-multiplying each term by the inverse of P, forming the new problem formulation
:
where
:
:
and
:
A joint confidence region for the parameters, i.e. for the elements of
, is then bounded by the ellipsoid given by:
:
Here ''F'' represents the percentage point of the
''F''-distribution and the quantities ''p'' and ''n-p'' are the
degrees of freedom
In many scientific fields, the degrees of freedom of a system is the number of parameters of the system that may vary independently. For example, a point in the plane has two degrees of freedom for translation: its two coordinates; a non-infinite ...
which are the parameters of this distribution.
Nonlinear problems
Confidence regions can be defined for any probability distribution. The experimenter can choose the significance level and the shape of the region, and then the size of the region is determined by the probability distribution. A natural choice is to use as a boundary a set of points with constant
(
chi-squared) values.
One approach is to use a
linear approximation to the nonlinear model, which may be a close approximation in the vicinity of the solution, and then apply the analysis for a linear problem to find an approximate confidence region. This may be a reasonable approach if the confidence region is not very large and the second derivatives of the model are also not very large.
Bootstrapping
In general, bootstrapping usually refers to a self-starting process that is supposed to continue or grow without external input. Many analytical techniques are often called bootstrap methods in reference to their self-starting or self-supporting ...
approaches can also be used.
[Hutton TJ, Buxton BF, Hammond P, Potts HWW (2003)]
Estimating average growth trajectories in shape-space using kernel smoothing
''IEEE Transactions on Medical Imaging'', 22(6):747-53
See also
*
Circular error probable
Circular error probable (CEP),Circular Error Probable (CEP), Air Force Operational Test and Evaluation Center Technical Paper 6, Ver 2, July 1987, p. 1 also circular error probability or circle of equal probability, is a measure of a weapon s ...
*
Linear regression
In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
*
Confidence band
*
Credible region
Notes
References
*
*{{cite book , title=Numerical Recipes in C: The Art of Scientific Computing , url=https://archive.org/details/numericalrecipes00pres_0 , url-access=registration , last=Press , first=W.H. , author2=S.A. Teukolsky , author3=W.T. Vetterling , author4=B.P. Flannery , year=1992 , orig-year=1988 , publisher=Cambridge University Press , location=Cambridge UK , edition=2nd, isbn=978-0-521-43720-2
Estimation theory