In
applied statistics
Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industria ...
, total least squares is a type of
errors-in-variables regression, a
least squares
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the r ...
data modeling technique in which observational errors on both dependent and independent variables are taken into account. It is a generalization of
Deming regression
In statistics, Deming regression, named after W. Edwards Deming, is an errors-in-variables model which tries to find the line of best fit for a two-dimensional dataset. It differs from the simple linear regression in that it accounts for erro ...
and also of
orthogonal regression
In statistics, Deming regression, named after W. Edwards Deming, is an errors-in-variables model which tries to find the line of best fit for a two-dimensional dataset. It differs from the simple linear regression in that it accounts for error ...
, and can be applied to both linear and non-linear models.
The total least squares approximation of the data is generically equivalent to the best, in the
Frobenius norm
In mathematics, a matrix norm is a vector norm in a vector space whose elements (vectors) are matrices (of given dimensions).
Preliminaries
Given a field K of either real or complex numbers, let K^ be the -vector space of matrices with m ...
,
low-rank approximation of the data matrix.
Linear model
Background
In the
least squares
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the r ...
method of data modeling, the
objective function
In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cos ...
, ''S'',
:
is minimized, where ''r'' is the vector of
residuals and ''W'' is a weighting matrix. In
linear least squares the model contains equations which are linear in the parameters appearing in the parameter vector
, so the residuals are given by
:
There are ''m'' observations in y and ''n'' parameters in β with ''m''>''n''. X is a ''m''×''n'' matrix whose elements are either constants or functions of the independent variables, x. The weight matrix W is, ideally, the inverse of the
variance-covariance matrix
In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...
of the observations y. The independent variables are assumed to be error-free. The parameter estimates are found by setting the gradient equations to zero, which results in the normal equations
[An alternative form is , where is the parameter shift from some starting estimate of and is the difference between y and the value calculated using the starting value of ]
:
Allowing observation errors in all variables
Now, suppose that both x and y are observed subject to error, with variance-covariance matrices
and
respectively. In this case the objective function can be written as
:
where
and
are the residuals in x and y respectively. Clearly these residuals cannot be independent of each other, but they must be constrained by some kind of relationship. Writing the model function as
, the constraints are expressed by ''m'' condition equations.
:
Thus, the problem is to minimize the objective function subject to the ''m'' constraints. It is solved by the use of
Lagrange multipliers
In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints (i.e., subject to the condition that one or more equations have to be satisfied ...
. After some algebraic manipulations, the result is obtained.
:
or alternatively
where M is the variance-covariance matrix relative to both independent and dependent variables.
:
Example
When the data errors are uncorrelated, all matrices M and W are diagonal. Then, take the example of straight line fitting.
:
in this case
:
showing how the variance at the ''i''th point is determined by the variances of both independent and dependent variables and by the model being used to fit the data. The expression may be generalized by noting that the parameter
is the slope of the line.
:
An expression of this type is used in fitting
pH titration data where a small error on ''x'' translates to a large error on y when the slope is large.
Algebraic point of view
As was shown in 1980 by Golub and Van Loan, the TLS problem does not have a solution in general. The following considers the simple case where a unique solution exists without making any particular assumptions.
The computation of the TLS using
singular value decomposition
In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any \ m \times n\ matrix. It is r ...
(SVD) is described in standard texts. We can solve the equation
:
for ''B'' where ''X'' is ''m''-by-''n'' and ''Y'' is ''m''-by-''k''.
[The notation ''XB'' ≈ ''Y'' is used here to reflect the notation used in the earlier part of the article. In the computational literature the problem has been more commonly presented as ''AX'' ≈ ''B'', i.e. with the letter ''X'' used for the ''n''-by-''k'' matrix of unknown regression coefficients.]
That is, we seek to find ''B'' that minimizes error matrices ''E'' and ''F'' for ''X'' and ''Y'' respectively. That is,
:
where