statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, the residual sum of squares (RSS), also known as the sum of squared residuals (SSR) or the sum of squared estimate of errors (SSE), is the sum of the

squares In geometry, a square is a regular polygon, regular quadrilateral. It has four straight sides of equal length and four equal angles. Squares are special cases of rectangles, which have four equal angles, and of rhombuses, which have four equal si ...

of residuals (deviations predicted from actual empirical values of data). It is a measure of the discrepancy between the data and an estimation model, such as a

linear regression In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...

. A small RSS indicates a tight fit of the model to the data. It is used as an optimality criterion in parameter selection and

model selection Model selection is the task of selecting a model from among various candidates on the basis of performance criterion to choose the best one. In the context of machine learning and more generally statistical analysis, this may be the selection of ...

. In general, total sum of squares = explained sum of squares + residual sum of squares. For a proof of this in the multivariate

ordinary least squares In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression In statistics, linear regression is a statistical model, model that estimates the relationship ...

(OLS) case, see partitioning in the general OLS model.

One explanatory variable

In a model with a single explanatory variable, RSS is given by: :

\operatorname = \sum_^n (y_i - f(x_i))^2

where ''y''_''i'' is the ''i''^th value of the variable to be predicted, ''x''_''i'' is the ''i''^th value of the explanatory variable, and

f(x_i)

is the predicted value of ''y''_''i'' (also termed

\hat

). In a standard linear simple regression model,

y_i = \alpha + \beta x_i+\varepsilon_i\,

, where

\alpha

and

\beta

are

coefficient In mathematics, a coefficient is a Factor (arithmetic), multiplicative factor involved in some Summand, term of a polynomial, a series (mathematics), series, or any other type of expression (mathematics), expression. It may be a Dimensionless qu ...

s, ''y'' and ''x'' are the regressand and the regressor, respectively, and ε is the error term. The sum of squares of residuals is the sum of squares of

\widehat_i

; that is :

\operatorname = \sum_^n (\widehat_i)^2 = \sum_^n (y_i - (\widehat + \widehat x_i))^2

where

\widehat

is the estimated value of the constant term

\alpha

and

\widehat

is the estimated value of the slope coefficient

\beta

Matrix expression for the OLS residual sum of squares

The general regression model with observations and explanators, the first of which is a constant unit vector whose coefficient is the regression intercept, is :

y = X \beta + e

where is an ''n'' × 1 vector of dependent variable observations, each column of the ''n'' × ''k'' matrix is a vector of observations on one of the ''k'' explanators,

\beta

is a ''k'' × 1 vector of true coefficients, and is an ''n''× 1 vector of the true underlying errors. The

estimator for

\beta

is :

X \hat \beta = y \iff

X^\operatorname X \hat \beta = X^\operatorname y \iff

\hat \beta = (X^\operatorname X)^X^\operatorname y.

The residual vector

\hat e = y - X \hat \beta = y - X (X^\operatorname X)^X^\operatorname y

; so the residual sum of squares is: :

\operatorname = \hat e ^\operatorname \hat e =  \,  \hat e \, ^2

, (equivalent to the square of the norm of residuals). In full: :

\operatorname = y^\operatorname y - y^\operatorname X(X^\operatorname X)^ X^\operatorname y = y^\operatorname - X(X^\operatorname X)^ X^\operatorname y = y^\operatorname - H y

, where is the hat matrix, or the projection matrix in linear regression.

Relation with Pearson's product-moment correlation

The least-squares regression line is given by :

y=ax+b

, where

b=\bar-a\bar

and

a=\frac

, where

S_=\sum_^n(\bar-x_i)(\bar-y_i)

and

S_=\sum_^n(\bar-x_i)^2.

Therefore, :

& = \sum_^n (a(\bar-x_i)-(\bar-y_i))^2=a^2S_-2aS_+S_=S_-aS_=S_ \left(1-\frac \right) \end

where

S_=\sum_^n (\bar-y_i)^2 .

The Pearson product-moment correlation is given by

r=\frac;

therefore,

\operatorname=S_(1-r^2).

References

* {{cite book , title = Applied Regression Analysis , edition = 3rd , last1= Draper , first1=N.R. , last2=Smith , first2=H. , publisher = John Wiley , year = 1998 , isbn = 0-471-17082-8 Least squares Errors and residuals

One explanatory variable

Matrix expression for the OLS residual sum of squares

Relation with Pearson's product-moment correlation

See also

References