Sum Of Squares (statistics)
   HOME

TheInfoList



OR:

The partition of sums of squares is a concept that permeates much of
inferential statistics Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
and
descriptive statistics A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and an ...
. More properly, it is the partitioning of sums of
squared deviations A square is a regular quadrilateral with four equal sides and four right angles. Square or Squares may also refer to: Mathematics and science *Square (algebra), multiplying a number or expression by itself *Square (cipher), a cryptographic block ...
or errors. Mathematically, the sum of squared deviations is an unscaled, or unadjusted measure of
dispersion Dispersion may refer to: Economics and finance *Dispersion (finance), a measure for the statistical distribution of portfolio returns * Price dispersion, a variation in prices across sellers of the same item *Wage dispersion, the amount of variat ...
(also called
variability Variability is how spread out or closely clustered a set of data is. Variability may refer to: Biology *Genetic variability, a measure of the tendency of individual genotypes in a population to vary from one another *Heart rate variability, a phy ...
). When scaled for the number of
degrees of freedom In many scientific fields, the degrees of freedom of a system is the number of parameters of the system that may vary independently. For example, a point in the plane has two degrees of freedom for translation: its two coordinates; a non-infinite ...
, it estimates the
variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
, or spread of the observations about their mean value. Partitioning of the sum of squared deviations into various components allows the overall variability in a dataset to be ascribed to different types or sources of variability, with the relative importance of each being quantified by the size of each component of the overall sum of squares.


Background

The distance from any point in a collection of data, to the mean of the data, is the deviation. This can be written as y_i - \overline, where y_i is the ith data point, and \overline is the estimate of the mean. If all such deviations are squared, then summed, as in \sum_^n\left(y_i-\overline\,\right)^2, this gives the "sum of squares" for these data. When more data are added to the collection the sum of squares will increase, except in unlikely cases such as the new data being equal to the mean. So usually, the sum of squares will grow with the size of the data collection. That is a manifestation of the fact that it is unscaled. In many cases, the number of
degrees of freedom In many scientific fields, the degrees of freedom of a system is the number of parameters of the system that may vary independently. For example, a point in the plane has two degrees of freedom for translation: its two coordinates; a non-infinite ...
is simply the number of data points in the collection, minus one. We write this as ''n'' − 1, where ''n'' is the number of data points. Scaling (also known as normalizing) means adjusting the sum of squares so that it does not grow as the size of the data collection grows. This is important when we want to compare samples of different sizes, such as a sample of 100 people compared to a sample of 20 people. If the sum of squares were not normalized, its value would always be larger for the sample of 100 people than for the sample of 20 people. To scale the sum of squares, we divide it by the degrees of freedom, i.e., calculate the sum of squares per degree of freedom, or variance.
Standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...
, in turn, is the square root of the variance. The above describes how the sum of squares is used in descriptive statistics; see the article on
total sum of squares In statistical data analysis the total sum of squares (TSS or SST) is a quantity that appears as part of a standard way of presenting results of such analyses. For a set of observations, y_i, i\leq n, it is defined as the sum over all squared dif ...
for an application of this broad principle to
inferential statistics Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
.


Partitioning the sum of squares in linear regression

Theorem. Given a
linear regression model In statistics, linear regression is a model that estimates the relationship between a scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A model with exactly one explanatory variable ...
y_i = \beta_0 + \beta_1 x_ + \cdots + \beta_p x_ + \varepsilon_i ''including a constant'' \beta_0, based on a sample (y_i, x_, \ldots, x_), \, i = 1, \ldots, n containing ''n'' observations, the total sum of squares \mathrm = \sum_^n (y_i - \bar)^2 can be partitioned as follows into the
explained sum of squares In statistics, the explained sum of squares (ESS), alternatively known as the model sum of squares or sum of squares due to regression (SSR – not to be confused with the residual sum of squares (RSS) or sum of squares of errors), is a quantity ...
(ESS) and the
residual sum of squares In statistics, the residual sum of squares (RSS), also known as the sum of squared residuals (SSR) or the sum of squared estimate of errors (SSE), is the sum of the squares of residuals (deviations predicted from actual empirical values of dat ...
(RSS): :\mathrm = \mathrm + \mathrm, where this equation is equivalent to each of the following forms: : \begin \left\, y - \bar \mathbf \right\, ^2 &= \left\, \hat - \bar \mathbf \right\, ^2 + \left\, \hat \right\, ^2, \quad \mathbf = (1, 1, \ldots, 1)^T ,\\ \sum_^n (y_i - \bar)^2 &= \sum_^n (\hat_i - \bar)^2 + \sum_^n (y_i - \hat_i)^2 ,\\ \sum_^n (y_i - \bar)^2 &= \sum_^n (\hat_i - \bar)^2 + \sum_^n \hat_i^2 ,\\ \end :where \hat_i is the value estimated by the regression line having \hat_0 , \hat_1, ..., \hat_p as the estimated
coefficient In mathematics, a coefficient is a Factor (arithmetic), multiplicative factor involved in some Summand, term of a polynomial, a series (mathematics), series, or any other type of expression (mathematics), expression. It may be a Dimensionless qu ...
s.


Proof

: \begin \sum_^n (y_i - \overline)^2 &= \sum_^n (y_i - \overline + \hat_i - \hat_i)^2 = \sum_^n ((\hat_i - \bar) + \underbrace_)^2 \\ &= \sum_^n ((\hat_i - \bar)^2 + 2 \hat_i (\hat_i - \bar) + \hat_i^2) \\ &= \sum_^n (\hat_i - \bar)^2 + \sum_^n \hat_i^2 + 2 \sum_^n \hat_i (\hat_i - \bar) \\ &= \sum_^n (\hat_i - \bar)^2 + \sum_^n \hat_i^2 + 2 \sum_^n \hat_i(\hat_0 + \hat_1 x_ + \cdots + \hat_p x_ - \overline) \\ &= \sum_^n (\hat_i - \bar)^2 + \sum_^n \hat_i^2 + 2 (\hat_0 - \overline) \underbrace_0 + 2 \hat_1 \underbrace_0 + \cdots + 2 \hat_p \underbrace_0 \\ &= \sum_^n (\hat_i - \bar)^2 + \sum_^n \hat_i^2 = \mathrm + \mathrm \\ \end The requirement that the model include a constant or equivalently that the design matrix contain a column of ones ensures that \sum_^n \hat_i = 0 , i.e. \hat^T\mathbf=0 . The proof can also be expressed in vector form, as follows: : \begin SS_\text = \Vert \mathbf - \bar\mathbf \Vert^2 & = \Vert \mathbf - \bar\mathbf + \mathbf - \mathbf \Vert^2 , \\ & = \Vert \left( \mathbf - \bar\mathbf \right) + \left( \mathbf - \mathbf \right) \Vert^2, \\ & = \Vert \Vert^2 + \Vert \hat \varepsilon\Vert^2 + 2 \hat \varepsilon^T \left( \mathbf - \bar\mathbf \right), \\ & = SS_\text + SS_\text + 2\hat \varepsilon^T \left( X\hat\beta - \bar\mathbf \right) ,\\ & = SS_\text + SS_\text + 2\left( \hat \varepsilon^T X \right)\hat\beta - 2\bar\underbrace_0, \\ & = SS_\text + SS_\text. \end The elimination of terms in the last line, used the fact that : \hat \varepsilon ^T X = \left( \mathbf - \mathbf \right)^T X = \mathbf^T(I - X(X^T X)^ X^T)^T X = ^T(X^T-X^T)^T=.


Further partitioning

Note that the residual sum of squares can be further partitioned as the
lack-of-fit sum of squares In statistics, a sum of squares due to lack of fit, or more tersely a lack-of-fit sum of squares, is one of the components of a partition of the sum of squares of residuals in an analysis of variance, used in the numerator in an F-test of the null ...
plus the sum of squares due to pure error.


See also

*
Inner-product space In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, ofte ...
**
Hilbert space In mathematics, a Hilbert space is a real number, real or complex number, complex inner product space that is also a complete metric space with respect to the metric induced by the inner product. It generalizes the notion of Euclidean space. The ...
***
Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are ''Euclidean spaces ...
* Expected mean squares **
Orthogonality In mathematics, orthogonality is the generalization of the geometric notion of '' perpendicularity''. Although many authors use the two terms ''perpendicular'' and ''orthogonal'' interchangeably, the term ''perpendicular'' is more specifically ...
**
Orthonormal basis In mathematics, particularly linear algebra, an orthonormal basis for an inner product space V with finite Dimension (linear algebra), dimension is a Basis (linear algebra), basis for V whose vectors are orthonormal, that is, they are all unit vec ...
***
Orthogonal complement In the mathematical fields of linear algebra and functional analysis, the orthogonal complement of a subspace W of a vector space V equipped with a bilinear form B is the set W^\perp of all vectors in V that are orthogonal to every vector in W. I ...
, the closed subspace orthogonal to a set (especially a subspace) *** Orthomodular lattice of the subspaces of an inner-product space ***
Orthogonal projection In linear algebra and functional analysis, a projection is a linear transformation P from a vector space to itself (an endomorphism) such that P\circ P=P. That is, whenever P is applied twice to any vector, it gives the same result as if it we ...
**
Pythagorean theorem In mathematics, the Pythagorean theorem or Pythagoras' theorem is a fundamental relation in Euclidean geometry between the three sides of a right triangle. It states that the area of the square whose side is the hypotenuse (the side opposite t ...
that the sum of the squared norms of orthogonal summands equals the squared norm of the sum. *
Least squares The method of least squares is a mathematical optimization technique that aims to determine the best fit function by minimizing the sum of the squares of the differences between the observed values and the predicted values of the model. The me ...
*
Mean squared error In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference betwee ...
*
Squared deviations A square is a regular quadrilateral with four equal sides and four right angles. Square or Squares may also refer to: Mathematics and science *Square (algebra), multiplying a number or expression by itself *Square (cipher), a cryptographic block ...


References

* Pre-publication chapters are available on-line. * * *:Republished as: * {{cite book, title=Probability Via Expectation, edition=4th, author=Whittle, P., publisher=Springer, date=20 April 2000, isbn=0-387-98955-2 Analysis of variance Least squares