In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the number of degrees of freedom is the number of values in the final calculation of a
statistic
A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypot ...
that are free to vary.
Estimates of
statistical parameter
In statistics, as opposed to its general use in mathematics, a parameter is any quantity of a statistical population that summarizes or describes an aspect of the population, such as a mean or a standard deviation. If a population exactly follo ...
s can be based upon different amounts of information or data. The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom. In general, the degrees of freedom of an estimate of a parameter are equal to the number of independent
scores that go into the estimate minus the number of parameters used as intermediate steps in the estimation of the parameter itself. For example, if the
variance
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
is to be estimated from a random sample of
independent scores, then the degrees of freedom is equal to the number of independent scores (''N'') minus the number of parameters estimated as intermediate steps (one, namely, the sample mean) and is therefore equal to
.
Mathematically, degrees of freedom is the number of
dimension
In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coo ...
s of the domain of a
random vector
In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge ...
, or essentially the number of "free" components (how many components need to be known before the vector is fully determined).
The term is most often used in the context of
linear models (
linear regression
In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
,
analysis of variance
Analysis of variance (ANOVA) is a family of statistical methods used to compare the Mean, means of two or more groups by analyzing variance. Specifically, ANOVA compares the amount of variation ''between'' the group means to the amount of variati ...
), where certain random vectors are constrained to lie in
linear subspace
In mathematics, the term ''linear'' is used in two distinct senses for two different properties:
* linearity of a ''function (mathematics), function'' (or ''mapping (mathematics), mapping'');
* linearity of a ''polynomial''.
An example of a li ...
s, and the number of degrees of freedom is the dimension of the
subspace. The degrees of freedom are also commonly associated with the squared lengths (or "sum of squares" of the coordinates) of such vectors, and the parameters of
chi-squared and other distributions that arise in associated statistical testing problems.
While introductory textbooks may introduce degrees of freedom as distribution parameters or through hypothesis testing, it is the underlying geometry that defines degrees of freedom, and is critical to a proper understanding of the concept.
History
Although the basic concept of degrees of freedom was recognized as early as 1821 in the work of German astronomer and mathematician
Carl Friedrich Gauss
Johann Carl Friedrich Gauss (; ; ; 30 April 177723 February 1855) was a German mathematician, astronomer, geodesist, and physicist, who contributed to many fields in mathematics and science. He was director of the Göttingen Observatory and ...
, its modern definition and usage was first elaborated by English statistician
William Sealy Gosset
William Sealy Gosset (13 June 1876 – 16 October 1937) was an English statistician, chemist and brewer who worked for Guinness. In statistics, he pioneered small sample experimental design. Gosset published under the pen name Student and develo ...
in his 1908 ''
Biometrika'' article "The Probable Error of a Mean", published under the pen name "Student". While Gosset did not actually use the term 'degrees of freedom', he explained the concept in the course of developing what became known as
Student's t-distribution
In probability theory and statistics, Student's distribution (or simply the distribution) t_\nu is a continuous probability distribution that generalizes the Normal distribution#Standard normal distribution, standard normal distribu ...
. The term itself was popularized by English statistician and biologist
Ronald Fisher
Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
, beginning with his 1922 work on chi squares.
Notation
In equations, the typical symbol for degrees of freedom is ''ν'' (lowercase
Greek letter nu). In text and tables, the abbreviation "d.f." is commonly used.
R. A. Fisher used ''n'' to symbolize degrees of freedom but modern usage typically reserves ''n'' for sample size. When reporting the results of
statistical tests
A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...
, the degrees of freedom are typically noted beside the test statistic as either subscript or in parentheses.
Of random vectors
Geometrically, the degrees of freedom can be interpreted as the dimension of certain vector subspaces. As a starting point, suppose that we have a sample of independent normally distributed observations,
:
This can be represented as an ''n''-dimensional
random vector
In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge ...
:
:
Since this random vector can lie anywhere in ''n''-dimensional space, it has ''n'' degrees of freedom.
Now, let
be the
sample mean
The sample mean (sample average) or empirical mean (empirical average), and the sample covariance or empirical covariance are statistics computed from a sample of data on one or more random variables.
The sample mean is the average value (or me ...
. The random vector can be decomposed as the sum of the sample mean plus a vector of residuals:
:
The first vector on the right-hand side is constrained to be a multiple of the vector of 1's, and the only free quantity is
. It therefore has 1 degree of freedom.
The second vector is constrained by the relation
. The first ''n'' − 1 components of this vector can be anything. However, once you know the first ''n'' − 1 components, the constraint tells you the value of the ''n''th component. Therefore, this vector has ''n'' − 1 degrees of freedom.
Mathematically, the first vector is the
oblique projection of the data vector onto the
subspace spanned by the vector of 1's. The 1 degree of freedom is the dimension of this subspace. The second residual vector is the least-squares projection onto the (''n'' − 1)-dimensional
orthogonal complement
In the mathematical fields of linear algebra and functional analysis, the orthogonal complement of a subspace W of a vector space V equipped with a bilinear form B is the set W^\perp of all vectors in V that are orthogonal to every vector in W. I ...
of this subspace, and has ''n'' − 1 degrees of freedom.
In statistical testing applications, often one is not directly interested in the component vectors, but rather in their squared lengths. In the example above, the
residual sum-of-squares is
:
If the data points
are normally distributed with mean 0 and variance
, then the residual sum of squares has a scaled
chi-squared distribution
In probability theory and statistics, the \chi^2-distribution with k Degrees of freedom (statistics), degrees of freedom is the distribution of a sum of the squares of k Independence (probability theory), independent standard normal random vari ...
(scaled by the factor
), with ''n'' − 1 degrees of freedom. The degrees-of-freedom, here a parameter of the distribution, can still be interpreted as the dimension of an underlying vector subspace.
Likewise, the one-sample
''t''-test statistic,
:
follows a
Student's t distribution with ''n'' − 1 degrees of freedom when the hypothesized mean
is correct. Again, the degrees-of-freedom arises from the residual vector in the denominator.
In structural equation models
When the results of structural equation models (SEM) are presented, they generally include one or more indices of overall model fit, the most common of which is a ''χ''
2 statistic. This forms the basis for other indices that are commonly reported. Although it is these other statistics that are most commonly interpreted, the ''degrees of freedom'' of the ''χ''
2 are essential to understanding model fit as well as the nature of the model itself.
Degrees of freedom in SEM are computed as a difference between the number of unique pieces of information that are used as input into the analysis, sometimes called knowns, and the number of parameters that are uniquely estimated, sometimes called unknowns. For example, in a one-factor confirmatory factor analysis with 4 items, there are 10 knowns (the six unique covariances among the four items and the four item variances) and 8 unknowns (4 factor loadings and 4 error variances) for 2 degrees of freedom. Degrees of freedom are important to the understanding of model fit if for no other reason than that, all else being equal, the fewer degrees of freedom, the better indices such as ''χ''
2 will be.
It has been shown that degrees of freedom can be used by readers of papers that contain SEMs to determine if the authors of those papers are in fact reporting the correct model fit statistics. In the organizational sciences, for example, nearly half of papers published in top journals report degrees of freedom that are inconsistent with the models described in those papers, leaving the reader to wonder which models were actually tested.
Of residuals
A common way to think of degrees of freedom is as the number of independent pieces of information available to estimate another piece of information. More concretely, the number of degrees of freedom is the number of independent observations in a sample of data that are available to estimate a parameter of the population from which that sample is drawn. For example, if we have two observations, when calculating the mean we have two independent observations; however, when calculating the variance, we have only one independent observation, since the two observations are equally distant from the sample mean.
In fitting statistical models to data, the vectors of residuals are constrained to lie in a space of smaller dimension than the number of components in the vector. That smaller dimension is the number of ''degrees of freedom for error'', also called ''residual degrees of freedom''.
Example
Perhaps the simplest example is this. Suppose
:
are
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s each with
expected value
In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
''μ'', and let
:
be the "sample mean." Then the quantities
:
are residuals that may be considered
estimates of the
errors ''X''
''i'' − ''μ''. The sum of the residuals (unlike the sum of the errors) is necessarily 0. If one knows the values of any ''n'' − 1 of the residuals, one can thus find the last one. That means they are constrained to lie in a space of dimension ''n'' − 1. One says that there are ''n'' − 1 degrees of freedom for errors.
An example which is only slightly less simple is that of
least squares
The method of least squares is a mathematical optimization technique that aims to determine the best fit function by minimizing the sum of the squares of the differences between the observed values and the predicted values of the model. The me ...
estimation of ''a'' and ''b'' in the model
:
where ''x''
''i'' is given, but e
''i'' and hence ''Y''
''i'' are random. Let
and
be the least-squares estimates of ''a'' and ''b''. Then the residuals
:
are constrained to lie within the space defined by the two equations
:
:
One says that there are ''n'' − 2 degrees of freedom for error.
Notationally, the capital letter ''Y'' is used in specifying the model, while lower-case ''y'' in the definition of the residuals; that is because the former are hypothesized random variables and the latter are actual data.
We can generalise this to multiple regression involving ''p'' parameters and covariates (e.g. ''p'' − 1 predictors and one mean (=intercept in the regression)), in which case the cost in ''degrees of freedom of the fit'' is ''p'', leaving ''n - p'' degrees of freedom for errors
In linear models
The demonstration of the ''t'' and chi-squared distributions for one-sample problems above is the simplest example where degrees-of-freedom arise. However, similar geometry and vector decompositions underlie much of the theory of
linear models, including
linear regression
In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
and
analysis of variance
Analysis of variance (ANOVA) is a family of statistical methods used to compare the Mean, means of two or more groups by analyzing variance. Specifically, ANOVA compares the amount of variation ''between'' the group means to the amount of variati ...
. An explicit example based on comparison of three means is presented here; the geometry of linear models is discussed in more complete detail by Christensen (2002).
Suppose independent observations are made for three populations,
,
and
. The restriction to three groups and equal sample sizes simplifies notation, but the ideas are easily generalized.
The observations can be decomposed as
:
where
are the means of the individual samples, and
is the mean of all 3''n'' observations. In vector notation this decomposition can be written as
:
The observation vector, on the left-hand side, has 3''n'' degrees of freedom. On the right-hand side, the first vector has one degree of freedom (or dimension) for the overall mean. The second vector depends on three random variables,
,
and
. However, these must sum to 0 and so are constrained; the vector therefore must lie in a 2-dimensional subspace, and has 2 degrees of freedom. The remaining 3''n'' − 3 degrees of freedom are in the residual vector (made up of ''n'' − 1 degrees of freedom within each of the populations).
In analysis of variance (ANOVA)
In statistical testing problems, one usually is not interested in the component vectors themselves, but rather in their squared lengths, or Sum of Squares. The degrees of freedom associated with a sum-of-squares is the degrees-of-freedom of the corresponding component vectors.
The three-population example above is an example of
one-way Analysis of Variance
In statistics, one-way analysis of variance (or one-way ANOVA) is a technique to compare whether two or more samples' means are significantly different (using the F distribution). This analysis of variance technique requires a numeric Dependent and ...
. The model, or treatment, sum-of-squares is the squared length of the second vector,
:
with 2 degrees of freedom. The residual, or error, sum-of-squares is
: