In
probability and
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, Student's ''t''-distribution (or simply the ''t''-distribution) is any member of a family of continuous
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
s that arise when estimating the
mean of a
normally distributed population in situations where the
sample size is small and the population's
standard deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
is unknown. It was developed by English statistician
William Sealy Gosset under the pseudonym "Student".
The ''t''-distribution plays a role in a number of widely used statistical analyses, including
Student's ''t''-test for assessing the
statistical significance
In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the p ...
of the difference between two sample means, the construction of
confidence interval
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
s for the difference between two population means, and in linear
regression analysis. Student's ''t''-distribution also arises in the
Bayesian analysis of data from a normal family.
If we take a sample of
observations from a normal distribution, then the ''t''-distribution with
degrees of freedom
Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
can be defined as the distribution of the location of the sample mean relative to the true mean, divided by the sample standard deviation, after multiplying by the standardizing term
. In this way, the ''t''-distribution can be used to construct a
confidence interval
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
for the true mean.
The ''t''-distribution is symmetric and bell-shaped, like the normal distribution. However, the ''t''-distribution has heavier tails, meaning that it is more prone to producing values that fall far from its mean. This makes it useful for understanding the statistical behavior of certain types of ratios of random quantities, in which variation in the denominator is amplified and may produce outlying values when the denominator of the ratio falls close to zero. The Student's ''t''-distribution is a special case of the
generalized hyperbolic distribution.
History and etymology

In statistics, the ''t''-distribution was first derived as a
posterior distribution in 1876 by
Helmert and
Lüroth.
The ''t''-distribution also appeared in a more general form as
Pearson Type IV distribution in
Karl Pearson
Karl Pearson (; born Carl Pearson; 27 March 1857 – 27 April 1936) was an English mathematician and biostatistician. He has been credited with establishing the discipline of mathematical statistics. He founded the world's first university st ...
's 1895 paper.
In the English-language literature, the distribution takes its name from William Sealy Gosset's 1908 paper in ''
Biometrika
''Biometrika'' is a peer-reviewed scientific journal published by Oxford University Press for thBiometrika Trust The editor-in-chief is Paul Fearnhead (Lancaster University). The principal focus of this journal is theoretical statistics. It was es ...
'' under the pseudonym "Student". One version of the origin of the pseudonym is that Gosset's employer preferred staff to use pen names when publishing scientific papers instead of their real name, so he used the name "Student" to hide his identity. Another version is that Guinness did not want their competitors to know that they were using the ''t''-test to determine the quality of raw material.
Gosset worked at the
Guinness Brewery in
Dublin, Ireland
Dublin (; , or ) is the capital and largest city of Ireland. On a bay at the mouth of the River Liffey, it is in the province of Leinster, bordered on the south by the Dublin Mountains, a part of the Wicklow Mountains range. At the 2016 cen ...
, and was interested in the problems of small samples – for example, the chemical properties of barley where sample sizes might be as few as 3. Gosset's paper refers to the distribution as the "frequency distribution of standard deviations of samples drawn from a normal population". It became well known through the work of
Ronald Fisher, who called the distribution "Student's distribution" and represented the test value with the letter ''t''.
How Student's distribution arises from sampling
Let
be independently and identically drawn from the distribution
, i.e. this is a sample of size
from a normally distributed population with expected mean value
and variance
.
Let
:
be the sample mean and let
:
be the (
Bessel-corrected) sample variance. Then the random variable
:
has a standard normal distribution (i.e. normal with expected mean 0 and variance 1), and the random variable
:
''i.e'' where
has been substituted for
, has a Student's ''t''-distribution with
degrees of freedom. Since
has replaced
the only unobservable quantity in this expression is
so this can be used to derive confidence intervals for
The numerator and the denominator in the preceding expression are
statistically independent random variables despite being based on the same sample
. This can be seen by observing that
and recalling that
and
are both linear combinations of the same set of i.i.d. normally distributed random variables.
Definition
Probability density function
Student's ''t''-distribution has the
probability density function (PDF) given by
:
where
is the number of ''
degrees of freedom
Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
'' and
is the
gamma function
In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except ...
. This may also be written as
:
where B is the
Beta function. In particular for integer valued degrees of freedom
we have:
For
even,
:
For
odd,
:
The probability density function is
symmetric, and its overall shape resembles the bell shape of a normally distributed variable with mean 0 and variance 1, except that it is a bit lower and wider. As the number of degrees of freedom grows, the ''t''-distribution approaches the normal distribution with mean 0 and variance 1. For this reason
is also known as the normality parameter.
The following images show the density of the ''t''-distribution for increasing values of
. The normal distribution is shown as a blue line for comparison. Note that the ''t''-distribution (red line) becomes closer to the normal distribution as
increases.
Cumulative distribution function
The
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
(CDF) can be written in terms of ''I'', the regularized
incomplete beta function
In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral
: \Beta(z_1,z_2) = \int_0^1 t ...
. For ''t'' > 0,
[
:
where
:
Other values would be obtained by symmetry. An alternative formula, valid for , is][
:
where 2''F''1 is a particular case of the hypergeometric function.
For information on its inverse cumulative distribution function, see .
]
Special cases
Certain values of give a simple form for Student's t-distribution.
How the ''t''-distribution arises
Sampling distribution
Let be the numbers observed in a sample from a continuously distributed population with expected value . The sample mean and sample variance are given by:
:
The resulting ''t-value'' is
:
The ''t''-distribution with degrees of freedom is the sampling distribution of the ''t''-value when the samples consist of independent identically distributed observations from a normally distributed population. Thus for inference purposes ''t'' is a useful " pivotal quantity" in the case when the mean and variance are unknown population parameters, in the sense that the ''t''-value has then a probability distribution that depends on neither nor .
Bayesian inference
In Bayesian statistics, a (scaled, shifted) ''t''-distribution arises as the marginal distribution of the unknown mean of a normal distribution, when the dependence on an unknown variance has been marginalized out:
:
where stands for the data , and represents any other information that may have been used to create the model. The distribution is thus the compounding of the conditional distribution of given the data and with the marginal distribution of given the data.
With data points, if uninformative, or flat, the location prior can be taken for ''μ'', and the scale prior can be taken for ''σ''2, then Bayes' theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
gives
:
a normal distribution and a scaled inverse chi-squared distribution respectively, where and
:
The marginalization integral thus becomes
:
This can be evaluated by substituting , where , giving
:
so
:
But the ''z'' integral is now a standard Gamma integral
In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except t ...
, which evaluates to a constant, leaving
:
This is a form of the ''t''-distribution with an explicit scaling and shifting that will be explored in more detail in a further section below. It can be related to the standardized ''t''-distribution by the substitution
:
The derivation above has been presented for the case of uninformative priors for and ; but it will be apparent that any priors that lead to a normal distribution being compounded with a scaled inverse chi-squared distribution will lead to a ''t''-distribution with scaling and shifting for , although the scaling parameter corresponding to above will then be influenced both by the prior information and the data, rather than just by the data as above.
Characterization
As the distribution of a test statistic
Student's ''t''-distribution with degrees of freedom can be defined as the distribution of the random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
''T'' with
:
where
* ''Z'' is a standard normal with expected value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
0 and variance 1;
* ''V'' has a chi-squared distribution () with degrees of freedom
Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
;
* ''Z'' and ''V'' are independent;
A different distribution is defined as that of the random variable defined, for a given constant ''μ'', by
:
This random variable has a noncentral ''t''-distribution with noncentrality parameter Noncentral distributions are families of probability distributions that are related to other "central" families of distributions by means of a noncentrality parameter. Whereas the central distribution describes how a test statistic is distributed wh ...
''μ''. This distribution is important in studies of the power of Student's ''t''-test.
Derivation
Suppose ''X''1, ..., ''X''''n'' are independent realizations of the normally-distributed, random variable ''X'', which has an expected value ''μ'' and variance ''σ''2. Let
:
be the sample mean, and
:
be an unbiased estimate of the variance from the sample. It can be shown that the random variable
:
has a chi-squared distribution with degrees of freedom (by Cochran's theorem). It is readily shown that the quantity
:
is normally distributed with mean 0 and variance 1, since the sample mean is normally distributed with mean ''μ'' and variance ''σ''2/''n''. Moreover, it is possible to show that these two random variables (the normally distributed one ''Z'' and the chi-squared-distributed one ''V'') are independent. Consequently the pivotal quantity
:
which differs from ''Z'' in that the exact standard deviation ''σ'' is replaced by the random variable ''S''''n'', has a Student's ''t''-distribution as defined above. Notice that the unknown population variance ''σ''2 does not appear in ''T'', since it was in both the numerator and the denominator, so it canceled. Gosset intuitively obtained the probability density function stated above, with equal to ''n'' − 1, and Fisher proved it in 1925.
The distribution of the test statistic ''T'' depends on , but not ''μ'' or ''σ''; the lack of dependence on ''μ'' and ''σ'' is what makes the ''t''-distribution important in both theory and practice.
As a maximum entropy distribution
Student's ''t''-distribution is the maximum entropy probability distribution for a random variate ''X'' for which is fixed.
Properties
Moments
For , the raw moments of the ''t''-distribution are
: