
In
statistics, the standard score is the number of
standard deviations by which the value of a
raw score (i.e., an observed value or data point) is above or below the
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set.
For a data set, the '' ari ...
value of what is being observed or measured. Raw scores above the mean have positive standard scores, while those below the mean have negative standard scores.
It is calculated by subtracting the
population mean
In statistics, a population is a set of similar items or events which is of interest for some question or experiment. A statistical population can be a group of existing objects (e.g. the set of all stars within the Milky Way galaxy) or a hypot ...
from an individual raw score and then dividing the difference by the
population
Population typically refers to the number of people in a single area, whether it be a city or town, region, country, continent, or the world. Governments typically quantify the size of the resident population within their jurisdiction using ...
standard deviation. This process of converting a raw score into a standard score is called standardizing or normalizing (however, "normalizing" can refer to many types of ratios; see
normalization for more).
Standard scores are most commonly called ''z''-scores; the two terms may be used interchangeably, as they are in this article. Other equivalent terms in use include z-values, normal scores, standardized variables and pull in
high energy physics.
Computing a z-score requires knowledge of the mean and standard deviation of the complete population to which a data point belongs; if one only has a
sample of observations from the population, then the analogous computation using the sample mean and sample standard deviation yields the
''t''-statistic.
Calculation
If the population mean and population standard deviation are known, a raw score
''x'' is converted into a standard score by
:
where:
: ''μ'' is the
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set.
For a data set, the '' ari ...
of the population,
: ''σ'' is the
standard deviation of the population.
The absolute value of z represents the distance between that raw score ''x'' and the population mean in units of the standard deviation. z is negative when the raw score is below the mean, positive when above.
Calculating z using this formula requires use of the population mean and the population standard deviation, not the sample mean or sample deviation. However, knowing the true mean and standard deviation of a population is often an unrealistic expectation, except in cases such as
standardized testing, where the entire population is measured.
When the population mean and the population standard deviation are unknown, the standard score may be estimated by using the sample mean and sample standard deviation as estimates of the population values.
[ ][ ][ ][ ]
In these cases, the z-score is given by
:
where:
:
is the
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set.
For a data set, the '' ari ...
of the sample,
: ''S'' is the
standard deviation of the sample.
Though it should always be stated, the distinction between use of the population and sample statistics often is not made. In either case, the numerator and denominator of the equations have the same units of measure so that the units cancel out through division and z is left as a
dimensionless quantity
A dimensionless quantity (also known as a bare quantity, pure quantity, or scalar quantity as well as quantity of dimension one) is a quantity to which no physical dimension is assigned, with a corresponding SI unit of measurement of one (or 1) ...
.
Applications
Z-test
The z-score is often used in the z-test in standardized testing – the analog of the
Student's t-test
A ''t''-test is any statistical hypothesis test in which the test statistic follows a Student's ''t''-distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of ...
for a population whose parameters are known, rather than estimated. As it is very unusual to know the entire population, the t-test is much more widely used.
Prediction intervals
The standard score can be used in the calculation of
prediction interval
In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are ...
s. A prediction interval
'L'',''U'' consisting of a lower endpoint designated ''L'' and an upper endpoint designated ''U'', is an interval such that a future observation ''X'' will lie in the interval with high probability
, i.e.
: