In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, rankits of a set of data are the expected values of the
order statistic
In statistics, the ''k''th order statistic of a statistical sample is equal to its ''k''th-smallest value. Together with Ranking (statistics), rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and ...
s of a sample from the standard
normal distribution
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
f(x) = \frac ...
the same size as the data. They are primarily used in the
normal probability plot
The normal probability plot is a graphical technique to identify substantive departures from normality. This includes identifying outliers, skewness, kurtosis, a need for transformations, and mixtures. Normal probability plots are made of raw ...
, a
graphical technique
Statistical graphics, also known as statistical graphical techniques, are graphics used in the field of statistics for data visualization.
Overview
Whereas statistics and data analysis procedures generally yield their output in numeric or tabul ...
for
normality test
In statistics, normality tests are used to determine if a data set is well-modeled by a normal distribution
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distributio ...
ing.
Example
This is perhaps most readily understood by means of an example. If an
i.i.d. sample of six items is taken from a
normally distributed
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real number, real-valued random variable. The general form of its probability density function is
f(x ...
population with
expected value
In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
0 and
variance
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
1 (the
standard normal distribution
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
f(x) = \frac e^ ...
) and then sorted into increasing order, the expected values of the resulting
order statistic
In statistics, the ''k''th order statistic of a statistical sample is equal to its ''k''th-smallest value. Together with Ranking (statistics), rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and ...
s are:
:−1.2672, −0.6418, −0.2016, 0.2016, 0.6418, 1.2672.
Suppose the numbers in a data set are
: 65, 75, 16, 22, 43, 40.
Then one may sort these and line them up with the corresponding rankits; in order they are
: 16, 22, 40, 43, 65, 75,
which yields the points:
These points are then plotted as the vertical and horizontal coordinates of a
scatter plot
A scatter plot, also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram, is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of dat ...
.
Alternative method
Alternatively, rather than ''sort'' the data points, one may ''rank'' them, and ''rearrange'' the rankits accordingly. This yields the same pairs of numbers, but in a different order.
For:
: 65, 75, 16, 22, 43, 40,
the corresponding ranks are:
: 5, 6, 1, 2, 4, 3,
i.e., the number appearing first is the 5th-smallest, the number appearing second is 6th-smallest, the number appearing third is smallest, the number appearing fourth is 2nd-smallest, etc. One rearranges the expected normal order statistics accordingly, getting the rankits of this data set:
Rankit plot
{{main, Normal probability plot
A graph plotting the rankits on the horizontal axis and the data points on the vertical axis is called a rankit plot or a
normal probability plot
The normal probability plot is a graphical technique to identify substantive departures from normality. This includes identifying outliers, skewness, kurtosis, a need for transformations, and mixtures. Normal probability plots are made of raw ...
. Such a plot is necessarily nondecreasing. In large samples from a normally distributed population, such a plot will approximate a straight line. Substantial deviations from straightness are considered evidence against normality of the distribution.
Rankit plots are usually used to visually demonstrate whether data are from a specified
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
.
A rankit plot is a kind of
Q–Q plot
In statistics, a Q–Q plot (quantile–quantile plot) is a probability plot, a List of graphical methods, graphical method for comparing two probability distributions by plotting their ''quantiles'' against each other. A point on the plot ...
– it plots the order statistics (quantiles) of the sample against certain quantiles (the rankits) of the assumed normal distribution. Q–Q plots may use other quantiles for the normal distribution, however.
History
The rankit plot and the word ''rankit'' was introduced by the biologist and statistician
Chester Ittner Bliss
Chester Ittner Bliss (February 1, 1899 – March 14, 1979) was primarily a biologist, who is best known for his contributions to statistics. He was born in Springfield, Ohio in 1899 and died in 1979. He was the first secretary of the Internation ...
(1899–1979).
See also
*
Probit
In probability theory and statistics, the probit function is the quantile function associated with the standard normal distribution. It has applications in data analysis and machine learning, in particular exploratory statistical graphics and ...
analysis developed by C. I. Bliss in 1934.
External links
Engineering Statistics Handbook
Statistical charts and diagrams
Normal distribution