In
statistics, the mid-range or mid-extreme is a measure of
central tendency
In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.Weisberg H.F (1992) ''Central Tendency and Variability'', Sage University Paper Series on Quantitative Applications ...
of a
sample defined as the
arithmetic mean
In mathematics and statistics, the arithmetic mean ( ) or arithmetic average, or just the ''mean'' or the '' average'' (when the context is clear), is the sum of a collection of numbers divided by the count of numbers in the collection. The coll ...
of the maximum and minimum values of the
data set A data set (or dataset) is a collection of data. In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the d ...
:
:
The mid-range is closely related to the
range, a measure of
statistical dispersion
In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard deviation, and interquartil ...
defined as the difference between maximum and minimum values.
The two measures are complementary in sense that if one knows the mid-range and the range, one can find the sample maximum and minimum values.
The mid-range is rarely used in practical statistical analysis, as it lacks
efficiency as an estimator for most
distribution Distribution may refer to:
Mathematics
*Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations
*Probability distribution, the probability of a particular value or value range of a varia ...
s of interest, because it ignores all intermediate points, and lacks
robustness
Robustness is the property of being strong and healthy in constitution. When it is transposed into a system, it refers to the ability of tolerating perturbations that might affect the system’s functional body. In the same line ''robustness'' ca ...
, as outliers change it significantly. Indeed, for many distributions it is one of the least efficient and least robust statistics. However, it finds some use in special cases: it is the maximally efficient estimator for the center of a uniform distribution, trimmed mid-ranges address robustness, and as an
L-estimator
In statistics, an L-estimator is an estimator which is a linear combination of order statistics of the measurements (which is also called an L-statistic). This can be as little as a single point, as in the median (of an odd number of values), or a ...
, it is simple to understand and compute.
Robustness
The midrange is highly sensitive to outliers and ignores all but two data points. It is therefore a very non-
robust statistic, having a
breakdown point of 0, meaning that a single observation can change it arbitrarily. Further, it is highly influenced by outliers: increasing the sample maximum or decreasing the sample minimum by ''x'' changes the mid-range by
while it changes the sample mean, which also has breakdown point of 0, by only
It is thus of little use in practical statistics, unless outliers are already handled.
A
trimmed
''Trimmed'' is a 1922 American silent Western film directed by Harry A. Pollard and featuring Hoot Gibson. It is not known whether the film currently survives, and it may be a lost film.
Cast
* Hoot Gibson as Dale Garland
* Patsy Ruth Miller ...
midrange is known as a – the ''n''% trimmed midrange is the average of the ''n''% and (100−''n'')% percentiles, and is more robust, having a
breakdown point of ''n''%. In the middle of these is the
midhinge, which is the 25% midsummary. The
median can be interpreted as the fully trimmed (50%) mid-range; this accords with the convention that the median of an even number of points is the mean of the two middle points.
These trimmed midranges are also of interest as
descriptive statistics
A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and a ...
or as
L-estimator
In statistics, an L-estimator is an estimator which is a linear combination of order statistics of the measurements (which is also called an L-statistic). This can be as little as a single point, as in the median (of an odd number of values), or a ...
s of central location or
skewness
In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.
For a unimo ...
: differences of midsummaries, such as midhinge minus the median, give measures of skewness at different points in the tail.
Efficiency
Despite its drawbacks, in some cases it is useful: the midrange is a highly
efficient estimator
In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...
of μ, given a small sample of a sufficiently
platykurtic
In probability theory and statistics, kurtosis (from el, κυρτός, ''kyrtos'' or ''kurtos'', meaning "curved, arching") is a measure of the "tailedness" of the probability distribution of a real number, real-valued random variable. Like skew ...
distribution, but it is inefficient for
mesokurtic distributions, such as the normal.
For example, for a
continuous uniform distribution with unknown maximum and minimum, the mid-range is the
uniformly minimum-variance unbiased estimator (UMVU) estimator for the mean. The
sample maximum
In statistics, the sample maximum and sample minimum, also called the largest observation and smallest observation, are the values of the greatest and least elements of a sample. They are basic summary statistics, used in descriptive statisti ...
and sample minimum, together with sample size, are a sufficient statistic for the population maximum and minimum – the distribution of other samples, conditional on a given maximum and minimum, is just the uniform distribution between the maximum and minimum and thus add no information. See
German tank problem
In the statistical theory of estimation, the German tank problem consists of estimating the maximum of a discrete uniform distribution from sampling without replacement. In simple terms, suppose there exists an unknown number of items which are ...
for further discussion. Thus the mid-range, which is an unbiased and sufficient estimator of the population mean, is in fact the UMVU: using the sample mean just adds noise based on the uninformative distribution of points within this range.
Conversely, for the normal distribution, the sample mean is the UMVU estimator of the mean. Thus for platykurtic distributions, which can often be thought of as between a uniform distribution and a normal distribution, the informativeness of the middle sample points versus the extrema values varies from "equal" for normal to "uninformative" for uniform, and for different distributions, one or the other (or some combination thereof) may be most efficient. A robust analog is the
trimean In statistics the trimean (TM), or Tukey's trimean, is a measure of a probability distribution's location defined as a weighted average of the distribution's median and its two quartiles:
: TM= \frac
This is equivalent to the average of the m ...
, which averages the midhinge (25% trimmed mid-range) and median.
Small samples
For small sample sizes (''n'' from 4 to 20) drawn from a sufficiently platykurtic distribution (negative
excess kurtosis, defined as γ
2 = (μ
4/(μ
2)²) − 3), the mid-range is an efficient estimator of the mean ''μ''. The following table summarizes empirical data comparing three estimators of the mean for distributions of varied kurtosis; the
modified mean
A truncated mean or trimmed mean is a statistical measure of central tendency, much like the mean and median. It involves the calculation of the mean after discarding given parts of a probability distribution or sample at the high and low end, an ...
is the
truncated mean
A truncated mean or trimmed mean is a statistical measure of central tendency, much like the mean and median. It involves the calculation of the mean after discarding given parts of a probability distribution or sample at the high and low end, an ...
, where the maximum and minimum are eliminated.
For ''n'' = 1 or 2, the midrange and the mean are equal (and coincide with the median), and are most efficient for all distributions. For ''n'' = 3, the modified mean is the median, and instead the mean is the most efficient measure of central tendency for values of ''γ''
2 from 2.0 to 6.0 as well as from −0.8 to 2.0.
Sampling properties
For a sample of size ''n'' from the
standard normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu i ...
, the mid-range ''M'' is unbiased, and has a variance given by:
:
For a sample of size ''n'' from the standard
Laplace distribution
In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two expo ...
, the mid-range ''M'' is unbiased, and has a variance given by:
:
and, in particular, the variance does not decrease to zero as the sample size grows.
For a sample of size ''n'' from a zero-centred
uniform distribution
Uniform distribution may refer to:
* Continuous uniform distribution
* Discrete uniform distribution
* Uniform distribution (ecology)
* Equidistributed sequence In mathematics, a sequence (''s''1, ''s''2, ''s''3, ...) of real numbers is said to be ...
, the mid-range ''M'' is unbiased, ''nM'' has an
asymptotic distribution
In mathematics and statistics, an asymptotic distribution is a probability distribution that is in a sense the "limiting" distribution of a sequence of distributions. One of the main uses of the idea of an asymptotic distribution is in providing ...
which is a
Laplace distribution
In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two expo ...
.
Deviation
While the mean of a set of values minimizes the sum of squares of
deviations and the
median minimizes the
average absolute deviation
The average absolute deviation (AAD) of a data set is the average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. In the general form, the central point can be a mean, median ...
, the midrange minimizes the
maximum deviation In mathematics and statistics, deviation is a measure of difference between the observed value of a Random variable, variable and some other value, often that variable's mean. The Sign (mathematics), sign of the deviation reports the direction of th ...
(defined as
): it is a solution to a
variational problem
The calculus of variations (or Variational Calculus) is a field of mathematical analysis that uses variations, which are small changes in functions
and functionals, to find maxima and minima of functionals: mappings from a set of functions t ...
.
See also
*
Range (statistics)
In statistics, the range of a set of data is the difference between the largest and smallest values,
the result of subtracting the sample maximum and minimum. It is expressed in the same units as the data.
In descriptive statistics, range ...
*
Midhinge
References
*
*
*
{{DEFAULTSORT:Mid-Range
Means
Summary statistics