
In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, Samuelson's inequality, named after the economist
Paul Samuelson
Paul Anthony Samuelson (May 15, 1915 – December 13, 2009) was an American economist who was the first American to win the Nobel Memorial Prize in Economic Sciences. When awarding the prize in 1970, the Swedish Royal Academies stated that he "h ...
, also called the Laguerre–Samuelson inequality,
after the mathematician
Edmond Laguerre, states that every one of any collection ''x''
1, ..., ''x''
''n'', is within uncorrected sample
standard deviation
In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...
s of their sample mean.
Statement of the inequality
If we let
:
be the sample
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
and
:
be the standard deviation of the sample, then
:
Equality holds on the left (or right) for
if and only if
In logic and related fields such as mathematics and philosophy, "if and only if" (often shortened as "iff") is paraphrased by the biconditional, a logical connective between statements. The biconditional is true in two cases, where either bo ...
all the ''n'' − 1
s other than
are equal to each other and greater (smaller) than
[
If you instead define then the inequality still applies and can be slightly tightened to
]
Comparison to Chebyshev's inequality
Chebyshev's inequality locates a certain fraction of the data within certain bounds, while Samuelson's inequality locates ''all'' the data points within certain bounds.
The bounds given by Chebyshev's inequality are unaffected by the number of data points, while for Samuelson's inequality the bounds loosen as the sample size increases. Thus for large enough data sets, Chebyshev's inequality is more useful.
Applications
Samuelson’s inequality has several applications in statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
and mathematics
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
. It is useful in the studentization of residuals which shows a rationale for why this process should be done externally to better understand the spread of residuals in regression analysis.
In matrix theory
In mathematics, a matrix (: matrices) is a rectangular array or table of numbers, symbols, or expressions, with elements or entries arranged in rows and columns, which is used to represent a mathematical object or property of such an object. ...
, Samuelson’s inequality is used to locate the eigenvalues
In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
of certain matrices and tensors.
Furthermore, generalizations of this inequality apply to complex data and random variables in a probability space
In probability theory, a probability space or a probability triple (\Omega, \mathcal, P) is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models ...
.
Relationship to polynomials
Samuelson was not the first to describe this relationship: the first was probably Laguerre in 1880 while investigating the root
In vascular plants, the roots are the plant organ, organs of a plant that are modified to provide anchorage for the plant and take in water and nutrients into the plant body, which allows plants to grow taller and faster. They are most often bel ...
s (zeros) of polynomial
In mathematics, a polynomial is a Expression (mathematics), mathematical expression consisting of indeterminate (variable), indeterminates (also called variable (mathematics), variables) and coefficients, that involves only the operations of addit ...
s.[Laguerre E. (1880) Mémoire pour obtenir par approximation les racines d'une équation algébrique qui a toutes les racines réelles. Nouv Ann Math 2e série, 19, 161–172, 193–202]
Consider a polynomial with all roots real:
:
Without loss of generality let and let
: and
Then
:
and
:
In terms of the coefficients
:
Laguerre showed that the roots of this polynomial were bounded by
:
where
:
Inspection shows that is the mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
of the roots and that ''b'' is the standard deviation of the roots.
Laguerre failed to notice this relationship with the means and standard deviations of the roots, being more interested in the bounds themselves. This relationship permits a rapid estimate of the bounds of the roots and may be of use in their location.
When the coefficients and are both zero no information can be obtained about the location of the roots, because not all roots are real (as can be seen from Descartes' rule of signs) unless the constant term is also zero.
References
{{reflist
Statistical inequalities