statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, Samuelson's inequality, named after the economist

Paul Samuelson Paul Anthony Samuelson (May 15, 1915 – December 13, 2009) was an American economist who was the first American to win the Nobel Memorial Prize in Economic Sciences. When awarding the prize in 1970, the Swedish Royal Academies stated that he "h ...

, also called the Laguerre–Samuelson inequality, after the mathematician Edmond Laguerre, states that every one of any collection ''x''₁, ..., ''x''_''n'', is within uncorrected sample

standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...

s of their sample mean.

Statement of the inequality

If we let :

\overline = \frac

be the sample

mean A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...

and :

s = \sqrt

be the standard deviation of the sample, then :

\overline - s\sqrt \le x_j \le \overline + s\sqrt\qquad \text j = 1,\dots,n.

Equality holds on the left (or right) for

x_j

if and only if In logic and related fields such as mathematics and philosophy, "if and only if" (often shortened as "iff") is paraphrased by the biconditional, a logical connective between statements. The biconditional is true in two cases, where either bo ...

all the ''n'' − 1

x_i

s other than

x_j

are equal to each other and greater (smaller) than

x_j.

If you instead define

s = \sqrt

then the inequality

\overline - s\sqrt \le x_j \le \overline + s\sqrt

still applies and can be slightly tightened to

\overline - s\tfrac \le x_j \le \overline + s\tfrac.

Comparison to Chebyshev's inequality

Chebyshev's inequality locates a certain fraction of the data within certain bounds, while Samuelson's inequality locates ''all'' the data points within certain bounds. The bounds given by Chebyshev's inequality are unaffected by the number of data points, while for Samuelson's inequality the bounds loosen as the sample size increases. Thus for large enough data sets, Chebyshev's inequality is more useful.

Applications

Samuelson’s inequality has several applications in

and

mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...

. It is useful in the studentization of residuals which shows a rationale for why this process should be done externally to better understand the spread of residuals in regression analysis. In

matrix theory In mathematics, a matrix (: matrices) is a rectangular array or table of numbers, symbols, or expressions, with elements or entries arranged in rows and columns, which is used to represent a mathematical object or property of such an object. ...

, Samuelson’s inequality is used to locate the

eigenvalues In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...

of certain matrices and tensors. Furthermore, generalizations of this inequality apply to complex data and random variables in a

probability space In probability theory, a probability space or a probability triple (\Omega, \mathcal, P) is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models ...

Relationship to polynomials

Samuelson was not the first to describe this relationship: the first was probably Laguerre in 1880 while investigating the

root In vascular plants, the roots are the plant organ, organs of a plant that are modified to provide anchorage for the plant and take in water and nutrients into the plant body, which allows plants to grow taller and faster. They are most often bel ...

s (zeros) of

polynomial In mathematics, a polynomial is a Expression (mathematics), mathematical expression consisting of indeterminate (variable), indeterminates (also called variable (mathematics), variables) and coefficients, that involves only the operations of addit ...

s.Laguerre E. (1880) Mémoire pour obtenir par approximation les racines d'une équation algébrique qui a toutes les racines réelles. Nouv Ann Math 2^e série, 19, 161–172, 193–202 Consider a polynomial with all roots real: :

a_0x^n + a_1x^ + \cdots + a_x + a_n = 0

Without loss of generality let

a_0 = 1

and let :

t_1 = \sum x_i

and

t_2 = \sum x_i^2

Then :

a_1 = - \sum x_i = -t_1

and :

a_2 = \sum x_ix_j = \frac \qquad \text i < j

In terms of the coefficients :

t_2 = a_1^2 - 2a_2

Laguerre showed that the roots of this polynomial were bounded by :

-a_1 / n \pm b \sqrt

where :

b = \frac = \frac

Inspection shows that

-\tfrac

is the

of the roots and that ''b'' is the standard deviation of the roots. Laguerre failed to notice this relationship with the means and standard deviations of the roots, being more interested in the bounds themselves. This relationship permits a rapid estimate of the bounds of the roots and may be of use in their location. When the coefficients

a_1

and

a_2

are both zero no information can be obtained about the location of the roots, because not all roots are real (as can be seen from Descartes' rule of signs) unless the constant term is also zero.

References

{{reflist Statistical inequalities