In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the method of moments is a method of
estimation
Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is d ...
of population
parameters. The same principle is used to derive higher moments like skewness and
kurtosis.
It starts by expressing the population
moments (i.e., the
expected value
In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
s of powers of the
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
under consideration) as functions of the parameters of interest. Those expressions are then set equal to the sample moments. The number of such equations is the same as the number of parameters to be estimated. Those equations are then solved for the parameters of interest. The solutions are estimates of those parameters.
The method of moments was introduced by
Pafnuty Chebyshev in 1887 in the proof of the
central limit theorem
In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the Probability distribution, distribution of a normalized version of the sample mean converges to a Normal distribution#Standard normal distributi ...
. The idea of matching empirical moments of a distribution to the population moments dates back at least to
Karl Pearson
Karl Pearson (; born Carl Pearson; 27 March 1857 – 27 April 1936) was an English biostatistician and mathematician. He has been credited with establishing the discipline of mathematical statistics. He founded the world's first university ...
.
Method
Suppose that the parameter
= (
) characterizes the
distribution of the random variable
. Suppose the first
moments of the true distribution (the "population moments") can be expressed as functions of the
s:
Suppose a sample of size
is drawn, resulting in the values
. For
, let
be the ''j''-th sample moment, an estimate of
. The method of moments estimator for
denoted by
is defined to be the solution (if one exists) to the equations:
The method described here for single random variables generalizes in an obvious manner to multiple random variables leading to multiple choices for moments to be used. Different choices generally lead to different solutions.
Advantages and disadvantages
The method of moments is fairly simple and yields
consistent estimators (under very weak assumptions), though these estimators are often
biased.
It is an alternative to the
method of maximum likelihood.
However, in some cases the likelihood equations may be intractable without computers, whereas the method-of-moments estimators can be computed much more quickly and easily. Due to easy computability, method-of-moments estimates may be used as the first approximation to the solutions of the likelihood equations, and successive improved approximations may then be found by the
Newton–Raphson method. In this way the method of moments can assist in finding maximum likelihood estimates.
In some cases, infrequent with large samples but less infrequent with small samples, the estimates given by the method of moments are outside of the parameter space (as shown in the example below); it does not make sense to rely on them then. That problem never arises in the method of
maximum likelihood Also, estimates by the method of moments are not necessarily
sufficient statistics, i.e., they sometimes fail to take into account all relevant information in the sample.
When estimating other structural parameters (e.g., parameters of a
utility function
In economics, utility is a measure of a certain person's satisfaction from a certain state of the world. Over time, the term has been used with at least two meanings.
* In a Normative economics, normative context, utility refers to a goal or ob ...
, instead of parameters of a known probability distribution), appropriate probability distributions may not be known, and moment-based estimates may be preferred to maximum likelihood estimation.
Alternative method of moments
The equations to be solved in the method of moments (MoM) are in general nonlinear and there are no generally applicable guarantees that tractable solutions exist. But there is an alternative approach to using sample moments to estimate data model parameters in terms of known dependence of model moments on these parameters, and this alternative requires the solution of only linear equations or, more generally, tensor equations. This alternative is referred to as the Bayesian-Like MoM (BL-MoM), and it differs from the classical MoM in that it uses optimally weighted sample moments. Considering that the MoM is typically motivated by a lack of sufficient knowledge about the data model to determine likelihood functions and associated ''a posteriori'' probabilities of unknown or random parameters, it is odd that there exists a type of MoM that is ''Bayesian-Like''. But the particular meaning of ''Bayesian-Like'' leads to a problem formulation in which required knowledge of ''a posteriori'' probabilities is replaced with required knowledge of only the dependence of model moments on unknown model parameters, which is exactly the knowledge required by the traditional MoM
[Lindsay, B.G. & Basak P. (1993). “Multivariate normal mixtures: a fast consistent method of moments”, ''Journal of the American Statistical Association'' 88, 468–476.][Quandt, R.E. & Ramsey, J.B. (1978). “Estimating mixtures of normal distributions and switching regressions”, ''Journal of the American Statistical Association'' 73, 730–752.][https://real-statistics.com/distribution-fitting/method-of-moments/][Hansen, L. (1982). “Large sample properties of generalized method of moments estimators”, ''Econometrica'' 50, 1029–1054.][Lindsay, B.G. (1982). “Conditional score functions: some optimality results”, ''Biometrika'' 69, 503–512.] The BL-MoM also uses knowledge of ''a priori'' probabilities of the parameters to be estimated, when available, but otherwise uses uniform priors.
The BL-MoM has been reported on in only the applied statistics literature in connection with parameter estimation and hypothesis testing using observations of stochastic processes for problems in Information and Communications Theory and, in particular, communications receiver design in the absence of knowledge of likelihood functions or associated ''a posteriori'' probabilities
[Gardner, W.A., “Design of nearest prototype signal classifiers”, ''IEEE Transactions on Information Theory'' 27 (3), 368–372,1981] and references therein. In addition, the restatement of this receiver design approach for
stochastic process
In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Sto ...
models as an alternative to the classical MoM for any type of multivariate data is available in tutorial form at the university website.
[Cyclostationarity](_blank)
page 11.4 The applications in
and references demonstrate some important characteristics of this alternative to the classical MoM, and a detailed list of relative advantages and disadvantages is given in,
but the literature is missing direct comparisons in specific applications of the classical MoM and the BL-MoM.
Examples
An example application of the method of moments is to estimate polynomial probability density distributions. In this case, an approximating polynomial of order
is defined on an interval