Scoring algorithm, also known as Fisher's scoring, is a form of

Newton's method In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a ...

used in

statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

to solve

maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...

equations numerically, named after

Ronald Fisher Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...

Sketch of derivation

Let

Y_1,\ldots,Y_n

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...

s, independent and identically distributed with twice differentiable p.d.f.

f(y; \theta)

, and we wish to calculate the

maximum likelihood estimator In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...

(M.L.E.)

\theta^*

\theta

. First, suppose we have a starting point for our algorithm

\theta_0

, and consider a

Taylor expansion In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...

of the score function,

V(\theta)

, about

\theta_0

: :

V(\theta) \approx V(\theta_0) - \mathcal(\theta_0)(\theta - \theta_0), \,

where :

\mathcal(\theta_0) = - \sum_^n \left. \nabla \nabla^ \_ \log f(Y_i ; \theta)

is the observed information matrix at

\theta_0

. Now, setting

\theta = \theta^*

, using that

V(\theta^*) = 0

and rearranging gives us: :

\theta^* \approx \theta_ + \mathcal^(\theta_)V(\theta_). \,

We therefore use the algorithm :

\theta_ = \theta_ + \mathcal^(\theta_)V(\theta_), \,

and under certain regularity conditions, it can be shown that

\theta_m \rightarrow \theta^*

Fisher scoring

In practice,

\mathcal(\theta)

is usually replaced by

\mathcal(\theta)= \mathrm mathcal(\theta) /math>, the

Fisher information In mathematical statistics, the Fisher information is a way of measuring the amount of information that an observable random variable ''X'' carries about an unknown parameter ''θ'' of a distribution that models ''X''. Formally, it is the variance ...

, thus giving us the Fisher Scoring Algorithm: :

\theta_ = \theta_ + \mathcal^(\theta_)V(\theta_)

.. Under some regularity conditions, if

\theta_m

is a

consistent estimator In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter ''θ''0—having the property that as the number of data points used increases indefinitely, the result ...

, then

\theta_

(the correction after a single step) is 'optimal' in the sense that its error distribution is asymptotically identical to that of the true max-likelihood estimate.

Sketch of derivation

Fisher scoring

See also

References

Further reading