Scoring algorithm, also known as Fisher's scoring,[1] is a form of Newton's method used in statistics to solve maximum likelihood equations numerically, named after Ronald Fisher.
Let Y 1 , … , Y n {\displaystyle Y_{1},\ldots ,Y_{n}} be random variables, independent and identically distributed with twice differentiable p.d.f. f ( y ; θ ) {\displaystyle f(y;\theta )} , and we wish to calculate the maximum likelihood estimator (M.L.E.) θ ∗ {\displaystyle \theta ^{*}} of θ {\displaystyle \theta } . First, suppose we have a starting point for our algorithm θ 0 {\displaystyle \theta _{0}} , and consider a Taylor expansion of the score function, V ( θ ) {\displaystyle V(\theta )} , about θ 0 {\displaystyle \theta _{0}} :
where
is the observed information matrix at θ 0 {\displaystyle \theta _{0}} . Now, setting θ = θ ∗ {\displaystyle \theta =\theta ^{*}} , using that V ( θ ∗ ) = 0 {\displaystyle V(\theta ^{*})=0} and rearranging gives us:
We therefore use the algorithm
and under certain regularity conditions, it can be shown that θ m → θ ∗ {\displaystyle \theta _{m}\rightarrow \theta ^{*}} .
In practice, J ( θ ) {\displaystyle {\mathcal {J}}(\theta )} is usually replaced by I ( θ ) = E [ J ( θ ) ] {\displaystyle {\mathcal {I}}(\theta )=\mathrm {E} [{\mathcal {J}}(\theta )]} , the Fisher information, thus giving us the Fisher Scoring Algorithm:
Under some regularity conditions, if θ m {\displaystyle \theta _{m}} is a consistent estimator, then θ m + 1 {\displaystyle \theta _{m+1}} (the correction after a single step) is 'optimal' in the sense that its error distribution is asymptotically identical to that of the true max-likelihood estimate.[2]
{{cite journal}}