相關概念:極大似然估計,score function,Fisher information
Let f(X; θ) be the probability density function (or probability mass function) for X conditional on the value of θ. This is also the likelihood function for θ. It describes the probability that we observe a given sample X, given a known value of θ.
1、If f is sharply peaked with respect to changes in θ, it is easy to indicate the “correct” value of θ from the data, or equivalently, that the data X provides a lot of information about the parameter θ。
2、If the likelihood f is flat and spread-out, then it would take many, many samples like X to estimate the actual “true” value of θ that would be obtained using the entire population being sampled.
two if : This suggests studying some kind of variance with respect to θ.
score function: 對數似然函數的一階導
以上是證明一階導的期望為0
fisher information:score function的二階矩
Note that. A random variable carrying high Fisher information implies that the absolute value of the score is often high. The Fisher information is not a function of a particular observation, as the random variable X has been averaged out.
如果對數似然函數的二階可導,則Fisher信息量可寫成:
因為:
{
對數似然的一階導
一階導的方差就是信息量}
同時由於:
Thus, the Fisher information may be seen as the curvature (曲率)of the support curve (the graph of the log-likelihood). Near the maximum likelihood estimate, low Fisher information therefore indicates that the maximum appears "blunt", that is, the maximum is shallow and there are many nearby values with a similar log-likelihood. Conversely, high Fisher information indicates that the maximum is sharp.
信息量可加: