Sufficient Statistic (充分統計量)


Sufficient statistic - Wikipedia

Sufficient statistic - arizona

定義

統計量是一些隨機樣本\(X_1, X_2, \cdots, X_n\)的函數

\[T = r(X_1, X_2, \cdots, X_n). \]

樣本\(X\)的分布\(f_{\theta}(X)=f(X;\theta)\)由位置參數\(\theta\)決定, 通常我們通過極大似然估計

\[\max_{\theta} \quad P(X_1,X_2,\cdots, X_n ;\theta) = \prod_{i=1}^n P(X_i;\theta) = \prod_{i=1}^n f_{\theta}(X_i). \]

而充分統計量是指這樣的統計量:

\[P(\{X_i\}|T=t;\theta) = P(\{X_i\}|T=t), \]

即在給定\(T(X)=t\)的情況下, \(\{X_i\}\)的條件聯合分布與未知參數\(\theta\)無關.

Example: 考慮伯努利分布, 成功的概率為\(p\), 失敗的概率為\(1-p\), 有\(n\)個獨立同分布的樣本\(X_1, X_2,\cdots, X_n\), 則:

\[P(\{X_i\};p) = p^{\sum_i X_i}(1-p)^{n-\sum_i X_i}, \]

實際上(后面會講到)\(T=\sum_i^n X_i\)為其一充分統計量. 實際上,

\[P(\{X_i\}|T=t;p) = \frac{P(\{X_i\}, T=t; p)}{P(T=t;p)} = \frac{\mathbb{I}[{\sum_{i}^nX_i=t]}\cdot p^t (1-p)^{n-t}}{C_n^t p^t (1-p)^{n-t}}=\frac{\mathbb{I}[\sum_i^n X_i = t]}{C_n^t}. \]

顯然與位置參數\(p\)無關.

充分統計量特別的意義, 比如上面提到的極大似然估計, 由於

\[P(\{X_i\};\theta) = P(\{X_i\}, T;\theta) = P(\{X_i\}|T;\theta) \:P(T;\theta) = P(\{X_i\}|T) \:P(T;\theta), \]

由於\(P(\{X_i\}|T)\)\(\theta\)無關, 所以最大化上式等價於

\[\max_{\theta} \quad P(T;\theta) = P(r(X_1, X_2,\cdots, X_n); \theta). \]

特別地, 有時候標量\(T\)並不充分, 需要\(T=(T_1, T_2,\cdots, T_k)\) 整體作為充分統計量, 比如當正態分布地\(\mu, \sigma\)均為未知參數的時候, \(T=(\frac{1}{n}\sum_i X_i, \frac{1}{n-1}\sum_i (X_i - \bar{X})^2)\). 性質和上面的別無二致, 所以下面也不特別說明了.

當置於貝葉斯框架下時, 可以發現:

\[P(\theta|\{X_i\}) = \frac{P(\{X_i\}, \theta)}{P(\{X_i\})} = \frac{P(\{X_i\}, T, \theta)}{P(\{X_i\}, T)} = \frac{P(\{X_i\}| T, \theta) P(T|\theta)}{P(\{X_i\}, T)} = \frac{P(\{X_i\}| T) P(T|\theta)}{P(\{X_i\}, T)} = P(\theta|T). \]

即給定\(\{X_i\}\)或者\(T\), \(\theta\)的條件(后驗)分布是一致的.

特別地, 我們可以用互信息來定義充分統計量, \(T\)為充分統計量, 當且僅當

\[I(\theta;X) = I(\theta;T(X)). \]

注: 一般情況下\(I(\theta;X) \ge I(\theta;T(X))\).

充分統計量的判定

用上面的標准來判斷充分統計量是非常困難的一件事, 好在有Fisher-Neyman分離定理:

Factorization Theorem: \(\{X_i\}\)的聯合密度函數為\(f_{\theta}(X)\), 則\(T\)是關於\(\theta\)的充分統計量當且僅當存在非負函數\(g, h\)滿足

\[f(X_1, X_2,\cdots, X_n; \theta) = h(X_1, X_2,\cdots, X_n) g(T; \theta). \]

注: \(T\)可以是\(T=(T_1, T_2,\cdots, T_k)\).

proof:

\(\Rightarrow\)

\[p(X_1,X_2,\cdots, X_n;\theta) = p(\{X_i\}|T;\theta) = p(\{X_i\}|T;\theta)p(T;\theta) = p(\{X_i\}|T)p(T;\theta) \]

此時

\[g(T;\theta) = p(T;\theta), \\ h(X_1, X_2,\cdots, X_n) = p(\{X_i\}|T). \]

\(\Leftarrow\)

為了符號簡便, 令\(X = \{X_1, X_2,\cdots, X_n\}\).

\[\begin{array}{ll} p(T=t;\theta) &= \int_{T(X)=t} p(X,T=t;\theta) \mathrm{d}X \\ &= \int_{T(X)=t} f(X;\theta) \mathrm{d}X \\ &= \int_{T(X)=t} h(X) g(T=t;\theta) \mathrm{d}X \\ &= \int_{T(X)=t} h(X) \mathrm{d}X \cdot g(T=t;\theta) \\ \end{array}. \]

\[\begin{array}{ll} p(X | T=t;\theta) &= \frac{p(X,T=t;\theta)}{p(T=t;\theta)} \\ &= \frac{p(X;\theta)}{p(T=t;\theta)} \\ &= \frac{h(X)g(T=t;\theta)}{\int_{T(X)=t}h(X)\mathrm{d} X \cdot g(T=t;\theta)} \\ &= \frac{h(X)}{\int_{T(X)=t}h(X)}. \\ \end{array} \]

\(\theta\)無關.

注: 上述的證明存疑.

最小統計量

最小統計量S, 即

  1. S是充分統計量;
  2. 充分統計量\(T\), 存在\(f\), 使得\(S=f(T)\).

注: 若\(T\)是充分統計量, 則任意的可逆函數\(f\)得到的\(f(T)\)也是充分統計量.

例子

\(U[0, \theta]\)

均勻分布, 此時

\[p(X_1, X_2,\cdots, X_n;\theta) = \frac{1}{\theta^n} \mathbb{I}[0\le \min \{X_i\}] \cdot \mathbb{I}[\max \{X_i\} \le \theta], \]

\[T = \max \{X_i\}, \: g(T;\theta) = \mathbb{I}[\max \{X_i\} \cdot \frac{1}{\theta^n}, \: h(X) = \mathbb{I}[0\le \min \{X_i\}]. \]

\(U[\alpha, \beta]\)

\[p(X_1, X_2,\cdots, X_n;\alpha,\beta) = \frac{1}{(\beta - \alpha)^n} \mathbb{I}[\alpha\le \min \{X_i\}] \cdot \mathbb{I}[\max \{X_i\} \le \theta], \]

\[T = (\min \{X_i\}, \max \{X_i\}), \\ g(T;\alpha, \beta) = \frac{1}{(\beta - \alpha)^n} \mathbb{I}[\alpha\le \min \{X_i\}] \cdot \mathbb{I}[\max \{X_i\} \le \theta], \\ h(X) = 1. \]

Poisson

\[P(X;\lambda) = \frac{\lambda^X e^{-\lambda}}{X!}. \]

\[p(X_1, X_2,\cdots, X_n;\lambda) = e^{-n\lambda} \lambda^{\sum_{i}X_i} \cdot \frac{1}{\prod_i X_i!}. \]

\[T = \sum_iX_i, \\ g(T;\theta) = e^{-n\lambda} \cdot \lambda^T, \\ h(X) = \frac{1}{\prod_{i} X_i!}. \]

Normal

\[P(X;\mu,\sigma) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp(-\frac{(X-\mu)^2}{2\sigma^2}). \]

\[p(X_1, X_2,\cdots, X_n;\mu, \sigma) = (2\pi\sigma^2)^{-\frac{n}{2}} \exp (-\frac{1}{2\sigma^2}\sum_{i=1}^n (X_i - \bar{X})^2) \exp(-\frac{n}{2\sigma^2})(\mu-\bar{X})^2. \]

\(\sigma\)已知:

\[T=\frac{1}{n}\sum X_i = \bar{X} , \\ g(T;\mu) = (2\pi\sigma^2)^{-\frac{n}{2}} \exp(-\frac{n}{2\sigma^2})(\mu-T)^2, \\ h(X) = \exp (-\frac{1}{2\sigma^2}\sum_{i=1}^n (X_i - \bar{X})^2). \]

\(\sigma\)未知:

\[T = (\bar{X}, s^2), s^2 = \frac{\sum_{i=1}^n(X_i-\bar{X})^2}{n-1}, \\ g(T;\mu,\sigma) = (2\pi\sigma^2)^{-\frac{n}{2}}\exp(-\frac{n-1}{2\sigma^2}s^2) \exp(-\frac{n}{2\sigma^2})(\mu-\bar{X})^2, \\ h(X) = 1. \]

指數分布

\[p(X) = \frac{1}{\lambda} e^{-\frac{X}{\lambda}}, \quad X \ge 0. \]

\[p(X_1, X_2,\cdots, X_n;\lambda) = \frac{1}{\lambda^n} e^{-\frac{\sum_{i=1}^n X_i}{\lambda}}. \]

\[T = \sum_{i=1}^n X_i, \\ g(T;\lambda) = \frac{1}{\lambda^n} e^{-\frac{T}{\lambda}}, \\ h(X) = 1. \]

Gamma

\[\Gamma(\alpha, \beta) = \frac{1}{\Gamma(\alpha) \beta^{\alpha}}X^{\alpha-1} e^{-\frac{X}{\beta}}. \]

\[p(X_1, X_2,\cdots, X_n;\alpha, \beta) = \frac{1}{(\Gamma(\alpha) \beta^{\alpha})^n}(\prod_{i} X_i)^{\alpha-1} e^{-\frac{\sum_iX_i}{\beta}}. \]

\[T = (\prod_i X_i, \sum_i X_i), \\ g(T;\theta) = \frac{1}{(\Gamma(\alpha) \beta^{\alpha})^n}(\prod_{i} X_i)^{\alpha-1} e^{-\frac{\sum_iX_i}{\beta}}, \\ h(X) = 1. \]


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM