分層貝葉斯模型——結構


分層貝葉斯模型


對於一個隨機變量序列$Y_{1},...,Y_{n} $,如果在任意排列順序$\pi $下,其概率密度都滿足$p(y_{1},...,y_{n})=p(y_{\pi_{1}},...,y_{\pi_{n}}) $,那么稱這些變量是可交換的。當我們缺乏區分這些隨機變量的信息時,可交換性是$p(y_{1},...,y_{n}) $的一個合理屬性。在這種情況下,各個隨機變量可以看作是從一個群體中獨立采樣的結果,群體的屬性可以用一個固定的未知參數$\phi $來描述,即:

$$
\phi\sim p(\phi)
$$

$$
\{Y_{1},...,Y_{n}|\phi\}\sim^{i.i.d.}p(y|\phi)
$$

考慮分層數據$\{Y_{1},...,Y_{n}\} $,其中$Y_{j}=\{Y_{1,j},...,Y_{n_{j},j}\} $,那么有

$$
\{Y_{1,j},...,Y_{n_{j},j}|\phi_{j}\}\sim^{i.i.d.}p(y|\phi_{j})
$$

但是我們該如何表示組參數$\phi_{1},...,\phi_{m} $呢?若這些組本身歸屬於更大的組群,那么這些組參數變量同樣滿足可交換性,因此有

$$
\{\phi_{1},...,\phi_{m}|\phi\}\sim^{i.i.d.}p(\phi|\psi)
$$

 綜上,我們能夠得到三個概率分布

組內采樣:$\{y_{1,j},...,y_{n_{j},j}|\phi_{j}\}\sim^{i.i.d.}p(y|\phi_{j}) $

組間采樣:$\{\phi_{1},...,\phi_{m}|\phi\}\sim^{i.i.d.}p(\phi|\psi) $

先驗分布:$\psi \sim p(\psi) $

 

分層正態分布模型


下面利用分層正態分布模型來描述幾個群體之間的均值異質性,設組內和組間采樣都服從正態分布:

組內模型:$\phi_{j}=\{\theta_{j},\sigma^2\},\; p(y|\phi_{j})=normal(\theta_{j},\sigma^2) $

組間模型:$\psi=\{\mu,\tau^2\},\; p(\theta_{j}|\psi)=normal(\mu,\tau^2) $

模型中固定的未知參數是,和,為了方便,我們假設這些參數服從標准半共軛正態分布和逆伽馬分布:

$1/\sigma^2 \sim gamma(\nu_{0}/2,\nu_{0}\sigma_{0}^2/2) $

$1/\tau^2 \sim gamma(\eta_{0}/2,\eta_{0} \tau_{0}^2/2) $

$\mu \sim normal(\mu_{0}, \gamma_{0}^2) $

模型結構如下:

后驗推斷:

一元正態模型重要結論:

結論1 假設抽樣模型為$\{Y_{1},...,Y_{n}|\theta,\sigma^2\}\sim^{i.i.d.}normal(\theta,\sigma^2) $,如果$\theta \sim normal(\mu_{0},\tau_{0}^2) $,$1/\sigma^2 \sim gamma(\nu_{0}/2,\nu_{0}\sigma_{0}^2/2)$,那么$p(\theta|\sigma^2,y_{1},...,y_{n})\sim normal(\mu_{n},\tau_{n}^2) $,其中$\mu_{n}=\frac{\mu_{0}/\tau{0}^2+n\bar{y}/\sigma^2}{1/\tau^2+n/\sigma^2} $,$\tau_{n}^2=\Big (\frac{1}{\tau_{0}^2}+\frac{n}{\sigma^2}\Big)^{-1} $

結論2假設抽樣模型為$\{Y_{1},...,Y_{n}|\theta,\sigma^2\}\sim^{i.i.d.}normal(\theta,\sigma^2) $,如果$\theta \sim normal(\mu_{0},\tau_{0}^2) $,$1/\sigma^2 \sim gamma(\nu_{0}/2,\nu_{0}\sigma_{0}^2/2)$,那么$p(\sigma^2|\theta,y_{1},...,y_{n})\sim inverse-gamma(\nu_{n}/2,\nu_{n}\sigma_{n}^2(\theta)/2) $,其中$\nu_{n}=\nu_{0}+n $,$\sigma_{n}^2(\theta)=\frac{1}{\nu_{n}[\nu_{0}\sigma_{0}^2+ns_{n}^2(\theta)]} $,$s_{n}^2(\theta)=\sum (y_{i}-\theta)^2/n $

系統中的未知參數包括組內均值$(\theta_{1},...,\theta_{m}) $和方差$\sigma^2 $以及組間均值$\mu $和方差$\tau^2 $,參數的聯合后驗推理可以通過構造$Gibbs $采樣器來進行估計$p(\theta,...,\theta,\mu,\tau^2,\sigma^2|y_{1},...,y_{m}) $,$Gibbs $采樣器通過從每個參數的全條件分布迭代采樣來進行計算。

$$
\begin{aligned}
&p(\theta_{1},...,\theta_{m},\mu,\tau^2,\sigma^2|y_{1},...,y_{m})\\
&\propto p(\mu,\tau^2,\sigma^2)\times p(\theta_{1},...,\theta_{m}|\mu,\tau^2,\sigma^2)\times p(y_{1},...,y_{m}|\theta_{1},...,\theta_{m},\mu,\tau^2,\sigma^2)\\
&=p(\mu)p(\tau^2)p(\sigma^2)\Big \{\prod_{j=1}^m p(\theta_j|\mu,\tau^2)\Big\} \Big\{\prod_{j=1}^m \prod_{i=1}^n p(y_{i,j}|\theta_j,\sigma^2) \Big\}
\end{aligned}
$$

 根據隨機變量之間的依賴關系,我們能夠得到各個變量的全條件分布

$$
p(\mu|\theta_{1},...,\theta_{m},\tau^2,\sigma^2,y_{1},...,y_{m})\propto p(\mu)\prod p(\theta_{j}|\mu,\tau^2)
$$

$$
p(\tau^2|\theta_{1},...,\theta_{m},\tau^2,\sigma^2,y_{1},...,y_{m})\propto p(\tau^2)\prod p(\theta_{j}|\mu,\tau^2)
$$

$$
p(\theta_{j}|\mu,\tau^2,\sigma^2,y_{1},...,y_{m})\propto p(\theta_{j}|\mu,\tau^2)\prod_{i=1}^{n_{j}} p(y_{i,j}|\theta_{j},\sigma^2)
$$

$$
\begin{aligned}
p(\sigma^2|\theta_{1},...,\theta_{m},y_{1},...,y_{m})&\propto p(\sigma^2)\prod_{j=1}^m \prod_{i=1}^{n_{j}}p(y_{i,j}|\theta_{j},\sigma^2)\\
&\propto (\sigma^2)^{-\nu_{0}/2+1}e^{-\frac{\nu_{0}\sigma_{0}^2}{2\sigma^2}}(\sigma^2)^{-\sum n_{j}/2}e^{-\frac{\sum \sum (y_{i,j}-\theta_{j})^2}{2\sigma^2}}
\end{aligned}
$$

從而根據上面兩個結論,我們可得:

$$
\{\mu|\theta_{1},...,\theta_{m},\tau^2\}\sim normal \Big(\frac{m\bar{\theta}/\tau^2+\mu_{0}/\gamma_{0}^2}{m/\tau^2+1/\gamma_{0}^2},[m/\tau^2+1/\gamma_{0}^2]^{-1} \Big)
$$

$$
\{1/\tau^2|\theta_{1},...,\theta_{m},\mu\}\sim gamma \Big(\frac{\eta_{0}+m}{2},\frac{\eta_{0}\tau_{0}^2+\sum(\theta_{j}-\mu)^2}{2}\Big)
$$

$$
\{\theta_{j}|y_{1,j},...,y_{n,j},\sigma^2\}\sim normal\Big(\frac{n_{j}\bar{y}_{j}/\sigma^2+1/\tau^2}{n_{j}/\sigma^2+1/\tau^2},[n_{j}/\sigma^2+1/\tau^2]^{-1}\Big)
$$

$$
\{1/\sigma^2|\theta,y_{1},...,y_{n}\sim gamma \Big(\frac{1}{2}[\nu_{0}+\sum_{j=1}^m n_{j}],\frac{1}{2}[\nu_{0}\sigma_{0}^2+\sum_{j=1}^m \sum_{i=1}^{n_{j}}(y_{i,j}-\theta_{j})^2]\Big)\}
$$

計算流程:

  1. 設定先驗分布參數

    $(\nu_{0},\sigma_{0}^2)\rightarrow p(\sigma^2) $

    $(\eta_{0},\tau_{0}^2) \rightarrow p(\tau_{0}^2) $

    $(\mu_{0},\gamma_{0}^2)\rightarrow p(\mu) $

  2.從全條件分布中迭代采樣每個未知參數進行參數后驗估計,即給定參數當前狀態$\{\theta_{1}^{(s)},...,\theta_{m}^{(s)},\mu^{(s)},\tau^{2(s)},\sigma^{2(s)}\} $,新狀態按下列方式獲得:

    $sample:\;\mu^{(s+1)}\sim p(\mu|\theta_{1}^{(s)},...,\theta_{m}^{(s)},\tau^{2(s)}) $

    $sample:\;\tau^{2(s+1)}\sim p(\tau^2|\theta_{1}^{(s)},...,\theta_{m}^{(s)},\mu^{(s+1)})  $

    $sample:\;\sigma^{2(s+1)}\sim p(\sigma^2|\theta_{1}^{(s)},...,\theta_{m}^{(s)},y_{1},...,y_{m})  $

    $for\;each\;j\in\{1,...,m\},\;sample\;\theta_{j}^{(s+1)}\sim p(\theta_{j}|\mu^{(s+1)},\tau^{2(s+1)},\sigma^{2(s+1)},y_{i}) $

  直到參數收斂,從而得到系統參數。

進一步推廣,如果組間的均值不同,組間的方差同樣是不同的,此時令$\sigma_{j}^2 $為第$j $組的方差,那么我們的采樣模型變為:$\{Y_{1,j},...,Y_{n_{j},j}\}\sim^{i.i.d.} normal(\theta_{j},\sigma_{j}^2) $,$\theta_{j} $的全條件分布為:$\{\theta_{j}|y_{1,j},...,y_{n_{j},j},\sigma_{j}^2\}\sim normal \Big(\frac{n_{j}\bar{y}_{j}/\sigma_{j}^2+1/\tau^2}{n_{j}/\sigma_{j}^2+1/\tau^2},[n_{j}/\sigma_{j}^2+1/\tau^2]^{-1}\Big) $

如何估計$\sigma_{j}^2 $呢?我們首先假設:

$$
\sigma_{1}^2,...,\sigma_{m}^2\sim^{i.i.d.}gamma(\nu_{0}/2,\nu_{0}\sigma_{0}^2/2)
$$

其全條件分布為:

$$
\{1/\sigma_{j}^2|y_{1,j},...,y_{n_{j},j},\theta_{j}\}\sim gamma \Big([\nu_{0}+n_{j}]/2,[\nu_{0}\sigma_{0}^2+\sum(y_{i,j}-\theta_{j})^2]/2)\Big )
$$

$\sigma_{1}^2,...,\sigma_{m}^2 $的值同樣可以利用$Gibbs $采樣迭代求解。

如果$\nu_{0} $和$\sigma_{0}^2 $是固定的話, $\sigma_{j}^2 $之間相互獨立,也就是說$\sigma_{m}^2 $的值不能由$\sigma_{1}^2,...,\sigma_{m-1}^2 $來進行估計,但如果$\sigma_{m}^2 $所在的組別樣本量很少的話,我們應該考慮采用$\sigma_{1}^2,...,\sigma_{m-1}^2 $的數據來提高對$\sigma_{m}^2 $的估計,那該如何做呢?其實我們要做的是可以把$\nu_{0} $和$\sigma_{0}^2 $作為估計值,系統整體的結構為:

從而我們的未知參數有:組內采樣分布$\{(\theta_{1},\sigma_{1}^2),...,(\theta_{m},\sigma_{m}^2)\} $,組間均值異質性參數$\{\mu,\tau^2\} $,組間方差異質性參數$\{\nu_{0},\sigma_{0}^2\} $,$\{\mu,\tau^2\} $和$\{(\theta_{1},\sigma_{1}^2),...,(\theta_{m},\sigma_{m}^2)\} $的求法都已給出,現在討論對$\{\nu_{0},\sigma_{0}^2\} $的估計。假定$\sigma_{0}^2 $服從共軛類的先驗分布,$p(\sigma_{0}^2)\sim gamma(a,b) $,那么就有

$$
p(\sigma_{0}^2|\sigma_{1}^2,...,\sigma_{m}^2,\nu_{0})=dgamma(a+\frac{1}{2}m\nu_{0},b+\frac{1}{2}\sum_{j=1}^m (1/\sigma_{j}^2))
$$

簡單$\nu_{0} $的共軛先驗分布是不存在的,但如果我們將其限制為一個整數,問題就變得簡單了。假定$\nu_{0} $的先驗服從$\{1,2,...\} $上的幾何分布使得$p(\nu_{0})\propto e^{-\alpha \nu_{0}} $,則

$$
\begin{aligned}
& p(\nu_{0}|\sigma_{0}^2,\sigma_{1}^2,...,\sigma_{m}^2)\\
& \propto p(\nu_{0})\times p(\sigma_{1}^2,...,\sigma_{m}^2|\nu_{0},\sigma_{0}^2)\\
& \propto \Big(\frac{(\nu_{0}\sigma_{0}^2/2)^{\nu_{0}/2}}{\Gamma(\nu_{0}/2)}\Big)^m \Big(\prod_{j=1}^m \frac{1}{\sigma_{j}^2}\Big)^{\nu_{0}/2-1}\times exp\{-\nu_{0}(\alpha+\frac{1}{2}\sigma_{0}^2\sum (1/\sigma_{j}^2))\}
\end{aligned}
$$

從而問題得到求解。

 

參考文獻:Hoff, Peter D. A first course in Bayesian statistical methods. Springer Science & Business Media, 2009.


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM