Baum-Welch算法就是EM算法,所以首先給出EM算法的Q函數
\[\sum_zP(Z|Y,\theta')\log P(Y,Z|\theta) \]
換成HMM里面的記號便於理解
\[Q(\lambda,\lambda') = \sum_zP(I|O,\lambda')\log P(I,O|\lambda) \]
根據狀態序列和觀測序列的聯合分布
\[\begin{align*} P(O,I|\lambda) &= \sum_IP(O|I,\lambda)P(I|\lambda)\\ &= \pi_{i_1}b_{i_1}(o_1)a_{i_1i_2}b_{i_2}(o_2)\dots a_{i_{T-1}i_T}b_{i_T}(o_T)\\ \end{align*}\]
代入上式后得
\[\begin{align*} Q(\lambda, \lambda') &= \sum_IP(I|O,\lambda')\log\pi_{i_1}\\ &+ \sum_IP(I|O,\lambda')\log\sum_{t=1}^Tb_{i_t}(o_t) \\ &+ \sum_IP(I|O,\lambda')\log\sum_{t=2}^Ta_{i_{t-1}i_T} \end{align*}\]
這便是E步,下面看看M步.
看Q函數得第一步, 由於帶有約束
\[\sum_i^N\pi_i = 1 \]
這個時候就需要請出拉格朗日乘子了
\[\begin{align*} L &= \sum_IP(I|O,\lambda')\log\pi_1 + \gamma(\sum_{i=1}^N\pi_i -1)\\ &= \sum_{i=1}^NP(O,i_1=i|\lambda')\log\pi_i + \gamma(\sum_{i=1}^N\pi_i -1)\\ \end{align*}\]
令\(\dfrac{\partial L}{\partial\pi_i} = 0\)得到
\[\begin{align*} P(O, i_1 = i|\lambda') + \gamma \pi_i &= 0\\ P(O, i_1 = i|\lambda') &= -\gamma \pi_i\\ \sum_{i=1}^NP(O, i_1 = i|\lambda') &= -\gamma \sum_{i=1}^N\pi_i\\ \gamma &= -P(O|\lambda') \end{align*}\]
回代,得到
\[\pi_i = \dfrac{P(O, i_1=i|\lambda')}{P(O|\lambda')} \]
其他得參數同樣可以得到