朴素贝叶斯(生成模型)


朴素贝叶斯中的基本假设

  1. 训练数据是由$P\left( {X,Y} \right)$独立同分布产生的
  2. 条件独立假设(当类别确定时特征之间是相互独立的):\[P\left( {X = x|Y = {c_k}} \right) = P\left( {{X^{\left( 1 \right)}} = {x^{\left( 1 \right)}},{X^{\left( 2 \right)}} = {x^{\left( 2 \right)}}, \ldots ,{X^{\left( n \right)}} = {x^{\left( n \right)}}|Y = {c_k}} \right) = \prod\limits_{j = 1}^n {P\left( {{X^{\left( j \right)}} = {x^{\left( j \right)}}|Y = {c_k}} \right)} \]

算法思想

对于给定的输入$x$,通过学习得到的模型计算后验概率分布$P\left( {Y{\rm{ = }}{c_k}|X = x} \right)$,将后验概率最大的类作为$x$的类,后验概率根据贝叶斯公式计算:\[P\left( {Y{\rm{ = }}{c_k}|X = x} \right) = \frac{{P\left( {Y = {c_k}} \right)\prod\limits_j {P\left( {{X^{\left( j \right)}} = {x^{\left( j \right)}}|Y = {c_k}} \right)} }}{{\sum\limits_i {P\left( {Y = {c_i}} \right)\prod\limits_j {P\left( {{X^{\left( j \right)}} = {x^{\left( j \right)}}|Y = {c_i}} \right)} } }}\]

朴素贝叶斯分类器可表示为:\[y = \arg {\max _{{c_k}}}P\left( {Y = {c_k}|X = x} \right) = \arg {\max _{{c_k}}}\frac{{P\left( {Y = {c_k}} \right)\prod\limits_j {P\left( {{X^j} = {x^j}|Y = {c_k}} \right)} }}{{\sum\limits_k {P\left( {Y = {c_k}} \right)\prod\limits_j {P\left( {{X^j} = {x^j}|Y = {c_k}} \right)} } }}\]

等价于:\[y = \arg {\max _{{c_k}}}P\left( {Y = {c_k}} \right)\prod\limits_j {P\left( {{X^j} = {x^j}|Y = {c_k}} \right)} \]

朴素贝叶斯法把实例分到后验概率最大的类中。这等价于损失函数是0-1函数时的期望风险最小化。

参数估计

 

 

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM