無監督LDA、PCA、k-means三種方法之間的的聯系及推導

本文轉載自查看原文 2020-05-07 23:24 572 機器學習

\(LDA\)是一種比較常見的有監督分類方法，常用於降維和分類任務中；而\(PCA\)是一種無監督降維技術；\(k\)-means則是一種在聚類任務中應用非常廣泛的數據預處理方法。
本文的主要寫作出發點是:探討無監督情況下，\(LDA\)的類內散度矩陣和類間散度矩陣與\(PCA\)和\(k\)-means之間的聯系。

1.常規有監督\(LDA\)的基本原理:

(1) \(LDA\)的目標函數:

關於\(LDA\)的產生及理論推導，大家參考：“線性判別分析LDA原理總結”，這篇文章已經講解地非常詳細，我在這里不再贅述。本文涉及到的\(LDA\)皆是多分類\(LDA\), 以矩陣形式書寫。
首先\(LDA\)的基本思想是：給定原始數據\(X\)（假設已經去中心化），求解一個正交投影子空間\(W\)，使得樣本經過子空間投影后，可以使類內散度矩陣\(S_w\)最小，類間散度矩陣\(S_b\)最大。即優化以下目標函數：

\[\begin{equation} \left\{\begin{array}{l} \min_{W^{T} W=I} \operatorname{Tr}\left(W^{T} S_{w} W\right) \\ \max_{W^{T} W=I} \operatorname{Tr}\left(W^{T} S_{b} W\right). \end{array} \right. \end{equation} \]

而上式中的類內散度矩陣\(S_w\)和類間散度矩陣\(S_b\)又滿足另一個條件：

\[\begin{equation} {S}_w + {S}_b = {S}_t, \end{equation} \]

這里，\({S}_t\)指的使整體散度矩陣。本文的出發點就是說明類內散度矩陣\({S}_t\)與\(PCA\)之間的聯系以及類間散度矩陣\({S}_w\)與\(k\)-means之間的關系。

(2) \(LDA\)為什么是有監督的

LDA之所以是有監督的，是因為在公式（1）中，計算類內散度矩陣\({S}_w\)和類間散度矩陣\({S}_b\)時，需要用到標簽矩陣Y。

2.LDA的類內散度矩陣和\(PCA\)之間的關系

關於PCA的具體推導過程，可以參考："PCA的數學原理"
LDA中的整體散度矩陣\({S}_t\)的計算可以表達為：

\[\begin{equation} {S}_{t}={X X}^{T}=\sum_{i=1}^{n} x_{i} x_{i}^{T}。 \end{equation} \]

這里可以明顯的發現，\(LDA\)中的整體散度矩陣\({S}_t\)和\(PCA\)是等價的。

3. \(LDA\)和\(k\)-means之間的聯系

首先，我們做出一個假設，在無監督情況下，標簽矩陣\(Y\)由一個已知變量轉化為一個待求變量。此時，類內散度矩陣\({S}_w\)和類間散度矩陣\({S}_b\)可以做如下推導：

\[\begin{equation} \left\{\begin{array}{l} {S}_{t}={X} {X}^{T} \\ {S}_{b}={X} {Y}\left({Y}^{T} {Y}\right)^{-1} {Y}^{T} {X}^{T} \\ {S}_{w}={S}_{t}-{S}_{b}={X} \left({I}-{Y}\left({Y}^{T} {Y}\right)^{-1} {Y}^{T}\right) {X}^{T} \end{array} \right. \end{equation} \]

這里\({I}\)是同維度的單位矩陣。下面，我們進行類內散度矩陣\(\mathbf{S}_w\)的推導：

\[\begin{equation} \begin{aligned} \mathbf{S}_{w} &=\mathbf{X} \left(\mathbf{I}-\mathbf{Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1} \mathbf{Y}^{T}\right) \mathbf{X}^{T}\\ &={X X}^{T}-\mathbf{X Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1} \mathbf{Y}^{T} \mathbf{X}^{T}\\ \end{aligned} \end{equation} \]

對上式進行拆分:

\[\begin{equation} \begin{aligned} &\mathbf{X X}^{T}-\mathbf{X Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1} \mathbf{Y}^{T} \mathbf{X}^{T}\\ =&\mathbf{X X}^{T}-2 \mathbf{X Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1} \mathbf{Y}^{T} \mathbf{X}^{T}+\mathbf{X Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1} \mathbf{Y}^{T} \mathbf{Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1} \mathbf{Y}^{T} X^{T} \\ =&\left(\mathbf{X}-\mathbf{X Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1} \mathbf{Y}^{T}\right)\left(\mathbf{X-XY}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1} \mathbf{Y}^{T}\right)^{T} \\ =& trace\left(\mathbf{X}-\mathbf{X Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1} \mathbf{Y}^{T}\right) \end{aligned} \end{equation} \]

上述公式中的一個小技巧：\((\mathbf{YY})^{-1}\)是一個對角矩陣，對角元素是，類別數分之一(\(\frac{1}{c}\))。
另外需要注意的一點是：

\[\begin{equation} \left\{ \begin{aligned} &\mathbf{Y}^{T} \mathbf{Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1}=I\\ &\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1 / 2} \mathbf{Y}^{T} \mathbf{Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1 / 2}=I\\ &\mathbf{Y}\left(\mathbf{Y}^{T} \mathbf{Y}\right)^{-1} \mathbf{Y}^{T} \neq I \end{aligned} \right.. \end{equation} \]

故此，無監督情況下，\(LDA\)的類內散度矩陣和\(k\)-means其實是等價的，並且可以寫成跡范數的形式。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 無監督分類算法—K-Means 無監督聚類算法K-Means 4.無監督學習--K-means聚類 [機器學習][K-Means] 無監督學習之K均值聚類 K-means聚類算法的三種改進(K-means++,ISODATA,Kernel K-means)介紹與對比非監督學習方法---k均值聚類（k-means） R中K-Means、Clara、C-Means三種聚類的評估 k-Means與EM之間的關系【學習筆記】非監督學習-k-means 聚類算法之划分方法（k-means）