論文解讀（VGAE）《Variational Graph Auto-Encoders》

本文轉載自查看原文 2022-03-23 16:57 1299 論文解讀

論文信息

論文標題：Variational Graph Auto-Encoders
論文作者：Thomas Kipf, M. Welling
論文來源：2016, ArXiv
論文地址：download
論文代碼：download

1 Introduce

　　變分自編碼器在圖上的應用，該框架可以自行參考變分自編碼器。

2 Method

　　變分圖自編碼器（VGAE ），整體框架如下：

　　框架組成部分：

- 一個 GCN Encoder
- 一個簡單的內積 Decoder

2.1 Encoder

　　Inference model：一個兩層的 GCN 推理模型

　　Step1：獲得均值 $\mu$ 和方差的對數 $\log \boldsymbol{\sigma}$

　　　　$\boldsymbol{\mu}=\operatorname{GCN}_{\boldsymbol{\mu}}(\mathbf{X}, \mathbf{A})$

　　　　$\log \boldsymbol{\sigma}=\mathrm{GCN}_{\boldsymbol{\sigma}}(\mathbf{X}, \mathbf{A})$

　　$\log \boldsymbol{\sigma}$ 可正可負，而 $\sigma$ 為正數。

def encode(self, x, adj): hidden1 = self.gc1(x, adj) return self.gc2(hidden1, adj), self.gc3(hidden1, adj) mu, logvar = self.encode(x, adj)

　　注意：這里的GCN 第一層共用，第二層GCN分別輸出 $\text{mu}$，$\text{log} \sigma$ 矩陣。

　　Step2：根據均值 $\mu$ 和方差的對數 $\log \boldsymbol{\sigma}$ 獲得隱表示 $z$

　　　　$q(\mathbf{Z} \mid \mathbf{X}, \mathbf{A})=\prod_{i=1}^{N} q\left(\mathbf{z}_{i} \mid \mathbf{X}, \mathbf{A}\right) \text { with } \quad q\left(\mathbf{z}_{i} \mid \mathbf{X}, \mathbf{A}\right)=\mathcal{N}\left(\mathbf{z}_{i} \mid \boldsymbol{\mu}_{i}, \operatorname{diag}\left(\boldsymbol{\sigma}_{i}^{2}\right)\right)$

def reparameterize(self, mu, logvar): if self.training: std = torch.exp(logvar) eps = torch.randn_like(std) return eps.mul(std).add_(mu) else: return mu z = self.reparameterize(mu, logvar)

　　PS：顯然 $z$ 的獲取是方差 $\text{std}$ 和正態分布生成的矩陣 $\text{eps}$ 先做哈達瑪積，然后在加均值 $\mu$ 。

2.2 Decoder

　　Generative model：根據潛在變量 $z$ 之間的內積給出鄰接矩陣 $A$：

　　　　$p(\mathbf{A} \mid \mathbf{Z})=\prod_{i=1}^{N} \prod_{j=1}^{N} p\left(A_{i j} \mid \mathbf{z}_{i}, \mathbf{z}_{j}\right) \text { with } p\left(A_{i j}=1 \mid \mathbf{z}_{i}, \mathbf{z}_{j}\right)=\sigma\left(\mathbf{z}_{i}^{\top} \mathbf{z}_{j}\right)$

　　其中：

- $\mathbf{A}$ 是鄰接矩陣　　
- $\sigma(\cdot)$ 是 logistic sigmoid function.

class InnerProductDecoder(nn.Module): """Decoder for using inner product for prediction."""

    def __init__(self, dropout, act=torch.sigmoid): super(InnerProductDecoder, self).__init__() self.dropout = dropout self.act = act def forward(self, z): z = F.dropout(z, self.dropout, training=self.training) adj = self.act(torch.mm(z, z.t())) return adj self.dc = InnerProductDecoder(dropout, act=lambda x: x) adj = self.dc(z)

　　PS：計算表示向量 $Z$ 和重建的鄰接矩陣 $\hat{\mathbf{A}}$

　　　　$\hat{\mathbf{A}}=\sigma\left(\mathbf{Z Z}^{\top}\right), \text { with } \quad \mathbf{Z}=\operatorname{GCN}(\mathbf{X}, \mathbf{A})$

2.3 Loss function

　　Learning：優化變分下界 $\mathcal{L}$ 的參數 $W_i$ ：

　　這里希望重構出的圖（鄰接矩陣）和原始的圖盡可能相似，當然，由於我們對於latent representation的形式做了一定的假設，同樣希望分布與假設中的標准高斯盡可能相似。因此損失函數需要包括兩個部分：

　　　　$\mathcal{L}=\mathbb{E}_{q(\mathbf{Z} \mid \mathbf{X}, \mathbf{A})}[\log p(\mathbf{A} \mid \mathbf{Z})]-\mathrm{KL}[q(\mathbf{Z} \mid \mathbf{X}, \mathbf{A}) \| p(\mathbf{Z})]$

　　其中：

$\operatorname{KL}[q(\cdot) \| p(\cdot)]$ 代表着 $q(\cdot)$ 和 $p(\cdot)$ 之間的 KL散度。　　
高斯先驗 $p(\mathbf{Z})=\prod_{i} p\left(\mathbf{z}_{\mathbf{i}}\right)=\prod_{i} \mathcal{N}\left(\mathbf{z}_{i} \mid 0, \mathbf{I}\right)$

　　PS：這里的交叉熵是二分類帶 sigmoid 且帶權的交叉熵。（preds = z , labels = A）

　　權：pos_weight = float(adj.shape[0] * adj.shape[0] - adj.sum()) / adj.sum() 即鄰接矩陣中 0 的個數除 1 的個數比。

　　比值：norm = adj.shape[0] * adj.shape[0] / float((adj.shape[0] * adj.shape[0] - adj.sum()) * 2)

def loss_function(preds, labels, mu, logvar, n_nodes, norm, pos_weight): cost = norm * F.binary_cross_entropy_with_logits(preds, labels, pos_weight=pos_weight)
    KLD = -0.5 / n_nodes * torch.mean(torch.sum( 1 + 2 * logvar - mu.pow(2) - logvar.exp().pow(2), 1)) return cost + KLD

3 Experiment

　　引文網絡中鏈接預測（link prediction）任務的結果如 Table 1 所示。

　　GAE* and VGAE* denote experiments without using input features, GAE and VGAE use input features.

4 Other

　　這里 GCN 定義為：
　　　　$\operatorname{GCN}(\mathbf{X}, \mathbf{A})=\tilde{\mathbf{A}} \operatorname{ReLU}\left(\tilde{\mathbf{A}} \mathbf{X} \mathbf{W}_{0}\right) \mathbf{W}_{1}$

　　其中：

- $\mathbf{W}_{i}$ 代表着權重矩陣　　
- $\operatorname{GCN}_{\boldsymbol{\mu}}(\mathbf{X}, \mathbf{A})$ 和 $\mathrm{GCN}_{\boldsymbol{\sigma}}(\mathbf{X}, \mathbf{A})$ 共享第一層的權重矩陣 $\mathbf{W}_{0} $
- $\operatorname{ReLU}(\cdot)=\max (0, \cdot)$　　
- $\tilde{\mathbf{A}}=\mathbf{D}^{-\frac{1}{2}} \mathbf{A} \mathbf{D}^{-\frac{1}{2}}$ 代表着 symmetrically normalized adjacency matrix

import torch import torch.nn.functional as F from torch.nn.modules.module import Module from torch.nn.parameter import Parameter class GraphConvolution(Module): def __init__(self, in_features, out_features, dropout=0., act=F.relu): super(GraphConvolution, self).__init__() self.in_features = in_features self.out_features = out_features self.dropout = dropout self.act = act self.weight = Parameter(torch.FloatTensor(in_features, out_features)) self.reset_parameters() def reset_parameters(self): torch.nn.init.xavier_uniform_(self.weight) def forward(self, input, adj): input = F.dropout(input, self.dropout, self.training) support = torch.mm(input, self.weight) output = torch.spmm(adj, support) output = self.act(output) return output

修改歷史

2021-03-23 創建文章

2022-06-10 修訂文章

2022-06-28 修訂文章中的損失函數圖片

論文解讀目錄

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Variational Auto-Encoders原理論文翻譯：2018_Artificial Bandwidth Extension with Memory Inclusion using Semi-supervised Stacked Auto-encoders Auto encoder簡介， Variational Autoencoder, 以及Variational Graph Autoencoder詳細介紹 Deep Learning 閱讀筆記：Convolutional Auto-Encoders 卷積神經網絡的自編碼表達論文解讀（GAT）《Graph Attention Networks》《Heterogeneous Graph Attention Network》論文解讀論文解讀（DGI）《DEEP GRAPH INFOMAX》論文解讀（GraRep）《GraRep: Learning Graph Representations with Global Structural Information》論文解讀（SUGRL）《Simple Unsupervised Graph Representation Learning》論文解讀（AGCN）《 Attention-driven Graph Clustering Network》