論文筆記之：Semi-supervised Classification with Graph Convolutional Networks

本文轉載自查看原文 2018-01-16 23:11 3546 Graph CNN/ 深度學習

Semi-supervised Classification with Graph Convolutional Networks

2018-01-16 22:33:36

1. 文章主要思想：

2. 代碼實現（Pytorch）：https://github.com/tkipf/pygcn

【Introduction】：

本文嘗試用 GCN 進行半監督的分類，通過引入一個 graph Laplacian regularization term 到損失函數中：

其中，L0 代表損失函數，即：graph 的標注部分，f(*) 可以是類似神經網絡的可微分函數，X 是節點特征向量組成的矩陣，代表無向圖 g 的 unnormalized graph Laplacian，及其鄰接矩陣 A，degree matrix $D_{ii} = \sum_{j} A_{ij}$. 公式（1）是依賴於假設：connected nodes in the graph are likely to share the same label. 但是這個假設，可能限制了模型的適應性（the modeling capacity），因為 graph edges 不需要編碼 node 的相似性，但可以包含額外的信息。

在這個工作中，我們直接用神經網絡模型 f(X, A) 來編碼 graph 結構，然后在有label 的節點上進行訓練，所以，避免了顯示的在損失函數中，基於 graph 的正則化項。基於 f(*) 在 graph 上的近鄰矩陣將會允許模型從監督loss L0 來分布梯度信息，也確保其可以學習 nodes 的表示。

本文的創新點主要由兩個部分：

1. we introduce a localized and well-behaved propagation rule for graph convolutional neural networks, and show it can be motived from a first-order approximation of spectral convolutions on graphs.

2. we show how this form of a graph convolutional neural network can be used for fast and scalable semi-supervised classification of nodes in a graph.

【Fast Approximate Convolutions on Graphs】:

我們利用下面的傳遞規則來構建多層 Graph Convolutional Network（GCN）：

其中，是無向圖 g 的鄰接矩陣加上自我連接。$I_N$ 是單位矩陣，和 $W^l$ 是特定層的可訓練權重矩陣。$\delta(*)$ 代表激活函數，例如 ReLU(*)。$H^l$ 是第 l 層的激活的矩陣。

接下來，我們表明這種形式的傳遞規則可以由 first-order approximation of localized spectral filters on graphs 啟發而來。我們將 graph 上的 spectral convolutions 定義為一個信號 x 和 filter $g_{\theta} = diag(\theta)$ 在傅里葉領域的乘積，參數化為 $\theta$，即：

其中，U 是歸一化的 graph Laplacian 的特征向量的矩陣（the matrix of eigenvectors of the normalized graph Laplacian），，with a diagonal matrix of its eigenvalues ^ and $U^T x$ being the graph Fourier transform of x. 我們可以將 $g_{\theta}$ 看做是 L的奇異值的函數，即：。評估上述公式，計算量比較大，因為奇異值矩陣乘積的復雜度是 $O(N^2)$。此外，計算 L 的特征值分解可能對於大型的 graph 來說代價也比較昂貴。為了解決這個問題，Hammond et al. 在 2011年提出，可以用一個 truncated expansion 來很好的估計：

其中，。$\lambda_{max}$ 代表 L 的最大奇異值。$\theta'$ 現在是 Chebyshev coefficients 的向量。這里引出了一個新的概念【Chebyshev polynomials】，其定義為：$T_k(x) = 2xT_{k-1}(x) - T_{k-2}(x)$ with $T_0(x) = 1$ and $T_1(x) = x$。讀者可以繼續研究下這兩篇 paper，來更好的理解這個近似：【1】【2】。

【1】Hammond, David K, Vandergheynst, Pierre, and Gribonval, Remi. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2):129–150, 2011

【2】Defferrard, Michael, Bresson, Xavier, and Vandergheynst, Pierre. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems, 2016

重新回到我們關於 a signal x and a filter $g_{\theta'}$ 的定義，我們現在有：

其中，；可以很簡單的驗證：。注意到這個表達式具有下面的性質：。注意到，this experssion is now K-localized sinece it is a K-th localized since it is a K-th order polynomial in the Laplacian, i.e. it depends only on nodes that are at maximum K steps away from the central node (K-th order neighborhood)。評估上述公式的復雜度為 $O(E)$，即：與邊的個數有關。Defferrard et al. 【2】利用這個 K-localized convolution 來定義 graphs 上的卷積神經網絡。

在這個工作中，我們建議 keeping only terms up to order k=1 來估計上述公式。原因如下：as we intend to stack multiple layers of parameterized graph convolutions followed by non-linearities, we expect that a per-layer convolution operation that is linear with respect to the adjacency matrix increases modeling capacity while keeping the comptational complexity comparable to a single graph convolution with k > 1. We further approximate $\lambda_{max} 約等於 2$，as we can expect that neural network parameters will adapt to this change in scale during training.

有了這些近似，我們有：

有兩個 free parameters $\theta_0^'$ and $\theta_1^'$. 公式（6）可以理解為利用一個參數化的 filter 僅僅在一個節點的直接近鄰上進行局部卷積操作。這些 filter 的參數可以在整個 graph 上進行參數共享。隨后的這種 filters 可以有效的卷積一個節點的 k-th order 的近鄰，其中 k is the number of successive filtering operations or convolutional layers in the neural network model.

實際上，進一步的限制參數的數量，可以降低每一層的許多操作（如 matrix multiplication）。我們可以寫作：

這里就僅僅有一個參數了 $\theta = \theta_0^' = -\theta_1^'$。注意到，現在奇異值的范圍[0, 2]。重復的利用這個操作符，可能會引起不穩定或者梯度消失、爆炸等情況，當在一個深度神經網絡模型中進行應用的時候。為了消除這種問題，我們引入如下的 renormalization trick：

我們將這種形式拓展到 signal X with C input channels （i.e. a C-dimensional feature vector for every node）and F filters or feature maps as follows:

其中，現在是 filter 參數的矩陣，Y 是卷積的信號矩陣。這個 filter operation 的復雜度是 $O(|E|FC)$，因為可以有效的執行，as a product of a sparse matrix with a dense matrix.

【Semi-supervised Node Classification 】

　　有了上述靈活的模型 f(X, A) 在 graph 上進行有效的信息傳遞，我們可以重新回到半監督節點分類的問題。像 introduction 中列出來的那樣，我們可以 relax 在基於 graph 的半監督學習中的常規假設，通過 conditioning our model f(X, A) both on the data X and on the adjacency matrix A of the underlying graph structure. 我們希望這種設定可以在特定的場景下特別有效：the adjacency matrix contains information not present in the data X. 總體的模型，例如：一個多層的 GCN 進行半監督學習，如圖1所示的那樣。

　　3.1 Example :

　　我們考慮一個兩層的 GCN 進行半監督節點分類（a two-layer GCN for semi-supervised node classification on a graph with a symmetric adjacency matrix A (binary or weighted)）。我們首先在預處理的步驟中計算。我們的前向傳播模型可以采用下面簡單的形式：

其中，$W^0$ is a input-to-hidden weight matrix for a hidden layer with H feature maps. $W^1$ is a hidden-to-output weight matrix. 對於半監督的多類別分類，我們采用 the cross-entropy error over all labeled examples:

　　其中，$y_L$ 是帶有標簽的節點集合（the set of node indices that have labels）。

　　神經網絡的權重 $W^0$ and $W^1$ 是用 gradient descent 進行訓練的。在這個工作中，我們利用全部的數據集，進行批梯度下降，進行每一次的訓練迭代。

Pytorch 代碼實現：

1. train.py :

數據的加載

2. Layer 的定義：

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 SEMI-SUPERVISED CLASSIFICATION WITH GRAPH CONVOLUTIONAL NETWORKS 論文筆記《SEMI-SUPERVISED CLASSIFICATION WITH GRAPH CONVOLUTIONAL NETWORKS》論文閱讀（二）《SEMI-SUPERVISED CLASSIFICATION WITH GRAPH CONVOLUTIONAL NETWORKS》論文閱讀（一） Semi-Supervised Classification with Graph Convolutional Networks 論文解讀第三代GCN《Semi-Supervised Classification with Graph Convolutional Networks》論文筆記之：Semi-Supervised Learning with Generative Adversarial Networks 論文筆記（6）：Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation 論文筆記：Adaptive Consistency Regularization for Semi-Supervised Transfer Learning (CVPR 2021) [CVPR2015] Is object localization for free? – Weakly-supervised learning with convolutional neural networks論文筆記論文筆記之：Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking