圖卷積網絡GCN代碼分析（Tensorflow版）

本文轉載自查看原文 2019-09-20 10:32 8494 深度學習

圖卷積網絡GCN代碼分析（Tensorflow版）

2019年09月08日 18:27:55 yyl424525 閱讀數 267更多

分類專欄：深度學習

本文鏈接： https://blog.csdn.net/yyl424525/article/details/100634211

文章目錄

代碼分析

問題總結&歡迎討論

參考

本文分享一個對Tensorflow 版GCN源碼的分析。
keras版可以看： https://blog.csdn.net/tszupup/article/details/89004637

源代碼 github：https://github.com/tkipf/gcn

代碼分析

代碼結構

├── __init__ 
├── data      // 圖數據
├── inits    // 初始化的一些公用函數
├── layers     // GCN層的定義
├── metrics    // 評測指標的計算
├── models     // 模型結構定義
├── train    // 訓練
└── utils    //  工具函數的定義

一些具體代碼的含義都在注釋里。下面基於Cora數據集為例。

`init.py`

from __future__ import division
#即使在python2.X，使用print就得像python3.X那樣加括號使用。
from __future__ import print_function
# 導入python未來支持的語言特征division(精確除法)，
# 當我們沒有在程序中導入該特征時，"/"操作符執行的是截斷除法(Truncating Division)；
# 當我們導入精確除法之后，"/"執行的是精確除法, "//"執行截斷除除法

`train.py`

通過flags = tf.app.flags模式設置參數，可以在命令行運行時指定參數，例如：python train.py --model gcn
提供了可供選擇的三個模型：‘gcn’, ‘gcn_cheby’, ‘dense’。dense是由兩層的MLP構成的
FLAGS.weight_decay（權重衰減）：目的就是為了讓權重減少到更小的值，在一定程度上減少模型過擬合的問題
FLAGS.hidden1：卷積層第一層的output_dim，第二層的input_dim
FLAGS.max_degree:K階的切比雪夫近似矩陣的參數k
FLAGS.dropout:避免過擬合（按照一定的概率隨機丟棄一部分神經元）
輸入維度input_dim=features[2][1]（1433），也就是每個節點特征的維度

from __future__ import division
#即使在python2.X，使用print就得像python3.X那樣加括號使用。
from __future__ import print_function
# 導入python未來支持的語言特征division(精確除法)，
# 當我們沒有在程序中導入該特征時，"/"操作符執行的是截斷除法(Truncating Division)；
# 當我們導入精確除法之后，"/"執行的是精確除法, "//"執行截斷除除法

import time
import tensorflow as tf

from gcn.utils import *
from gcn.models import GCN, MLP

# Set random seed
seed = 123
np.random.seed(seed)
tf.set_random_seed(seed)

# Settings
flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_string('dataset', 'cora', 'Dataset string.')  # 'cora', 'citeseer', 'pubmed'
flags.DEFINE_string('model', 'gcn', 'Model string.')  # 'gcn', 'gcn_cheby', 'dense'
flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.')
flags.DEFINE_integer('epochs', 200, 'Number of epochs to train.')
#第一層的輸出維度
flags.DEFINE_integer('hidden1', 16, 'Number of units in hidden layer 1.')
flags.DEFINE_float('dropout', 0.5, 'Dropout rate (1 - keep probability).')

#權值衰減：防止過擬合
# loss計算方式（權值衰減+正則化）：self.loss += FLAGS.weight_decay * tf.nn.l2_loss(var)
flags.DEFINE_float('weight_decay', 5e-4, 'Weight for L2 loss on embedding matrix.')

flags.DEFINE_integer('early_stopping', 10, 'Tolerance for early stopping (# of epochs).')
flags.DEFINE_integer('max_degree', 3, 'Maximum Chebyshev polynomial degree.') #K階的切比雪夫近似矩陣的參數k



# Load data
adj, features, y_train, y_val, y_test, train_mask, val_mask, test_mask = load_data(FLAGS.dataset)
# print(features)
# (0, 19) 1.0
# (0, 81) 1.0
# ...
# (2707, 1412) 1.0
# (2707, 1414) 1.0

# print(type(features))
# <class 'scipy.sparse.lil.lil_matrix'>

#預處理特征矩陣:將特征矩陣進行歸一化並返回tuple (coords, values, shape)
features = preprocess_features(features)
# print(features)
# (array([[   0, 1274],
#        [   0, 1247],
#        [   0, 1194],
#        ...,
#        [2707,  329],
#        [2707,  186],
#        [2707,   19]], dtype=int32), array([0.11111111, 0.11111111, 0.11111111, ..., 0.07692308, 0.07692308,
#        0.07692308], dtype=float32), (2708, 1433))

# print(type(features))
# <class 'tuple'>

# print("features[1]",features[1])
# features[1] [0.11111111 0.11111111 0.11111111 ... 0.07692308 0.07692308 0.07692308]

# print("features[1].shape",features[1].shape)
# features[1].shape (49216,)
if FLAGS.model == 'gcn':
    support = [preprocess_adj(adj)]  #support是鄰接矩陣的歸一化形式
    # print("support：",support)
    # support： [(array([[0, 0],
    #                   [633, 0],
    #                   [1862, 0],
    #                   ...,
    #                   [1473, 2707],
    #                   [2706, 2707],
    #                   [2707, 2707]], dtype=int32), array([0.25, 0.25, 0.2236068, ..., 0.2, 0.2,
    #                                                       0.2]), (2708, 2708))]
    num_supports = 1
    model_func = GCN
elif FLAGS.model == 'gcn_cheby':
    support = chebyshev_polynomials(adj, FLAGS.max_degree)
    num_supports = 1 + FLAGS.max_degree
    model_func = GCN
elif FLAGS.model == 'dense':
    support = [preprocess_adj(adj)]  # Not used
    num_supports = 1
    model_func = MLP
else:
    raise ValueError('Invalid argument for model: ' + str(FLAGS.model))





# print("num_supports:",num_supports)
#num_supports: 1

# Define placeholders
placeholders = {
    #由於鄰接矩陣是稀疏的，並且用LIL格式表示，因此定義為一個tf.sparse_placeholder(tf.float32)，可以節省內存
    'support': [tf.sparse_placeholder(tf.float32) for _ in range(num_supports)],
    # features也是稀疏矩陣，也用LIL格式表示，因此定義為tf.sparse_placeholder(tf.float32)，維度(2708, 1433)
    # print(features[2])
    # (2708, 1433)
    'features': tf.sparse_placeholder(tf.float32, shape=tf.constant(features[2], dtype=tf.int64)),
    # print(y_train.shape[1])
    # 7
    'labels': tf.placeholder(tf.float32, shape=(None, y_train.shape[1])),
    'labels_mask': tf.placeholder(tf.int32),
    'dropout': tf.placeholder_with_default(0., shape=()),
    'num_features_nonzero': tf.placeholder(tf.int32)  # helper variable for sparse dropout
}




# Create model
# print(features[2][1])
# 1433
model = model_func(placeholders, input_dim=features[2][1], logging=True)

# print("GCN output_dim:",model.output_dim)
#GCN output_dim: 7



# Initialize session
sess = tf.Session()


# Define model evaluation function
def evaluate(features, support, labels, mask, placeholders):
    t_test = time.time()
    feed_dict_val = construct_feed_dict(features, support, labels, mask, placeholders)
    outs_val = sess.run([model.loss, model.accuracy], feed_dict=feed_dict_val)
    return outs_val[0], outs_val[1], (time.time() - t_test)


# Init variables
sess.run(tf.global_variables_initializer())

cost_val = []

# Train model
for epoch in range(FLAGS.epochs):

    t = time.time()
    # Construct feed dictionary
    feed_dict = construct_feed_dict(features, support, y_train, train_mask, placeholders)
    feed_dict.update({placeholders['dropout']: FLAGS.dropout})

    # Training step
    outs = sess.run([model.opt_op, model.loss, model.accuracy], feed_dict=feed_dict)
    # print("outs:",outs) #outs: [None, 0.57948196, 0.9642857]


    # Validation
    cost, acc, duration = evaluate(features, support, y_val, val_mask, placeholders)
    cost_val.append(cost)

    # Print results
    print("Epoch:", '%04d' % (epoch + 1), "train_loss=", "{:.5f}".format(outs[1]),
          "train_acc=", "{:.5f}".format(outs[2]), "val_loss=", "{:.5f}".format(cost),
          "val_acc=", "{:.5f}".format(acc), "time=", "{:.5f}".format(time.time() - t))

    if epoch > FLAGS.early_stopping and cost_val[-1] > np.mean(cost_val[-(FLAGS.early_stopping+1):-1]):
        print("Early stopping...")
        break

print("Optimization Finished!")

# Testing
test_cost, test_acc, test_duration = evaluate(features, support, y_test, test_mask, placeholders)
print("Test set results:", "cost=", "{:.5f}".format(test_cost),
      "accuracy=", "{:.5f}".format(test_acc), "time=", "{:.5f}".format(test_duration))

`models.py`

定義了一個model基類，以及兩個繼承自model類的MLP、GCN類。

需要注意self.outputs、self.activations、self.layers的計算方式（看注釋）

from gcn.layers import *
from gcn.metrics import *

flags = tf.app.flags
FLAGS = flags.FLAGS


#根據Layer來建立Model,主要是設置了self.layers 和 self.activations 建立序列模型，
# 還有init中的其他比如loss、accuracy、optimizer、opt_op等。
class Model(object):
    def __init__(self, **kwargs):
        allowed_kwargs = {'name', 'logging'}
        for kwarg in kwargs.keys():
            assert kwarg in allowed_kwargs, 'Invalid keyword argument: ' + kwarg
        name = kwargs.get('name')
        if not name:
            name = self.__class__.__name__.lower()
        self.name = name

        logging = kwargs.get('logging', False)
        self.logging = logging

        self.vars = {}
        self.placeholders = {}

        #在子類中可以看出，通過_build方法append各個層
        #保存每一個layer
        self.layers = []
        #保存每一次的輸入，以及最后一層的輸出
        self.activations = []

        self.inputs = None
        self.outputs = None

        self.loss = 0
        self.accuracy = 0
        self.optimizer = None
        self.opt_op = None

    # 定義私有方法，只能被類中的函數調用，不能在類外單獨調用
    def _build(self):
        raise NotImplementedError

    def build(self):
        """ Wrapper for _build() """
        with tf.variable_scope(self.name):
            self._build()

        # Build sequential layer model
        self.activations.append(self.inputs)

        # 以一個兩層GCN層為例，輸入inputs是features
        #self.activations.append(self.inputs)初始化第一個元素為inputs，也就是features
        # 第一層，hidden=layer(self.activations[-1])，即hidden等於inputs的輸出outputs，並將第一層的輸出hidden=outputs加入到activations中
        #同理，對第二層，hidden作為一個中間存儲結果。最后activations分別存儲了三個元素：第一層的輸入，第二層的輸入（第一層的輸出），第二層的輸出
        # 最后self.outputs=最后一層的輸出
        for layer in self.layers:
            #Layer類重寫了__call__ 函數，可以把對象當函數調用,__call__輸入為inputs，輸出為outputs
            hidden = layer(self.activations[-1])
            self.activations.append(hidden)


        self.outputs = self.activations[-1]

        # Store model variables for easy access
        variables = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope=self.name)
        self.vars = {var.name: var for var in variables}

        # Build metrics
        self._loss()
        self._accuracy()

        self.opt_op = self.optimizer.minimize(self.loss)

    def predict(self):
        pass

    def _loss(self):
        raise NotImplementedError

    def _accuracy(self):
        raise NotImplementedError

    def save(self, sess=None):
        if not sess:
            raise AttributeError("TensorFlow session not provided.")
        saver = tf.train.Saver(self.vars)
        save_path = saver.save(sess, "tmp/%s.ckpt" % self.name)
        print("Model saved in file: %s" % save_path)

    def load(self, sess=None):
        if not sess:
            raise AttributeError("TensorFlow session not provided.")
        saver = tf.train.Saver(self.vars)
        save_path = "tmp/%s.ckpt" % self.name
        saver.restore(sess, save_path)
        print("Model restored from file: %s" % save_path)


#繼承Model的多層感知機，主要是重寫了基類中沒有實現的函數；計算了網絡第一層的權重衰減L2損失，因為這是半監督學習，還計算了掩碼交叉熵masked_softmax_cross_entropy
class MLP(Model):
    def __init__(self, placeholders, input_dim, **kwargs):
        super(MLP, self).__init__(**kwargs)

        self.inputs = placeholders['features']
        self.input_dim = input_dim
        # self.input_dim = self.inputs.get_shape().as_list()[1]  # To be supported in future Tensorflow versions
        self.output_dim = placeholders['labels'].get_shape().as_list()[1]
        self.placeholders = placeholders #以key，value形式存儲的字典

        self.optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)

        self.build()

    def _loss(self):
        # Weight decay loss # 正則化項
        for var in self.layers[0].vars.values():
            self.loss += FLAGS.weight_decay * tf.nn.l2_loss(var)

        # Cross entropy error # 交叉熵損失函數
        self.loss += masked_softmax_cross_entropy(self.outputs, self.placeholders['labels'],
                                                  self.placeholders['labels_mask'])

    def _accuracy(self):
        self.accuracy = masked_accuracy(self.outputs, self.placeholders['labels'],
                                        self.placeholders['labels_mask'])

    def _build(self):
        self.layers.append(Dense(input_dim=self.input_dim,
                                 output_dim=FLAGS.hidden1,
                                 placeholders=self.placeholders,
                                 act=tf.nn.relu,
                                 dropout=True,
                                 sparse_inputs=True,
                                 logging=self.logging))

        self.layers.append(Dense(input_dim=FLAGS.hidden1,
                                 output_dim=self.output_dim,
                                 placeholders=self.placeholders,
                                 act=lambda x: x,
                                 dropout=True,
                                 logging=self.logging))

    def predict(self):
        return tf.nn.softmax(self.outputs)

#繼承Model的卷機模型 GCN
class GCN(Model):
    def __init__(self, placeholders, input_dim, **kwargs):
        super(GCN, self).__init__(**kwargs)

        self.inputs = placeholders['features']
        self.input_dim = input_dim
        # self.input_dim = self.inputs.get_shape().as_list()[1]  # To be supported in future Tensorflow versions
        self.output_dim = placeholders['labels'].get_shape().as_list()[1]
        self.placeholders = placeholders

        self.optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)

        self.build()

    # 損失計算
    def _loss(self):
        # Weight decay loss
        for var in self.layers[0].vars.values():
            self.loss += FLAGS.weight_decay * tf.nn.l2_loss(var)

        # Cross entropy error
        self.loss += masked_softmax_cross_entropy(self.outputs, self.placeholders['labels'],
                                                  self.placeholders['labels_mask'])
    # 計算模型准確度
    def _accuracy(self):
        self.accuracy = masked_accuracy(self.outputs, self.placeholders['labels'],
                                        self.placeholders['labels_mask'])
    # 構建模型：兩層GCN
    def _build(self):
        #第一層的輸入維度：input_dim=1433
        #第一層的輸出維度：output_dim=FLAGS.hidden1=16
        #第一層的激活函數：relu
        self.layers.append(GraphConvolution(input_dim=self.input_dim,
                                            output_dim=FLAGS.hidden1,
                                            placeholders=self.placeholders,
                                            act=tf.nn.relu,
                                            dropout=True,
                                            sparse_inputs=True,
                                            logging=self.logging))

        #第二層的輸入等於第一層的輸出維度：input_dim=FLAGS.hidden1=16
        #第二層的輸出維度：output_dim=placeholders['labels'].get_shape().as_list()[1]=7
        #第二層的激活函數：lambda x: x，即沒有加激活函數
        self.layers.append(GraphConvolution(input_dim=FLAGS.hidden1,
                                            output_dim=self.output_dim,
                                            placeholders=self.placeholders,
                                            act=lambda x: x,
                                            dropout=True,
                                            logging=self.logging))
    # 模型預測
    def predict(self):
        #返回的tensor每一行和為1
        return tf.nn.softmax(self.outputs)
    #test.py
    #tf.enable_eager_execution()
    # ones = tf.ones(shape=[2,3])
    # print(ones)
    # temp3 = tf.nn.softmax(ones)
    # print(temp3)
    # tf.Tensor(
    # [[0.33333334 0.33333334 0.33333334]
    #  [0.33333334 0.33333334 0.33333334]], shape=(2, 3), dtype=float32)

`layers.py`

定義基類 Layer
屬性：name (String) => 定義了變量范圍；logging (Boolean) => 打開或關閉TensorFlow直方圖日志記錄
方法：init()(初始化)，_call()(定義計算)，call()(調用_call()函數)，_log_vars()
定義Dense Layer類，繼承自Layer類
定義GraphConvolution類，繼承自Layer類。

from gcn.inits import *
import tensorflow as tf

flags = tf.app.flags
FLAGS = flags.FLAGS

# global unique layer ID dictionary for layer name assignment
_LAYER_UIDS = {}


def get_layer_uid(layer_name=''):
    """Helper function, assigns unique layer IDs."""
    if layer_name not in _LAYER_UIDS:
        _LAYER_UIDS[layer_name] = 1
        return 1
    else:
        _LAYER_UIDS[layer_name] += 1
        return _LAYER_UIDS[layer_name]


#稀疏矩陣的dropout操作
def sparse_dropout(x, keep_prob, noise_shape):
    """Dropout for sparse tensors."""
    random_tensor = keep_prob
    random_tensor += tf.random_uniform(noise_shape)
    dropout_mask = tf.cast(tf.floor(random_tensor), dtype=tf.bool)
    pre_out = tf.sparse_retain(x, dropout_mask)
    return pre_out * (1./keep_prob)


def dot(x, y, sparse=False):
    """Wrapper for tf.matmul (sparse vs dense)."""
    if sparse:
        res = tf.sparse_tensor_dense_matmul(x, y)
    else:
        res = tf.matmul(x, y)
    return res


#定義Layer 層，主要作用是：對每層的name做了命名，還用一個參數決定是否做log
class Layer(object):
    """Base layer class. Defines basic API for all layer objects.
    Implementation inspired by keras (http://keras.io).

    # Properties
        name: String, defines the variable scope of the layer.
        logging: Boolean, switches Tensorflow histogram logging on/off

    # Methods
        _call(inputs): Defines computation graph of layer
            (i.e. takes input, returns output)
        __call__(inputs): Wrapper for _call()
        _log_vars(): Log all variables
    """

    def __init__(self, **kwargs):
        allowed_kwargs = {'name', 'logging'}
        for kwarg in kwargs.keys():
            assert kwarg in allowed_kwargs, 'Invalid keyword argument: ' + kwarg
        name = kwargs.get('name')
        if not name:
            layer = self.__class__.__name__.lower()
            name = layer + '_' + str(get_layer_uid(layer))
        self.name = name
        self.vars = {}
        logging = kwargs.get('logging', False)
        self.logging = logging
        self.sparse_inputs = False

    def _call(self, inputs):
        return inputs

    #__call__ 的作用讓 Layer 的實例成為可調用對象；
    def __call__(self, inputs):
        with tf.name_scope(self.name):
            if self.logging and not self.sparse_inputs:
                tf.summary.histogram(self.name + '/inputs', inputs)
            outputs = self._call(inputs)
            if self.logging:
                tf.summary.histogram(self.name + '/outputs', outputs)
            return outputs

    def _log_vars(self):
        for var in self.vars:
            tf.summary.histogram(self.name + '/vars/' + var, self.vars[var])

#根據 Layer 繼承得到denseNet
class Dense(Layer):
    """Dense layer."""
    def __init__(self, input_dim, output_dim, placeholders, dropout=0., sparse_inputs=False,
                 act=tf.nn.relu, bias=False, featureless=False, **kwargs):
        super(Dense, self).__init__(**kwargs)

        if dropout:
            self.dropout = placeholders['dropout']
        else:
            self.dropout = 0.

        self.act = act   #激活函數
        self.sparse_inputs = sparse_inputs  #是否是稀疏數據
        self.featureless = featureless  #輸入的數據帶不帶特征矩陣
        self.bias = bias  #是否有偏置

        # helper variable for sparse dropout
        self.num_features_nonzero = placeholders['num_features_nonzero']

        with tf.variable_scope(self.name + '_vars'):
            self.vars['weights'] = glorot([input_dim, output_dim],
                                          name='weights')
            if self.bias:
                self.vars['bias'] = zeros([output_dim], name='bias')

        if self.logging:
            self._log_vars()

    #重寫了_call 函數，其中對稀疏矩陣做 drop_out:sparse_dropout()
    def _call(self, inputs):
        x = inputs

        # dropout
        if self.sparse_inputs:
            x = sparse_dropout(x, 1-self.dropout, self.num_features_nonzero)
        else:
            x = tf.nn.dropout(x, 1-self.dropout)

        # transform
        output = dot(x, self.vars['weights'], sparse=self.sparse_inputs)

        # bias
        if self.bias:
            output += self.vars['bias']

        return self.act(output)


#從 Layer 繼承下來得到圖卷積網絡，與denseNet的唯一差別是_call函數和__init__函數（self.support = placeholders['support']的初始化）
class GraphConvolution(Layer):
    """Graph convolution layer."""
    def __init__(self, input_dim, output_dim, placeholders, dropout=0.,
                 sparse_inputs=False, act=tf.nn.relu, bias=False,
                 featureless=False, **kwargs):
        super(GraphConvolution, self).__init__(**kwargs)

        if dropout:
            self.dropout = placeholders['dropout']
        else:
            self.dropout = 0.

        self.act = act
        self.support = placeholders['support']
        self.sparse_inputs = sparse_inputs
        self.featureless = featureless
        self.bias = bias

        # helper variable for sparse dropout
        self.num_features_nonzero = placeholders['num_features_nonzero']

        # 下面是定義變量，主要是通過調用utils.py中的glorot函數實現
        with tf.variable_scope(self.name + '_vars'):
            for i in range(len(self.support)):
                self.vars['weights_' + str(i)] = glorot([input_dim, output_dim],
                                                        name='weights_' + str(i))
            if self.bias:
                self.vars['bias'] = zeros([output_dim], name='bias')

        if self.logging:
            self._log_vars()

    def _call(self, inputs):
        x = inputs

        # dropout
        if self.sparse_inputs:
            x = sparse_dropout(x, 1-self.dropout, self.num_features_nonzero)
        else:
            x = tf.nn.dropout(x, 1-self.dropout)

        # convolve
        # convolve 卷積的實現。主要是根據論文中公式Z = \tilde{D}^{-1/2}\tilde{A}^{-1/2}X\theta實現
        supports = list()  #support是鄰接矩陣的一個變化
        for i in range(len(self.support)):
            if not self.featureless:
                pre_sup = dot(x, self.vars['weights_' + str(i)],
                              sparse=self.sparse_inputs)
            else:
                pre_sup = self.vars['weights_' + str(i)]
            support = dot(self.support[i], pre_sup, sparse=True)
            supports.append(support)
        output = tf.add_n(supports)

        # bias
        if self.bias:
            output += self.vars['bias']

        return self.act(output)

`utils.py`

LIL（Row-Based Linked List Format）-基於行的鏈表格式

稀疏矩陣轉化成兩個鏈表data和rows：

列表.data: data[k]是行k中的非零元素的列表。如果該行中的所有元素都為0，則它包含一個空列表。
列表.rows: 是在位置k包含了在行k中的非零元素列索引列表。

import numpy as np
import scipy.sparse as sp

A=np.array([[1,0,2,0],[0,0,0,0],[3,0,0,0],[1,0,0,4]])
AS=sp.lil_matrix(A)

print(AS.data)
# [list([1, 2]) list([]) list([3]) list([1, 4])]
print(AS.rows)
# [list([0, 2]) list([]) list([0]) list([0, 3])]

載入數據的維度（以Cora數據集為例）

adj(鄰接矩陣)：由於比較稀疏，鄰接矩陣格式是LIL的，並且shape為(2708, 2708)
features（特征矩陣）：每個節點的特征向量也是稀疏的，也用LIL格式存儲，features.shape: (2708, 1433)
labels：ally, ty數據集疊加構成，labels.shape:(2708, 7)
train_mask, val_mask, test_mask：shaped都為(2708, )的向量，但是train_mask中的[0,140)范圍的是True，其余是False；val_mask中范圍為(140, 640]范圍為True，其余的是False；test_mask中范圍為[1708,2707]范圍是True，其余的是False
y_train, y_val, y_test：shape都是(2708, 7) 。y_train的值為對應與labels中train_mask為True的行，其余全是0；y_val的值為對應與labels中val_mask為True的行，其余全是0；y_test的值為對應與labels中test_mask為True的行，其余全是0
特征矩陣進行歸一化並返回一個格式為(coords, values, shape)的元組
將鄰接矩陣加上自環以后，對稱歸一化，並存儲為COO模式，最后返回格式為(coords, values, shape)的元組

import numpy as np
import pickle as pkl
import networkx as nx
import scipy.sparse as sp
from scipy.sparse.linalg.eigen.arpack import eigsh
import sys


def parse_index_file(filename):
    """Parse index file."""
    index = []
    for line in open(filename):
        index.append(int(line.strip()))
        print(int(line.strip()))
    print("min", min(index))
    return index


def sample_mask(idx, l):
    """Create mask."""
    mask = np.zeros(l)
    mask[idx] = 1
    return np.array(mask, dtype=np.bool)


# 數據的讀取，這個預處理是把訓練集（其中一部分帶有標簽），測試集，標簽的位置，對應的掩碼訓練標簽等返回。
def load_data(dataset_str):
    """
    Loads input data from gcn/data directory

    ind.dataset_str.x => the feature vectors of the training instances as scipy.sparse.csr.csr_matrix object;
    ind.dataset_str.tx => the feature vectors of the test instances as scipy.sparse.csr.csr_matrix object;
    ind.dataset_str.allx => the feature vectors of both labeled and unlabeled training instances
        (a superset of ind.dataset_str.x) as scipy.sparse.csr.csr_matrix object;
    ind.dataset_str.y => the one-hot labels of the labeled training instances as numpy.ndarray object;
    ind.dataset_str.ty => the one-hot labels of the test instances as numpy.ndarray object;
    ind.dataset_str.ally => the labels for instances in ind.dataset_str.allx as numpy.ndarray object;
    ind.dataset_str.graph => a dict in the format {index: [index_of_neighbor_nodes]} as collections.defaultdict
        object;
    ind.dataset_str.test.index => the indices of test instances in graph, for the inductive setting as list object.

    All objects above must be saved using python pickle module.

    :param dataset_str: Dataset name
    :return: All data input files loaded (as well the training/test data).
    """
    names = ['x', 'y', 'tx', 'ty', 'allx', 'ally', 'graph']
    objects = []
    for i in range(len(names)):
        with open("data/ind.{}.{}".format(dataset_str, names[i]), 'rb') as f:
            if sys.version_info > (3, 0):  # get python version
                objects.append(pkl.load(f, encoding='latin1'))
            else:
                objects.append(pkl.load(f))

    # x.shape:(140, 1433); y.shape:(140, 7);tx.shape:(1000, 1433);ty.shape:(1708, 1433);
    # allx.shape:(1708, 1433);ally.shape:(1708, 7)
    x, y, tx, ty, allx, ally, graph = tuple(objects)  # 轉化成tuple

    # 測試數據集
    # print(x[0][0],x.shape,type(x))  ##x是一個稀疏矩陣,記住1的位置,140個實例,每個實例的特征向量維度是1433  (140,1433)
    # print(y[0],y.shape)   ##y是標簽向量,7分類，140個實例 (140,7)

    ##訓練數據集
    # print(tx[0][0],tx.shape,type(tx))  ##tx是一個稀疏矩陣,1000個實例,每個實例的特征向量維度是1433  (1000,1433)
    # print(ty[0],ty.shape)   ##y是標簽向量,7分類，1000個實例 (1000,7)

    ##allx,ally和上面的形式一致
    # print(allx[0][0],allx.shape,type(allx))  ##tx是一個稀疏矩陣,1708個實例,每個實例的特征向量維度是1433  (1708,1433)
    # print(ally[0],ally.shape)   ##y是標簽向量,7分類，1708個實例 (1708,7)

    ##graph是一個字典，大圖總共2708個節點
    # for i in graph:
    #     print(i,graph[i])

    # 測試數據集的索引亂序版
    test_idx_reorder = parse_index_file("data/ind.{}.test.index".format(dataset_str))
    # print(test_idx_reorder)
    # [2488, 2644, 3261, 2804, 3176, 2432, 3310, 2410, 2812,...]

    # 從小到大排序,如[1707,1708,1709,...]
    test_idx_range = np.sort(test_idx_reorder)

    # 處理citeseer中一些孤立的點
    if dataset_str == 'citeseer':
        # Fix citeseer dataset (there are some isolated nodes in the graph)
        # Find isolated nodes, add them as zero-vecs into the right position

        test_idx_range_full = range(min(test_idx_reorder), max(test_idx_reorder) + 1)
        # print("test_idx_range_full.length",len(test_idx_range_full))
        # test_idx_range_full.length 1015

        # 轉化成LIL格式的稀疏矩陣,tx_extended.shape=(1015,1433)
        tx_extended = sp.lil_matrix((len(test_idx_range_full), x.shape[1]))
        # print(tx_extended)
        # [2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325
        # ....
        # 3321 3322 3323 3324 3325 3326]

        # test_idx_range-min(test_idx_range):列表中每個元素都減去min(test_idx_range)，即將test_idx_range列表中的index值變為從0開始編號
        tx_extended[test_idx_range - min(test_idx_range), :] = tx
        # print(tx_extended.shape) #(1015, 3703)

        # print(tx_extended)
        # (0, 19) 1.0
        # (0, 21) 1.0
        # (0, 169) 1.0
        # (0, 170) 1.0
        # (0, 425) 1.0
        #  ...
        # (1014, 3243) 1.0
        # (1014, 3351) 1.0
        # (1014, 3472) 1.0

        tx = tx_extended
        # print(tx.shape)
        # (1015, 3703)
        # 997,994,993,980,938...等15行全為0

        ty_extended = np.zeros((len(test_idx_range_full), y.shape[1]))
        ty_extended[test_idx_range - min(test_idx_range), :] = ty
        ty = ty_extended
        # for i in range(ty.shape[0]):
        #     print(i," ",ty[i])
        #     # 980 [0. 0. 0. 0. 0. 0.]
        #     # 994 [0. 0. 0. 0. 0. 0.]
        #     # 993 [0. 0. 0. 0. 0. 0.]

    # 將allx和tx疊起來並轉化成LIL格式的feature,即輸入一張整圖
    features = sp.vstack((allx, tx)).tolil()

    # 把特征矩陣還原，和對應的鄰接矩陣對應起來，因為之前是打亂的，不對齊的話，特征就和對應的節點搞錯了。
    features[test_idx_reorder, :] = features[test_idx_range, :]
    # print("features.shape:",features.shape)
    # features.shape: (2708, 1433)

    # 鄰接矩陣格式也是LIL的，並且shape為(2708, 2708)
    adj = nx.adjacency_matrix(nx.from_dict_of_lists(graph))

    # labels.shape:(2708, 7)
    labels = np.vstack((ally, ty))
    labels[test_idx_reorder, :] = labels[test_idx_range, :]

    # len(list(idx_val)) + len(list(idx_train)) + len(idx_test) =  1640
    idx_test = test_idx_range.tolist()
    # print(idx_test)
    # [1708, 1709, 1710, 1711, 1712, 1713,...,2705, 2706, 2707]
    # print(len(idx_test))
    # 1000

    idx_train = range(len(y))
    # print(idx_train)
    # range(0, 140)

    idx_val = range(len(y), len(y) + 500)
    # print(idx_val,len(idx_val))
    # range(140, 640) 500

    # 訓練mask：idx_train=[0,140)范圍的是True，后面的是False
    train_mask = sample_mask(idx_train, labels.shape[0])  # labels.shape[0]:(2708,)
    # print(train_mask,train_mask.shape)
    # [True  True  True... False False False]  # labels.shape[0]:(2708,)

    # 驗證mask：val_mask的idx_val=(140, 640]范圍為True，其余的是False
    val_mask = sample_mask(idx_val, labels.shape[0])  # labels.shape[0]:(2708,)

    # test_mask，idx_test=[1708,2707]范圍是True，其余的是False
    test_mask = sample_mask(idx_test, labels.shape[0])

    y_train = np.zeros(labels.shape)
    y_val = np.zeros(labels.shape)
    y_test = np.zeros(labels.shape)
    # print(y_train.shape," ",y_test.shape," ",y_val.shape)
    # (2708, 7)(2708, 7)(2708, 7)

    # 替換了true位置
    y_train[train_mask, :] = labels[train_mask, :]

    y_val[val_mask, :] = labels[val_mask, :]
    y_test[test_mask, :] = labels[test_mask, :]

    return adj, features, y_train, y_val, y_test, train_mask, val_mask, test_mask


# 將稀疏矩sparse_mx陣轉換成tuple格式並返回
def sparse_to_tuple(sparse_mx):
    """Convert sparse matrix to tuple representation."""

    def to_tuple(mx):
        if not sp.isspmatrix_coo(mx):
            mx = mx.tocoo()
        coords = np.vstack((mx.row, mx.col)).transpose()
        values = mx.data
        shape = mx.shape
        return coords, values, shape

    if isinstance(sparse_mx, list):
        for i in range(len(sparse_mx)):
            sparse_mx[i] = to_tuple(sparse_mx[i])
    else:
        sparse_mx = to_tuple(sparse_mx)

    return sparse_mx


# 處理特征:特征矩陣進行歸一化並返回一個格式為(coords, values, shape)的元組
# 特征矩陣的每一行的每個元素除以行和，處理后的每一行元素之和為1
# 處理特征矩陣，跟譜圖卷積的理論有關，目的是要把周圍節點的特征和自身節點的特征都捕捉到，同時避免不同節點間度的不均衡帶來的問題
def preprocess_features(features):
    """Row-normalize feature matrix and convert to tuple representation"""
    print("preprocess_features")
    # >> > b = [[1.0, 3], [2, 4], [3, 5]]
    # >> > b = np.array(b)
    # >> > b
    # array([[1., 3.],
    #        [2., 4.],
    #        [3., 5.]])
    # >> > np.array(b.sum(1))
    # array([4., 6., 8.])
    # >> > c = np.array(b.sum(1))
    # >> > np.power(c, -1)
    # array([0.25, 0.16666667, 0.125])
    # >> > np.power(c, -1).flatten()
    # array([0.25, 0.16666667, 0.125])
    # >> > r_inv = np.power(c, -1).flatten()
    # >> > import scipy.sparse as sp
    # >> > r_mat_inv = sp.diags(r_inv)
    # >> > r_mat_inv
    # < 3x3 sparse matrix of type '<class 'numpy.float64 '>'
    # with 3 stored elements (1 diagonals) in DIAgonal format >
    # >> > r_mat_inv.toarray()
    # array([[0.25, 0., 0.],
    #        [0., 0.16666667, 0.],
    #        [0., 0., 0.125]])
    # >> > f = r_mat_inv.dot(b)
    # >> > f
    # array([[0.25, 0.75],
    #        [0.33333333, 0.66666667],
    #        [0.375, 0.625]])

    # a.sum()是將矩陣中所有的元素進行求和;a.sum(axis = 0)是每一列列相加;a.sum(axis = 1)是每一行相加
    rowsum = np.array(features.sum(1))
    r_inv = np.power(rowsum, -1).flatten()
    # print("r_inv:", r_inv)
    # r_inv: [0.11111111 0.04347826 0.05263158... 0.05555556 0.07142857 0.07692308]
    # np.isnan(ndarray)返回一個判斷是否是NaN的bool型數組
    r_inv[np.isinf(r_inv)] = 0.
    # sp.diags創建一個對角稀疏矩陣
    r_mat_inv = sp.diags(r_inv)
    # dot矩陣乘法
    features = r_mat_inv.dot(features)
    return sparse_to_tuple(features)


# 鄰接矩陣adj對稱歸一化並返回coo存儲模式
def normalize_adj(adj):
    """Symmetrically normalize adjacency matrix."""
    adj = sp.coo_matrix(adj)
    rowsum = np.array(adj.sum(1))
    d_inv_sqrt = np.power(rowsum, -0.5).flatten()
    d_inv_sqrt[np.isinf(d_inv_sqrt)] = 0.
    d_mat_inv_sqrt = sp.diags(d_inv_sqrt)
    return adj.dot(d_mat_inv_sqrt).transpose().dot(d_mat_inv_sqrt).tocoo()


# 將鄰接矩陣加上自環以后，對稱歸一化，並存儲為COO模式，最后返回元組格式
def preprocess_adj(adj):
    """Preprocessing of adjacency matrix for simple GCN model and conversion to tuple representation."""
    #加上自環，再對稱歸一化
    adj_normalized = normalize_adj(adj + sp.eye(adj.shape[0]))
    return sparse_to_tuple(adj_normalized)


# 構建輸入字典並返回
#labels和labels_mask傳入的是具體的值，例如
# labels=y_train,labels_mask=train_mask；
# labels=y_val,labels_mask=val_mask；
# labels=y_test,labels_mask=test_mask；
def construct_feed_dict(features, support, labels, labels_mask, placeholders):
    """Construct feed dictionary."""
    feed_dict = dict()
    feed_dict.update({placeholders['labels']: labels})
    feed_dict.update({placeholders['labels_mask']: labels_mask})
    feed_dict.update({placeholders['features']: features})
    #由於鄰接矩陣是稀疏的，並且用LIL格式表示，因此定義為一個tf.sparse_placeholder(tf.float32)，可以節省內存
    feed_dict.update({placeholders['support'][i]: support[i] for i in range(len(support))})
    # print(features)
    # (array([[   0, 1274],
    #        [   0, 1247],
    #        [   0, 1194],
    #        ...,
    #        [2707,  329],
    #        [2707,  186],
    #        [2707,   19]], dtype=int32), array([0.11111111, 0.11111111, 0.11111111, ..., 0.07692308, 0.07692308,
    #        0.07692308], dtype=float32), (2708, 1433))

    # print(type(features))
    # <class 'tuple'>

    # print("features[1]",features[1])
    # features[1] [0.11111111 0.11111111 0.11111111 ... 0.07692308 0.07692308 0.07692308]

    # print("features[1].shape",features[1].shape)
    # features[1].shape (49216,)
    #49126是特征矩陣存儲為coo模式后非零元素的個數（2078*1433里只有49126個非零，稀疏度達1.3%）
    feed_dict.update({placeholders['num_features_nonzero']: features[1].shape})
    return feed_dict


# 切比雪夫多項式近似:計算K階的切比雪夫近似矩陣
def chebyshev_polynomials(adj, k):
    """Calculate Chebyshev polynomials up to order k. Return a list of sparse matrices (tuple representation)."""
    print("Calculating Chebyshev polynomials up to order {}...".format(k))

    adj_normalized = normalize_adj(adj)  # D^{-1/2}AD^{1/2}
    laplacian = sp.eye(adj.shape[0]) - adj_normalized  # L = I_N - D^{-1/2}AD^{1/2}
    largest_eigval, _ = eigsh(laplacian, 1, which='LM')  # \lambda_{max}
    scaled_laplacian = (2. / largest_eigval[0]) * laplacian - sp.eye(adj.shape[0])  # 2/\lambda_{max}L-I_N

    # 將切比雪夫多項式的 T_0(x) = 1和 T_1(x) = x 項加入到t_k中
    t_k = list()
    t_k.append(sp.eye(adj.shape[0]))
    t_k.append(scaled_laplacian)

    # 依據公式 T_n(x) = 2xT_n(x) - T_{n-1}(x) 構造遞歸程序，計算T_2 -> T_k
    def chebyshev_recurrence(t_k_minus_one, t_k_minus_two, scaled_lap):
        s_lap = sp.csr_matrix(scaled_lap, copy=True)
        return 2 * s_lap.dot(t_k_minus_one) - t_k_minus_two

    for i in range(2, k + 1):
        t_k.append(chebyshev_recurrence(t_k[-1], t_k[-2], scaled_laplacian))

    return sparse_to_tuple(t_k)

# load_data('cora')

`metrics.py`

import tensorflow as tf


# 其中 mask 是一個索引向量，值為1表示該位置的標簽在訓練數據中是給定的；比如100個數據中訓練集已知帶標簽的數據有50個，
# 那么計算損失的時候，loss 乘以的 mask  等於 loss 在未帶標簽的地方都乘以0沒有了，而在帶標簽的地方損失變成了mask倍；
# 即只對帶標簽的樣本計算損失。
# 注：loss的shape與mask的shape相同，等於樣本的數量：(None,），所以 loss *= mask 是向量點乘。
def masked_softmax_cross_entropy(preds, labels, mask):
    """Softmax cross-entropy loss with masking."""
    loss = tf.nn.softmax_cross_entropy_with_logits(logits=preds, labels=labels)
    mask = tf.cast(mask, dtype=tf.float32)
    mask /= tf.reduce_mean(mask) #擴大了tf.reduce_mean(mask)倍，因此要除以這個數
    loss *= mask
    return tf.reduce_mean(loss)


def masked_accuracy(preds, labels, mask):
    """Accuracy with masking."""
    correct_prediction = tf.equal(tf.argmax(preds, 1), tf.argmax(labels, 1))
    accuracy_all = tf.cast(correct_prediction, tf.float32)
    mask = tf.cast(mask, dtype=tf.float32)
    mask /= tf.reduce_mean(mask)
    accuracy_all *= mask
    return tf.reduce_mean(accuracy_all)

`inits.py`

glorot初始化方法：它為了保證前向傳播和反向傳播時每一層的方差一致:在正向傳播時，每層的激活值的方差保持不變；在反向傳播時，每層的梯度值的方差保持不變。根據每層的輸入個數和輸出個數來決定參數隨機初始化的分布范圍，是一個通過該層的輸入和輸出參數個數得到的分布范圍內的均勻分布。
(推導見：https://blog.csdn.net/yyl424525/article/details/100823398#4_Xavier_21)

import tensorflow as tf
import numpy as np

#產生一個維度為shape的Tensor，值分布在（-0.005-0.005）之間，且為均勻分布
def uniform(shape, scale=0.05, name=None):
    """Uniform init."""
    initial = tf.random_uniform(shape, minval=-scale, maxval=scale, dtype=tf.float32)
    return tf.Variable(initial, name=name)


def glorot(shape, name=None):
    """Glorot & Bengio (AISTATS 2010) init."""
    #
    init_range = np.sqrt(6.0/(shape[0]+shape[1]))
    initial = tf.random_uniform(shape, minval=-init_range, maxval=init_range, dtype=tf.float32)
    return tf.Variable(initial, name=name)

#產生一個維度為shape，值全為1的Tensor
def zeros(shape, name=None):
    """All zeros."""
    initial = tf.zeros(shape, dtype=tf.float32)
    return tf.Variable(initial, name=name)

#產生一個維度為shape，值全為0的Tensor
def ones(shape, name=None):
    """All ones."""
    initial = tf.ones(shape, dtype=tf.float32)
    return tf.Variable(initial, name=name)

問題總結&歡迎討論

Q1：總共2708個節點，但是訓練數據僅用了140個，范圍是(0, 140)，驗證集用了500個，范圍是(140, 640]，測試集用了1000個，范圍是[1708,2707]，其余范圍從[641，1707]的數據集呢？以及這樣分配數據集合理嗎？

Q2：增加GCN層數，為何准確率還降低了？

最后，歡迎加微信進群交流，只為一起學習，不含任何商業推廣。

參考

快速構建訓練集、驗證集和測試集label的一個方法

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 圖卷積網絡入門(GCN) 圖卷積網絡（GCN）python實現圖卷積神經網絡(GCN)入門第十九節，GCN——圖卷積網絡圖卷積神經網絡(GCN)入門【GCN】圖卷積網絡初探——基於圖（Graph）的傅里葉變換和卷積圖卷積神經網絡GCN系列二：節點分類（含示例及代碼）圖卷積神經網絡GCN：整圖分類（含示例及代碼）圖卷積神經網絡（GCN）原理詳解圖卷積網絡 GCN Graph Convolutional Network（譜域GCN）的理解和詳細推導

圖卷積網絡GCN代碼分析（Tensorflow版）

圖卷積網絡GCN代碼分析（Tensorflow版）

文章目錄

代碼分析

__init__.py

train.py

models.py

layers.py

utils.py

LIL（Row-Based Linked List Format）-基於行的鏈表格式

載入數據的維度（以Cora數據集為例）

metrics.py

inits.py

問題總結&歡迎討論

Q1：總共2708個節點，但是訓練數據僅用了140個，范圍是(0, 140)，驗證集用了500個，范圍是(140, 640]，測試集用了1000個，范圍是[1708,2707]，其余范圍從[641，1707]的數據集呢？以及這樣分配數據集合理嗎？

Q2：增加GCN層數，為何准確率還降低了？

參考

免責聲明！

`init.py`

`train.py`

`models.py`

`layers.py`

`utils.py`

`metrics.py`

`inits.py`