graph attention network（ICLR2018）官方代碼詳解（tensorflow）-稀疏矩陣版

本文轉載自查看原文 2020-09-06 22:20 838 圖卷積網絡

論文地址：https://arxiv.org/abs/1710.10903

代碼地址： https://github.com/Diego999/pyGAT

之前非稀疏矩陣版的解讀：https://www.cnblogs.com/xiximayou/p/13622283.html

我們知道圖的鄰接矩陣可能是稀疏的，將整個圖加載到內存中是十分耗費資源的，因此對鄰接矩陣進行存儲和計算是很有必要的。

我們已經講解了圖注意力網絡的非稀疏矩陣版本，再來弄清其稀疏矩陣版本就輕松了，接下來我們將來看不同之處。

主運行代碼在：execute_cora_sparse.py中

同樣的，先加載數據：

adj, features, y_train, y_val, y_test, train_mask, val_mask, test_mask = process.load_data(dataset)

其中adj是coo_matrix類型，features是lil_matrix類型。

對於features，我們最終還是：

def preprocess_features(features):
    """Row-normalize feature matrix and convert to tuple representation"""
    rowsum = np.array(features.sum(1))
    r_inv = np.power(rowsum, -1).flatten()
    r_inv[np.isinf(r_inv)] = 0.
    r_mat_inv = sp.diags(r_inv)
    features = r_mat_inv.dot(features)
    return features.todense(), sparse_to_tuple(features)

將其：

features, spars = process.preprocess_features(features)

轉換為原始矩陣。

對於biases：

if sparse:
    biases = process.preprocess_adj_bias(adj)
else:
    adj = adj.todense()
    adj = adj[np.newaxis]
    biases = process.adj_to_bias(adj, [nb_nodes], nhood=1)

如果是稀疏格式的，就調用biases = process.preprocess_adj_bias(adj)：

def preprocess_adj_bias(adj):
    num_nodes = adj.shape[0] #2708
    adj = adj + sp.eye(num_nodes)  # self-loop 給對角上+1
    adj[adj > 0.0] = 1.0 #大於0的值置為1
    if not sp.isspmatrix_coo(adj):
        adj = adj.tocoo()
    adj = adj.astype(np.float32) #類型轉換
    indices = np.vstack((adj.col, adj.row)).transpose()  # This is where I made a mistake, I used (adj.row, adj.col) instead
    # return tf.SparseTensor(indices=indices, values=adj.data, dense_shape=adj.shape)
    return indices, adj.data, adj.shape

這里看兩個例子：

我們可以通過indices，data，shape來構造一個coo_matrix。

在定義計算圖中的占位符時：

       if sparse:
            #bias_idx = tf.placeholder(tf.int64)
            #bias_val = tf.placeholder(tf.float32)
            #bias_shape = tf.placeholder(tf.int64)
            bias_in = tf.sparse_placeholder(dtype=tf.float32)
        else:
            bias_in = tf.placeholder(dtype=tf.float32, shape=(batch_size, nb_nodes, nb_nodes))

使用bias_in = tf.sparse_placeholder(dtype=tf.float32)。

再接着就是模型中了，在utils文件夾下的layers.py中：

# Experimental sparse attention head (for running on datasets such as Pubmed)
# N.B. Because of limitations of current TF implementation, will work _only_ if batch_size = 1!
def sp_attn_head(seq, out_sz, adj_mat, activation, nb_nodes, in_drop=0.0, coef_drop=0.0, residual=False):
    with tf.name_scope('sp_attn'):
        if in_drop != 0.0:
            seq = tf.nn.dropout(seq, 1.0 - in_drop)

        seq_fts = tf.layers.conv1d(seq, out_sz, 1, use_bias=False)

        # simplest self-attention possible
        f_1 = tf.layers.conv1d(seq_fts, 1, 1)
        f_2 = tf.layers.conv1d(seq_fts, 1, 1)
        
        f_1 = tf.reshape(f_1, (nb_nodes, 1))
        f_2 = tf.reshape(f_2, (nb_nodes, 1))

        f_1 = adj_mat*f_1
        f_2 = adj_mat * tf.transpose(f_2, [1,0])

        logits = tf.sparse_add(f_1, f_2)
      lrelu = tf.SparseTensor(indices=logits.indices, values=tf.nn.leaky_relu(logits.values), dense_shape=logits.dense_shape)
        coefs = tf.sparse_softmax(lrelu) if coef_drop != 0.0:
            coefs = tf.SparseTensor(indices=coefs.indices, values=tf.nn.dropout(coefs.values, 1.0 - coef_drop), dense_shape=coefs.dense_shape) if in_drop != 0.0:
            seq_fts = tf.nn.dropout(seq_fts, 1.0 - in_drop)

        # As tf.sparse_tensor_dense_matmul expects its arguments to have rank-2,
        # here we make an assumption that our input is of batch size 1, and reshape appropriately.
        # The method will fail in all other cases!
        coefs = tf.sparse_reshape(coefs, [nb_nodes, nb_nodes])  seq_fts = tf.squeeze(seq_fts) vals = tf.sparse_tensor_dense_matmul(coefs, seq_fts) vals = tf.expand_dims(vals, axis=0) vals.set_shape([1, nb_nodes, out_sz]) ret = tf.contrib.layers.bias_add(vals) # residual connection
        if residual:
            if seq.shape[-1] != ret.shape[-1]:
                ret = ret + conv1d(seq, ret.shape[-1], 1) # activation
            else:
                ret = ret + seq

        return activation(ret)  # activation

相應的位置都要使用稀疏的方式。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Graph Attention Network (GAT) 圖注意力網絡論文詳解 ICLR2018 《Graph Attention Network》閱讀筆記論文解讀（AGCN）《 Attention-driven Graph Clustering Network》稀疏矩陣十字鏈表實現稀疏矩陣相加【代碼】 python的稀疏矩陣計算稀疏矩陣理論與實踐隨機生成某些稀疏矩陣在Pytorch上使用稀疏矩陣稀疏矩陣相加減