雙線性模型(二)(NTN、SLM、SME)


繼續學習雙線性模型。越看越覺得,雙線性模型都可以歸類為 NN 模型,之前很多論文的模型綜述里的雙線性模型,看了原文才知道,其實更應歸類為 NN,只是在打分函數設計時候用到了雙線性函數,所以稱為雙線性模型。后面寫大論文的時候應該重新給這些模型歸歸類。

NTN(Neural Tensor Network)

【paper】 Reasoning With Neural Tensor Networks for Knowledge Base Completion

【簡介】 本文是斯坦福大學陳丹琦所在團隊 2013 年的工作,好像是發表在一個期刊上的。文章提出了用於知識庫補全的神經網絡框架 NTN(Neural Tensor Network),網絡結構/打分函數中同時包含雙線性函數和線性函數,並用詞向量的平均作為實體的表示。

NTN 的模型結構圖如下:

首先得到詞向量空間中詞的表示,然后用詞的組合作為實體的表示,輸入神經張量網絡,進行置信度打分。

文章的 related work 也值得學習,每一段介紹了一種方法,並闡述了本文方法與前人方法的關聯。

文中也提到,NTN 可以被視為學習張量分解的一種方法,類似於 Rescal。

模型

打分函數定義

NTN 定義的打分函數為:

f 是 tanh 非線性激活函數,\(W_R\) 是一個張量。模型示意圖如下:

虛線框內的暖色矩陣是張量 \(W_R\) 的一個 slice,對應一個關系,用於頭尾實體的交互。

相關模型與 NTN 特例

這部分介紹與 NTN 相關的模型和在特殊情況下 NTN 的表現形式。

  1. Distance Model(之前看過的翻譯模型 SE)

這種模型的缺點是兩個實體之間沒有交互。

  1. Single Layer Model(SLM)

SLM 是普通的線性模型,是 NTN 中去掉雙線性部分的表現形式(將 tensor 設置)。昨天看的 LFM 是純雙線性函數,沒有線性部分,因此之前有論文說 NTN 是 SLM 和 LFM 的聯合,這就對上了。

  1. Hadamard Model

這是 Antoine Bordes 2012 年提出的 UM,應該屬於從線性到雙線性的一個過渡模型。

  1. Bilinear Model

這個就是 LFM 雙線性模型了,沒有線性變換。

訓練目標

實體表示初始化

用詞向量的平均作為實體向量,文章還試驗了 RNN,普通的平均操作效果差不多。

實驗

關系三元組分類

整體數據集上的效果:

在每個關系上的效果:

對於每個關系下的分類效果各不相同的現象,文章給出的解釋是:由於關系的模糊語義導致難以推斷,因此在有些關系下分類效果差。

文章還對比了三種向量初始化方法的效果:不使用詞向量初始化(EV);隨機初始化的詞向量(WV);用無監督語料訓練的詞向量(WV-init)

Case Study


代碼

文中給出的代碼鏈接失效了。Pykg2vec 給出了 NTN 的實現:

class NTN(PairwiseModel):
    """
        `Reasoning With Neural Tensor Networks for Knowledge Base Completion`_ (NTN) is
        a neural tensor network which represents entities as an average of their constituting
        word vectors. It then projects entities to their vector embeddings
        in the input layer. The two entities are then combined and mapped to a non-linear hidden layer.
        https://github.com/siddharth-agrawal/Neural-Tensor-Network/blob/master/neuralTensorNetwork.py
        It is a neural tensor network which represents entities as an average of their constituting word vectors. It then projects entities to their vector embeddings in the input layer. The two entities are then combined and mapped to a non-linear hidden layer.
        Portion of the code based on `siddharth-agrawal`_.

        Args:
            config (object): Model configuration parameters.

        .. _siddharth-agrawal:
            https://github.com/siddharth-agrawal/Neural-Tensor-Network/blob/master/neuralTensorNetwork.py

        .. _Reasoning With Neural Tensor Networks for Knowledge Base Completion:
            https://nlp.stanford.edu/pubs/SocherChenManningNg_NIPS2013.pdf

    """

    def __init__(self, **kwargs):
        super(NTN, self).__init__(self.__class__.__name__.lower())
        param_list = ["tot_entity", "tot_relation", "ent_hidden_size", "rel_hidden_size", "lmbda"]
        param_dict = self.load_params(param_list, kwargs)
        self.__dict__.update(param_dict)

        self.ent_embeddings = NamedEmbedding("ent_embedding", self.tot_entity, self.ent_hidden_size)
        self.rel_embeddings = NamedEmbedding("rel_embedding", self.tot_relation, self.rel_hidden_size)
        self.mr1 = NamedEmbedding("mr1", self.ent_hidden_size, self.rel_hidden_size)
        self.mr2 = NamedEmbedding("mr2", self.ent_hidden_size, self.rel_hidden_size)
        self.br = NamedEmbedding("br", 1, self.rel_hidden_size)
        self.mr = NamedEmbedding("mr", self.rel_hidden_size, self.ent_hidden_size*self.ent_hidden_size)
        nn.init.xavier_uniform_(self.ent_embeddings.weight)
        nn.init.xavier_uniform_(self.rel_embeddings.weight)
        nn.init.xavier_uniform_(self.mr1.weight)
        nn.init.xavier_uniform_(self.mr2.weight)
        nn.init.xavier_uniform_(self.br.weight)
        nn.init.xavier_uniform_(self.mr.weight)

        self.parameter_list = [
            self.ent_embeddings,
            self.rel_embeddings,
            self.mr1,
            self.mr2,
            self.br,
            self.mr,
        ]

        self.loss = Criterion.pairwise_hinge

    def train_layer(self, h, t):
        """ Defines the forward pass training layers of the algorithm.

            Args:
               h (Tensor): Head entities ids.
               t (Tensor): Tail entity ids of the triple.
        """

        mr1h = torch.matmul(h, self.mr1.weight) # h => [m, self.ent_hidden_size], self.mr1 => [self.ent_hidden_size, self.rel_hidden_size]
        mr2t = torch.matmul(t, self.mr2.weight) # t => [m, self.ent_hidden_size], self.mr2 => [self.ent_hidden_size, self.rel_hidden_size]

        expanded_h = h.unsqueeze(dim=0).repeat(self.rel_hidden_size, 1, 1) # [self.rel_hidden_size, m, self.ent_hidden_size]
        expanded_t = t.unsqueeze(dim=-1) # [m, self.ent_hidden_size, 1]

        temp = (torch.matmul(expanded_h, self.mr.weight.view(self.rel_hidden_size, self.ent_hidden_size, self.ent_hidden_size))).permute(1, 0, 2) # [m, self.rel_hidden_size, self.ent_hidden_size]
        htmrt = torch.squeeze(torch.matmul(temp, expanded_t), dim=-1) # [m, self.rel_hidden_size]

        return F.tanh(htmrt + mr1h + mr2t + self.br.weight)


    def embed(self, h, r, t):
        """Function to get the embedding value.

        Args:
           h (Tensor): Head entities ids.
           r (Tensor): Relation ids of the triple.
           t (Tensor): Tail entity ids of the triple.

        Returns:
            Tensors: Returns head, relation and tail embedding Tensors.
        """
        emb_h = self.ent_embeddings(h)
        emb_r = self.rel_embeddings(r)
        emb_t = self.ent_embeddings(t)

        return emb_h, emb_r, emb_t


    def forward(self, h, r, t):
        h_e, r_e, t_e = self.embed(h, r, t)
        norm_h = F.normalize(h_e, p=2, dim=-1)
        norm_r = F.normalize(r_e, p=2, dim=-1)
        norm_t = F.normalize(t_e, p=2, dim=-1)
        return -torch.sum(norm_r*self.train_layer(norm_h, norm_t), -1)


    def get_reg(self, h, r, t):
        return self.lmbda*torch.sqrt(sum([torch.sum(torch.pow(var.weight, 2)) for var in self.parameter_list]))

【小結】 本文提出了神經張量模型 NTN,在打分函數中同時使用了雙線性和線性操作,雙線性操作中定義了張量用於捕捉頭尾實體間的交互。NTN 使用預訓練詞向量的平均作為實體向量的初始化表示。

SLM(Single Layer Model)

NTN 模型中介紹到了 SLM:是最簡單的單層線性模型,NTN 去掉雙線性部分的特例。

代碼

還是 Pykg2vec 的實現:

class SLM(PairwiseModel):
    """
        In `Reasoning With Neural Tensor Networks for Knowledge Base Completion`_,
        SLM model is designed as a baseline of Neural Tensor Network.
        The model constructs a nonlinear neural network to represent the score function.

        Args:
            config (object): Model configuration parameters.

        .. _Reasoning With Neural Tensor Networks for Knowledge Base Completion:
            https://nlp.stanford.edu/pubs/SocherChenManningNg_NIPS2013.pdf
    """
    def __init__(self, **kwargs):
        super(SLM, self).__init__(self.__class__.__name__.lower())
        param_list = ["tot_entity", "tot_relation", "rel_hidden_size", "ent_hidden_size"]
        param_dict = self.load_params(param_list, kwargs)
        self.__dict__.update(param_dict)

        self.ent_embeddings = NamedEmbedding("ent_embedding", self.tot_entity, self.ent_hidden_size)
        self.rel_embeddings = NamedEmbedding("rel_embedding", self.tot_relation, self.rel_hidden_size)
        self.mr1 = NamedEmbedding("mr1", self.ent_hidden_size, self.rel_hidden_size)
        self.mr2 = NamedEmbedding("mr2", self.ent_hidden_size, self.rel_hidden_size)
        nn.init.xavier_uniform_(self.ent_embeddings.weight)
        nn.init.xavier_uniform_(self.rel_embeddings.weight)
        nn.init.xavier_uniform_(self.mr1.weight)
        nn.init.xavier_uniform_(self.mr2.weight)

        self.parameter_list = [
            self.ent_embeddings,
            self.rel_embeddings,
            self.mr1,
            self.mr2,
        ]

        self.loss = Criterion.pairwise_hinge

    def embed(self, h, r, t):
        """Function to get the embedding value.

            Args:
               h (Tensor): Head entities ids.
               r (Tensor): Relation ids of the triple.
               t (Tensor): Tail entity ids of the triple.

            Returns:
                Tensors: Returns head, relation and tail embedding Tensors.
        """
        emb_h = self.ent_embeddings(h)
        emb_r = self.rel_embeddings(r)
        emb_t = self.ent_embeddings(t)
        return emb_h, emb_r, emb_t


    def forward(self, h, r, t):
        h_e, r_e, t_e = self.embed(h, r, t)
        norm_h = F.normalize(h_e, p=2, dim=-1)
        norm_r = F.normalize(r_e, p=2, dim=-1)
        norm_t = F.normalize(t_e, p=2, dim=-1)
        return -torch.sum(norm_r * self.layer(norm_h, norm_t), -1)


    def layer(self, h, t):
        """Defines the forward pass layer of the algorithm.

          Args:
              h (Tensor): Head entities ids.
              t (Tensor): Tail entity ids of the triple.
        """
        mr1h = torch.matmul(h, self.mr1.weight) # h => [m, d], self.mr1 => [d, k]
        mr2t = torch.matmul(t, self.mr2.weight) # t => [m, d], self.mr2 => [d, k]
        return torch.tanh(mr1h + mr2t)

SME(Semantic Matching Energy)

【paper】 A Semantic Matching Energy Function for Learning with Multi-relational Data

【簡介】 這篇文章是 Antoine Bordes 發表在 2014 年的 Machine Learning 上的工作,提出的模型和翻譯模型的 UM(Unstructured Model)一毛一樣,甚至使用的變量寫法都是一樣的,只是 UM 畫了示意圖、SME 沒有,唯一和 UM 不同的是 SME 提供了實體和關系交互時使用的 g 函數的兩種形式:線性和雙線性,而 UM 只有雙線性。

代碼

線性的實現:

class SME(PairwiseModel):
    """ `A Semantic Matching Energy Function for Learning with Multi-relational Data`_

        Semantic Matching Energy (SME) is an algorithm for embedding multi-relational data into vector spaces.
        SME conducts semantic matching using neural network architectures. Given a fact (h, r, t), it first projects
        entities and relations to their embeddings in the input layer. Later the relation r is combined with both h and t
        to get gu(h, r) and gv(r, t) in its hidden layer. The score is determined by calculating the matching score of gu and gv.

        There are two versions of SME: a linear version(SMELinear) as well as bilinear(SMEBilinear) version which differ in how the hidden layer is defined.

        Args:
            config (object): Model configuration parameters.

        Portion of the code based on glorotxa_.

        .. _glorotxa: https://github.com/glorotxa/SME/blob/master/model.py

        .. _A Semantic Matching Energy Function for Learning with Multi-relational Data: http://www.thespermwhale.com/jaseweston/papers/ebrm_mlj.pdf

    """

    def __init__(self, **kwargs):
        super(SME, self).__init__(self.__class__.__name__.lower())
        param_list = ["tot_entity", "tot_relation", "hidden_size"]
        param_dict = self.load_params(param_list, kwargs)
        self.__dict__.update(param_dict)

        self.ent_embeddings = NamedEmbedding("ent_embedding", self.tot_entity, self.hidden_size)
        self.rel_embeddings = NamedEmbedding("rel_embedding", self.tot_relation, self.hidden_size)
        self.mu1 = NamedEmbedding("mu1", self.hidden_size, self.hidden_size)
        self.mu2 = NamedEmbedding("mu2", self.hidden_size, self.hidden_size)
        self.bu = NamedEmbedding("bu", self.hidden_size, 1)
        self.mv1 = NamedEmbedding("mv1", self.hidden_size, self.hidden_size)
        self.mv2 = NamedEmbedding("mv2", self.hidden_size, self.hidden_size)
        self.bv = NamedEmbedding("bv", self.hidden_size, 1)
        nn.init.xavier_uniform_(self.ent_embeddings.weight)
        nn.init.xavier_uniform_(self.rel_embeddings.weight)
        nn.init.xavier_uniform_(self.mu1.weight)
        nn.init.xavier_uniform_(self.mu2.weight)
        nn.init.xavier_uniform_(self.bu.weight)
        nn.init.xavier_uniform_(self.mv1.weight)
        nn.init.xavier_uniform_(self.mv2.weight)
        nn.init.xavier_uniform_(self.bv.weight)

        self.parameter_list = [
            self.ent_embeddings,
            self.rel_embeddings,
            self.mu1,
            self.mu2,
            self.bu,
            self.mv1,
            self.mv2,
            self.bv,
        ]

        self.loss = Criterion.pairwise_hinge

    def embed(self, h, r, t):
        """Function to get the embedding value.

            Args:
                h (Tensor): Head entities ids.
                r (Tensor): Relation ids of the triple.
                t (Tensor): Tail entity ids of the triple.

            Returns:
                Tensors: Returns head, relation and tail embedding Tensors.
        """
        emb_h = self.ent_embeddings(h)
        emb_r = self.rel_embeddings(r)
        emb_t = self.ent_embeddings(t)
        return emb_h, emb_r, emb_t


    def _gu_linear(self, h, r):
        """Function to calculate linear loss.

            Args:
                h (Tensor): Head entities ids.
                r (Tensor): Relation ids of the triple.

            Returns:
                Tensors: Returns the bilinear loss.
        """
        mu1h = torch.matmul(self.mu1.weight, h.T)  # [k, b]
        mu2r = torch.matmul(self.mu2.weight, r.T)  # [k, b]
        return (mu1h + mu2r + self.bu.weight).T  # [b, k]

    def _gv_linear(self, r, t):
        """Function to calculate linear loss.

            Args:
                h (Tensor): Head entities ids.
                r (Tensor): Relation ids of the triple.

            Returns:
                Tensors: Returns the bilinear loss.
        """
        mv1t = torch.matmul(self.mv1.weight, t.T)  # [k, b]
        mv2r = torch.matmul(self.mv2.weight, r.T)  # [k, b]
        return (mv1t + mv2r + self.bv.weight).T  # [b, k]

    def forward(self, h, r, t):
        """Function to that performs semanting matching.

            Args:
                h (Tensor): Head entities ids.
                r (Tensor): Relation ids of the triple.
                t (Tensor): Tail ids of the triple.

            Returns:
                Tensors: Returns the semantic matchin score.
        """
        h_e, r_e, t_e = self.embed(h, r, t)
        norm_h = F.normalize(h_e, p=2, dim=-1)
        norm_r = F.normalize(r_e, p=2, dim=-1)
        norm_t = F.normalize(t_e, p=2, dim=-1)

        return -torch.sum(self._gu_linear(norm_h, norm_r) * self._gv_linear(norm_r, norm_t), 1)

雙線性的實現:

class SME_BL(SME):
    """ `A Semantic Matching Energy Function for Learning with Multi-relational Data`_

        SME_BL is an extension of SME_ that BiLinear function to calculate the matching scores.

        Args:
            config (object): Model configuration parameters.

        .. _`SME`: api.html#pykg2vec.models.pairwise.SME

    """
    def __init__(self, **kwargs):
        super(SME_BL, self).__init__(**kwargs)
        self.model_name = self.__class__.__name__.lower()
        self.loss = Criterion.pairwise_hinge

    def _gu_bilinear(self, h, r):
        """Function to calculate bilinear loss.

            Args:
                h (Tensor): Head entities ids.
                r (Tensor): Relation ids of the triple.

            Returns:
                Tensors: Returns the bilinear loss.
        """
        mu1h = torch.matmul(self.mu1.weight, h.T)  # [k, b]
        mu2r = torch.matmul(self.mu2.weight, r.T)  # [k, b]
        return (mu1h * mu2r + self.bu.weight).T  # [b, k]

    def _gv_bilinear(self, r, t):
        """Function to calculate bilinear loss.

            Args:
                h (Tensor): Head entities ids.
                r (Tensor): Relation ids of the triple.

            Returns:
                Tensors: Returns the bilinear loss.
        """
        mv1t = torch.matmul(self.mv1.weight, t.T)  # [k, b]
        mv2r = torch.matmul(self.mv2.weight, r.T)  # [k, b]
        return (mv1t * mv2r + self.bv.weight).T  # [b, k]

    def forward(self, h, r, t):
        """Function to that performs semanting matching.

            Args:
                h (Tensor): Head entities ids.
                r (Tensor): Relation ids of the triple.
                t (Tensor): Tail ids of the triple.

            Returns:
                Tensors: Returns the semantic matchin score.
        """
        h_e, r_e, t_e = self.embed(h, r, t)
        norm_h = F.normalize(h_e, p=2, dim=-1)
        norm_r = F.normalize(r_e, p=2, dim=-1)
        norm_t = F.normalize(t_e, p=2, dim=-1)

        return torch.sum(self._gu_bilinear(norm_h, norm_r) * self._gv_bilinear(norm_r, norm_t), -1)


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM