《Enhanced LSTM for Natural Language Inference》（自然語言推理）

本文轉載自查看原文 2018-05-28 16:27 3367 論文閱讀

解決的問題

$a = (a_{1}, . . ., a_{l_{a}})$

方法

我們的自然語言推理網絡由以下部分組成：輸入編碼（Input Encoding ），局部推理模型（Local Inference Modeling ），和推理合成（inference composition）。結構圖如下所示：

垂直來看，上圖顯示了系統的三個主要組成部分；水平來看，左邊代表稱為ESIM的序列NLI模型，右邊代表包含了句法解析信息的樹形LSTM網絡。

輸入編碼

 1     # Based on arXiv:1609.06038
 2     q1 = Input(name='q1', shape=(maxlen,))
 3     q2 = Input(name='q2', shape=(maxlen,))
 4 
 5     # Embedding
 6     embedding = create_pretrained_embedding(
 7         pretrained_embedding, mask_zero=False)
 8     bn = BatchNormalization(axis=2)
 9     q1_embed = bn(embedding(q1))
10     q2_embed = bn(embedding(q2))
11 
12     # Encode
13     encode = Bidirectional(LSTM(lstm_dim, return_sequences=True))
14     q1_encoded = encode(q1_embed)
15     q2_encoded = encode(q2_embed)

有2種lstm:

A: sequential model 的做法
這里寫圖片描述

句子中的每個詞都有了包含周圍信息的 word representation

B: Tree-LSTM model的做法
這里寫圖片描述

樹中的每個節點（短語或字句）有了向量表示 h $_{t}$

關於tree-LSTM 的介紹需要看文章：
[1] Improved semantic representations from tree-structured long short-term memory networks
[2] Natural Language inference by tree-based convolution and heuristic matching
[3] Long short-term memory over recursive structures

局部推理（Local Inference Modeling ）

個人感覺就是一個attention的過程，取了個名字叫局部推理。

A: sequential model

 1 def soft_attention_alignment(input_1, input_2):
 2     "Align text representation with neural soft attention"
 3     attention = Dot(axes=-1)([input_1, input_2])
 4     
 5     #計算兩個tensor中樣本的張量乘積。例如，如果兩個張量a和b的shape都為（batch_size, n），
 6     #則輸出為形如（batch_size,1）的張量，結果張量每個batch的數據都是a[i,:]和b[i,:]的矩陣（向量）點積。
 7 
 8     w_att_1 = Lambda(lambda x: softmax(x, axis=1),
 9                      output_shape=unchanged_shape)(attention)
10     w_att_2 = Permute((2, 1))(Lambda(lambda x: softmax(x, axis=2),
11                                      output_shape=unchanged_shape)(attention))
12     #Permute層將輸入的維度按照給定模式進行重排，例如，當需要將RNN和CNN網絡連接時，可能會用到該層。
13     #dims：整數tuple，指定重排的模式，不包含樣本數的維度。重拍模式的下標從1開始。
14     #例如（2，1）代表將輸入的第二個維度重拍到輸出的第一個維度，而將輸入的第一個維度重排到第二個維度
15 
16     in1_aligned = Dot(axes=1)([w_att_1, input_1])
17     in2_aligned = Dot(axes=1)([w_att_2, input_2])
18     return in1_aligned, in2_aligned

這里寫圖片描述

兩句話相似或相反的對應

B: Tree-LSTM model
待續

推理合成（inference composition）

a是上層局部推理得到的。

ma 輸入LSTM

對 lstm 每個time step 的結果進行pooling.

    # Compare
    q1_combined = Concatenate()(
        [q1_encoded, q2_aligned, submult(q1_encoded, q2_aligned)])
    q2_combined = Concatenate()(
        [q2_encoded, q1_aligned, submult(q2_encoded, q1_aligned)])
    compare_layers = [
        Dense(compare_dim, activation=activation),
        Dropout(compare_dropout),
        Dense(compare_dim, activation=activation),
        Dropout(compare_dropout),
    ]
    q1_compare = time_distributed(q1_combined, compare_layers)
    q2_compare = time_distributed(q2_combined, compare_layers)

    # Aggregate
    q1_rep = apply_multiple(q1_compare, [GlobalAvgPool1D(), GlobalMaxPool1D()])
    q2_rep = apply_multiple(q2_compare, [GlobalAvgPool1D(), GlobalMaxPool1D()])

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Textual Entailment（自然語言推理－文本蘊含） - AllenNLP 用CNTK搞深度學習（二）訓練基於RNN的自然語言模型 ( language model ) (QA-LSTM)自然語言處理：智能問答 IBM 保險QA QA-LSTM 實現筆記.md 自然語言處理之jieba分詞自然語言處理(一) 關系抽取自然語言處理NLTK之入門 NLP自然語言處理 Python自然語言處理-系列一自然語言處理入門 NLP 自然語言處理之綜述