【神經網絡】依存樹

本文轉載自查看原文 2021-01-13 22:07 606 關系抽取/ 深度學習

主要介紹GCN-Tree模型中依存樹的內容。論文中使用的工具來自Standford Parser。

https://www.xfyun.cn/services/semanticDependence 訊飛中文分詞平台

http://nlp.stanford.edu:8080/parser/ 這是可以體驗功能。

工具包：https://nlp.stanford.edu/software/stanford-dependencies.shtml 教你怎么用stanford dependency parser這個工具代碼。

圖2：用圖卷積網絡抽取關系。左側顯示整體架構，而右側則只顯示“relative”一詞的詳細圖卷積計算，以求清晰。本文還提供了一個完整的、未標記的句子依存分析，以供參考。

我們使用論文中的例子還原一下這個解析樹：

Your query

He was not a relative of Mike Cane

Tagging

He/PRP　　was/VBD　　not/RB　　a/DT　　relative/NN　　of/IN　　Mike/NNP　　Cane/NNP

Parse

(ROOT
  (S
    (NP (PRP He))
    (VP (VBD was) (RB not)
      (NP
        (NP (DT a) (NN relative))
        (PP (IN of)
          (NP (NNP Mike) (NNP Cane)))))))

Universal dependencies

nsubj(relative-5, He-1)
cop(relative-5, was-2)
advmod(relative-5, not-3)
det(relative-5, a-4)
root(ROOT-0, relative-5)
case(Cane-8, of-6)
compound(Cane-8, Mike-7)
nmod(relative-5, Cane-8)

Universal dependencies, enhanced

nsubj(relative-5, He-1)
cop(relative-5, was-2)
advmod(relative-5, not-3)
det(relative-5, a-4)
root(ROOT-0, relative-5)
case(Cane-8, of-6)
compound(Cane-8, Mike-7)
nmod:of(relative-5, Cane-8)

可以看到第5個單詞relative作為根節點，nsubj,cop,avvmod,det,root,case,compound,nmod:of作為依賴邊關系表示縮寫，在論文數據集中標注為$stanford-deprel$

括號前邊的項為關系邊的出發點，后項為這句子中的第X個單詞（head，此單詞），head在論文數據集中標注為$stanford-head$

取自數據集中某條數據：

這樣可以寫出universal dependencies:

nsubj(named-2,He-1)

root(ROOT-0,named-2)

dobj(named-2,one-3)

case(Aziz-7,as-4)

compound(Aziz-7,Shah-5)

compound(Aziz-7,Shah-6)

compound(Aziz-7,Shah-7)

nmod(named-2,Aziz-7)

通過工具驗證一下：

Your query

He named one as Shah Abdul Aziz

Tagging

He/PRP　　named/VBD　　one/CD　　as/IN　　Shah/NNP　　Abdul/NNP　　Aziz/NNP

Parse

(ROOT
  (S
    (NP (PRP He))
    (VP (VBD named)
      (NP (CD one))
      (PP (IN as)
        (NP (NNP Shah) (NNP Abdul) (NNP Aziz))))))

Universal dependencies

nsubj(named-2, He-1)
root(ROOT-0, named-2)
obj(named-2, one-3)
case(Aziz-7, as-4)
compound(Aziz-7, Shah-5)
compound(Aziz-7, Abdul-6)
obl(named-2, Aziz-7)

Universal dependencies, enhanced

nsubj(named-2, He-1)
root(ROOT-0, named-2)
obj(named-2, one-3)
case(Aziz-7, as-4)
compound(Aziz-7, Shah-5)
compound(Aziz-7, Abdul-6)
obl:as(named-2, Aziz-7)

經驗證確實是這樣標注的。

兩個基本問題

都挺簡單的數據結構問題(多叉樹的節點問題)：

a. 已知一個節點怎么找到它的父(子)節點。

這個就很簡單了。自己應該會的。

b. 求兩個節點的最短路徑

就是找到一個節點，把自己和所有父節點放到一個數組里，再在另一個節點，從本身開始順着父節點找，直到找到和第一個節點並且存在於第一個數組里，這樣，第一個數組從0開始到這個公共節點和第二個節點的從這個節點到自己本身的所有節點就是這倆節點的最短路徑。

舉個實在例子(意見抽取)：

dependency tree是：

屬性之間的最短路徑：

注意的是，這個路徑上每次經過的線(也就是他們倆的關系),這里的路徑就是這個。

屬性與評價之間的最短路徑：

從這兩組最短路徑很明顯看出誰跟誰更親近，這也是最短路徑的一個應用。

下面介紹在論文模型中如何將輸入變為Tree形式表示，這是GCNRelationModel模型中forward過程中使用依賴樹的過程：

 1     def forward(self, inputs):
 2         words, masks, pos, ner, deprel, head, subj_pos, obj_pos, subj_type, obj_type = inputs # unpack
 3         l = (masks.data.cpu().numpy() == 0).astype(np.int64).sum(1)   #將mask矩陣中的True/Fasle->1/0,記錄每個batch有多少個單詞
 4         maxlen = max(l)
 5 
 6         def inputs_to_tree_reps(head, words, l, prune, subj_pos, obj_pos):
 7             head, words, subj_pos, obj_pos = head.cpu().numpy(), words.cpu().numpy(), subj_pos.cpu().numpy(), obj_pos.cpu().numpy()
 8             trees = [head_to_tree(head[i], words[i], l[i], prune, subj_pos[i], obj_pos[i]) for i in range(len(l))]
 9             adj = [tree_to_adj(maxlen, tree, directed=False, self_loop=False).reshape(1, maxlen, maxlen) for tree in trees]
10             adj = np.concatenate(adj, axis=0)  # 這個batch中的多個numpy鄰接矩陣，跨行進行拼接   shape = [b,maxlen,maxlen]
11             adj = torch.from_numpy(adj)
12             return Variable(adj.cuda()) if self.opt['cuda'] else Variable(adj)
13 
14         #.data用法可以修改tensor的值而不被autograd(不會影響反向傳播)，
15         # subj_pos,obj_pos均為主語賓語在句子中的位置，#返回距離List :[-3,-2,-1,0,0,0,1,2,3]
16         adj = inputs_to_tree_reps(head.data, words.data, l, self.opt['prune_k'], subj_pos.data, obj_pos.data)
17         h, pool_mask = self.gcn(adj, inputs) #將此batch的adj鄰接矩陣，與輸入輸入到gcn

在第16行，直接輸入數據的head,word,l,剪枝路徑約束值k，主語、賓語位置信息，調用6~12行的函數生成了此batch的依賴樹的鄰接矩陣。

方法內部第8行，通過調用head_to_tree( )方法，給batch每句話都生成一棵依賴樹。

  1 class Tree(object):
  2     """
  3     Reused tree object from stanfordnlp/treelstm.
  4     stanfordnlp/treelstm重用的樹對象
  5     """
  6     def __init__(self):
  7         self.dist = 0
  8         self.idx = 0
  9         self.parent = None
 10         self.num_children = 0
 11         self.children = list()
 12 
 13     def add_child(self,child):
 14         child.parent = self
 15         self.num_children += 1
 16         self.children.append(child)
 17 
 18     def size(self):
 19         if getattr(self,'_size'):
 20             return self._size
 21         count = 1
 22         for i in range(self.num_children):
 23             count += self.children[i].size()
 24         self._size = count
 25         return self._size
 26 
 27     def depth(self):
 28         if getattr(self,'_depth'):
 29             return self._depth
 30         count = 0
 31         if self.num_children>0:
 32             for i in range(self.num_children):
 33                 child_depth = self.children[i].depth()
 34                 if child_depth>count:
 35                     count = child_depth
 36             count += 1
 37         self._depth = count
 38         return self._depth
 39 
 40     def __iter__(self):
 41         yield self
 42         for c in self.children:
 43             for x in c:
 44                 yield x
 45 
 46 def head_to_tree(head, tokens, len_, prune, subj_pos, obj_pos):
 47     """
 48     Convert a sequence of head indexes into a tree object.
 49     將head索引序列轉換為tree對象
 50     """
 51     tokens = tokens[:len_].tolist()
 52     head = head[:len_].tolist()
 53     root = None
 54 
 55     if prune < 0:   #不進行剪枝
 56         nodes = [Tree() for _ in head]  #多少個單詞，就有多少個節點nodes
 57 
 58         for i in range(len(nodes)):
 59             h = head[i]
 60             nodes[i].idx = i
 61             nodes[i].dist = -1 # just a filler
 62             if h == 0:
 63                 root = nodes[i]
 64             else:
 65                 nodes[h-1].add_child(nodes[i]) #nodes[h-1]出邊指向->當前節點，對應standford標注
 66     else:           #進行剪枝
 67         # find dependency path
 68         subj_pos = [i for i in range(len_) if subj_pos[i] == 0]  #subj_pos為0的部分實際上是實體，返回主語實體的下標[3,4,5]
 69         obj_pos = [i for i in range(len_) if obj_pos[i] == 0]
 70 
 71         cas = None
 72 
 73         subj_ancestors = set(subj_pos)
 74         for s in subj_pos:      #遍歷主語實體的每一個下標
 75             h = head[s]
 76             tmp = [s]
 77             while h > 0:        #head如果不是root
 78                 tmp += [h-1]    #tmp存儲當前節點s與發射邊節點（head,s），以及發射邊節點的祖先
 79                 subj_ancestors.add(h-1)
 80                 #subj_ancestors存儲主語實體下標除root之外的所有發射邊節點，以及發射邊節點的祖先，一直到找到root根節停止
 81                 h = head[h-1]
 82 
 83             if cas is None:
 84                 cas = set(tmp)      #第一次遍歷cas是空的，把第一個下標節點對應的所有祖先加入
 85             else:
 86                 cas.intersection_update(tmp)  #第二三次遍歷就調用intersection_update取交集，最后保留幾個主語實體節點的公共祖先
 87 
 88         obj_ancestors = set(obj_pos)
 89         for o in obj_pos:
 90             h = head[o]
 91             tmp = [o]
 92             while h > 0:
 93                 tmp += [h-1]
 94                 obj_ancestors.add(h-1)
 95                 h = head[h-1]
 96             cas.intersection_update(tmp)    #cas再與賓語實體節點的公共祖先取交集
 97 
 98         # find lowest common ancestor
 99         if len(cas) == 1:           #只有一個公共節點那么LCA就是它
100             lca = list(cas)[0]
101         else:
102             child_count = {k:0 for k in cas}
103             for ca in cas:
104                 if head[ca] > 0 and head[ca] - 1 in cas:  #ca的祖先不是根節點 and ca的祖先在cas這堆祖先節點中
105                     child_count[head[ca] - 1] += 1      #ca的祖先加一個孩子，那個孩子就是ca
106 
107             #LCA(Least Common Ancestors)
108             # the LCA has no child in the CA set
109             for ca in cas:          #很容易理解，公共祖先樹中沒孩子的‘葉子’肯定是所有實體節點的lCA
110                 if child_count[ca] == 0:
111                     lca = ca
112                     break
113 
114         path_nodes = subj_ancestors.union(obj_ancestors).difference(cas)
115         #主語樹（含祖先）與賓語樹（含祖先）取並集，再去掉公共祖先節點
116         path_nodes.add(lca)   #再加上最低公共祖先，LCA樹構造完成了
117 
118         # compute distance to path_nodes
119         dist = [-1 if i not in path_nodes else 0 for i in range(len_)]#LCA樹中的節點被標記為0，其他節點標記為-1
120 
121         for i in range(len_):
122             if dist[i] < 0:         #如果不是LCA的節點
123                 stack = [i]
124                 while stack[-1] >= 0 and stack[-1] not in path_nodes:#
125                     stack.append(head[stack[-1]] - 1)   #stack存儲節點i以及他的祖先們，直到最高的祖先在path_nodes中
126 
127                 if stack[-1] in path_nodes:             #如果節點i的最高祖先在LCA中
128                     for d, j in enumerate(reversed(stack)):  #stack存儲的路徑反序i<-B<-A 變成 A->B->i
129                         dist[j] = d             #dist[A] = 0 ,dist[B] = 1,dist[i] = 2，顯然dist表示了各個節點到LCA樹的距離
130                 else:
131                     for j in stack:             #這部分節點說明與LCA沒有邊連接到，與LCA的距離自然是無窮大
132                         if j >= 0 and dist[j] < 0:
133                             dist[j] = int(1e4) # aka infinity
134 
135         highest_node = lca
136         nodes = [Tree() if dist[i] <= prune else None for i in range(len_)]  #剪枝 prune<=k,滿足要求的節點Tree()，不滿足要求的為None
137 
138         #遍歷一遍nodes，將LCA樹創建好
139         for i in range(len(nodes)):
140             if nodes[i] is None:
141                 continue
142             h = head[i]
143             nodes[i].idx = i
144             nodes[i].dist = dist[i]
145             if h > 0 and i != highest_node:
146                 assert nodes[h-1] is not None
147                 nodes[h-1].add_child(nodes[i])
148 
149         root = nodes[highest_node]
150 
151     assert root is not None
152     return root

再調用tree_to_adj( )方法（第9行），將每棵依賴樹轉換為鄰接矩陣。

 1 def tree_to_adj(sent_len, tree, directed=True, self_loop=False):
 2     """
 3     Convert a tree object to an (numpy) adjacency matrix.
 4     把一個樹對象轉為鄰接矩陣
 5     """
 6     ret = np.zeros((sent_len, sent_len), dtype=np.float32)
 7 
 8     queue = [tree]   #樹LCA根節點入隊
 9     idx = []
10     while len(queue) > 0:
11         t, queue = queue[0], queue[1:]
12 
13         idx += [t.idx]     #LCA樹的節點編號
14 
15         for c in t.children:    #t節點有孩子c節點，所以t到c有臨界邊
16             ret[t.idx, c.idx] = 1
17         queue += t.children     #孩子節點入隊遍歷
18 
19     if not directed:            #這個參數關鍵決定是雙向還是單向圖
20         ret = ret + ret.T
21 
22     if self_loop:               #節點到自身循環邊
23         for i in idx:
24             ret[i, i] = 1
25 
26     return ret

最后將依賴樹鄰接矩陣做合適處理可以輸入gcn當中了。

參考：

詳解依存樹的來龍去脈：https://blog.csdn.net/qq_27590277/article/details/88345017

Standford依存句法詳細解釋：http://wenku.baidu.com/link?url=IfW-hkMfPuK29t49Wa_nO2UAMpP2oGYCUAZuY5PrHHIQHsIm5moH82DMbTA521PMhCC4svgGRSgUTaSkHktw5Ru6RQCCRjwuHfkNVB3mcum

numpy庫數組拼接np.concatenate：https://www.cnblogs.com/shueixue/p/10953699.html

PyTorch中Variable變量：https://blog.csdn.net/qq_19329785/article/details/85029116

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 剖析分類、聚類、決策樹、回歸以及神經網絡利用LM神經網絡和決策樹去分類 BP神經網絡 [神經網絡 2] 神經網絡：卷積神經網絡神經網絡與BP神經網絡什么是遞歸神經網絡神經網絡求導 BP神經網絡 OpenCV 之神經網絡 (一) 神經網絡框架