AFM模型 pytorch示例代碼


1.AFM模型pytorch實現。

$\hat{y}_{AFM}=w_{0} + \sum_{i=1}^{n}w_{i}x_{i}+p^{T}\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}a_{ij}(v_{i}v_{j})x_{i}x_{j}$

$a_{ij}^{'}=h^{T}Relu(W(v_{i}v_{j})x_{i}x_{j}+b)$

$a_{ij}=\frac{exp(a_{ij}^{'})}{\sum_{i,j}exp(a_{ij}^{'})}$

(實際數據使用的是Dataloader,需要設置batch_size等參數。)

設原來的數據有num_fields =3個特征,one-hot編碼過后對應有30維度,嵌入維度設為ebd_size=4。所以嵌入層定義為

ebd_size = 4
ebd = nn.Embedding(30,ebd_size)

自定義一個batch_size的數據

x_ = [[1, 13, 22], [0, 18,29],[2, 13,27], [0, 11,22],[1, 14,26]]  #shape=batch_size*num_fields 
x_ =  Variable(torch.LongTensor([[1, 13, 22], [0, 18,29],[2, 13,27], [0, 11,22],[1, 14,26]]))

得到對應的嵌入向量

x=ebd(x_)

計算交叉特征:

$(v_{i}v_{j})x_{i}x_{j}$

交叉特征數目=num_fields*(num_fields - 1)/2  

inner_product的shape為batch_size*交叉特征數目*嵌入維度

num_fields = x.shape[1]
row, col = list(), list()
for i in range(num_fields - 1):
    for j in range(i + 1, num_fields):
        row.append(i), col.append(j)
p, q = x[:, row], x[:, col]
inner_product = p * q

接下來求得 

$Relu(W(v_{i}v_{j})x_{i}x_{j}+b)$

用一個nn.Linear層,在經過一個Relu激活函數可以完成

attention(inner_product))結果的shape為 batch_size*交叉特征*嵌入維度
attention = torch.nn.Linear(ebd_size, ebd_size)
print(attention(inner_product))  # batch_size*交叉特征*嵌入維度
attn_scores = F.relu(attention(inner_product))
print("attn_scores", attn_scores)  # batch_size*交叉特征*嵌入維度

接下來在經過一個linear得到$a_{ij}^{'}$

$a_{ij}^{'}=h^{T}Relu(W(v_{i}v_{j})x_{i}x_{j}+b)$

projection = torch.nn.Linear(ebd_size, 1)
print("projection(attn_scores)", projection(attn_scores))  # batch_size*交叉特征*1

在經過一個softmax得到

$a_{ij}=\frac{exp(a_{ij}^{'})}{\sum_{i,j}exp(a_{ij}^{'})}$

attn_scores = F.softmax(projection(attn_scores), dim=1)
print("attn_scores", attn_scores)  # batch_size*交叉特征*1

接下來把交叉特征$(v_{i}v_{j})x_{i}x_{j}$與注意力權重$a_{ij}$相乘

print("attn_scores * inner_product", attn_scores * inner_product)  # batch_size*交叉特征*嵌入維度
attn_output = torch.sum(attn_scores * inner_product, dim=1)
print("attn_output", attn_output)  # batch_size*嵌入維度

最后經過一個輸出大小為1的全連接層

fc = torch.nn.Linear(ebd_size, 1)
fc_out = fc(attn_output)
print("fc_out", fc_out)  # batch_size*1

這樣就把$p^{T}\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}a_{ij}(v_{i}v_{j})x_{i}x_{j}$求出來了,前面一階部分使用一個Linear層就可以求得到

參考代碼:

import torch
import numpy as np
from torch.autograd import Variable
import torch.nn.functional as F
import torch.nn as nn
ebd_size = 4
ebd = nn.Embedding(30,ebd_size)
x_ =  Variable(torch.LongTensor([[1, 13, 22], [0, 18,29],[2, 13,27], [0, 11,22],[1, 14,26]]))
x=ebd(x_)
num_fields = x.shape[1]
row, col = list(), list()
for i in range(num_fields - 1):
    for j in range(i + 1, num_fields):
        row.append(i), col.append(j)
p, q = x[:, row], x[:, col]
inner_product = p * q
print("inner_product", inner_product)  # batch_size*交叉特征*嵌入維度
attention = torch.nn.Linear(ebd_size, ebd_size)
print(attention(inner_product))  # batch_size*交叉特征*嵌入維度
attn_scores = F.relu(attention(inner_product))
print("attn_scores", attn_scores)  # batch_size*交叉特征*嵌入維度
projection = torch.nn.Linear(ebd_size, 1)
print("projection(attn_scores)", projection(attn_scores))  # batch_size*交叉特征*1
attn_scores = F.softmax(projection(attn_scores), dim=1)
print("attn_scores", attn_scores)  # batch_size*交叉特征*1
print("attn_scores * inner_product", attn_scores * inner_product)  # batch_size*交叉特征*嵌入維度
attn_output = torch.sum(attn_scores * inner_product, dim=1)
print("attn_output", attn_output)  # batch_size*嵌入維度
fc = torch.nn.Linear(ebd_size, 1)
fc_out = fc(attn_output)
print("fc_out", fc_out)  # batch_size*1
exit()

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM