[NLP] The Annotated Transformer 代碼修正

本文轉載自查看原文 2019-05-22 22:31 4094 python/ NLP/ The Annotated Transformer/ pytorch/ 算法模型

1. RuntimeError: "exp" not implemented for 'torch.LongTensor'

class PositionalEncoding(nn.Module)

div_term = torch.exp(torch.arange(0., d_model, 2) *
                             -(math.log(10000.0) / d_model))

將 “0” 改為 “0.”

否則會報錯：RuntimeError: "exp" not implemented for 'torch.LongTensor'

2. RuntimeError: expected type torch.FloatTensor but got torch.LongTensor

class PositionalEncoding(nn.Module)

position = torch.arange(0., max_len).unsqueeze(1)

將 “0” 改為 “0.”

否則會報錯：

pe[:, 0::2] = torch.sin(position * div_term)
RuntimeError: expected type torch.FloatTensor but got torch.LongTensor

3. UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.

def make_model

nn.init.xavier_uniform_(p)

將“nn.init.xavier_uniform(p)” 改為 “nn.init.xavier_uniform_(p)”

否則會提示：UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.

4. UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.

class LabelSmoothing

self.criterion = nn.KLDivLoss(reduction='sum')

將 “self.criterion = nn.KLDivLoss(size_average=False)” 改為 “self.criterion = nn.KLDivLoss(reduction='sum')”

否則會提示：UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.

5. IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

class SimpleLossCompute

return loss.item() * norm

將 “loss.data[0]” 改為 loss.item()，

否則會報錯：IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

6. floating point exception (core dumped)

直接運行“A First Example”會報錯：floating point exception (core dumped)

參考的修改方法：https://github.com/harvardnlp/annotated-transformer/issues/26，該方法中，修改 run_epoch 函數，將計數值轉換為numpy。方法：.detach().numpy() 或者直接 .numpy()

但是我試了仍有問題。最后需要將gpu上的先移到cpu中，再進行numpy轉換。

以下是自己調整后的代碼，是可以正確運行的：

 1 def run_epoch(data_iter, model, loss_compute, epoch = 0):
 2     "Standard Training and Logging Function"
 3     start = time.time()
 4     total_tokens = 0
 5     total_loss = 0
 6     tokens = 0
 7     for i, batch in enumerate(data_iter):
 8         out = model.forward(batch.src, batch.trg, batch.src_mask, batch.trg_mask)
 9         loss = loss_compute(out, batch.trg_y, batch.ntokens)
10 
11         total_loss += loss.detach().cpu().numpy()
12         total_tokens += batch.ntokens.cpu().numpy()
13         tokens += batch.ntokens.cpu().numpy()
14         if i % 50 == 1:
15             elapsed = time.time() - start
16             print("Epoch Step: %d Loss: %f Tokens per Sec: %f" % (i, loss.detach().cpu().numpy() / batch.ntokens.cpu().numpy(), tokens / elapsed))
17             start = time.time()
18             tokens = 0
19     return total_loss / total_tokens

7. loss 均為整數

class SimpleLossCompute

在運行“A First Example” 時，結果顯示的 loss 全部是整數，這就很奇怪了。測試后發現，是 class SimpleLossCompute中的返回值的問題，norm這個tensor是int型的，雖然loss.item()是浮點數，但是return loss.item() * norm的值仍是int型tensor.

修改方法：將norm轉為float再進行乘法運算：

return loss.item() * norm.float()

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 【NLP-2017】代碼解讀Transformer--Attention is All You Need NLP個人面試問題總結-transformer 【NLP-16】Transformer-XL [NLP論文]Longformer: The Long-Document Transformer論文翻譯及理解【NLP-2017】解讀Transformer--Attention is All You Need [NLP] 相對位置編碼(一) Relative Position Representatitons (RPR) - Transformer pytorch實現的transformer代碼分析 Transformer解析與tensorflow代碼解讀 transformer基本架構及代碼實現 Transformer