transformer


終於來到transformer了,之前的幾個東西都搞的差不多了,剩下的就是搭積木搭模型了。首先來看一下transformer模型,OK好像就是那一套東西。

image-20211119114134448

transformer是純基於注意力機制的架構,但是也是之前的encoder-decoder架構。

層歸一化

image-20211119114946601

這里用到了層歸一化,和之前的批量歸一化有區別。

這里參考了torch文檔:

image-20211119120240537

image-20211119120301859

N就是batchsize維,layernorm就是對一個batch里序列里的向量做歸一化。

image-20211119120219369

Encoder

import torch
import torch.nn as nn
import torch.nn.functional as F
from d2l import torch as d2l

class add_norm(nn.Module):
    def __init__(self, norm_shape, dropout=0):
        super(add_norm, self).__init__()
        self.norm = nn.LayerNorm(norm_shape)
        self.dropout = nn.Dropout(dropout)
    
    def forward(self, X, Y):
        return self.norm(X + self.dropout(Y)) #這里默認X, Y的shape一樣

class EncoderBlock(nn.Module):
    def __init__(self, embed_dim, norm_shape):
        super(EncoderBlock, self).__init__()
        self.add_norm1 = add_norm(norm_shape=norm_shape)
        self.attention = nn.MultiheadAttention(embed_dim, 8, batch_first=True) # 這里將batch_first 設置為了True。
        self.ffn = nn.Sequential(nn.Linear(embed_dim, embed_dim), nn.ReLU(), nn.Linear(embed_dim, embed_dim))
        self.add_norm2 = add_norm(norm_shape=norm_shape)
        
        
    def forward(self, X): 
        Y,_ = self.attention(X, X, X)
        X = self.add_norm1(X, Y)
        Y = self.ffn(X)
        X = self.add_norm2(X, Y)
        return X
    
class Encoder(nn.Module):
    def __init__(self, embed_dim, norm_shape, num_block) -> None:
        super(Encoder, self).__init__()
        self.pos_encoding = d2l.PositionalEncoding(embed_dim, dropout=0)
        self.EncoderBlocks = [EncoderBlock(embed_dim, norm_shape) for _ in range(num_block)]
    
    def forward(self, X):
        X = self.pos_encoding(X)  
        for i in range(len(self.EncoderBlocks)):
            X = self.EncoderBlocks[i](X)
        return X

model = Encoder(128, [35, 128], 2)
s = torch.zeros((64, 35, 128)) 
s = model(s)

用torch實現了一個encoder, decoder不想寫,擺爛了,就這樣,愛咋滴咋滴,以后就調用框架了。

image-20211119195121147

image-20211119195149263

直接用框架實現了,愛咋滴咋滴吧。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM