Pytorch中的RNN、RNNCell、LSTM、LSTMCell、GRU、GRUCell的用法

本文轉載自查看原文 2021-07-23 15:11 267

首先，當然，官方文檔都有

RNN: https://pytorch.org/docs/stable/generated/torch.nn.RNN.html

RNNCell: https://pytorch.org/docs/stable/generated/torch.nn.RNNCell.html

LSTM: https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html

LSTMCell: https://pytorch.org/docs/stable/generated/torch.nn.LSTMCell.html

GRU: https://pytorch.org/docs/stable/generated/torch.nn.GRU.html

GRUCell: https://pytorch.org/docs/stable/generated/torch.nn.GRUCell.html

這里，只是自己做下筆記

以LSTM和LSTMCell為例

LSTM的結構

LSTM the dim of definition input output weights

LSTM parameters:

input_size: input x 的 features
hidden_size: hidden state h 的 features
num_layers: 層數，默認為1
batch_first: if True，是(batch, seq, feature)，否則是(seq, batch, feature)，默認是False
bidirectional: 默認為False

input:

input: 當batch_first=False， tensor為(L, N, H_i) ，否則為 (N, L, H_i)
h_0: tensor of shape (D*num_layers, N, H_out)，默認為zeros，如果(h_0, c_0) not provided
c_0: tensor of shape (D*num_layers, n, H_cell)，默認為zeros，如果(h_0, c_0) not provided

where:

N = batch size

L = sequence length

D = 2 if bidirectional=True otherwise 1

H_in = input_size

H_cell = hidden_size

H_out = proj_size if proj_size>0 otherwise hidden_size，通常就是hidden_size咯

Output:

output: (L, N, D*H_out) when batch_first=False，是一個長度為L的序列，[h_1[-1], h_2[-1], ..., h_L[-1]]，就是最后一層的hidden states
h_n: tensor of shape (D*num_layers, N, H_out)
c_n: tensor of shape (D*num_layers, N, H_cell)

Variables:

好像新版的有改動

all_weights

Examples:

>>> rnn = nn.LSTM(10, 20, 2)   # (input_size, hidden_size, num_layers)
>>> input = torch.randn(5, 3, 10) # (time_steps, batch, input_size) 
>>> h0 = torch.randn(2, 3, 20)  # (num_layers, batch_size, hidden_size)
>>> c0 = torch.randn(2, 3, 20)
>>> output, (hn, cn) = rnn(input, (h0, c0))  # (time_steps, batch, hidden_size)
# output[-1] = h0[-1]

LSTM Cell

就是LSTM的一個單元，許多個LSTM Cell組成一個LSTM

結構

相比LSTM，少了參數t

Parameters:

只有input_size 和 hidden_size，沒有了 num_layers

Inputs:

input: (batch, input_size)
h_0: (batch, hidden_size)
c_0: (batch, hidden_size)

Outputs:

h_1: (batch, hidden_size)
c_1: (batch, hidden_size)

Variables:

weight_ih: input-hidden weights, of shape (4*hidden_size, input_size)，因為是左乘W*input，且有4個W，所以是4*hidden_size
weight_hh: hidden-hidden weights, of shape (4*hidden_size, hidden_size)
bias_ih: input-hidden bias, of shape (4*hidden_size)
bias_hh: hidden-hidden bias, of shape (4*hidden_size)

Example:

>>> rnn = nn.LSTMCell(10, 20) # (input_size, hidden_size)
>>> input = torch.randn(2, 3, 10) # (time_steps, batch, input_size)
>>> hx = torch.randn(3, 20) # (batch, hidden_size)
>>> cx = torch.randn(3, 20)
>>> output = []
>>> for i in range(2):
        hx, cx = rnn(input[i], (hx, cx))
        output.append(hx)
>>> output = torch.stack(output, dim=0)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 [PyTorch] rnn,lstm,gru中輸入輸出維度 Keras中RNN、LSTM和GRU的參數計算 RNN、lstm、gru詳解 RNN、lstm和GRU推導 RNN - LSTM - GRU RNN、LSTM、GRU LSTM、GRU、 BRNN、Hierarchical RNN RNN-LSTM-GRU-BIRNN RNN & LSTM & GRU 的原理與區別 RNN & GRU & LSTM 區別與聯系