classtorch.nn.RNN(*args, **kwargs)
input_size – The number of expected features in the input x
hidden_size – The number of features in the hidden state h
num_layers – Number of recurrent layers. E.g., setting num_layers=2 would mean stacking two RNNs together to form a stacked RNN, with the second RNN taking in outputs of the first RNN and computing the final results. Default: 1
nonlinearity – The non-linearity to use. Can be either ‘tanh’ or ‘relu’. Default: ‘tanh’
bias – If False, then the layer does not use bias weights b_ih and b_hh. Default: True
batch_first – If True, then the input and output tensors are provided as (batch, seq, feature)
dropout – If non-zero, introduces a Dropout layer on the outputs of each RNN layer except the last layer, with dropout probability equal to dropout. Default: 0
bidirectional – If True, becomes a bidirectional RNN. Default: False
有個參數一直理解錯誤,導致了認知困難
首先,RNN這里的序列長度,是動態的,不寫在參數里的,具體會由輸入的input參數而定
而num_layers並不是RNN的序列長度,而是堆疊層數,由上一層每個時間節點的輸出作為下一層每個時間節點的輸入
RNN的對象接受的參數,input維度是(seq_len, batch_size, input_dim),h0維度是(num_layers * directions, batch_size, hidden_dim)
其中,input的seq_len決定了序列的長度,h0是提供給每層RNN的初始輸入,所有num_layers要和RNN的num_layers對得上
返回兩個值,一個output,一個hn
hn的維度是(num_layers * directions, batch_size, hidden_dim),是RNN的右側輸出,如果是雙向的話,就還有一個左側輸出
output的維度是(seq_len, batch_size, hidden_dim * directions),是RNN的上側輸出
