1.LSTM&GRU的原理
https://blog.csdn.net/jerr__y/article/details/58598296
https://github.com/starflyyy/Gated-Recurrent-Unit-GRU
2.多層LSTM
pytorch里有一個num_layers,是指參數共享之后網絡也有不同cell,即相當於隱含層的數目,是指cell串聯和mlp很像,即為StackedRNN具體如下圖
3.雙向RNN
待繼續了解https://blog.csdn.net/jojozhangju/article/details/51982254
4.lstm&rnn的實現和改造
源碼分析https://zhuanlan.zhihu.com/p/63638656
例子分析https://zhuanlan.zhihu.com/p/32103001,可以加上自己改寫RNNcellhttps://github.com/huyingxi/new-LSTM-Cell
pytorch中rnn的輸出有output和hidden,區別如下圖(lstm)
其中:
Outputs: output, (h_n, c_n)
- output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
- h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t=seq_len
- c_n (num_layer, batch, hidden_size): tensor containing the cell state for t=seq_len
一種實現,但其實現的多層lstm不太對https://github.com/emadRad/lstm-gru-pytorch/