pytorch RNN層api的幾個參數說明

本文轉載自查看原文 2018-05-04 16:19 4548

classtorch.nn.RNN(*args, **kwargs)

input_size – The number of expected features in the input x

hidden_size – The number of features in the hidden state h

num_layers – Number of recurrent layers. E.g., setting num_layers=2 would mean stacking two RNNs together to form a stacked RNN, with the second RNN taking in outputs of the first RNN and computing the final results. Default: 1

nonlinearity – The non-linearity to use. Can be either ‘tanh’ or ‘relu’. Default: ‘tanh’

bias – If False, then the layer does not use bias weights b_ih and b_hh. Default: True

batch_first – If True, then the input and output tensors are provided as (batch, seq, feature)

dropout – If non-zero, introduces a Dropout layer on the outputs of each RNN layer except the last layer, with dropout probability equal to dropout. Default: 0

bidirectional – If True, becomes a bidirectional RNN. Default: False

有個參數一直理解錯誤，導致了認知困難

首先，RNN這里的序列長度，是動態的，不寫在參數里的，具體會由輸入的input參數而定

而num_layers並不是RNN的序列長度，而是堆疊層數，由上一層每個時間節點的輸出作為下一層每個時間節點的輸入

RNN的對象接受的參數，input維度是(seq_len, batch_size, input_dim)，h0維度是(num_layers * directions, batch_size, hidden_dim)

其中，input的seq_len決定了序列的長度，h0是提供給每層RNN的初始輸入，所有num_layers要和RNN的num_layers對得上

返回兩個值，一個output，一個hn

hn的維度是(num_layers * directions, batch_size, hidden_dim)，是RNN的右側輸出，如果是雙向的話，就還有一個左側輸出

output的維度是(seq_len, batch_size, hidden_dim * directions)，是RNN的上側輸出

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 @JSONField的幾個常用參數說明 MySQL JDBC URL中幾個重要參數說明 pytorch之 RNN 參數解釋 jQuery Jcrop API參數說明(中文版) delay的幾個函數說明 xdebug參數說明 iostat參數說明 Locust 參數說明 rdesktop 參數說明 XGBoost 參數說明