SimpleRNNCell詳解


SimpleRNNCell詳解

一、總結

一句話總結:

units: 正整數,輸出空間的維度,即隱藏層神經元數量.
recurrent_dropout: 隱藏層之間的dropout.
class SimpleRNNCell(Layer):
    """Cell class for SimpleRNN.

    # Arguments
        units: 正整數,輸出空間的維度,即隱藏層神經元數量.
        activation: 激活函數,默認是tanh
        use_bias: Boolean, 是否使用偏置向量.
        kernel_initializer: 輸入和隱藏層之間的權重參數初始化器.默認使用'glorot_uniform'
        recurrent_initializer: 隱藏層之間的權重參數初始化器.默認使用'orthogonal'
        bias_initializer: 偏置向量的初始化器.
        kernel_regularizer: 輸入和隱藏層之間權重參數的正則化方法.
        recurrent_regularizer: 隱藏層之間權重參數的正則化方法.
        bias_regularizer: 偏置向量的正則化方法.
        kernel_constraint: kernel的約束方法.
        recurrent_constraint: 隱藏層權重的約束函數.
        bias_constraint: 偏置向量的約束函數.
        dropout: 輸入和隱藏層之間的dropout.
        recurrent_dropout: 隱藏層之間的dropout.
    """

 

 

二、SimpleRNNCell詳解

轉自或參考:Keras源碼(1):SimpleRNNCell詳解
http://blog.csdn.net/u013230189/article/details/108208123

1.源碼講解

  SimpleRNNCell類可以理解為RNN中的一個時間步的計算,而RNN則是把多個這樣的cell進行串聯起來統一計算。

 

 

如上圖所示,紅色小方框就表示一個cell的計算。而外面的紅色大方框則表示整個RNN的計算。

SimpleRNNCell繼承自Layer基類,主要包含4個方法:

  • init():構造方法,主要用於初始化參數
  • build():主要用於初始化網絡層中涉及到的權重參數
  • call():用於網絡層的參數計算,對輸入進行計算,並產生相應地輸出
  • get_config():獲取該網絡層的參數配置

具體參數和方法解釋看以下源碼注釋:

class SimpleRNNCell(Layer):
    """Cell class for SimpleRNN. # Arguments units: 正整數,輸出空間的維度,即隱藏層神經元數量. activation: 激活函數,默認是tanh use_bias: Boolean, 是否使用偏置向量. kernel_initializer: 輸入和隱藏層之間的權重參數初始化器.默認使用'glorot_uniform' recurrent_initializer: 隱藏層之間的權重參數初始化器.默認使用'orthogonal' bias_initializer: 偏置向量的初始化器. kernel_regularizer: 輸入和隱藏層之間權重參數的正則化方法. recurrent_regularizer: 隱藏層之間權重參數的正則化方法. bias_regularizer: 偏置向量的正則化方法. kernel_constraint: kernel的約束方法. recurrent_constraint: 隱藏層權重的約束函數. bias_constraint: 偏置向量的約束函數. dropout: 輸入和隱藏層之間的dropout. recurrent_dropout: 隱藏層之間的dropout. """

    def __init__(self, units,
                 activation='tanh',
                 use_bias=True,
                 kernel_initializer='glorot_uniform',
                 recurrent_initializer='orthogonal',
                 bias_initializer='zeros',
                 kernel_regularizer=None,
                 recurrent_regularizer=None,
                 bias_regularizer=None,
                 kernel_constraint=None,
                 recurrent_constraint=None,
                 bias_constraint=None,
                 dropout=0.,
                 recurrent_dropout=0.,
                 **kwargs):
        super(SimpleRNNCell, self).__init__(**kwargs)
        self.units = units
        self.activation = activations.get(activation)
        self.use_bias = use_bias

        self.kernel_initializer = initializers.get(kernel_initializer)
        self.recurrent_initializer = initializers.get(recurrent_initializer)
        self.bias_initializer = initializers.get(bias_initializer)

        self.kernel_regularizer = regularizers.get(kernel_regularizer)
        self.recurrent_regularizer = regularizers.get(recurrent_regularizer)
        self.bias_regularizer = regularizers.get(bias_regularizer)

        self.kernel_constraint = constraints.get(kernel_constraint)
        self.recurrent_constraint = constraints.get(recurrent_constraint)
        self.bias_constraint = constraints.get(bias_constraint)

        self.dropout = min(1., max(0., dropout))  # dropout 在[0,1]之間
        self.recurrent_dropout = min(1., max(0., recurrent_dropout))
        self.state_size = self.units
        self.output_size = self.units
        self._dropout_mask = None
        self._recurrent_dropout_mask = None

    def build(self, input_shape):
        # build方法主要是用於構建權重。
        # 在call()函數第一次執行時會被調用一次,這時候可以知道輸入數據的shape,會初始化權重參數

        # 輸入和隱藏層之間的權重,add_weight方法會初始化權重參數,該方法有個參數trainable默認為True
        # 表示權重參數會隨着訓練更新
        self.kernel = self.add_weight(shape=(input_shape[-1], self.units),
                                      name='kernel',
                                      initializer=self.kernel_initializer,
                                      regularizer=self.kernel_regularizer,
                                      constraint=self.kernel_constraint)
        # 不同時間步隱藏層之間的權重
        self.recurrent_kernel = self.add_weight(
            shape=(self.units, self.units),
            name='recurrent_kernel',
            initializer=self.recurrent_initializer,
            regularizer=self.recurrent_regularizer,
            constraint=self.recurrent_constraint)
        # 是否使用偏置向量
        if self.use_bias:
            self.bias = self.add_weight(shape=(self.units,),
                                        name='bias',
                                        initializer=self.bias_initializer,
                                        regularizer=self.bias_regularizer,
                                        constraint=self.bias_constraint)
        else:
            self.bias = None
         # build函數會在__call__之前被調用一次,但是如果已經調用過了那么就不會被調用,
         # 看是否被調用的標志是self.built是否為True,
         # 如果是True, 那么下一次__call__的時候就不會調用,所以我們調用官方的layers的時候是不需要額外的build的。
        self.built = True

    # 神經網絡的前向傳播過程,在這里進行計算
    # inputs表示輸入的單個時間步的張量,states表示前一時間步的hidden state,list類型
    def call(self, inputs, states, training=None):
        prev_output = states[0]
        if 0 < self.dropout < 1 and self._dropout_mask is None:
            # 生成一個dropout_mask張量,用於對輸入inputs進行dropout
            self._dropout_mask = _generate_dropout_mask(
                K.ones_like(inputs),
                self.dropout,
                training=training)
        if (0 < self.recurrent_dropout < 1 and
                self._recurrent_dropout_mask is None):
            self._recurrent_dropout_mask = _generate_dropout_mask(
                K.ones_like(prev_output),
                self.recurrent_dropout,
                training=training)

        dp_mask = self._dropout_mask
        rec_dp_mask = self._recurrent_dropout_mask

        # 先對輸入inputs進行dropout,然后在與權重參數kernel進行dot
        if dp_mask is not None:
            h = K.dot(inputs * dp_mask, self.kernel)
        else:
            h = K.dot(inputs, self.kernel)
        # 如果有偏置向量,則加上偏置
        if self.bias is not None:
            h = K.bias_add(h, self.bias)

        # 隱藏層之間的計算,是否需要dropout
        if rec_dp_mask is not None:
            prev_output *= rec_dp_mask
        output = h + K.dot(prev_output, self.recurrent_kernel)
        # 是否進行激活
        if self.activation is not None:
            output = self.activation(output)

        # Properly set learning phase on output tensor.
        if 0 < self.dropout + self.recurrent_dropout:
            if training is None:
                output._uses_learning_phase = True
        return output, [output]

    # 獲取參數配置
    def get_config(self):
        config = {'units': self.units,
                  'activation': activations.serialize(self.activation),
                  'use_bias': self.use_bias,
                  'kernel_initializer':
                      initializers.serialize(self.kernel_initializer),
                  'recurrent_initializer':
                      initializers.serialize(self.recurrent_initializer),
                  'bias_initializer': initializers.serialize(self.bias_initializer),
                  'kernel_regularizer':
                      regularizers.serialize(self.kernel_regularizer),
                  'recurrent_regularizer':
                      regularizers.serialize(self.recurrent_regularizer),
                  'bias_regularizer': regularizers.serialize(self.bias_regularizer),
                  'kernel_constraint': constraints.serialize(self.kernel_constraint),
                  'recurrent_constraint':
                      constraints.serialize(self.recurrent_constraint),
                  'bias_constraint': constraints.serialize(self.bias_constraint),
                  'dropout': self.dropout,
                  'recurrent_dropout': self.recurrent_dropout}
        base_config = super(SimpleRNNCell, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

其中build方法中使用的add_weight()方法繼承自父類Layer,如果網絡層有參數需要參與訓練,都需要在這里定義。

    def add_weight(self,
                   name,
                   shape,
                   dtype=None,
                   initializer=None,
                   regularizer=None,
                   trainable=True,
                   constraint=None):
        """Adds a weight variable to the layer. # Arguments name: String, 權重變量的名稱. shape: 權重的shape. dtype: 權重的數據類型. initializer: 權重的初始化方法. regularizer: 權重的正則化方法. trainable: 權重是否可更新. constraint: 可選的約束方法. # Returns 返回權重張量. """
        initializer = initializers.get(initializer)
        if dtype is None:
            dtype = self.dtype
        weight = K.variable(initializer(shape, dtype=dtype),
                            dtype=dtype,
                            name=name,
                            constraint=constraint)
        if regularizer is not None:
            with K.name_scope('weight_regularizer'):
                self.add_loss(regularizer(weight))
        if trainable:
            self._trainable_weights.append(weight)
        else:
            self._non_trainable_weights.append(weight)
        return weight

2. 使用實例

import tensorflow as tf
import keras


# (batch_size,time_step,embedding_dim)
batch_size = 10
time_step = 20
embedding_dim = 100
train_x = tf.random.normal(shape=[batch_size,time_step,embedding_dim])
hidden_dim = 64  # 隱藏層維度
h0 = tf.random.normal(shape=[batch_size,hidden_dim])
x0 = train_x[:,0,:]  # 第一個時間步的輸入

simpleRnnCell = keras.layers.recurrent.SimpleRNNCell(hidden_dim)
out,h1=simpleRnnCell(x0, [h0])  # 將當前時間步的x和上一時間步的隱藏層輸出輸入到

print(out.shape,h1[0].shape)
 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM