一、tf.constant_initializer(value)
作用:將變量初始化為給定的常量,初始化一切所提供的值。
二、tf.zeros_initializer()
作用:將變量設置為全0;也可以簡寫為tf.Zeros()
三、tf.ones_initializer()
作用:將變量設置為全1;可簡寫為tf.Ones()
四、tf.random_normal_initializer(mean,stddev)
作用:將變量初始化為滿足正太分布的隨機值,主要參數(正太分布的均值和標准差),用所給的均值和標准差初始化均勻分布。
五、tf.truncated_normal_initializer(mean,stddev,seed,dtype)
作用:將變量初始化為滿足
正太分布的隨機值,但如果隨機出來的值偏離平均值超過2個標准差,那么這個數將會被重新隨機。
- mean:用於指定均值;
- stddev用於指定標准差;
- seed:用於指定隨機數種子;
- dtype:用於指定隨機數的數據類型。通常只需要設定一個標准差stddev這一個參數就可以。
@tf_export("initializers.truncated_normal", "truncated_normal_initializer")
class TruncatedNormal(Initializer):
"""Initializer that generates a truncated normal distribution.
These values are similar to values from a `random_normal_initializer`
except that values more than two standard deviations from the mean
are discarded and re-drawn. This is the recommended initializer for
neural network weights and filters.
Args:
mean: a python scalar or a scalar tensor. Mean of the random values
to generate. 一個python標量或一個標量張量。要生成的隨機值的均值
stddev: a python scalar or a scalar tensor. Standard deviation of the
random values to generate.一個python標量或一個標量張量。要生成的隨機值的標准偏差。
seed: A Python integer. Used to create random seeds. See
`tf.set_random_seed`
for behavior.一個Python整數。用於創建隨機種子。查看 tf.set_random_seed 行為。
dtype: The data type. Only floating point types are supported.數據類型。只支持浮點類型。
"""
def __init__(self, mean=0.0, stddev=1.0, seed=None, dtype=dtypes.float32):
self.mean = mean
self.stddev = stddev
self.seed = seed
self.dtype = _assert_float_dtype(dtypes.as_dtype(dtype))
def __call__(self, shape, dtype=None, partition_info=None):
if dtype is None:
dtype = self.dtype
return random_ops.truncated_normal(
shape, self.mean, self.stddev, dtype, seed=self.seed)
def get_config(self):
return {
"mean": self.mean,
"stddev": self.stddev,
"seed": self.seed,
"dtype": self.dtype.name
}
舉例:bert中初始化token_type_embeddings、embedding_table時,假設token_type_embeddings服從正態分布
def embedding_postprocessor(input_tensor,
use_token_type=False,
token_type_ids=None,
token_type_vocab_size=16,
token_type_embedding_name="token_type_embeddings",
use_position_embeddings=True,
position_embedding_name="position_embeddings",
initializer_range=0.02,
max_position_embeddings=512,
dropout_prob=0.1):
...
if use_token_type:
if token_type_ids is None:
raise ValueError("`token_type_ids` must be specified if"
"`use_token_type` is True.")
token_type_table = tf.get_variable(
name=token_type_embedding_name,
shape=[token_type_vocab_size, width],
initializer=create_initializer(initializer_range))
...
def create_initializer(initializer_range=0.02):
"""Creates a `truncated_normal_initializer` with the given range."""
return tf.truncated_normal_initializer(stddev=initializer_range)
六、tf.random_uniform_initializer(a,b,seed,dtype)
作用:從a到b均勻初始化,將變量初始化為滿足均勻分布的隨機值,主要參數(最大值,最小值)。
七、tf.uniform_unit_scaling_initializer(factor,seed,dtypr)
作用:將變量初始化為滿足均勻分布但不影響輸出數量級的隨機值
