激活函數（relu，prelu，elu，+BN）對比on cifar10

本文轉載自查看原文 2018-09-14 15:03 5850 Deep Learning

可參考上一篇：

激活函數 ReLU、LReLU、PReLU、CReLU、ELU、SELU 的定義和區別

一．理論基礎

1.1激活函數

1.2 elu論文（FAST AND ACCURATE DEEP NETWORK LEARNING BY

EXPONENTIAL LINEAR UNITS (ELUS)）

1.2.1 摘要

論文中提到，elu函數可以加速訓練並且可以提高分類的准確率。它有以下特征：

1）elu由於其正值特性，可以像relu,lrelu,prelu一樣緩解梯度消失的問題。

2）相比relu，elu存在負值，可以將激活單元的輸出均值往0推近，達到

batchnormlization的效果且減少了計算量。（輸出均值接近0可以減少偏移效應進而使梯

度接近於自然梯度。）

3）Lrelu和prelu雖然有負值存在，但是不能確保是一個噪聲穩定的去激活狀態。

4）Elu在負值時是一個指數函數，對於輸入特征只定性不定量。

1.2.2.bias shift correction speeds up learning

為了減少不必要的偏移移位效應，做出如下改變：（i）輸入單元的激活可以

以零為中心，或（ii）可以使用具有負值的激活函數。我們介紹一個新的

激活函數具有負值，同時保持正參數的特性，即elus。

1.2.4實驗

作者把elu函數用於無監督學習中的autoencoder和有監督學習中的卷積神經網絡；

elu與relu，lrelu，SReLU做對比實驗；數據集選擇mnist，cifar10，cifar100.

2ALL-CNN for cifar-10

2.1結構設計

ALL-CNN結構來自論文（STRIVING FOR SIMPLICITY:

THE ALL CONVOLUTIONAL NET）主要工作是把pool層用stride=2的卷積來代替，提出了一些全卷積網絡架構，kernel=3時效果最好，最合適之類的，比較好懂，同時效果也不錯，比原始的cnn效果好又沒有用到一些比較大的網絡結構如resnet等。

附上：

Lrelu實現：
def lrelu(x, leak=0.2, name="lrelu"):
return tf.maximum(x, leak * x)

Prelu實現：
def parametric_relu(_x):
alphas = tf.get_variable('alpha', _x.get_shape()[-1],
initializer=tf.constant_initializer(0.25),
dtype = tf.float32
)
pos = tf.nn.relu(_x)
neg = alphas * (_x - abs(_x)) * 0.5
print(alphas)
return pos + neg

BN實現：     
def batch_norm(x, n_out,scope='bn'):
  """
  Batch normalization on convolutional maps.
  Args:
    x: Tensor, 4D BHWD input maps
    n_out: integer, depth of input maps
    phase_train: boolean tf.Variable, true indicates training phase
    scope: string, variable scope

  Return:
    normed: batch-normalized maps
  """
  with tf.variable_scope(scope):
    beta = tf.Variable(tf.constant(0.0, shape=[n_out]),
      name='beta', trainable=True)
    gamma = tf.Variable(tf.constant(1.0, shape=[n_out]),
      name='gamma', trainable=True)
    tf.add_to_collection('biases', beta)
    tf.add_to_collection('weights', gamma)

    batch_mean, batch_var = tf.nn.moments(x, [0,1,2], name='moments')
    ema = tf.train.ExponentialMovingAverage(decay=0.99)

    def mean_var_with_update():
      ema_apply_op = ema.apply([batch_mean, batch_var])
      with tf.control_dependencies([ema_apply_op]):
       return tf.identity(batch_mean), tf.identity(batch_var)
    #mean, var = control_flow_ops.cond(phase_train,
    # mean, var = control_flow_ops.cond(phase_train,
    #   mean_var_with_update,
    #   lambda: (ema.average(batch_mean), ema.average(batch_var)))
    mean, var = mean_var_with_update()
    normed = tf.nn.batch_normalization(x, mean, var,
      beta, gamma, 1e-3)
  return normed

在cifar10 上測試結果如下：

以loss所有結果如下：relu+bn>elu>prelu>elubn>relu

所有的測試准確率如下

relu+bn組合准確率最高，relu+bn>elu>prelu>elubn>relu

可見elu在激活函數里表現最好，但是它不必加BN，這樣減少了BN的計算量。

3.ALL-CNN for cifar-100

cifar100數據集

CIFAR-100 python version,下載完之后解壓，在cifar-100-python下會出現：meta,test和train

三個文件，他們都是python用cPickle封裝的pickled對象

解壓：tar -zxvf xxx.tar.gz 
cifar-100-python/ 
cifar-100-python/file.txt~ 
cifar-100-python/train 
cifar-100-python/test 
cifar-100-python/meta 
def unpickle(file): 
import cPickle 
fo = open(file, ‘rb’) 
dict = cPickle.load(fo) 
fo.close() 
return dict

通過以上代碼可以將其轉換成一個dict對象，test和train的dict中包含以下元素：

data——一個nx3072的numpy數組,每一行都是(32,32,3)的RGB圖像,n代表圖像個數

coarse_labels——一個范圍在0-19的包含n個元素的列表,對應圖像的大類別

fine_labels——一個范圍在0-99的包含n個元素的列表,對應圖像的小類別

而meta的dict中只包含fine_label_names,第i個元素對應其真正的類別。

二進制版本（我用的）：

<1 x coarse label><1 x fine label><3072 x pixel>

…

<1 x coarse label><1 x fine label><3072 x pixel>

網絡結構直接在cifar10的基礎上輸出100類即可，只對cifar100的精細標簽100個進行分類任務，因此代碼里取輸入數據集第二個值做為標簽。（tensorflow的cifar10代碼）

label_bytes =2 # 2 for CIFAR-100
#取第二個標簽100維
result.label = tf.cast(
tf.strided_slice(record_bytes, [1], [label_bytes]), tf.int32)

在all CNN 9層上，大約50k步，relu+bn組合測試的cifar100 test error為0.36

PS:

Activation Function Cheetsheet

來源：

https://blog.csdn.net/m0_37561765/article/details/78398098

https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 激活函數ReLU、Leaky ReLU、PReLU和RReLU [轉]激活函數ReLU、Leaky ReLU、PReLU和RReLU 激活函數ReLU、Leaky ReLU、PReLU和RReLU 常用激活函數：Sigmoid、Tanh、Relu、Leaky Relu、ELU優缺點總結 ReLU激活函數 ReLU激活函數的缺點 relu6激活函數激活函數Relu的優點 tensorflow Relu激活函數 Relu激活函數的優點