caffe中的sgd,與激活函數(activation function)

本文轉載自查看原文 2014-11-18 09:55 3351

caffe中activation function的形式，直接決定了其訓練速度以及SGD的求解。

在caffe中，不同的activation function對應的sgd的方式是不同的，因此，在配置文件中指定activation layer的type，目前caffe中用的最多的是relu的activation function.

caffe中，目前實現的activation function有以下幾種：

absval, bnll, power, relu, sigmoid, tanh等幾種，分別有單獨的layer層。其數學公式分別為:

算了，這部分我不解釋了，直接看caffe的tutorial吧

ReLU / Rectified-Linear and Leaky-ReLU

LayerType: RELU
CPU implementation: ./src/caffe/layers/relu_layer.cpp
CUDA GPU implementation: ./src/caffe/layers/relu_layer.cu
Parameters (ReLUParameter relu_param)
- Optional
  - negative_slope [default 0]: specifies whether to leak the negative part by multiplying it with the slope value rather than setting it to 0.

Sample (as seen in ./examples/imagenet/imagenet_train_val.prototxt)

layers {
  name: "relu1"
  type: RELU
  bottom: "conv1"
  top: "conv1"
}

Given an input value x, The RELU layer computes the output as x if x > 0 and negative_slope * x if x <= 0. When the negative slope parameter is not set, it is equivalent to the standard ReLU function of taking max(x, 0). It also supports in-place computation, meaning that the bottom and the top blob could be the same to preserve memory consumption.

Sigmoid

LayerType: SIGMOID
CPU implementation: ./src/caffe/layers/sigmoid_layer.cpp
CUDA GPU implementation: ./src/caffe/layers/sigmoid_layer.cu

Sample (as seen in ./examples/imagenet/mnist_autoencoder.prototxt)

layers {
  name: "encode1neuron"
  bottom: "encode1"
  top: "encode1neuron"
  type: SIGMOID
}

The SIGMOID layer computes the output as sigmoid(x) for each input element x.

TanH / Hyperbolic Tangent

LayerType: TANH
CPU implementation: ./src/caffe/layers/tanh_layer.cpp
CUDA GPU implementation: ./src/caffe/layers/tanh_layer.cu

Sample

layers {
  name: "layer"
  bottom: "in"
  top: "out"
  type: TANH
}

The TANH layer computes the output as tanh(x) for each input element x.

Absolute Value

LayerType: ABSVAL
CPU implementation: ./src/caffe/layers/absval_layer.cpp
CUDA GPU implementation: ./src/caffe/layers/absval_layer.cu

Sample

layers {
  name: "layer"
  bottom: "in"
  top: "out"
  type: ABSVAL
}

The ABSVAL layer computes the output as abs(x) for each input element x.

Power

LayerType: POWER
CPU implementation: ./src/caffe/layers/power_layer.cpp
CUDA GPU implementation: ./src/caffe/layers/power_layer.cu
Parameters (PowerParameter power_param)
- Optional
  - power [default 1]
  - scale [default 1]
  - shift [default 0]

Sample

layers {
  name: "layer"
  bottom: "in"
  top: "out"
  type: POWER
  power_param {
    power: 1
    scale: 1
    shift: 0
  }
}

The POWER layer computes the output as (shift + scale * x) ^ power for each input element x.

BNLL

LayerType: BNLL
CPU implementation: ./src/caffe/layers/bnll_layer.cpp
CUDA GPU implementation: ./src/caffe/layers/bnll_layer.cu

Sample

layers {
  name: "layer"
  bottom: "in"
  top: "out"
  type: BNLL
}

The BNLL (binomial normal log likelihood) layer computes the output as log(1 + exp(x)) for each input element x.

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Pytorch中的激活函數 caffe中的Local Response Normalization (LRN)有什么用，和激活函數區別神經網絡中的激活函數具體是什么？為什么Relu要好過與tanh和sigmoid function 記-CNN中的激活函數常用激活函數 Pytorch 之激活函數什么是sigmoid激活函數？激活函數--GeLU CReLU激活函數常見的激活函數