caffe 學習(3)——Layer Catalogue

本文轉載自查看原文 2016-05-18 12:05 2616 caffe

layer是建模和計算的基本單元。

caffe的目錄包含各種state-of-the-art model的layers。

為了創建一個caffe model，我們需要定義模型架構在一個protocol buffer定義文件中(prototxt)。caffe的layer和它們的參數被定義在caffe.proto中。

Vision Layers:

頭文件./include/caffe/vision_layers.hpp

vision layers通常取圖像為輸入，產生其他圖像作為輸出。實際中典型的圖像可能只有一個顏色通道(c = 1)，例如在一個灰度圖像中，或者三個通道(c = 3)，在一個RGB圖像中。但是這里，一個圖像的顯著特征是它的空間結構，通常一個圖像有高度h > 1，寬度w > 1。這個2D幾何圖形自然導致了如何處理輸入。特別地，大多數vision layers通過對輸入的一些區域應用一個特殊的操作，產生相應的輸出。對比來看，其他layers（少數例外）忽略輸入的空間結構，將輸入視為一個大的向量，向量維度為chw。

Convolution layer：

layer類型：Convolution
CPU實現：./src/caffe/layers/conv_layer.cpp
CUDA GPU實現：./src/caffe/layers/conv_layer.cu
參數(ConvolutionParameter convolution_param)
- 必須要求的
  - num_output(c_o): 濾波器數量
  - kernel_size (or kernel_h and kernel_w): 每個濾波器的高和寬
- 強烈推薦的
  - weight_filter [default type: 'constant' value: 0]
- 可選的
  - bias_term [default true]: 是否學習和應用一組biase到濾波器輸出
  - pad (or pad_h and pad_w) [default 0]: 指定在輸入圖像的每個邊隱含添加的像素數目
  - stride (or stride_h and stride_w) [default 1]: 指定應用濾波器到圖像時濾波器的間隔
  - group (g) [default 1]: 如果 g > 1，我們限制每個濾波器連接到輸入的子集。特別地，輸入和輸出通道被分為g組，第i組輸出僅僅連接到第i組輸入。
輸入： n * c_i * h_i * w_i
輸出：n * c_o * h_o * w_o，其中h_o = (h_i + 2 * pad_h - kernel_h) / stride_h + 1，w_o可得類似結果。
例子：

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  # learning rate and decay multipliers for the filters
  param { lr_mult: 1 decay_mult: 1 }
  # learning rate and decay multipliers for the biases
  param { lr_mult: 2 decay_mult: 0 }
  convolution_param {
    num_output: 96     # learn 96 filters
    kernel_size: 11    # each filter is 11x11
    stride: 4          # step 4 pixels between each filter application
    weight_filler {
      type: "gaussian" # initialize the filters from a Gaussian
      std: 0.01        # distribution with stdev 0.01 (default mean: 0)
    }
    bias_filler {
      type: "constant" # initialize the biases to zero (0)
      value: 0
    }
  }
}

Convolution layer卷積輸入圖像和一組可學習的濾波器，每個濾波器對應地產生輸出圖像的一個feature map。

Pooling layers：

layer類型：Pooling
CPU實現：./src/caffe/layers/pooling_layer.cpp
CUDA GPU實現：./src/caffe/layers/pooling_layer.cu
參數(PoolingParameter pooling_param)
- 必須要求的
  - kernel_size (or kernel_h and kernel_w): 每個濾波器的高和寬
- 強烈推薦的
  - weight_filter [default type: 'constant' value: 0]
- 可選的
  - pool [default MAX]: pooling的方法，包括MAX, AVE, or STOCHASTIC
  - pad (or pad_h and pad_w) [default 0]: 指定在輸入圖像的每個邊隱含添加的像素數目
  - stride (or stride_h and stride_w) [default 1]: 指定應用濾波器到圖像時濾波器的間隔
輸入： n * c * h_i * w_i
輸出：n * c * h_o * w_o，其中h_o = (h_i + 2 * pad_h - kernel_h) / stride_h + 1，w_o可得類似結果。

例子：

layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3 # pool over a 3x3 region
    stride: 2      # step two pixels (in the bottom blob) between pooling regions
  }
}

Local Response Normalization (LRN): 局部響應歸一化

layer類型：LRN
CPU實現：./src/caffe/layers/lrn_layer.cpp
CUDA GPU實現：./src/caffe/layers/lrn_layer.cu
參數(LRNParameter lrn_param)

局部響應歸一化層是一種側抑制(lateral inhibition)，在局部輸入區域上進行歸一化。在ACROSS_CHANNELS模式下，局部區域擴展到相鄰通道，但是沒有空間擴展（也就是形狀是local_size * 1 * 1）。在WITHIN_CHANNEL模式下，局部區域空間擴展，但是在各自的通道內（形狀是1 * local_size * local_size）。每個輸入值除以，n是每個局部區域的尺寸，求和是在以當前位置為中心的區域上操作。

im2col：它是一個做圖像到列向量變換的工具，我們不需要了解。它在caffe原始的卷積中使用，通過把所有patches放入一個矩陣進行矩陣乘法。

Loss Layers：損失層

loss驅動了學習過程，它比較輸出和目標之間的差異，並為之設置代價去最小化。loss本身被前向傳播計算，關於loss的梯度被后向傳播計算。

Softmax：type: SoftmaxWithLoss

softmax損失層計算輸入的softmax的多項式logistic loss。它概念上等同於一個softmax layer，后面連接一個多項式logistic loss layer，但是softmax loss layer提供一個數值更穩定的梯度。

Sum-of-Squares / Euclidean: type: EuclideanLoss

Euclidean損失層計算兩個輸入的差的平方和：

Hinge / Margin: 鉸鏈損失或邊緣損失

layer類型：HingeLoss
CPU實現：./src/caffe/layers/hinge_loss_layer.cpp
CUDA GPU實現：目前尚無GPU實現
參數(HingeLossParameter hinge_loss_param)
- 可選的
  - norm [default L1]: 使用范數，目前包括 L1, L2兩種選擇。
輸入：
- n * c * h * w Predictions
- n * 1 * 1 * 1 Labels
輸出：1 * 1 * 1 * 1 所得損失

例子：

# L1 Norm
layer {
  name: "loss"
  type: "HingeLoss"
  bottom: "pred"
  bottom: "label"
}

# L2 Norm
layer {
  name: "loss"
  type: "HingeLoss"
  bottom: "pred"
  bottom: "label"
  top: "loss"
  hinge_loss_param {
    norm: L2
  }
}

Sigmoid Cross-Entropy: type: SigmoidCrossEntropyLoss交叉熵損失，用於多標簽分類

Infogain: type: InfogainLoss信息增益損失

Accuracy and Top-K: 准確性對輸出進行評分，計算輸出與目標之間的差異，它實際上不是一個loss，沒有后向傳播階段。

Activiation / Neuron Layers：激勵或神經元層

通常下，這類layer都是element-wise操作，輸入一個bottom blob，產生一個同樣大小的blob。在下面的layer介紹中，我們忽略了輸入輸出大小，因為它們是相同的，都是n * c * h * w。

ReLU / Rectified-Linear and Leaky-ReLU:

layer類型：ReLU
CPU實現：./src/caffe/layers/relu_layer.cpp
CUDA GPU實現：./src/caffe/layers/relu_layer.cu
參數(ReLUParameter relu_param)
- 可選的
  - negative_slope [default 0]: 指定是否使用斜坡值代替負數部分，還是將負數部分直接設置為0.
例子：
```
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
```
給定一個輸入值x，ReLU層在x > 0時輸出x， x < 0時輸出negative_slope * x。當negative_slope參數沒有設置時，等價於標准ReLU函數(max(x, 0))。它支持原位運算，意味着bottom和top blob是同址的，減少了內存消耗。

Sigmoid:

layer類型：Sigmoid
CPU實現：./src/caffe/layers/sigmoid_layer.cpp
CUDA GPU實現：./src/caffe/layers/sigmoid_layer.cu

例子：

layer {
  name: "encode1neuron"
  bottom: "encode1"
  top: "encode1neuron"
  type: "Sigmoid"
}

TanH / Hyperbolic Tangent

layer類型：TanH
CPU實現：./src/caffe/layers/tanh_layer.cpp
CUDA GPU實現：./src/caffe/layers/tanh_layer.cu

例子：

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: "TanH"
}

Absolute Value

layer類型：AbsVal
CPU實現：./src/caffe/layers/absval_layer.cpp
CUDA GPU實現：./src/caffe/layers/absval_layer.cu

例子：

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: "AbsVal"
}

Power

layer類型：Power
CPU實現：./src/caffe/layers/power_layer.cpp
CUDA GPU實現：./src/caffe/layers/power_layer.cu
參數(PowerParameter power_param)
- 可選的
  - power [default 1]
  - scale [default 1]
  - shift [default 0]

例子：

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: "Power"
  power_param {
    power: 1
    scale: 1
    shift: 0
  }
}

power層計算輸入為x時的，輸出為(shift + scale * x)^power。

BNLL (Binomial Normal Log Likelihood) 二項式標准對數似然

layer類型：BNLL
CPU實現：./src/caffe/layers/bnll_layer.cpp
CUDA GPU實現：./src/caffe/layers/bnll_layer.cu

例子：

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: BNLL
}

BNLL layer計算輸入x的輸出為log(1 + exp(x))。

Data Layers：數據層

數據進入caffe需要經過數據層，數據層位於網絡的底部。數據可以來源於有效的數據庫(LevelDB或LMDB)，直接來源於內存，或者從磁盤文件以HDF5或通用圖像格式。

通常輸入預處理（減均值，尺度化，隨機裁剪，鏡像）可以通過TransformationParameters指定。

Database：來源於LevelDB或LMDB的數據

layer類型：Data
參數：
- 必需的
  - source: 包含數據文件的目錄名
  - batch_size: 每次處理的輸入數目
- 可選的
  - rand_skip: 開始時跳過的輸入數目，對異步SGD
  - backend [default LEVELDB]: 選擇是否使用 LEVELDB或 LMDB

In-Memory：來源於內存的數據

layer類型：MemoryData
參數：
- 必需的
  - batch_size, channels, height, width: 指定從內存中讀取的輸入塊大小

內存數據層直接從內存中讀取數據，不拷貝。為了使用，需要調用MemoryDataLayer::Reset (from C++) 或Net.set_input_arrays (from Python) 指定連續數據的源，例如4D行主序數組，一次讀取一個batch-size的數據塊。

HDF5 Input：來源於HDF5輸入

layer類型：HDF5Data
參數：
- 必需的
  - source: 讀取數據的文件名
  - batch_size

HDF5 Output：HDF5輸出

layer類型：HDF5Output
參數：
- 必需的
  - filename: 寫數據的文件名

Images：圖像輸入

layer類型：ImageData
參數：
- 必需的
  - source: 一個文本文件的名字，文件中每行給出一個圖片名和label
  - batch_size: 每個batch處理的圖像數量
- 可選的
  - rand_skip
  - shuffle [default false]：打亂順序與否
  - new_height, new_width: 如果給出定義，將所有圖像resize到這個尺寸

Windows type: `WindowData`

Dummy

DummyData 用來開發和debug，詳見 DummyDataParameter.

Common Layers：一般層

Inner Product

layer類型：InnerProduct
CPU實現：./src/caffe/layers/inner_product_layer.cpp
CUDA GPU實現：./src/caffe/layers/inner_product_layer.cu
參數(InnerProductParameter inner_product_param)
- 必需的
  - num_output (c_o): 濾波器數目
- 強烈推薦的
  - weight_filler [default type: 'constant' value: 0]
- 可選的
  - bias_filler [default type: 'constant' value: 0]
  - bias_term [default true]: 指定是否對濾波器輸出學習和應用一組附加偏差項
輸入：n * c_i * h_i * w_i
輸出：n * c_o * 1 * 1

例子

layer {
  name: "fc8"
  type: "InnerProduct"
  # learning rate and decay multipliers for the weights
  param { lr_mult: 1 decay_mult: 1 }
  # learning rate and decay multipliers for the biases
  param { lr_mult: 2 decay_mult: 0 }
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  bottom: "fc7"
  top: "fc8"
}

內積層（實際上通常指全連接層）將輸入看成簡單向量，產生一個單個向量形式的輸出（blob的高和寬設置為1）。

Splitting：分割

分割層是一個功能層，將輸入blob分成多個輸出blob。這個layer用於一個blob被輸入到多個輸出層的情況。

Flattening：壓扁

flatten layer也是一個功能層，將形為n * c * h * w的blob輸入壓扁成一個形為n * (c * h * w)的簡單向量，實際上是單獨壓縮，每個數據是一個簡單向量，維度c * h * w，共n個向量。

Reshape：整形

layer類型：Reshape
CPU實現：./src/caffe/layers/reshape_layer.cpp
參數(ReshapeParameter reshape_param)
- 可選的
  - shape
輸入：一個任意維度的blob
輸出：同一個blob，維度修改為reshape_param
例子：
```
  layer {
    name: "reshape"
    type: "Reshape"
    bottom: "input"
    top: "output"
    reshape_param {
      shape {
        dim: 0  # copy the dimension from below
        dim: 2
        dim: 3
        dim: -1 # infer it from the other dimensions
      }
    }
  }
```
reshape layer用於改變輸入維度，但是不改變數據。就像flatten layer一樣，僅僅數據維度改變，過程中沒有數據被拷貝。
輸出維度被Reshape_param指定。幀數直接使用，設置相應的輸出blob的維度。在目標維度值設置時，兩個特殊值被接受：
- 0：從bottom layer拷貝相應維度。如果給定dim: 0，且bottom由2作為第一維維度，那么top layer也由2作為第一維維度 ==> 不改變原始維度
- -1：代表從其他維度推斷這一維維度。這個行為與numpy的-1和Matlab reshape時的[ ]作用是相似的。維度被計算，使得總體輸出維度與bottom layer相似。在reshape操作中至多可以設置一個-1。

另外一個例子，指定reshape_param{shape{dim: 0 dim:-1}}作用與Flatten layer作用相同，都是將輸入blob壓扁成向量。

Concatenation：拼接

concat layer是一個功能層，用於將多個輸入blob拼接城一個單個的輸出blob。

layer類型：Concat
CPU實現：./src/caffe/layers/concat_layer.cpp
CUDA GPU實現：./src/caffe/layers/concat_layer.cu
參數(ConcatParameter concat_param)
- 可選的
  - axis [default 1]: 0表示沿着num連接，1表示按通道連接。
輸入：n_i * c_i * h * w，K個輸入blob
輸出：
- 如果axis = 0: (n_1 + n_2 + ... + n_K) * c_1 * h * w，所有輸入的c_i應該相同；
- 如果axis = 1: n_1 * (c_1 + c_2 + ... + c_K) * h * w，所有輸入的n_i應該相同。

例子：

layer {
  name: "concat"
  bottom: "in1"
  bottom: "in2"
  top: "out"
  type: "Concat"
  concat_param {
    axis: 1
  }
}

Slicing：切片

slice layer也是一個功能層，將一個輸入層沿着給定維度（當前僅提供基於num和通道的實現）切片成多個輸出層。

例子：

layer {
  name: "slicer_label"
  type: "Slice"
  bottom: "label"
  ## Example of label with a shape N x 3 x 1 x 1
  top: "label1"
  top: "label2"
  top: "label3"
  slice_param {
    axis: 1
    slice_point: 1
    slice_point: 2
  }
}

axis表示目標axis，沿着給定維度切片。slice_point表示選擇維度的索引，索引數目應該等於頂層blob數目減一。

Elementwise Operations

Eltwise

Argmax

ArgMax

Softmax

Softmax

Mean-Variance Normalization

MVN

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 caffe——常用layer參數梳理caffe代碼layer(五) caffe python layer [Caffe]: 關於concat layer 轉caffe scale layer Caffe源碼解析3：Layer caffe rpn layer 中的 reshape layer Caffe 單獨測試添加的layer Caffe 單獨測試添加的layer 怎樣在caffe中添加layer以及caffe中triplet loss layer的實現