版權聲明:本文為博主原創文章,遵循 CC 4.0 BY-SA 版權協議,轉載請附上原文出處鏈接和本聲明。
Scale Layer是輸入進行縮放和平移,常常出現在BatchNorm歸一化后,Caffe中常用BatchNorm+Scale實現歸一化操作(等同Pytorch中BatchNorm)
首先我們先看一下 ScaleParameter
message ScaleParameter {
// The first axis of bottom[0] (the first input Blob) along which to apply
// bottom[1] (the second input Blob). May be negative to index from the end
// (e.g., -1 for the last axis).
// 根據 bottom[0] 指定 bottom[1] 的形狀
// For example, if bottom[0] is 4D with shape 100x3x40x60, the output
// top[0] will have the same shape, and bottom[1] may have any of the
// following shapes (for the given value of axis):
// (axis == 0 == -4) 100; 100x3; 100x3x40; 100x3x40x60
// (axis == 1 == -3) 3; 3x40; 3x40x60
// (axis == 2 == -2) 40; 40x60
// (axis == 3 == -1) 60
// Furthermore, bottom[1] may have the empty shape (regardless of the value of
// "axis") -- a scalar multiplier.
// 例如,如果 bottom[0] 的 shape 為 100x3x40x60,則 top[0] 輸出相同的 shape;
// bottom[1] 可以包含上面 shapes 中的任一種(對於給定 axis 值).
// 而且,bottom[1] 可以是 empty shape 的,沒有任何的 axis 值,只是一個標量的乘子.
optional int32 axis = 1 [default = 1];
// (num_axes is ignored unless just one bottom is given and the scale is
// a learned parameter of the layer. Otherwise, num_axes is determined by the
// number of axes by the second bottom.)
// (忽略 num_axes 參數,除非只給定一個 bottom 及 scale 是網絡層的一個學習到的參數.
// 否則,num_axes 是由第二個 bottom 的數量來決定的.)
// The number of axes of the input (bottom[0]) covered by the scale
// parameter, or -1 to cover all axes of bottom[0] starting from `axis`.
// Set num_axes := 0, to multiply with a zero-axis Blob: a scalar.
// bottom[0] 的 num_axes 是由 scale 參數覆蓋的;
optional int32 num_axes = 2 [default = 1];
// (filler is ignored unless just one bottom is given and the scale is
// a learned parameter of the layer.)
// (忽略 filler 參數,除非只給定一個 bottom 及 scale 是網絡層的一個學習到的參數.
// The initialization for the learned scale parameter.
// scale 參數學習的初始化
// Default is the unit (1) initialization, resulting in the ScaleLayer
// initially performing the identity operation.
// 默認是單位初始化,使 Scale 層初始進行單位操作.
optional FillerParameter filler = 3;
// Whether to also learn a bias (equivalent to a ScaleLayer+BiasLayer, but
// may be more efficient). Initialized with bias_filler (defaults to 0).
// 是否學習 bias,等價於 ScaleLayer+BiasLayer,只不過效率更高
// 采用 bias_filler 進行初始化. 默認為 0.
optional bool bias_term = 4 [default = false];
optional FillerParameter bias_filler = 5;
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
Scale layer 在prototxt里面的書寫:
layer {
name: "scale_conv1"
type: "Scale"
bottom: "conv1"
top: "conv1"
scale_param {
bias_term: true
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
例如在MobileNet中:
layer {
name: "conv6_4/scale"
type: "Scale"
bottom: "conv6_4/bn"
top: "conv6_4/bn"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 1
decay_mult: 0
}
scale_param {
bias_term: true
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
</div>
<link href="https://csdnimg.cn/release/phoenix/mdeditor/markdown_views-095d4a0b23.css" rel="stylesheet">
</div>