轉載請注明出處:
http://www.cnblogs.com/darkknightzh/p/6065526.html
本部分多試幾次就可以弄得清每一層具體怎么訪問了。
step1. 網絡定義如下:
require "dpnn" local net = nn.Sequential() net:add(nn.SpatialConvolution(3, 64, 7, 7, 2, 2, 3, 3)) net:add(nn.SpatialBatchNormalization(64)) net:add(nn.ReLU()) net:add(nn.SpatialMaxPooling(3, 3, 2, 2, 1, 1)) net:add(nn.Inception{ inputSize = 64, kernelSize = {3, 5}, kernelStride = {1, 1}, outputSize = {128, 32}, reduceSize = {96, 16, 32, 64}, pool = nn.SpatialMaxPooling(3, 3, 1, 1, 1, 1), batchNorm = true }) net:evaluate()
上面的網絡,包含conv+BatchNorm+ReLU+Maxpool+Inception層。
step2. 直接通過print(net)便可得到其網絡結構:
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
(1): nn.SpatialConvolution(3 -> 64, 7x7, 2,2, 3,3)
(2): nn.SpatialBatchNormalization
(3): nn.ReLU
(4): nn.SpatialMaxPooling(3x3, 2,2, 1,1)
(5): nn.Inception @ nn.DepthConcat {
input
|`-> (1): nn.Sequential {
| [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output]
| (1): nn.SpatialConvolution(64 -> 96, 1x1)
| (2): nn.SpatialBatchNormalization
| (3): nn.ReLU
| (4): nn.SpatialConvolution(96 -> 128, 3x3, 1,1, 1,1)
| (5): nn.SpatialBatchNormalization
| (6): nn.ReLU
| }
|`-> (2): nn.Sequential {
| [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output]
| (1): nn.SpatialConvolution(64 -> 16, 1x1)
| (2): nn.SpatialBatchNormalization
| (3): nn.ReLU
| (4): nn.SpatialConvolution(16 -> 32, 5x5, 1,1, 2,2)
| (5): nn.SpatialBatchNormalization
| (6): nn.ReLU
| }
|`-> (3): nn.Sequential {
| [input -> (1) -> (2) -> (3) -> (4) -> output]
| (1): nn.SpatialMaxPooling(3x3, 1,1, 1,1)
| (2): nn.SpatialConvolution(64 -> 32, 1x1)
| (3): nn.SpatialBatchNormalization
| (4): nn.ReLU
| }
|`-> (4): nn.Sequential {
[input -> (1) -> (2) -> (3) -> output]
(1): nn.SpatialConvolution(64 -> 64, 1x1)
(2): nn.SpatialBatchNormalization
(3): nn.ReLU
}
... -> output
}
}
但實際上該網絡還包括input,output,gradInput等參數。
step3. 使用下面代碼便可輸出網絡比較詳細的參數:
for k,curLayer in pairs(net) do print(k,curLayer) end
step4. 輸出:
_type torch.DoubleTensor
output [torch.DoubleTensor with no dimension]
gradInput [torch.DoubleTensor with no dimension]
modules {
1 :
{
dH : 2
dW : 2
nInputPlane : 3
output : DoubleTensor - empty
kH : 7
train : false
gradBias : DoubleTensor - size: 64
padH : 3
bias : DoubleTensor - size: 64
weight : DoubleTensor - size: 64x3x7x7
_type : "torch.DoubleTensor"
gradWeight : DoubleTensor - size: 64x3x7x7
padW : 3
nOutputPlane : 64
kW : 7
gradInput : DoubleTensor - empty
}
2 :
{
gradBias : DoubleTensor - size: 64
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
running_var : DoubleTensor - size: 64
momentum : 0.1
gradWeight : DoubleTensor - size: 64
eps : 1e-05
_type : "torch.DoubleTensor"
affine : true
running_mean : DoubleTensor - size: 64
bias : DoubleTensor - size: 64
weight : DoubleTensor - size: 64
train : false
}
3 :
{
inplace : false
threshold : 0
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
train : false
val : 0
}
4 :
{
dH : 2
dW : 2
kW : 3
gradInput : DoubleTensor - empty
indices : DoubleTensor - empty
train : false
_type : "torch.DoubleTensor"
padH : 1
ceil_mode : false
output : DoubleTensor - empty
kH : 3
padW : 1
}
5 :
{
outputSize :
{
1 : 128
2 : 32
}
inputSize : 64
gradInput : DoubleTensor - empty
modules :
{
1 :
{
train : false
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules :
{
1 : {...}
2 : {...}
3 : {...}
4 : {...}
}
dimension : 2
size : LongStorage - size: 0
}
}
kernelStride :
{
1 : 1
2 : 1
}
_type : "torch.DoubleTensor"
module :
{
train : false
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules :
{
1 :
{
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules : {...}
train : false
}
2 :
{
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules : {...}
train : false
}
3 :
{
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules : {...}
train : false
}
4 :
{
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules : {...}
train : false
}
}
dimension : 2
size : LongStorage - size: 0
}
poolStride : 1
padding : true
reduceStride : {...}
transfer :
{
inplace : false
threshold : 0
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
val : 0
}
batchNorm : true
train : false
pool :
{
dH : 1
dW : 1
kW : 3
gradInput : DoubleTensor - empty
indices : DoubleTensor - empty
train : false
_type : "torch.DoubleTensor"
padH : 1
ceil_mode : false
output : DoubleTensor - empty
kH : 3
padW : 1
}
poolSize : 3
reduceSize :
{
1 : 96
2 : 16
3 : 32
4 : 64
}
kernelSize :
{
1 : 3
2 : 5
}
output : DoubleTensor - empty
}
}
train false
上面的modules中,分別為conv、BatchNorm、ReLU、Maxpool、Inception對應的參數。
step5. 可通過net.modules[1]來索引nn.SpatialConvolution。如print(net.modules[1])得到:
nn.SpatialConvolution(3 -> 64, 7x7, 2,2, 3,3)
step6. 如果想更進一步,輸出該層的參數,可以使用如下代碼(實際上step4中已經輸出了):
for k,curLayer in pairs(net.modules[1]) do if type(curLayer) ~= 'userdata' then print(k,curLayer) else local strval = ' ' for i = 1, curLayer:dim() do strval = strval .. curLayer:size(i) .. " " end print(k .. " " .. type(curLayer) .. " " .. string.format("\27[31m size: %s", strval)) end end
step7. 得到的結果為:
dH 2 dW 2 nInputPlane 3 output userdata size: kH 7 train false gradBias userdata size: 64 padH 3 bias userdata size: 64 weight userdata size: 64 3 7 7 _type torch.DoubleTensor gradWeight userdata size: 64 3 7 7 padW 3 nOutputPlane 64 kW 7 gradInput userdata size:
step8. 對於Inception層,step4中並沒有完全顯示出來。按照step5中的方式,使用net.modules[5]來得到Inception層。將step6進行更改,可輸出:
outputSize {
1 : 128
2 : 32
}
inputSize 64
gradInput userdata size:
modules {
1 :
{
train : false
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules :
{
1 :
{
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules :
{
1 : {...}
2 : {...}
3 : {...}
4 : {...}
5 : {...}
6 : {...}
}
train : false
}
2 :
{
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules :
{
1 : {...}
2 : {...}
3 : {...}
4 : {...}
5 : {...}
6 : {...}
}
train : false
}
3 :
{
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules :
{
1 : {...}
2 : {...}
3 : {...}
4 : {...}
}
train : false
}
4 :
{
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
modules :
{
1 : {...}
2 : {...}
3 : {...}
}
train : false
}
}
dimension : 2
size : LongStorage - size: 0
}
}
kernelStride {
1 : 1
2 : 1
}
_type torch.DoubleTensor
module nn.DepthConcat {
input
|`-> (1): nn.Sequential {
| [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output]
| (1): nn.SpatialConvolution(64 -> 96, 1x1)
| (2): nn.SpatialBatchNormalization
| (3): nn.ReLU
| (4): nn.SpatialConvolution(96 -> 128, 3x3, 1,1, 1,1)
| (5): nn.SpatialBatchNormalization
| (6): nn.ReLU
| }
|`-> (2): nn.Sequential {
| [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output]
| (1): nn.SpatialConvolution(64 -> 16, 1x1)
| (2): nn.SpatialBatchNormalization
| (3): nn.ReLU
| (4): nn.SpatialConvolution(16 -> 32, 5x5, 1,1, 2,2)
| (5): nn.SpatialBatchNormalization
| (6): nn.ReLU
| }
|`-> (3): nn.Sequential {
| [input -> (1) -> (2) -> (3) -> (4) -> output]
| (1): nn.SpatialMaxPooling(3x3, 1,1, 1,1)
| (2): nn.SpatialConvolution(64 -> 32, 1x1)
| (3): nn.SpatialBatchNormalization
| (4): nn.ReLU
| }
|`-> (4): nn.Sequential {
[input -> (1) -> (2) -> (3) -> output]
(1): nn.SpatialConvolution(64 -> 64, 1x1)
(2): nn.SpatialBatchNormalization
(3): nn.ReLU
}
... -> output
}
poolStride 1
padding true
reduceStride {}
transfer nn.ReLU
batchNorm true
train false
pool nn.SpatialMaxPooling(3x3, 1,1, 1,1)
poolSize 3
reduceSize {
1 : 96
2 : 16
3 : 32
4 : 64
}
kernelSize {
1 : 3
2 : 5
}
output userdata size:
step9. 在step8中,modules中為對應的inception各層(3*3卷積,5*5卷積,pooling,1*1reduce)。可通過net.modules[5].module來得到這些層。該層也有train,output,gradInput,modules等變量。可通過print(net.modules[5].module)來輸出。
step10. 根據step5中的思路,可通過net.modules[5].module.modules[1]來得到3*3卷基層具體情況:
_type torch.DoubleTensor
output userdata size:
gradInput userdata size:
modules {
1 :
{
dH : 1
dW : 1
nInputPlane : 64
output : DoubleTensor - empty
kH : 1
train : false
gradBias : DoubleTensor - size: 96
padH : 0
bias : DoubleTensor - size: 96
weight : DoubleTensor - size: 96x64x1x1
_type : "torch.DoubleTensor"
gradWeight : DoubleTensor - size: 96x64x1x1
padW : 0
nOutputPlane : 96
kW : 1
gradInput : DoubleTensor - empty
}
2 :
{
gradBias : DoubleTensor - size: 96
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
running_var : DoubleTensor - size: 96
momentum : 0.1
gradWeight : DoubleTensor - size: 96
eps : 1e-05
_type : "torch.DoubleTensor"
affine : true
running_mean : DoubleTensor - size: 96
bias : DoubleTensor - size: 96
weight : DoubleTensor - size: 96
train : false
}
3 :
{
inplace : false
threshold : 0
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
train : false
val : 0
}
4 :
{
dH : 1
dW : 1
nInputPlane : 96
output : DoubleTensor - empty
kH : 3
train : false
gradBias : DoubleTensor - size: 128
padH : 1
bias : DoubleTensor - size: 128
weight : DoubleTensor - size: 128x96x3x3
_type : "torch.DoubleTensor"
gradWeight : DoubleTensor - size: 128x96x3x3
padW : 1
nOutputPlane : 128
kW : 3
gradInput : DoubleTensor - empty
}
5 :
{
gradBias : DoubleTensor - size: 128
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
running_var : DoubleTensor - size: 128
momentum : 0.1
gradWeight : DoubleTensor - size: 128
eps : 1e-05
_type : "torch.DoubleTensor"
affine : true
running_mean : DoubleTensor - size: 128
bias : DoubleTensor - size: 128
weight : DoubleTensor - size: 128
train : false
}
6 :
{
inplace : false
threshold : 0
_type : "torch.DoubleTensor"
output : DoubleTensor - empty
gradInput : DoubleTensor - empty
train : false
val : 0
}
}
train false
注意:此處有一個module和一個modules,具體不太明白。
step11. 可通過net.modules[5].module.modules[1].modules進一步查看該層的情況:
1 nn.SpatialConvolution(64 -> 96, 1x1) 2 nn.SpatialBatchNormalization 3 nn.ReLU 4 nn.SpatialConvolution(96 -> 128, 3x3, 1,1, 1,1) 5 nn.SpatialBatchNormalization 6 nn.ReLU
可見,該層包括1*1conv,BatchNorm,ReLU,3*3conv,BatchNorm,Relu這些。
step12. 若要查看step11中的3*3卷基層信息,可使用如下索引:
net.modules[5].module.modules[1].modules[4]
結果如下:
dH 1 dW 1 nInputPlane 96 output userdata size: kH 3 train false gradBias userdata size: 128 padH 1 bias userdata size: 128 weight userdata size: 128 96 3 3 _type torch.DoubleTensor gradWeight userdata size: 128 96 3 3 padW 1 nOutputPlane 128 kW 3 gradInput userdata size:
step13. 到了step12,已經索引到了step1中網絡的最深層。網絡中每層均有input,output等。
step14. 對於net.modules[5]的Inception層,net.modules[5].output的結果和net.modules[5].module.output的結果是一樣的,如(為方便顯示,只顯示了一小部分。如果輸出net.modules[5].output,可能會有很多全為0的):
local imgBatch = torch.rand(1,3,128,128) local infer = net:forward(imgBatch) print(net.modules[5].output[1][2][3]) print(net.modules[5].module.output[1][2][3])
結果為:
0.01 * 2.7396 2.9070 3.1895 1.5040 1.9784 4.0125 3.2874 3.3137 2.1326 2.3930 2.8170 3.5226 2.3162 2.7308 2.8511 2.5278 3.3325 3.0819 3.2826 3.5363 2.5749 2.8816 2.2393 2.4765 2.4803 3.2553 3.0837 3.1197 2.4632 1.5145 3.7101 2.1888 [torch.DoubleTensor of size 32] 0.01 * 2.7396 2.9070 3.1895 1.5040 1.9784 4.0125 3.2874 3.3137 2.1326 2.3930 2.8170 3.5226 2.3162 2.7308 2.8511 2.5278 3.3325 3.0819 3.2826 3.5363 2.5749 2.8816 2.2393 2.4765 2.4803 3.2553 3.0837 3.1197 2.4632 1.5145 3.7101 2.1888 [torch.DoubleTensor of size 32]
