放棄使用pytorch,學習caffe
本文僅記錄個人觀點,不免存在許多錯誤
Caffe 學習
caffe模型生成需要如下步驟
- 編寫network.prototxt
- 編寫solver.prototxt
- caffe train -solver=solver.prototxt
network.prototxt編寫
在caffe中,Net由Layer構成,其中數據由Blob進行傳遞
network編寫就是組織layer
關於layer如何編寫,參考caffe.proto
這里寫出layer一般形式
layer{
name: "layer name"
type: "layer type"
bottom: "bottom blob"
top: "top blob"
param{
...
}
include{ phase: ... }
exclude{ phase: ... }
# 對某一type的layer參數, 這里以內積層為例
inner_product_param{
num_output: 64
weight_filler{
type: "xavier"
}
bias_filler{
type: "constant"
value: 0
}
axis=1
}
}
在這里簡要說一下我們的目的,
通過中文分詞rnn和貝葉斯分類器實現一個垃圾信息處理的功能
這里直接附上我的network好了,反正沒人看: (
# project for chinese segmentation
# T: 64, batch: 64
# label[T*batch, 1, 1, 1] cont[T*batch, 1, 1, 1]=0 or 1
# data[T*batch, 1, 1, 1] ->
# embed[T*batch, 2000, 1, 1](drop&reshape) -> [T, batch, 2000, 1]
# lstm[T, batch, 256, 1](drop) ->
# ip[T, batch, 64, 1](relu) ->
# ip[T, batch, 5, 1] ->
# Accuracy & SoftMaxWithLoss
# for output: 0-none, 1-Signal, 2-Begin, 3-Middle, 4-End
name: "Segment"
# train data
layer{
name: "train_data"
type: "HDF5Data"
top: "data"
top: "label"
top: "cont"
include{ phase: TRAIN }
hdf5_data_param{
source: "/home/tanglizi/caffe/projects/data_segment/h5_test.txt"
batch_size: 4096
shuffle: true
}
}
# test data
layer{
name: "test_data"
type: "HDF5Data"
top: "data"
top: "label"
top: "cont"
include{ phase: TEST }
hdf5_data_param{
source: "/home/tanglizi/caffe/projects/data_segment/h5_test.txt"
batch_size: 4096
shuffle: true
}
}
# embed
layer{
name: "embedding"
type: "Embed"
bottom: "data"
top: "embedding"
param{
lr_mult: 1
}
embed_param{
input_dim: 14000
num_output: 2000
weight_filler {
type: "uniform"
min: -0.08
max: 0.08
}
}
}
# embed-drop
layer{
name: "embed-drop"
type: "Dropout"
bottom: "embedding"
top: "embed-drop"
dropout_param{
dropout_ratio: 0.05
}
}
# reshape
# embed
# [T*batch, 2000, 1, 1] ->
# [T, batch, 2000, 1]
layer{
name: "embed-reshape"
type: "Reshape"
bottom: "embed-drop"
top: "embed-reshaped"
reshape_param{
shape{
dim: 64
dim: 64
dim: 2000
}
}
}
# label
layer{
name: "label-reshape"
type: "Reshape"
bottom: "label"
top: "label-reshaped"
reshape_param{
shape{
dim: 64
dim: 64
dim: 1
}
}
}
# cont
layer{
name: "cont-reshape"
type: "Reshape"
bottom: "cont"
top: "cont-reshaped"
reshape_param{
shape{
dim: 64
dim: 64
}
}
}
# lstm
layer{
name: "lstm"
type: "LSTM"
bottom: "embed-reshaped"
bottom: "cont-reshaped"
top: "lstm"
recurrent_param{
num_output: 256
weight_filler{
# type: "xavier"
type: "uniform"
min: -0.08
max: 0.08
}
bias_filler{
type: "constant"
value: 0
}
}
}
# lstm-drop
layer{
name: "lstm1-drop"
type: "Dropout"
bottom: "lstm"
top: "lstm-drop"
dropout_param{
dropout_ratio: 0.05
}
}
# connect
# ip1
layer{
name: "ip1"
type: "InnerProduct"
bottom: "lstm-drop"
top: "ip1"
param{
lr_mult: 1
decay_mult: 1
}
param{
lr_mult: 2
decay_mult: 0
}
inner_product_param{
num_output: 64
weight_filler{
type: "xavier"
}
bias_filler{
type: "constant"
value: 0
}
axis: 2
}
}
# relu
layer{
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "relu1"
relu_param{
negative_slope: 0
}
}
# ip2
layer{
name: "ip2"
type: "InnerProduct"
bottom: "relu1"
top: "ip2"
param{
lr_mult: 1
}
param{
lr_mult: 2
}
inner_product_param{
num_output: 5
weight_filler{
type: "xavier"
}
bias_filler{
type: "constant"
value: 0
}
axis: 2
}
}
# loss
layer{
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label-reshaped"
top: "loss"
softmax_param{
axis: 2
}
}
# accuracy
layer{
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label-reshaped"
top: "accuracy"
accuracy_param{
axis: 2
}
}
solver.prototxt編寫
solver用於調整caffe訓練等操作的超參數
solver如何編寫,參考caffe.proto
附上一般寫法
net: "network.proto"
test_iter: 100
test_interval: 500
type: "Adam"
base_lr: 0.01
weight_decay: 0.0005
lr_policy: "inv"
display: 100
max_iter: 10000
snapshot: 5000
snapshot_prefix: "/home/tanglizi/caffe/projects/segment/"
solver_mode: CPU
訓練模型
caffe train -solver=solver.prototxt
這時可能報錯:
Message type "caffe.MultiPhaseSolverParameter" has no field named "net".
請注意不是沒有net,而是其他參數設置有誤
intel caffe特有的報錯
Caffemodel 的使用
模型訓練的結果很有問題,accuracy非常低,感覺又是network寫錯了
於是想看看其中發生了什么
caffemodel可以通過c++或python matlab接口來使用
接下來進入intel caffe 和intel devcloud大坑
pycaffe的使用
注意:以下python代碼在devcloud進行
首先我們知道caffe模型就是訓練好的一個神經網絡
於是必然需要caffe.Net()來讀取caffemodel和net.prototxt,需要caffe.io讀取數據
import caffe
from caffe import io
# 這時報錯:
#Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
#ImportError: cannot import name 'io'
連忙查看caffe里面有什么
dir(caffe)
# 顯示 ['__doc__', '__loader__', '__name__', '__package__', '__path__', '__spec__']
# 正常顯示 ['AdaDeltaSolver', 'AdaGradSolver', 'AdamSolver', 'Classifier', 'Detector', 'Layer', 'NesterovSolver',
# 'Net', 'NetSpec', 'RMSPropSolver', 'SGDSolver', 'TEST', 'TRAIN', '__builtins__', '__doc__', '__file__', '__name__',
# '__package__', '__path__', '__version__', '_caffe', 'classifier', 'detector', 'get_solver', 'init_log', 'io', 'layer_type_list',
# 'layers', 'log', 'net_spec', 'params', 'proto', 'pycaffe', 'set_device', 'set_mode_cpu', 'set_mode_gpu', 'set_random_seed', 'to_proto']
淦,根本什么都沒有
由於我們的項目需要必須在服務器上進行,所以不考慮在本地機器上運行
現在有兩條路:重新編譯一個caffe 或用c++實現
懶得搞事情,選擇c++實現
c++中使用caffemodel
注:以下過程使用intel caffe
首先我們知道caffe模型就是訓練好的一個神經網絡
於是必然需要caffe.Net()來讀取caffemodel和net.prototxt
// predict.cpp
#include <caffe/caffe.hpp>
boost::shared_ptr< Net<float> > net(new caffe::Net<float>(net, Caffe::TEST));
- 開始手動編譯
# 注意到caffe.hpp的位置,我們添加路徑即可
clang++ -I <caffe path>/include -lboost_system predict.cpp -o predict
#不料報錯
#/tmp/predict-fea879.o: In function 'main':
#predict.cpp:(.text+0x35b): undefined reference to 'caffe::Net<int>::Net(std::__cxx11::basic_string<char, std::char_traits<char>,
#std::allocator<char> > const&, caffe::Phase, int, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>,
#std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const*,
# caffe::Net<int> const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
#clang: error: linker command failed with exit code 1 (use -v to see invocation)
# 看起來找不到libcaffe,添加路徑即可
clang++ -I <caffe path>/include -lboost_system predict.cpp -o predict -L <caffe path>/build/lib -lcaffe
# 不料報錯 錯誤相同
- 放棄手動編譯,放在examples/下重新編譯caffe
不料報錯 錯誤相同 - 放在tools/下(caffe.cpp的位置)重新編譯caffe
直接跳過跳過編譯predict.cpp
煩 放棄本地使用c++ - 在devcloud上手動編譯
不料報錯 錯誤相同
雲上都編譯不了我還干chua - 重新編譯intel caffe
按照環境重新配置Makefile.config
編譯報錯
In file included from .build_release/src/caffe/proto/caffe.pb.cc:5:0:
.build_release/src/caffe/proto/caffe.pb.h:12:2: error: #error This file was generated by a newer version of protoc which is
#error This file was generated by a newer version of protoc which is
.build_release/src/caffe/proto/caffe.pb.h:13:2: error: #error incompatible with your Protocol Buffer headers. Please update
#error incompatible with your Protocol Buffer headers. Please update
.build_release/src/caffe/proto/caffe.pb.h:14:2: error: #error your headers.
#error your headers.
.build_release/src/caffe/proto/caffe.pb.h:22:35: fatal error: google/protobuf/arena.h: No such file or directory
#include <google/protobuf/arena.h>
查了一下,此處需要libprotoc 2.6.1,然而devcloud上libprotoc 3.2.0
煩死了
於是查到這個文章,在此十分感謝 @大黃老鼠 同學!!!
好了現在完全放棄caffe了!
轉戰chainer!