[Caffe]使用經驗積累

本文轉載自查看原文 2017-09-03 19:33 1102 深度學習/ Caffe

Caffe使用經驗積累

本貼記錄Caffe編譯好了，使用過程的常用命令與常見錯誤解決方式。如果對編譯過程還存在問題，請參考史上最全的caffe安裝過程配置Caffe環境。

1 使用方法

訓練網絡

xxx/caffe/build/tools/caffe train --solver xx/solver.prototxt

選擇某個模型作為預訓練模型

xxx/caffe/build/tools/caffe train --solver solver.protxt --weights pre_training.caffemodel

繼續之前的狀態續訓

xxx/caffe/build/tools/caffe train --solver solver.protxt --snapshot=train_iter_95000.solverstate

畫出網絡結構

python /caffe/python/draw_net.py train_alex.prototxt alexnet.png

選擇多gpu進行訓練

xxx/caffe/build/tools/caffe train --solver xx/solver.prototxt --gpu=0,1

設置系統環境變量使所需GPU可見

export CUDA_VISIBLE_DEVICES=1

訓練log保存

nohup xxx/caffe/build/tools/caffe train –solver solver.prototxt	&
tail –f output

查看log中訓練loss的值

cat output.log | grep "Train net output" | awk '{print $11}' > loss.log

其中，awk的 ‘{print $11}’ 是用來截取串中的第11個子串

2 常見使用過程報錯含義

(1) errror: Check failed: error == cudaSuccess (2 vs. 0) out of memory
說明GPU內存不夠用了，減少batch_size即可，參考

(2) error: ImportError: No module named pydot when python draw_net.py train_val.prototxt xxx.png
使用draw_net.py畫圖時所報的錯誤，需要安裝graphviz

pip install pydot
pip install GraphViz		
sudo apt-get install graphviz

(3) error: Cannot copy param 0 weights from layer 'fc8'; shape mismatch.
Source param shape is 5 4096 (20480); target param shape is 1000 4096 (4096000). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
出現這個問題一般是層與層的之前blob維度對應不上，需要改prototxt

change deploy.prototxt	adapt to train_val.prototxt

(4) error: Use hdf5 as caffe input, error: HDF5Data does not transform data
transform_param { scale: 0.00392156862745098 }
這句是說如果HDF5作為輸入圖像，不支持scale操作，把它注釋就好了
Reference

(5) error: Loading list of HDF5 filenames from: failed to open source file
Read hdf5 data failed:

source中 .txt位置用絕對路徑
.txt中.h5文件的要用絕對路徑
.prototxt中應該是：hdf5_data_param {}而非data_param{}

(6) error: Top blob 'data' produced by multiple sources.
檢查數據輸入層是不是多了一層，比如定義了兩遍’data’

(7) Error: Check failed: shape[i] >= 0 (-1 vs. 0)

數據維度順序不對, blobs的順序: [ 圖像數量 N *通道數 C *圖像高度 H *圖像寬度 W ]
kernerl size 與 feature map的大小不對應

(8) Error: Check failed: outer_num_ * inner_num_ == bottom[1]->count() (128 vs 128x51)
這層是accuracy layer出現的問題，檢測accuracy的兩個bottom的維度是否對應，實在解決不了的話，直接去掉。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 5 Kafka 應用問題經驗積累 SlickEdit 使用技巧積累 git 使用積累【點滴積累】使用IIS Express 【WEB前端經驗之談】沒有速成，只有不斷積累。 Caffe 使用記錄（三）finetune 使用caffe測試自己的圖片 caffe使用 (python接口) [caffe] 安裝及使用注意 caffe的使用方法