找不到cublas....:
在/etc/ld.so.conf文件夾中新建cuda.conf,里面添加/usr/local/cuda/lib64,然后sudo /sbin/ldconfig -v。
cannot find lopencv_xxxx:
apt-cache search opencv
sudo apt-get install yyy
一次解決煩惱?如果用的是opencv3+的話,要面臨更多痛苦。
這里的問題應當是make之后沒有install,在sudo gedit /etc/ld.so.conf.d/opencv.conf中添加/usr/local/lib和/usr/local/lib/x86_64-linux-gnu, 然后sudo /sbin/ldconfig -v
Check failed: fd != -1 (-1 vs. -1):
文件路徑不恰當,一般把[caffe]/當做項目根,其他文件以此做相對路徑。
Check failed: net_->num_inputs() == 1 (0 vs. 1) Network should have exactly one input:
引錯參數文件
Check failed: error == cudaSuccess (2 vs. 0) out of memory:
減小batch_size(每次迭代送入的樣本數),最好為8的倍數
設定test_iter(測試時調入的batch數量) = TEST樣本總量/batch_size(TEST的) (進一法)
增大snapshot(每迭代xx次生成一個模型)
Check failed: labels_.size() == output_layer->channels() (4 vs. 5) Number of labels is different from the output layer dimension.
train_val.prototxt和deploy.prototxt內,標簽數據不一致
Check failed: status == CUDNN_STATUS_SUCCESS (8 vs. 0) CUDNN_STATUS_EXECUTION_FAILED
刷新一下nvidia信息之類的,再train就好了,不是很懂.jpg
Check failed: (11 vs. 0)
設置的cuda屬性與顯卡實際算力不匹配
Check failed: mdb_status == 0 (13 vs. 0) Permission denied
sudo
Check failed: error == cudaSuccess (30 vs. 0) unknown error 或 Cannot create Cublas handle:
sudo
Check failed: error == cudaSuccess (73 vs. 0):
重運行幾次就好了,原因未知
Check failed: error == cudaSuccess (74 vs. 0):
調大max_iter,建議保持為test_interval的倍數
Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered:
nvidia-smi檢查進程,sudo kill -9 [PID],然后無sudo前綴運行一次cuda程序,再加上sudo前綴運行cuda程序。
注意prototxt中需求的空間和文件夾是否存在
Segmentation fault (core dumped) 或 malloc: memory coruption :
修改源文件,排查出錯誤的行,換用安全的方法重寫。(這一般是悲劇的開始)
corrupted size vs. prev_size:
???(悲劇達到高潮)
caffe/proto/caffe.pb.h not such file:
QT項目的.pro中 INCLUDEPATH += [caffe]/build/src
或復制該文件到[caffe]/src/caffe/proto
Error parsing text-format caffe.SolverParameter:
看具體報錯行號,參考 http://www.cnblogs.com/denny402/p/5074212.html 和 http://www.cnblogs.com/denny402/p/5074049.html 修正