caffe的編譯配置真的是很讓人頭疼啊,不知道試過多少次了~~~
重裝系統了七八次,搞得linux的一些常用命令倒是很熟悉了~~~
我有潔癖~~~某一個點上出了錯,我一定要把它搞好了,再重新來一次,我怕會因為某一點的小錯誤會影響到其它重要的地方。。。(有同感的默默在心里舉個爪~~~^_^~~~)
又折騰了好幾次,參考了很多的博客,總結出一整套的安裝配置流程!
開始:
- 網絡無問題即可,不用太糾結
- 需要更換默認的驅動和安裝CUDA,但是如果你的cuda的計算能力達不到3.0及以上,請跳過本部分。
驅動安裝過程中可能會出現問題:the nouveao kernel driver未禁用的錯誤。
sudo gedit /etc/modprobe.d/blacklist.conf
在最后加上兩行:
1 blacklist nouveau 2 options nouveau modeset=0
然后執行:
sudo update-initramfs -u
reboot重啟即可。重啟后會發現字體變大了。
即是初始驅動已經禁用了。再次重試安裝即可。
1 sudo chmod +x NVIDIA-Linux-x86_64-367.44.run 2 sudo ./NVIDIA-Linux-x86_64-367.44.run
1sudo dpkg -i cuda-repo-ubuntu1604-8-0-rc_8.0.27-1_amd64.deb 2sudo apt-get update 3sudo apt-get install cuda 4sudo dpkg -i cuda-misc-headers-8-0_8.0.27.1-1_amd64.deb
1、聲明環境變量:
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export CUDA_PATH=/usr/local/cuda-8.0/lib64${CUDA_PATH:+:${CUDA_PATH}}
2、設置文件:
sudo gedit /etc/profile
3、在文件末尾添加:
export PATH=/usr/local/cuda/bin:$PATH
4、創建鏈接文件:
sudo gedit /etc/ld.so.conf.d/cuda.conf
5、在打開的文件中添加:
/usr/local/cuda/lib64
6、最后執行
sudo ldconfig
7、運行測試用例
cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery
sudo make
sudo ./deviceQuery
然后即可顯示出關於GPU的信息,則說明安裝成功了
8、另外使用命令:nvidia-smi直接會輸出支持cuda的GPU設備列表
- 這里設置使用cudnn加速,一定注意前面說的計算能力問題,后面還會提到!!!
cd cuda
sudo cp ./include/cudnn.h /usr/local/cuda/include/ #復制頭文件
sudo cp ./lib64/lib* /usr/local/cuda/lib64/ #復制動態鏈接庫
cd /usr/local/cuda/lib64/
sudo rm -rf libcudnn.so libcudnn.so.5 #刪除原有動態文件
sudo ln -s libcudnn.so.5.0.5 libcudnn.so.5 #生成軟銜接
sudo ln -s libcudnn.so.5 libcudnn.so #生成軟鏈接
1 # cuDNN acceleration switch (uncomment to build with cuDNN). 2 USE_CUDNN := 1
查看CUDA計算容量:
sudo /usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery
2.1
在caffe的Makefile.config文件中,找到並修改:
CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
-gencode arch=compute_20,code=sm_21 \ -gencode arch=compute_21,code=sm_21 \ -gencode arch=compute_30,code=sm_30 \ -gencode arch=compute_35,code=sm_35 \ -gencode arch=compute_50,code=sm_50 \ -gencode arch=compute_50,code=compute_50
不知道是不是有什么用處,但是據說cudnn加速需要cuda計算能力在3.0以上才可以!
- 這里需要注意opencv的版本:最好使用2.4.13,其它版本會出錯誤!!!
注意編譯之前確保numpy已經安裝,否則最后不會生成cv2.so
sudo apt-get install python-numpy python3-numpy
可能會出現錯誤:error:1 /usr/include/string.h:652:42: error: ‘memcpy’ was not declared in this scope
原因是g++版本太新了,需要在CMakeLists.txt中前面幾行添加
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_FORCE_INLINES")
然后再重新執行2中的cmake一次即可。
linux發行版通常會把類庫的頭文件和相關的pkg-config分拆成一個單獨的xxx-dev(el)包.
以python為例, 以下情況你是需要python-dev的
你需要自己安裝一個源外的python類庫, 而這個類庫內含需要編譯的調用python api的c/c++文件
你自己寫的一個程序編譯需要鏈接libpythonXX.(a|so)
(注:以上不含使用ctypes/ffi或者裸dlsym方式直接調用libpython.so)
其他正常使用python或者通過安裝源內的python類庫的不需要python-dev.
cython>=0.19.2
numpy>=1.7.1
scipy>=0.13.2
scikit-image>=0.9.3
matplotlib>=1.3.1
ipython>=3.0.0
h5py>=2.2.0
leveldb>=0.191
networkx>=1.8.1
nose>=1.3.0
pandas>=0.12.0
python-dateutil>=1.4,<2
protobuf>=2.5.0
python-gflags>=2.0
pyyaml>=3.10
Pillow>=2.3.0
six>=1.1.0
-
-
-
- 這里Matlab engine是非常重要的步驟
-
-
PASS: protobuf-test
PASS: protobuf-lazy-descriptor-test
PASS: protobuf-lite-test
PASS: google/protobuf/compiler/zip_output_unittest.sh
PASS: google/protobuf/io/gzip_stream_unittest.sh
=================================
Testsuite summary for Protocol Buffers 2.5.0
=================================
# TOTAL: 5
# PASS: 5
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
=================================

-
-
- 這里的Makefile設置非常重要
-
CPU_ONLY := 1
USE_OPENCV := 0
USE_LEVELDB := 0
USE_LMDB := 0
USE_OPENCV := 1
USE_LEVELDB := 1
USE_LMDB := 1
CUSTOM_CXX := g++
WITH_PYTHON_LAYER := 1
# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial/
可能會出現錯誤:Check failed: status == CUDNN_STATUS_SUCCESS (6 vs. 0)
說明GPU的加速性能不夠,CUDNN只支持CUDA Capability 3.0以上的GPU加速,所以不能使用CUDNN加速,需要在Makefile.config中注釋掉USE_CUDNN := 1
一定要注意自己GPU硬件的計算能力問題!!!
[----------] Global test environment tear-down
[==========] 996 tests from 141 test cases ran. (45874 ms total)
[ PASSED ] 996 tests.
-
-
- 這里的pycaffe接口非常重要,一定要配置測試好!!!(先編譯好caffe后再進行pycaffe接口編譯)
-
LD -o .build_release/lib/libcaffe.so.1.0.0-rc3
CXX/LD -o python/caffe/_caffe.so python/caffe/_caffe.cpp
touch python/caffe/proto/__init__.py
PROTOC (python) src/caffe/proto/caffe.proto
Downloading...
--2016-10-07 23:44:11-- http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Resolving yann.lecun.com (yann.lecun.com)... 128.122.47.89
Connecting to yann.lecun.com (yann.lecun.com)|128.122.47.89|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9912422 (9.5M) [application/x-gzip]
Saving to: ‘train-images-idx3-ubyte.gz’
train-images-idx3-ubyte.gz 100%[=====================================>] 9.45M 39.5KB/s in 2m 42s
2016-10-07 23:46:53 (59.9 KB/s) - ‘train-images-idx3-ubyte.gz’ saved [9912422/9912422]
Creating lmdb...
I1007 23:47:04.655964 18706 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_train_lmdb
I1007 23:47:04.656126 18706 convert_mnist_data.cpp:88] A total of 60000 items.
I1007 23:47:04.656134 18706 convert_mnist_data.cpp:89] Rows: 28 Cols: 28
I1007 23:47:09.992278 18706 convert_mnist_data.cpp:108] Processed 60000 files.
I1007 23:47:10.043660 18708 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_test_lmdb
I1007 23:47:10.043848 18708 convert_mnist_data.cpp:88] A total of 10000 items.
I1007 23:47:10.043862 18708 convert_mnist_data.cpp:89] Rows: 28 Cols: 28
I1007 23:47:10.859005 18708 convert_mnist_data.cpp:108] Processed 10000 files.
Done.
USE_LEVELDB := 1
USE_LMDB := 1
2、修改配置
修改該目錄下的prototxt擴展名配置文件
修改./examples/mnist/lenet_solver.prototxt
定位到最后一行:solver_mode: GPU,將GPU改為CPU。 直接先使用CPU進行測試
3、運行
1007 23:53:09.915892 18795 caffe.cpp:210] Use CPU.
I1007 23:53:09.916203 18795 solver.cpp:48] Initializing solver from parameters:
test_iter: 100
test_interval: 500
base_lr: 0.01
display: 100
max_iter: 10000
lr_policy: "inv"
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
solver_mode: CPU
net: "examples/mnist/lenet_train_test.prototxt"
train_state {
level: 0
stage: ""
}
I1008 00:12:51.708220 18795 sgd_solver.cpp:106] Iteration 9800, lr = 0.00599102
I1008 00:13:02.717388 18795 solver.cpp:228] Iteration 9900, loss = 0.00611393
I1008 00:13:02.717483 18795 solver.cpp:244] Train net output #0: loss = 0.00611391 (* 1 = 0.00611391 loss)
I1008 00:13:02.717496 18795 sgd_solver.cpp:106] Iteration 9900, lr = 0.00596843
I1008 00:13:14.016697 18795 solver.cpp:454] Snapshotting to binary proto file examples/mnist/lenet_iter_10000.caffemodel
I1008 00:13:14.025446 18795 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_10000.solverstate
I1008 00:13:14.084300 18795 solver.cpp:317] Iteration 10000, loss = 0.00241856
I1008 00:13:14.084349 18795 solver.cpp:337] Iteration 10000, Testing net (#0)
I1008 00:13:21.108484 18795 solver.cpp:404] Test net output #0: accuracy = 0.9905
I1008 00:13:21.108542 18795 solver.cpp:404] Test net output #1: loss = 0.0295916 (* 1 = 0.0295916 loss)
I1008 00:13:21.108553 18795 solver.cpp:322] Optimization Done.
I1008 00:13:21.108559 18795 caffe.cpp:254] Optimization Done.