GPU版的Tensorflow無疑是深度學習的一大神器,當然caffe之類的框架也可以用GPU來加速訓練。
注意:以下安裝默認為python2.7
1. 安裝依賴包
$ sudo apt-get install openjdk-8-jdk git python-dev python3-dev python-numpy python3-numpy python-six python3-six build-essential python-pip python3-pip python-virtualenv swig python-wheel python3-wheel libcurl3-dev libcupti-dev
其中openjdk是必須的,不然在之后配置文件的時候會報錯。
2. 安裝CUDA和cuDNN
這兩個是NVIDIA開發的專門用於機器學習的底層計算框架,通過軟硬件的加成達到深度學習吊打I卡的神功。
安裝的CUDA和cuDNN版本以來所選用的顯卡,可以在這里查詢。這里我們用的是GeForce 1080ti,所以對應的版本為CUDA8.0(.run版本)(這里下載)和cuDNN6.0(這里下載)。
# 安裝cuda
$ wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run
$ sudo sh cuda_8.0.61_375.26_linux.run --override --silent --toolkit # 安裝的cuda在/usr/local/cuda下面
# 安裝cdDNN
$ cd /usr/local/cuda # cuDNN放在這個目錄下解壓
$ tar -xzvf cudnn-8.0-linux-x64-v6.0.tgz
$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
然后將將一下路徑加入環境變量:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda
即將上述代碼放入~/.bashrc文件保存后source ~/.bashrc
3. 安裝Tensorflow
自從tensorflow 1.0版正式發布以后,安裝已經非常容易,在安裝CUDA和cuDNN之后,只需以下兩步即可安裝GPU版本的tf:
$ sudo apt-get install python-pip python-dev
$ pip install tensorflow-gpu
但是用這種方法安裝的會在運行的時候報一下警告:
這是由於用默認安裝是沒有編譯上面這4個庫,需要手動編譯。
3.1 安裝bazel
$ echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
$ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
$ sudo apt-get update
$ sudo apt-get install bazel
3.2 clone tf
$ git clone https://github.com/tensorflow/tensorflow
$ cd tensorflow
$ git checkout r1.1 # 選擇tf 1.1版本
3.3 配置tf
$ ./configure # 以下是一個例子
Please specify the location of python. [Default is /usr/bin/python]: y
Invalid python path. y cannot be found
Please specify the location of python. [Default is /usr/bin/python]:
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n] y
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] y
Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] n
No XLA JIT support will be enabled for TensorFlow
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use. Default is [/usr/local/lib/python2.7/dist-packages]
Using python library path: /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with OpenCL support? [y/N] n
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 5
Please specify the location where cuDNN 5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 6.1
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
........
INFO: All external dependencies fetched successfully.
Configuration finished
3.4 build tf
為了編譯之前提到的SSE4.1, SSE4.2, AVX, AVX2, FMA,用bazel build的時候設置參數如下(具體可以參考這里):
$ bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k //tensorflow/tools/pip_package:build_pip_package
3.5 封裝tf
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
3.6 安裝tf
$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.2.1-cp27-cp27mu-linux_x86_64.whl
3.7 測試
$ python
import tensorflow as tf
sess = tf.Session()
參考
https://alliseesolutions.wordpress.com/2016/09/08/install-gpu-tensorflow-from-sources-w-ubuntu-16-04-and-cuda-8-0/
https://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions
https://www.tensorflow.org/install/install_linux