本博文是對前面兩篇tensorflow的博文的一個繼續,對環境的更新。
基於tensorflow的MNIST手寫識別
安裝tensorflow,那叫一個坑啊
主要出發點:
上述兩篇博文的程序運行的環境,其實是沒有用到GPU的。本篇博文,介紹如何利用GPU。
首先通過pip重新安裝一個支持gpu的tensorflow,采用upgrade的方式進行。
[root@bogon tensorflow]# pip install --upgrade tensorflow-gpu Collecting tensorflow-gpu Downloading tensorflow_gpu-1.0.1-cp27-cp27mu-manylinux1_x86_64.whl (94.8MB) 100% |████████████████████████████████| 94.8MB 9.6kB/s Requirement already up-to-date: protobuf>=3.1.0 in /usr/lib64/python2.7/site-packages (from tensorflow-gpu) Requirement already up-to-date: six>=1.10.0 in /usr/lib/python2.7/site-packages (from tensorflow-gpu) Requirement already up-to-date: wheel in /usr/lib/python2.7/site-packages (from tensorflow-gpu) Requirement already up-to-date: mock>=2.0.0 in /usr/lib/python2.7/site-packages (from tensorflow-gpu) Requirement already up-to-date: numpy>=1.11.0 in /usr/lib64/python2.7/site-packages (from tensorflow-gpu) Requirement already up-to-date: setuptools in /usr/lib/python2.7/site-packages (from protobuf>=3.1.0->tensorflow-gpu) Requirement already up-to-date: funcsigs>=1; python_version < "3.3" in /usr/lib/python2.7/site-packages (from mock>=2.0.0->tensorflow-gpu) Requirement already up-to-date: pbr>=0.11 in /usr/lib/python2.7/site-packages (from mock>=2.0.0->tensorflow-gpu) Requirement already up-to-date: appdirs>=1.4.0 in /usr/lib/python2.7/site-packages (from setuptools->protobuf>=3.1.0->tensorflow-gpu) Requirement already up-to-date: packaging>=16.8 in /usr/lib/python2.7/site-packages (from setuptools->protobuf>=3.1.0->tensorflow-gpu) Requirement already up-to-date: pyparsing in /usr/lib/python2.7/site-packages (from packaging>=16.8->setuptools->protobuf>=3.1.0->tensorflow-gpu) Installing collected packages: tensorflow-gpu Successfully installed tensorflow-gpu-1.0.1
這個過程順利完成。
然后,將MNIST的手寫識別程序,在運行一下,驗證一下,是否啟用GPU。
[root@bogon tensorflow]# python mnist_demo1.py I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:126] Couldn't open CUDA library libcudnn.so.5. LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64: I tensorflow/stream_executor/cuda/cuda_dnn.cc:3517] Unable to load cuDNN DSO I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally Extracting MNIST_data/train-images-idx3-ubyte.gz Extracting MNIST_data/train-labels-idx1-ubyte.gz Extracting MNIST_data/t10k-images-idx3-ubyte.gz Extracting MNIST_data/t10k-labels-idx1-ubyte.gz W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate (GHz) 1.7335 pciBusID 0000:82:00.0 Total memory: 7.92GiB Free memory: 7.81GiB I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:82:00.0) F tensorflow/stream_executor/cuda/cuda_dnn.cc:222] Check failed: s.ok() could not find cudnnCreate in cudnn DSO; dlerror: /usr/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so: undefined symbol: cudnnCreate Aborted (core dumped)
上面紅色部分報錯了,找不到cudnn的so文件,進入到cuda的安裝路徑,查看是否有這個so。
[root@bogon lib64]# ll libcudnn libcudnn.so.5.1 libcudnn.so.5.1.5 libcudnn_static.a
的確沒有libcudnn.so.5的文件。
下面,建立一個軟連接,將libcudnn.so.5指向libcudnn.so.5.1。
[root@bogon lib64]# ln -s libcudnn.so.5.1 libcudnn.so.5 [root@bogon lib64]# ll libcudnn* lrwxrwxrwx. 1 root root 15 Mar 23 16:58 libcudnn.so.5 -> libcudnn.so.5.1 lrwxrwxrwx. 1 root root 17 Mar 20 17:12 libcudnn.so.5.1 -> libcudnn.so.5.1.5 -rwxr-xr-x. 1 root root 79337624 Mar 20 17:11 libcudnn.so.5.1.5 -rw-r--r--. 1 root root 69756172 Mar 20 17:12 libcudnn_static.a
現在,有了這個libcudnn.so.5的文件了。
再次驗證mnist的手寫識別程序。
[root@bogon tensorflow]# python mnist_demo1.py I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally Extracting MNIST_data/train-images-idx3-ubyte.gz Extracting MNIST_data/train-labels-idx1-ubyte.gz Extracting MNIST_data/t10k-images-idx3-ubyte.gz Extracting MNIST_data/t10k-labels-idx1-ubyte.gz W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate (GHz) 1.7335 pciBusID 0000:82:00.0 Total memory: 7.92GiB Free memory: 7.81GiB I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:82:00.0) 0.9092
到現在為止,我的tensorflow的運行環境,已經是基於GPU的了。
下面附上測試中的mnist_demo1.py的內容:
#!/usr/bin/env python # -*- coding: utf-8 -*- import tensorflow as tf import tensorflow.examples.tutorials.mnist.input_data as input_data mnist = input_data.read_data_sets("MNIST_data", one_hot=True) sess = tf.InteractiveSession() x = tf.placeholder("float", shape=[None, 784]) y_ = tf.placeholder("float", shape=[None, 10]) w = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) init = tf.global_variables_initializer() sess.run(init) y = tf.nn.softmax(tf.matmul(x, w) + b) cross_entropy = -tf.reduce_sum(y_*tf.log(y)) train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy) for i in range(1000): batch = mnist.train.next_batch(50) train_step.run(feed_dict={x: batch[0], y_: batch[1]}) correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) print accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels})
最后說明下,上述WARNING部分:
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
暫時沒有關注,所知道的處理辦法,就是用bazel進行源碼安裝tensorflow可以解決這個問題。由於不是太影響實驗,暫且不關注。