目前流行的深度學習庫有Caffe,Keras,Theano,本文采用谷歌開源的曾用來制作AlphaGo的深度學習系統Tensorflow。
1:安裝Tensorflow
最早TensorFlow只支持mac和Linux系統,目前也支持windows系統,但要求python3.5 (64bit)版本。TensorFlow有cpu和gpu版本,由於本文使用服務器是NVIDIA顯卡,因此安裝gpu版本,在cmd命令行鍵入
pip install --upgrade tensorflow-gpu
如果出現錯誤“Cannot remove entries from nonexistent file”,執行以下命令
“pip install --upgrade -I setuptools”,安裝成功出現以下界面
2:安裝CUDA庫
用gpu來運行Tensorflow還需要配置CUDA和cuDnn庫,
用以下link下載win10(64bit)版本CUDA安裝包,大小約為1.2G https://developer.nvidia.com/cuda-downloads
安裝cuda_8.0.61_win10.exe,完成后配置系統變量,在系統變量中的CUDA_PATH中,加上三個路徑, C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\lib\x64
3:安裝cuDnn庫
用以下link下載cuDnn庫
https://developer.nvidia.com/cudnn
下載解壓后,為了在運行tensorflow的時候也能將這個庫加載進去,我們要將解壓后的文件拷到CUDA對應的文件夾下C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0
4:測試安裝
在Pycharm新建一個py文件鍵入
import tensorflow as tf
hello = tf.constant('Hello world, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
如果能夠輸出'Hello, TensorFlow!'像下面這樣就代表配置成功了。
1: 導入必要的模塊
import sys
import cv2
import numpy as np
import tensorflow as tf
2: 定義CNN的基本組件
按照LeNet5 的定義,采用32*32 圖像輸入,CNN的基本組件包括卷積C1,采樣層S1、卷積C2,采樣層S2、全結合層1、分類層2
3:訓練CNN
將輸入圖像縮小至32*32大小,采用opencv中的resize函數
其變換參數有
CV_INTER_NN - 最近鄰插值,
CV_INTER_LINEAR - 雙線性插值 (缺省使用)
CV_INTER_AREA - 使用象素關系重采樣。當圖像縮小時候,該方法可以避免波紋出現。當圖像放大時,類似於 CV_INTER_NN 方法..
CV_INTER_CUBIC - 立方插值.
output=cv2.resize(img,(32,32),interpolation=cv2.INTER_CUBIC)
核心代碼如下:
class CNNetwork:
NUM_CLASSES = 2 #分兩類
IMAGE_SIZE = 28
IMAGE_PIXELS = IMAGE_SIZE*IMAGE_SIZE*3
def inference(images_placeholder, keep_prob):
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
x_image = tf.reshape(images_placeholder, [-1, 28, 28, 1])
with tf.name_scope('conv1') as scope:
W_conv1 = weight_variable([5, 5, 3, 32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
with tf.name_scope('pool1') as scope:
h_pool1 = max_pool_2x2(h_conv1)
with tf.name_scope('conv2') as scope:
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
with tf.name_scope('pool2') as scope:
h_pool2 = max_pool_2x2(h_conv2)
with tf.name_scope('fc1') as scope:
W_fc1 = weight_variable([7*7*64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
with tf.name_scope('fc2') as scope:
W_fc2 = weight_variable([1024, NUM_CLASSES])
b_fc2 = bias_variable([NUM_CLASSES])
with tf.name_scope('softmax') as scope:
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
return y_conv
if __name__ == '__main__':
test_image = []
filenames = []
for i in range(1, len(sys.argv)):
img = cv2.imread(sys.argv[i])
img = cv2.resize(img, (28, 28))
test_image.append(img.flatten().astype(np.float32)/255.0)
filenames.append(sys.argv[i])
test_image = np.asarray(test_image)
images_placeholder = tf.placeholder("float", shape=(None, IMAGE_PIXELS))
labels_placeholder = tf.placeholder("float", shape=(None, NUM_CLASSES))
keep_prob = tf.placeholder("float")
logits = inference(images_placeholder, keep_prob)
sess = tf.InteractiveSession()
saver = tf.train.Saver()
sess.run(tf.initialize_all_variables())
saver.restore(sess, "model.ckpt")
for i in range(len(test_image)):
pred = np.argmax(logits.eval(feed_dict={
images_placeholder: [test_image[i]],
keep_prob: 1.0 })[0])
pred2 = logits.eval(feed_dict={
images_placeholder: [test_image[i]],
keep_prob: 1.0 })[0]
print filenames[i],pred,"{0:10.8f}".format(pred2[0]),"{0:10.8f}".format(pred2[1])