硬件環境:
RTX2070super 顯卡

軟件環境:
Ubuntu18.04.5

Tensorflow = 1.14.0

---------------------------------------------------------------------
運行代碼:
import numpy as np import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) def dense(x, size, scope): return tf.contrib.layers.fully_connected(x, size, activation_fn=None, scope=scope) def dense_relu(x, size, scope): with tf.variable_scope(scope): h1 = dense(x, size, 'dense') return tf.nn.relu(h1, 'relu') tf.reset_default_graph() x = tf.placeholder('float32', (None, 784), name='x') y = tf.placeholder('float32', (None, 10), name='y') phase = tf.placeholder(tf.bool, name='phase') h1 = dense_relu(x, 100, 'layer1') h1 = tf.contrib.layers.batch_norm(h1, center=True, scale=True, is_training=phase, scope='bn_1') h2 = dense_relu(h1, 100, 'layer2') h2 = tf.contrib.layers.batch_norm(h2, center=True, scale=True, is_training=phase, scope='bn_2') logits = dense(h2, 10, scope='logits') with tf.name_scope('accuracy'): accuracy = tf.reduce_mean(tf.cast( tf.equal(tf.argmax(y, 1), tf.argmax(logits, 1)), 'float32')) with tf.name_scope('loss'): loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y)) def train(): update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) with tf.control_dependencies(update_ops): train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss) sess = tf.Session() sess.run(tf.global_variables_initializer()) history = [] iterep = 500 for i in range(iterep * 30): x_train, y_train = mnist.train.next_batch(100) sess.run(train_step, feed_dict={'x:0': x_train, 'y:0': y_train, 'phase:0': 1}) if (i + 1) % iterep == 0: epoch = (i + 1)/iterep tr = sess.run([loss, accuracy], feed_dict={'x:0': mnist.train.images, 'y:0': mnist.train.labels, 'phase:0': 1}) t = sess.run([loss, accuracy], feed_dict={'x:0': mnist.test.images, 'y:0': mnist.test.labels, 'phase:0': 0}) history += [[epoch] + tr + t] print( history[-1] ) return history train()
報錯, 具體如下:

2020-08-09 21:03:53.837785: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2020-08-09 21:03:53.837987: W ./tensorflow/stream_executor/stream.h:1995] attempting to perform DNN operation using StreamExecutor without DNN support Traceback (most recent call last): File "/home/devil/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call return fn(*args) File "/home/devil/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/devil/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: cuDNN launch failure : input shape ([100,100,1,1]) [[{{node bn_1/cond/FusedBatchNorm}}]] During handling of the above exception, another exception occurred:
不使用 顯卡 進行計算,正常運行:

或:

主要語句:
CUDA_VISIBLE_DEVICES=-1
正常運行:

如果 這種情況要仍然要使用 RTX 顯卡, 那么 加入下面語句(對 會話session 的創建不使用默認設置,而是進行配置):
使用非交互的session時候,如下:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
或
gpu_options = tf.GPUOptions( allow_growth = True )
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
或
gpu_options = tf.GPUOptions( per_process_gpu_memory_fraction=0.5, allow_growth = True )
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
總之,就是不能使用默認配置的session,需要配置一下。
其中,
per_process_gpu_memory_fraction=0.5
是指為該程序分配使用的顯卡其內存不超過總內存的 0.5倍。
--------------------------------------------------------
發生該問題的原因:
Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 這個問題大部分是因為RTX顯卡不兼容它出生前的接口有關。
原因解釋出自資料:
https://blog.csdn.net/pkuyjxu/article/details/89402298
-------------
對上面代碼中對 tensor 進行運算的代碼中 feed_dict 的形式不是很熟悉,即:

因為以前經常使用的形式為:

於是很好奇,將代碼改為如下:

發現報錯:

從報錯中可以知道,原來 feed_dict 中的key 是可以用 所構建的圖的tensor(用函數tf.placeholder生成的tensor) 在圖內的名字來表示的,即 "<op_name>:<output_index>" , 也就是這里的 “x:0” 。
而我們以前常用的形式是 將構建圖中tensor (用tf.placeholder生成的tensor)的那個變量 即 x 作為 feed_dict 中的key 的。
比如:

這里,我們相當於構建了一個tensor (用函數tf.placeholder生成的tensor), tensor的名字為 'xxx:0' , 但是所構建的這個tensor 的變量為 x 。
詳細的說就是:
x = tf.placeholder('float32', (None, 784), name='x') 中, name="x" 是說這個tf.placeholer函數在圖中所定義的操作( operation)的名字(name) 是 “xxx” , 而圖中的這個操作產生的第0個tensor在圖中的名字為 “xxx:0” , 而這個名字為 “xxx:0” 的tensor又傳遞給了python變量x , 因此在 feed_dict 中我們可以使用變量x 來表示這個tensor, 也可以使用這個tensor的圖內的名字“xxx:0” 來表示。需要注意的是“xxx”是操作(operation)的名字,而不是tensor的名字。
對於 tensor 的這個 "<op_name>:<output_index>" 形式的表示還是很長知識的。
注:
這里傳給 feed_dict 的變量都是使用 tf.placeholder生成的 tensor 的變量, 這種變量也是整個圖所依賴的起始tensor的變量。
-----------------------------------------------------
以下給出 feed_dict 的兩個混合寫法的 代碼:

import numpy as np import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) def dense(x, size, scope): return tf.contrib.layers.fully_connected(x, size, activation_fn=None, scope=scope) def dense_relu(x, size, scope): with tf.variable_scope(scope): h1 = dense(x, size, 'dense') return tf.nn.relu(h1, 'relu') tf.reset_default_graph() x = tf.placeholder('float32', (None, 784), name='x') y = tf.placeholder('float32', (None, 10), name='y') phase = tf.placeholder(tf.bool, name='phase') h1 = dense_relu(x, 100, 'layer1') h1 = tf.contrib.layers.batch_norm(h1, center=True, scale=True, is_training=phase, scope='bn_1') h2 = dense_relu(h1, 100, 'layer2') h2 = tf.contrib.layers.batch_norm(h2, center=True, scale=True, is_training=phase, scope='bn_2') logits = dense(h2, 10, scope='logits') with tf.name_scope('accuracy'): accuracy = tf.reduce_mean(tf.cast( tf.equal(tf.argmax(y, 1), tf.argmax(logits, 1)), 'float32')) with tf.name_scope('loss'): loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y)) def train(): update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) with tf.control_dependencies(update_ops): train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss) gpu_options = tf.GPUOptions( per_process_gpu_memory_fraction=0.5, allow_growth = True ) sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) #sess = tf.Session() sess.run(tf.global_variables_initializer()) history = [] iterep = 500 for i in range(iterep * 30): x_train, y_train = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: x_train, 'y:0': y_train, phase: 1}) if (i + 1) % iterep == 0: epoch = (i + 1)/iterep tr = sess.run([loss, accuracy], feed_dict={'x:0': mnist.train.images, y: mnist.train.labels, phase: 1}) t = sess.run([loss, accuracy], feed_dict={x: mnist.test.images, y: mnist.test.labels, 'phase:0': 0}) history += [[epoch] + tr + t] print( history[-1] ) return history train()
