RTX顯卡運行TensorFlow=1.14.0 代碼報錯 Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

本文轉載自查看原文 2020-08-09 20:53 1057 Tensorflow

硬件環境：

RTX2070super 顯卡

軟件環境：

Ubuntu18.04.5

Tensorflow = 1.14.0

---------------------------------------------------------------------

運行代碼：

import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

def dense(x, size, scope):
    return tf.contrib.layers.fully_connected(x, size, 
                        activation_fn=None, scope=scope)

def dense_relu(x, size, scope):
    with tf.variable_scope(scope):
        h1 = dense(x, size, 'dense')
        return tf.nn.relu(h1, 'relu')


tf.reset_default_graph()
x = tf.placeholder('float32', (None, 784), name='x')
y = tf.placeholder('float32', (None, 10), name='y')
phase = tf.placeholder(tf.bool, name='phase')

h1 = dense_relu(x, 100, 'layer1')
h1 = tf.contrib.layers.batch_norm(h1, 
                            center=True, scale=True, 
                            is_training=phase,
                            scope='bn_1')

h2 = dense_relu(h1, 100, 'layer2')
h2 = tf.contrib.layers.batch_norm(h2, 
                            center=True, scale=True, 
                            is_training=phase,
                            scope='bn_2')

logits = dense(h2, 10, scope='logits')

with tf.name_scope('accuracy'):
    accuracy = tf.reduce_mean(tf.cast(
            tf.equal(tf.argmax(y, 1), tf.argmax(logits, 1)), 
            'float32'))

with tf.name_scope('loss'):
    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))


def train():
    update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
    with tf.control_dependencies(update_ops):
        train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
    sess = tf.Session()
    sess.run(tf.global_variables_initializer())
    
    history = []
    iterep = 500
    for i in range(iterep * 30):
        x_train, y_train = mnist.train.next_batch(100)
        sess.run(train_step,
                 feed_dict={'x:0': x_train, 
                            'y:0': y_train, 
                            'phase:0': 1})
        if (i + 1) %  iterep == 0:
            epoch = (i + 1)/iterep
            tr = sess.run([loss, accuracy], 
                          feed_dict={'x:0': mnist.train.images,
                                     'y:0': mnist.train.labels,
                                     'phase:0': 1})
            t = sess.run([loss, accuracy], 
                         feed_dict={'x:0': mnist.test.images,
                                    'y:0': mnist.test.labels,
                                    'phase:0': 0})
            history += [[epoch] + tr + t]
            print( history[-1] )
    return history


train()

報錯，具體如下：

2020-08-09 21:03:53.837785: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-08-09 21:03:53.837987: W ./tensorflow/stream_executor/stream.h:1995] attempting to perform DNN operation using StreamExecutor without DNN support
Traceback (most recent call last):
  File "/home/devil/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
    return fn(*args)
  File "/home/devil/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/devil/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: cuDNN launch failure : input shape ([100,100,1,1])
     [[{{node bn_1/cond/FusedBatchNorm}}]]

During handling of the above exception, another exception occurred:

不使用顯卡進行計算，正常運行：

或：

主要語句：

CUDA_VISIBLE_DEVICES=-1

正常運行：

如果這種情況要仍然要使用 RTX 顯卡, 那么加入下面語句（對會話session 的創建不使用默認設置，而是進行配置）：

使用非交互的session時候，如下：

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

或

gpu_options = tf.GPUOptions( allow_growth = True )
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

或

gpu_options = tf.GPUOptions( per_process_gpu_memory_fraction=0.5, allow_growth = True )
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

總之，就是不能使用默認配置的session，需要配置一下。

其中，

per_process_gpu_memory_fraction=0.5

是指為該程序分配使用的顯卡其內存不超過總內存的 0.5倍。

--------------------------------------------------------

發生該問題的原因：

Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 這個問題大部分是因為RTX顯卡不兼容它出生前的接口有關。

原因解釋出自資料：

https://blog.csdn.net/pkuyjxu/article/details/89402298

-------------

對上面代碼中對 tensor 進行運算的代碼中 feed_dict 的形式不是很熟悉，即：

因為以前經常使用的形式為：

於是很好奇，將代碼改為如下：

發現報錯：

從報錯中可以知道，原來 feed_dict 中的key 是可以用所構建的圖的tensor（用函數tf.placeholder生成的tensor）在圖內的名字來表示的，即 "<op_name>:<output_index>" ，也就是這里的 “x:0” 。

而我們以前常用的形式是將構建圖中tensor （用tf.placeholder生成的tensor）的那個變量即 x 作為 feed_dict 中的key 的。

比如：

這里，我們相當於構建了一個tensor （用函數tf.placeholder生成的tensor）， tensor的名字為 'xxx:0' ，但是所構建的這個tensor 的變量為 x 。

詳細的說就是：

x = tf.placeholder('float32', (None, 784), name='x') 中， name="x" 是說這個tf.placeholer函數在圖中所定義的操作（ operation）的名字（name）是 “xxx” , 而圖中的這個操作產生的第0個tensor在圖中的名字為 “xxx:0” ，而這個名字為 “xxx:0” 的tensor又傳遞給了python變量x ，因此在 feed_dict 中我們可以使用變量x 來表示這個tensor，也可以使用這個tensor的圖內的名字“xxx:0” 來表示。需要注意的是“xxx”是操作（operation）的名字，而不是tensor的名字。

對於 tensor 的這個 "<op_name>:<output_index>" 形式的表示還是很長知識的。

注：

這里傳給 feed_dict 的變量都是使用 tf.placeholder生成的 tensor 的變量，這種變量也是整個圖所依賴的起始tensor的變量。

-----------------------------------------------------

以下給出 feed_dict 的兩個混合寫法的代碼：

import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

def dense(x, size, scope):
    return tf.contrib.layers.fully_connected(x, size, 
                        activation_fn=None, scope=scope)

def dense_relu(x, size, scope):
    with tf.variable_scope(scope):
        h1 = dense(x, size, 'dense')
        return tf.nn.relu(h1, 'relu')


tf.reset_default_graph()
x = tf.placeholder('float32', (None, 784), name='x')
y = tf.placeholder('float32', (None, 10), name='y')
phase = tf.placeholder(tf.bool, name='phase')

h1 = dense_relu(x, 100, 'layer1')
h1 = tf.contrib.layers.batch_norm(h1, 
                            center=True, scale=True, 
                            is_training=phase,
                            scope='bn_1')

h2 = dense_relu(h1, 100, 'layer2')
h2 = tf.contrib.layers.batch_norm(h2, 
                            center=True, scale=True, 
                            is_training=phase,
                            scope='bn_2')

logits = dense(h2, 10, scope='logits')

with tf.name_scope('accuracy'):
    accuracy = tf.reduce_mean(tf.cast(
            tf.equal(tf.argmax(y, 1), tf.argmax(logits, 1)), 
            'float32'))

with tf.name_scope('loss'):
    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))


def train():
    update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
    with tf.control_dependencies(update_ops):
        train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)


    gpu_options = tf.GPUOptions( per_process_gpu_memory_fraction=0.5, allow_growth = True )
    sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
    #sess = tf.Session()
    sess.run(tf.global_variables_initializer())
    
    history = []
    iterep = 500
    for i in range(iterep * 30):
        x_train, y_train = mnist.train.next_batch(100)
        sess.run(train_step,
                 feed_dict={x: x_train, 
                            'y:0': y_train, 
                            phase: 1})
        if (i + 1) %  iterep == 0:
            epoch = (i + 1)/iterep
            tr = sess.run([loss, accuracy], 
                          feed_dict={'x:0': mnist.train.images,
                                     y: mnist.train.labels,
                                     phase: 1})
            t = sess.run([loss, accuracy], 
                         feed_dict={x: mnist.test.images,
                                    y: mnist.test.labels,
                                    'phase:0': 0})
            history += [[epoch] + tr + t]
            print( history[-1] )
    return history


train()

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 [大坑]Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR tensorflow-1.13.1和1.14windows版本目前不支持CUDA10.0 CUDNN_STATUS_INTERNAL_ERROR RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR check failed status == cudnn_status_success (4 vs. 0) cudnn_status_internal_error 【每天學習一點點】Tensorflow2.X 運行問題:Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED pytorch報錯cuDNN error: CUDNN_STATUS_EXECUTION_FAILED 首次運行tensorflow-gpu 1.0 報錯 failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED ubuntu18.40 rtx2080ti安裝顯卡驅動/cuda/cudnn/tensorflow-gpu 報錯cuDNN: CUDNN_STATUS_EXECUTION_FAILED

RTX顯卡 運行TensorFlow=1.14.0 代碼 報錯 Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

免責聲明！

RTX顯卡運行TensorFlow=1.14.0 代碼報錯 Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR