【TensorFlow】InternalError: Failed copying input tensor

本文轉載自查看原文 2021-11-06 16:50 8694 TensorFlow/ Python

⚠ TensorFlow-GPU 執行模型訓練時報錯：

InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized.

解決方案：『TensorFlow: Dst tensor is not initialized - Stack Overflow』

主要原因在於 batch_size 太大，內存無法負載，將 batch_size 適當調小即可正常運行。

【注】默認情況下，TF 會盡可能地多分配占用 GPU 內存，通過調整 GPUConfig 可以設置為按需分配內存，參考『TensorFlow 文檔』和『TensorFlow 代碼』。

另外，使用 Jupyter Notebook 進行長期模型訓練時，可能由於 GPU 內存無法及時釋放導致該報錯。參考『此答案』可以解決此問題，定義如下函數：

from keras.backend import set_session
from keras.backend import clear_session
from keras.backend import get_session
import gc

# Reset Keras Session
def reset_keras():
    sess = get_session()
    clear_session()
    sess.close()
    sess = get_session()

    try:
        del classifier # this is from global space - change this as you need
    except:
        pass

    print(gc.collect()) # if it does something you should see a number as output

    # use the same config as you used to create the session
    config = tf.compat.v1.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 1
    config.gpu_options.visible_device_list = "0"
    set_session(tf.compat.v1.Session(config=config))

需要清除 GPU 內存時，直接調用 reset_keras 函數即可。例如：

dense_layers = [0, 1, 2]
layer_sizes = [32, 64, 128]
conv_layers = [1, 2, 3]

for dense_layer in dense_layers:
    for layer_size in layer_sizes:
        for conv_layer in conv_layers:
            reset_keras()
            # training your model here

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 TensorFlow | ReluGrad input is not finite. Tensor had NaN values 深度學習系列（7）——報錯：tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed Tensorflow之合並tensor [TensorFlow]Tensor維度理解獲取tensorflow中tensor的值機器學習-Tensorflow之Tensor和Dataset學習 opencv MAT與tensorflow tensor相互轉化 tensorflow中tensor與數組之間的轉換怎么在tensorflow中打印graph中的tensor信息學習TensorFlow，打印輸出tensor的值