How to do error checking in CUDA(如何在CUDA里做錯誤檢查)

本文轉載自查看原文 2018-03-23 14:15 1077 深度學習相關

https://codeyarns.com/2011/03/02/how-to-do-error-checking-in-cuda/

Error checks in CUDA code can help catch CUDA errors at their source. There are 2 sources of errors in CUDA source code:

Errors from CUDA API calls. For example, a call to cudaMalloc() might fail.
Errors from CUDA kernel calls. For example, there might be invalid memory access inside a kernel

在CUDA代碼里，錯誤檢查可以幫助找到CUDA代碼里的錯誤，有兩種從代碼里產生的錯誤

CUDA API調用錯誤。如，一個cudaMalloc()調用可能會失敗。
CUDA kernel調用錯誤。如，可能會在某個kernel的實現了訪問了非法的內存。

All CUDA API calls return a cudaError value, so these calls are easy to check:

所有CUDA API調用都會返回一個cudaError值，所以這種調用非常容易檢查。

if ( cudaSuccess != cudaMalloc( &fooPtr, fooSize ) )
    printf( "Error!\n" );

CUDA kernel invocations do not return any value. Error from a CUDA kernel call can be checked after its execution by calling cudaGetLastError():

CUDA kernel不返回任何值。從CUDA kernel調用產生的錯誤可以在該調用完畢后，從cudaGetLastError()中檢查到。

fooKernel<<< x, y >>>(); // Kernel call
if ( cudaSuccess != cudaGetLastError() )
    printf( "Error!\n" );

These two types of checks can be elegantly wrapped up in two simple error-checking functions like this:

這兩種檢查可以非常優雅地封裝在兩個錯誤檢查函數中，如下，

// Define this to turn on error checking
#define CUDA_ERROR_CHECK

#define CudaSafeCall( err ) __cudaSafeCall( err, __FILE__, __LINE__ )
#define CudaCheckError()    __cudaCheckError( __FILE__, __LINE__ )

inline void __cudaSafeCall( cudaError err, const char *file, const int line )
{
#ifdef CUDA_ERROR_CHECK
    if ( cudaSuccess != err )
    {
        fprintf( stderr, "cudaSafeCall() failed at %s:%i : %s\n",
                 file, line, cudaGetErrorString( err ) );
        exit( -1 );
    }
#endif

    return;
}

inline void __cudaCheckError( const char *file, const int line )
{
#ifdef CUDA_ERROR_CHECK
    cudaError err = cudaGetLastError();
    if ( cudaSuccess != err )
    {
        fprintf( stderr, "cudaCheckError() failed at %s:%i : %s\n",
                 file, line, cudaGetErrorString( err ) );
        exit( -1 );
    }

    // More careful checking. However, this will affect performance.
    // Comment away if needed.
    err = cudaDeviceSynchronize();
    if( cudaSuccess != err )
    {
        fprintf( stderr, "cudaCheckError() with sync failed at %s:%i : %s\n",
                 file, line, cudaGetErrorString( err ) );
        exit( -1 );
    }
#endif

    return;
}

Using these error checking functions is easy:

使用這兩個錯誤檢查函數非常簡單：

CudaSafeCall( cudaMalloc( &fooPtr, fooSize ) );
 
fooKernel<<< x, y >>>(); // Kernel call
CudaCheckError();

These functions are actually derived from similar functions which used to be available in the cutil.h in old CUDA SDKs.

這兩個函數實際上也是從簡單的舊CUDA SDK里導出的

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 RabbitMQ錯誤檢查 CUDA Error CUDA CUDA cuda CUDA Error: no kernel image is available for execution on the device: No error 錯誤如何處理? Cuda runtime error (999) CUDA運行時錯誤 --- CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure Keras 訓練時出現 CUDA_ERROR_OUT_OF_MEMORY 錯誤顯存充足，但是卻出現CUDA error:out of memory錯誤