Pytorch遇到的錯誤解決方法

1. pytorch運行錯誤：RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

解決方法：

代碼中添加：

torch.cuda.set_device(0)

2. 訓練RNN網絡loss出現Nan解決辦法

（1）梯度爆炸的原因可以通過梯度裁決解決

GRAD_CLIP = 5
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), GRAD_CLIP)
optimizer.step()

（2）testModel和evaluate中需要使用

with torch.no_grad():

（3）學習率調小一點

3. RuntimeError: Expected object of device type cuda but got device type cpu for argument

在代碼中由三個位置需要進行cuda()轉換：

模型是否放到了CUDA上model = model.to(device)
輸入數據是否放到了CUDA上data = data.to(device)
模型內部新建的張量是否放到了CUDA上p = torch.tensor([1]).to(device)

關於第一條中model = model.to(device)只對model中實例化在__init__()中的函數有效，如果在forward中實例化並直接使用則不會將model放置到cuda中。

下面給出一個錯誤的代碼：

import torch
import torch.nn as nn


data = torch.rand(1, 10).cuda()


class TestMoule(nn.Module):
    def __init__(self):
        super(TestMoule, self).__init__()
        # self.linear = torch.nn.Linear(10, 2)

    def forward(self, x):
        # return self.linear(x)
        return torch.nn.Linear(10, 2)(x)


model = TestMoule()
model = model.cuda()

print(model(data))

4. RuntimeError: CUDA error: an illegal memory access was encountered

出現上面問題一種情況是某些nn模塊下的函數傳入了gpu類型的數據，如下錯誤代碼：

import torch

data = torch.randn(1, 10).cuda()

layernorm = torch.nn.LayerNorm(10)
# layernorm = torch.nn.LayerNorm(10).cuda()

re_data = layernorm(data)
print(re_data)

5. RuntimeError: CUDA error: device-side assert triggered

分類的類別target與模型輸出softmax的值不是一一對應的，如三分類問題：

targets 為 1-3的值，但是softmax計算的值是0-2，因此提示上面的錯誤。

df = pd.read_csv('data/reviews.csv')

def to_sentiment(score):
    score = int(score)
    if score <= 2:
        return 0
    elif score == 3:
        return 1
    else:
        return 2

df['sentiment'] = df.score.apply(to_sentiment)

6. RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling xxxx

https://zhuanlan.zhihu.com/p/140954200

訓練時加上以下代碼解決

torch.cuda.set_device(1)

當出現以下錯誤時，

RuntimeError: CUDA error: an illegal memory access was encountered error

也可以使用以上方法解決。

不知為何，總能work fine。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 word2007在試圖打開文件時遇到錯誤解決方法 JMeter常見錯誤解決方法 PTA段錯誤解決方法匯編常見錯誤解決方法總結 Tomcat安裝教程及常見錯誤解決方法 Cannot find module 'xxx'，錯誤解決方法 C#常見錯誤解決方法主從復制1062錯誤解決方法 Invalid operation updata 錯誤解決方法 install_github安裝錯誤解決方法