背景

多分類問題里（單對象單標簽），一般問題的setup都是一個輸入，然后對應的輸出是一個vector，這個vector的長度等於總共類別的個數。輸入進入到訓練好的網絡里，predicted class就是輸出層里值最大的那個entry對應的標簽。

交叉熵在多分類神經網絡訓練中用的最多的loss function（損失函數）。舉一個很簡單的例子，我們有一個三分類問題，對於一個input \(x\)，神經網絡最后一層的output (\(y\))是一個\((3 \times 1)\)的向量。然后這個\(x\)對應的ground-truth(\(y^{'}\) )也是一個\((3 \times 1)\)的向量。

交叉熵

來舉一個例子，讓三個類別分別是類別0，1和2。這里讓input \(x\)屬於類別0。所以ground-truth(\(y^{'}\) ) 就等於\((1,0,0)\), 讓網絡的預測輸出(\(y\))等於\((3,1,-3)\)。

\[y = (3,1,-3), \space y^{'} = (1,0,0) \]

交叉熵損失的定義如下公式所示(在上面的列子里，i是從0到2的)：

\[H (y,y^{'})= -\sum_i {y_i^{'}}\log(softmax(y_i)) \]

Softmax

softmax的計算可以在下圖找到。注意在圖里，softmax的輸入\((3,1,-3)\) 是神經網絡最后一個fc層的輸出(\(y\))。\(y\)經過softmax層之后，就變成了\(softmax(y)=(0.88,0.12,0)\)。\(y\)的每一個entry可以看作每一個class的預測得分，那么\(softmax(y)\)的每一個entry就是每一個class的預測概率。

\[H (y,y^{'})= -{y_j^{'}}\log(softmax(y_j)) \]

對於上面的列子，當前\(x\)的分類loss就是\(H(y,y^{'})=-1\times \log(0.88)=0.12\) (注意，這里\(\log\)的base是\(e\))

softmax常用於多分類過程中，它將多個神經元的輸出，歸一化到\((0, 1)\) 區間內，因此Softmax的輸出可以看成概率，從而來進行多分類。

nn.CrossEntropyLoss() in Pytorch

其實歸根結底，交叉熵損失的計算只需要一個term。這個term就是在softmax輸出層中找到ground-truth里正確標簽對應的那個entry \(j\) ，也就是(\(\log(softmax(y_j))\))。(當然咯，在計算\(softmax(y_j)\)的時候，我們是需要y里所有的term的值的。)

\[H (y,y^{'})= -{y_j^{'}}\log(softmax(y_j)) \]

因為entry \(j\)對應的是ground-truth里正確的class。只有在\(i=j\)的時候才\(y^{'}_i = 1\)，其他時候都等於0。

在下面的代碼里，我們把python中torch.nn.CrossEntropyLoss() 的計算結果和用公式計算出的交叉熵結果進行比較. 結果顯示，torch.nn.CrossEntropyLoss()的input只需要是網絡fc層的輸出\(y\), 在torch.nn.CrossEntropyLoss()里它會自己把\(y\) 轉化成\(softmax(y)\) 然后再進行交叉熵loss的運算.

所以當我們用PyTorch搭建分類網絡的時候，不需要再在最后一個fc層后再手動添加一個softmax層。

注意，在用PyTorch做分類問題的時候，在網絡搭建時（假設全連接層的output是y），在之后加一個 y = torch.nn.functional.log_softmax (y)，並在訓練時，用torch.nn.functional.nll_loss(y, labels)。這樣達到的效果和不用log_softmax層，並用torch.nn.CrossEntropyLoss(y,labels)做損失函數是一模一樣的。

import torch
import torch.nn as nn
import math

output = torch.randn(1, 5, requires_grad = True) #假設是網絡的最后一層，5分類
label = torch.empty(1, dtype=torch.long).random_(5) # 0 - 4， 任意選取一個分類

print ('Network Output is: ', output)
print ('Ground Truth Label is: ', label)

score = output [0,label.item()].item() # label對應的class的logits（得分）
print ('Score for the ground truth class = ', label)

first = - score
second = 0
for i in range(5):
    second += math.exp(output[0,i])
second = math.log(second)

loss = first + second
print ('-' * 20)
print ('my loss = ', loss)

loss = nn.CrossEntropyLoss()
print ('pytorch loss = ', loss(output, label))

下面這段新增代碼分別用nn.functional.nll_loss()和nn.CrossEntropyLoss()對一個神經網絡輸出（fc_output）和其label進行了交叉熵損失計算

import torch
import torch.nn as nn
import torch.nn.functional as F

# raw output from the net, a 10d vector
fc_output = torch.randn(1, 5, requires_grad = True)  # tensor of shape 1x10
label = torch.empty(1, dtype=torch.long).random_(5)  # tensor of shape 1

# output element being softmaxed
softmax_output = F.softmax(fc_output, dim=1) 

# output element being softmaxed and apply log to it
log_softmax_output = F.log_softmax(fc_output, dim=1)

print('               label = ', label)
print('         output(raw) = ', fc_output.detach())
print('     softmax(output) = ', softmax_output.detach())
print('log(softmax(output)) = ', log_softmax_output.detach())  # can use ln() to check

loss1 = F.nll_loss(log_softmax_output, label)

cross_entropy_loss = nn.CrossEntropyLoss()
loss2 = cross_entropy_loss(fc_output, label)
print()
print('loss from nn.functional.nll_loss() = ', loss1)
print('loss from    nn.CrossEntropyLoss() = ', loss2)

第二段代碼的運行結果為：

label =  tensor([1])
output(raw) =  tensor([[-0.7707,  0.1843,  0.2211,  0.4185, -0.3662]])
softmax(output) =  tensor([[0.0903, 0.2346, 0.2434, 0.2965, 0.1353]])
log(softmax(output)) =  tensor([[-2.4050, -1.4500, -1.4131, -1.2157, -2.0005]])

loss from nn.functional.nll_loss() =  tensor(1.4500, grad_fn=<NllLossBackward>)
loss from    nn.CrossEntropyLoss() =  tensor(1.4500, grad_fn=<NllLossBackward>)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 交叉熵和 torch.nn.CrossEntropyLoss() torch.nn.CrossEntropyLoss pytorch的torch.nn.CrossEntropyLoss() 小白學習之pytorch框架(4)-softmax回歸(torch.gather()、torch.argmax()、torch.nn.CrossEntropyLoss()) Softmax函數與交叉熵損失函數 softmax交叉熵損失函數求導交叉熵 pytorch中的nn.CrossEntropyLoss()函數 Pytorch常用的交叉熵損失函數CrossEntropyLoss()詳解深度學習中softmax交叉熵損失函數的理解 softmax+交叉熵損失函數代碼實現