卷積神經網絡的簡單可視化


卷積神經網絡的簡單可視化


本次將進行卷積神經網絡權重的簡單可視化。

在本篇教程的前半部分,我們會首先定義一個及其簡單的 CNN 模型,並手工指定一些過濾器權重參數,作為卷積核參數。

后半部分,我們會使用 FashionMNIST 數據集,並且定義一個 2 層的 CNN 模型,將模型訓練至准確率在 85% 以上,再進行模型卷積核的可視化。

1. 簡單卷積網絡模型的可視化

1.1 指定過濾器卷積層的可視化

在下面的練習中,我們將手動定義幾個類似索比爾算子的過濾器,並將它們指定給一個極其簡單地卷積神經網絡模型。然后可視化卷積層 4 個過濾器的輸出(即 feature maps)。

加載目標圖像

import cv2
import matplotlib.pyplot as plt
%matplotlib inline

img_path = 'images/udacity_sdc.png'
bgr_img = cv2.imread(img_path)

gray_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2GRAY)
gray_img = gray_img.astype("float32")/255

plt.imshow(gray_img, cmap='gray')
plt.show()

手動定義過濾器

import numpy as np

filter_vals = np.array([[-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1]])

# 變化產生更豐富的過濾器
filter_1 = filter_vals
filter_2 = -filter_1
filter_3 = filter_1.T
filter_4 = -filter_3
filters = np.array([filter_1, filter_2, filter_3, filter_4])

fig = plt.figure(figsize=(10, 5))
for i in range(4):
    ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])
    ax.imshow(filters[i], cmap='gray')
    ax.set_title('Filter %s' % str(i+1))
    width, height = filters[i].shape
    for x in range(width):
        for y in range(height):
            ax.annotate(str(filters[i][x][y]), xy=(y,x),
                       horizontalalignment='center',
                       verticalalignment='center', 
                       color='white' if filters[i][x][y] < 0 else 'black')

定義簡單卷積神經網絡

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self, weight):
        super(Net, self).__init__()
        k_height, k_width = weight.shape[2:]
        self.conv = nn.Conv2d(1, 4, kernel_size=(k_height, k_width), bias=False)
        self.conv.weight = torch.nn.Parameter(weight)
        self.pool = nn.MaxPool2d(4,4)
        
    def forward(self, x):
        conv_x = self.conv(x)
        activated_x = F.relu(conv_x)
        pooled_x = self.pool(activated_x)
        
        return conv_x, activated_x, pooled_x
    
# filters 的大小為 4 4 4
# weight 的大小被增加為 4 1 4 4,1 的維度是針對輸入的一個通道
weight = torch.from_numpy(filters).unsqueeze(1).type(torch.FloatTensor)
model = Net(weight)

print('Filters shape: ', filters.shape)
print('weights shape: ', weight.shape)
print(model)
Filters shape:  (4, 4, 4)
weights shape:  torch.Size([4, 1, 4, 4])
Net(
  (conv): Conv2d(1, 4, kernel_size=(4, 4), stride=(1, 1), bias=False)
  (pool): MaxPool2d(kernel_size=4, stride=4, padding=0, dilation=1, ceil_mode=False)
)

可視化卷積輸出

定義一個函數 viz_layer,在這個方法可以可視化某一層卷積的輸出。

def viz_layer(layer, n_filters=4):
    fig = plt.figure(figsize=(20, 20))
    
    for i in range(n_filters):
        ax = fig.add_subplot(1, n_filters, i+1, xticks=[], yticks=[])
        ax.imshow(np.squeeze(layer[0,i].data.numpy()), cmap='gray')
        ax.set_title('Output %s' % str(i+1))

# 輸出原圖
plt.imshow(gray_img, cmap='gray')
# 格式化輸出過濾器(卷積核)
fig = plt.figure(figsize=(12, 6))
fig.subplots_adjust(left=0, right=1.5, bottom=0.8, top=1, hspace=0.05, wspace=0.05)
for i in range(4):
    ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])
    ax.imshow(filters[i], cmap='gray')
    ax.set_title('Filter %s' % str(i+1))
    
# 為 gray img 添加 1 個 batch 維度,以及 1 個 channel 維度,並轉化為 tensor
gray_img_tensor = torch.from_numpy(gray_img).unsqueeze(0).unsqueeze(1)
print(gray_img.shape)
print(gray_img_tensor.shape)

# 將輸入圖傳入模型,獲得輸出
conv_layer, activated_layer, pooled_layer = model(gray_img_tensor)

# 可視化卷積輸出
viz_layer(conv_layer)
(213, 320)
torch.Size([1, 1, 213, 320])

# 可視化卷積后激活函數后的輸出
viz_layer(activated_layer)

1.2 指定過濾器池化層的可視化

下面可視化池化層后的輸出。

# 可視化池化層后的輸出
viz_layer(pooled_layer)

2. 多層卷積網絡模型的可視化

在下面的練習中,我們將定義一個相對復雜點的神經網絡,並使用 FashionMNIST 數據集訓練至 85% 以上的准確率,其后再對神經網絡進行可視化分析。

2.1 加載 FashionMNIST 數據集

FashionMNIST 相當於一種對 MNIST 數據集的升級。MNIST 數據集的數字識別在目前來說,模式比較簡單,可能作為深度神經網絡模型的目標數據集稍顯簡單。FashionMNIST 將圖像內容變為“時尚衣物”,圖像格式不變,使用起來幾乎與 MNIST 無異,且比 MNIST 更能考驗模型對數據模式的學習能力。

FashionMNIST 的類別列表:

0:T-shirt/top(T恤) 
1:Trouser(褲子) 
2:Pullover(套衫) 
3:Dress(裙子) 
4:Coat(外套) 
5:Sandal(涼鞋) 
6:Shirt(汗衫) 
7:Sneaker(運動鞋) 
8:Bag(包) 

加載 FashionMNIST 數據集

import torch
import torchvision

from torchvision.datasets import FashionMNIST
from torch.utils.data import DataLoader
from torchvision import transforms

data_transform = transforms.ToTensor()

train_data = FashionMNIST(root='./data', train=True,
                         download=False, transform=data_transform)
test_data = FashionMNIST(root='./data', train=False,
                         download=False, transform=data_transform)

# Print out some stats about the training and test data
print('Train data, number of images: ', len(train_data))
print('Test data, number of images: ', len(test_data))
Train data, number of images:  60000
Test data, number of images:  10000

創建數據加載器

batch_size = 20

train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=True)

# specify the image classes
classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 
           'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

可視化目標數據集的部分數據

import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

dataiter = iter(train_loader)
images, labels = dataiter.next()
images = images.numpy()

# plot the images in the batch, along with the corresponding labels
fig = plt.figure(figsize=(25, 4))
for idx in np.arange(batch_size):
    ax = fig.add_subplot(2, batch_size/2, idx+1, xticks=[], yticks=[])
    ax.imshow(np.squeeze(images[idx]), cmap='gray')
    ax.set_title(classes[labels[idx]])#### 加載 FashionMNIST 數據集

2.2 訓練多層卷積模型

定義模型

下面定義一個具有兩層卷積的模型,加入的 dropout 在一定程度上起到防止過擬合的作用。

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        
        self.conv1 = nn.Conv2d(1, 16, 3, padding=1)
        self.pool1 = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
        self.pool2 = nn.MaxPool2d(2, 2)
        self.activation_l = nn.ReLU()
        
        self.fc = nn.Linear(32 * 7 * 7, 24)
        self.out = nn.Linear(24, 10)
        self.dropout = nn.Dropout(p=0.5)
        self.activation_out = nn.Softmax(dim=1)
        
    def forward(self, x):
        x = self.activation_l(self.conv1(x))
        x = self.pool1(x)
        x = self.activation_l(self.conv2(x))
        x = self.pool2(x)
        
        x = x.view(x.size(0), -1)
        x = self.activation_l(self.fc(x))
        x = self.dropout(x)
        x = self.activation_out(self.out(x))
        
        return x

訓練模型

import torch.optim as optim

criterion = nn.CrossEntropyLoss()

optimizer = torch.optim.Adam(net.parameters())

def train(n_epochs):
    for epoch in range(n_epochs):
        running_loss = 0.0
        for batch_i, data in enumerate(train_loader):
            inputs, labels = data
            optimizer.zero_grad()
            outputs = net(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            
            running_loss += loss.item()
            
            if batch_i % 1000 == 999:
                print('Epoch: {}, Batch: {}, Avg. Loss: {}'.format(epoch + 1, batch_i+1, running_loss/1000))
                running_loss = 0.0
                
    print('Finished Training')
    
n_epochs = 10

train(n_epochs)

model_dir = 'saved_models/'
model_name = 'model_best.pt'

torch.save(net.state_dict(), model_dir+model_name)

加載訓練的模型

net = Net()

net.load_state_dict(torch.load('saved_models/model_best.pt'))

print(net)
Net(
  (conv1): Conv2d(1, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (activation_l): ReLU()
  (fc): Linear(in_features=1568, out_features=24, bias=True)
  (out): Linear(in_features=24, out_features=10, bias=True)
  (dropout): Dropout(p=0.5)
  (activation_out): Softmax()
)

在測試數據集上測試模型

test_loss = torch.zeros(1)
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))

print(class_correct)
print(test_loss)
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
tensor([ 0.])
net.eval()

criterion = torch.nn.CrossEntropyLoss()

for batch_i, data in enumerate(test_loader):
    inputs, labels = data
    output = net(inputs)
    loss = criterion(outputs, labels)
    
    # update average test loss 
    test_loss = test_loss + ( (torch.ones(1) / (batch_i+1)) * (loss.data - test_loss) )
    
    _, predicted = torch.max(output.data, 1)
    
    correct = np.squeeze(predicted.eq(labels.data.view_as(predicted)))
    
    for i in range(batch_size):
        label = labels.data[i]
        class_correct[label] += correct[i].item()
        class_total[label] += 1
        
print('Test Loss: {:.6f}\n'.format(test_loss.numpy()[0]))

for i in range(10):
    if class_total[i] > 0:
        print('Test Accuracy of %5s: %2d%% (%2d/%2d)' % (
            classes[i], 100 * class_correct[i] / class_total[i],
            np.sum(class_correct[i]), np.sum(class_total[i])))
    else:
        print('Test Accuracy of %5s: N/A (no training examples)' % (classes[i]))

        
print('\nTest Accuracy (Overall): %2d%% (%2d/%2d)' % (
    100. * np.sum(class_correct) / np.sum(class_total),
    np.sum(class_correct), np.sum(class_total)))
Test Loss: 2.362950

Test Accuracy of T-shirt/top: 85% (850/1000)
Test Accuracy of Trouser: 96% (963/1000)
Test Accuracy of Pullover: 84% (842/1000)
Test Accuracy of Dress: 91% (911/1000)
Test Accuracy of  Coat: 85% (856/1000)
Test Accuracy of Sandal: 98% (989/1000)
Test Accuracy of Shirt: 49% (495/1000)
Test Accuracy of Sneaker: 94% (948/1000)
Test Accuracy of   Bag: 97% (978/1000)
Test Accuracy of Ankle boot: 93% (930/1000)

Test Accuracy (Overall): 87% (8762/10000)

2.3 特征可視化

模型得到訓練並且在測試數據上可以達到 87% 的准確率,下面讓我們進行可視化。

可視化策略是從模型中將各卷積層的參數提取出來,作為獨立的過濾器,使用 OpenCV 的 filter2D 函數,施加在一張從測試集抽樣出的圖像中。觀察過濾器對圖像起到的作用,並嘗試去解釋當前過濾器對原圖起到了怎樣的濾波作用。

從數據集中抽取單張圖片

dataiter = iter(test_loader)
images, labels = dataiter.next()
images = images.numpy()
idx = 15
img = np.squeeze(images[idx])

import cv2
plt.imshow(img, cmap='gray')
<matplotlib.image.AxesImage at 0x124832a90>

進行第一層卷積核的可視化

weights = net.conv1.weight.data
w = weights.numpy()
print(w.shape)

fig = plt.figure(figsize=(30, 10))
columns = 4 * 2
row = 4
for i in range(0, columns * row):
    fig.add_subplot(row, columns, i+1)
    if ((i%2)==0):
        plt.imshow(w[int(i/2)][0], cmap='gray')
    else:
        c = cv2.filter2D(img, -1, w[int((i-1)/2)][0])
        plt.imshow(c, cmap='gray')
plt.show()
(16, 1, 3, 3)

進行第一層卷積核的可視化

weights = net.conv2.weight.data
w = weights.numpy()
print(w.shape)

fig = plt.figure(figsize=(30, 20))
columns = 4 * 2
row = 8
for i in range(0, columns * row):
    fig.add_subplot(row, columns, i+1)
    if ((i%2)==0):
        plt.imshow(w[int(i/2)][0], cmap='gray')
    else:
        c = cv2.filter2D(img, -1, w[int((i-1)/2)][0])
        plt.imshow(c, cmap='gray')
plt.show()
(32, 16, 3, 3)

可以看到一些卷積核起到了邊緣檢測的功能,不同的卷積核對不同方向,不同的紋理,或者說不同的圖像內容敏感。

感覺這種人以主觀想法可視化卷積的方法還不夠豐滿,可能這就算是簡單的神經網絡的可視化方法。除了卷積核的可視化,還可以進行全連接層的可視化。

關於全連接層的可視化,有教程表示是通過可視化類似類別間不同數據單例的“嵌入向量”距離進行可視化的,可能還需要對全連接層產生的“嵌入向量”進行 T-SNE 將為后再進行可視化。如果后續遇到了相關內容,會在本文中再補上。

后記

本文內容參考自 Udacity 計算機視覺納米學位練習,官方源碼連接:

https://github.com/udacity/CVND_Exercises/tree/master/1_5_CNN_Layers


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM