利用ResNet-50進行犬種鑒定


作者:如縷清風

本文為博主原創,未經允許,請勿轉載:https://www.cnblogs.com/warren2123/p/15033224.html


 

一、前言

        本文基於殘差網絡模型,通過對ResNet-50模型進行微調,對不同狗狗品種數據集進行鑒定。

        Dog Breed Identification數據集包含20579張不同size的彩色圖片,共分為120類犬種。其中訓練集包含10222張圖片,測試集包含10357張圖片。犬種數據集樣本圖如下所示。

二、基於Fine Tuning構建ResNet-50模型

        隨着模型深度的提升,出現網絡出現退化問題,因此殘差網絡應運而生。通過微調已經構建好的模型,能夠在相似的數據集運用,不再需要重新訓練模型。本文模型構建分為四個部分:數據讀取及預處理、構建ResNet-50模型以及模型微調、定義模型超參數以及評估方法、參數優化。

1、數據讀取及預處理

        本文采用pandas對數據進行預處理,以及通過GPU對運算進行提速。

import os, torch, torchvision
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import pandas as pd
from torch.utils.data import DataLoader, Dataset
from torchvision import datasets, models, transforms
from PIL import Image
from sklearn.model_selection import StratifiedShuffleSplit
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

        通過pandas讀取csv文件,顯示前5條數據,發現原數據只存在兩列數據,id對應圖片名,breed對應犬種名稱,總共有120種。

data_root = 'data'
all_labels_df = pd.read_csv(os.path.join(data_root, 'labels.csv'))
all_labels_df.head()

        根據犬種名稱,將其轉化為id的形式進行一一對應,並在數據集中增加新列,表示犬種類別的id標簽

breeds = all_labels_df.breed.unique()
breed2idx = dict((breed, idx) for idx, breed in enumerate(breeds))
idx2breed = dict((idx, breed) for idx, breed in enumerate(breeds))
all_labels_df['label_idx'] = all_labels_df['breed'].map(breed2idx)

        將訓練集分割出10%當作驗證集,得到訓練集和驗證集兩個數據集。

dataset_names = ['train', 'valid']
stratified_split = StratifiedShuffleSplit(n_splits=1, test_size=0.1, random_state=0)
train_split_idx, val_split_idx = next(iter(stratified_split.split(all_labels_df.id, all_labels_df.breed)))
train_df = all_labels_df.iloc[train_split_idx].reset_index()
val_df = all_labels_df.iloc[val_split_idx].reset_index()

        由於當前處理后的數據不能作為PyTorch的輸入,所以需要自定義一個數據的讀取。通過DogDataset類實現如下:

class DogDataset(Dataset):
    def __init__(self, labels_df, img_path, transform=None):
        self.labels_df = labels_df
        self.img_path = img_path
        self.transform = transform
    
    def __len__(self):
        return self.labels_df.shape[0]
    
    def __getitem__(self, idx):
        image_name = os.path.join(self.img_path, self.labels_df.id[idx]) + '.jpg'
        img = Image.open(image_name)
        label = self.labels_df.label_idx[idx]
        
        if self.transform:
            img = self.transform(img)
        
        return img, label

        下一步,定義了數據的預處理方法,包括:Resize、Crop、Normalize等三種主要方法,數據增強采用了RandomResizedCrop、RandomHorizontalFlip、RandomRotation。圖片讀取的方式采用PyTorch的DataLoader。

train_transforms = transforms.Compose([
    transforms.Resize(img_size),
    transforms.RandomResizedCrop(img_size),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(30),
    transforms.ToTensor(),
    transforms.Normalize(img_mean, img_std)
])

val_transforms = transforms.Compose([
    transforms.Resize(img_size),
    transforms.CenterCrop(img_size),
    transforms.ToTensor(),
    transforms.Normalize(img_mean, img_std)
])

image_transforms = {'train': train_transforms, 'valid': val_transforms}

train_dataset = DogDataset(train_df, os.path.join(data_root, 'train'), transform=image_transforms['train'])
val_dataset = DogDataset(val_df, os.path.join(data_root, 'train'), transform=image_transforms['valid'])

image_dataset = {'train': train_dataset, 'valid': val_dataset}

image_dataloader = {x: DataLoader(image_dataset[x], batch_size=batch_size, shuffle=True, num_workers=0) for x in dataset_names}
dataset_sizes = {x: len(image_dataset[x]) for x in dataset_names}

2、構建ResNet-50模型以及模型微調

        殘差網絡是由一系列殘差塊組成,網絡的一層通常可以看做[公式], 而殘差網絡的一個殘差塊可以表示為[公式],也就是[公式],在單位映射中,[公式]便是觀測值,而[公式]是預測值,所以[公式]便對應着殘差,因此叫做殘差網絡。殘差塊的示意圖如下所示。

        ResNet-50模型是將殘差塊堆疊而成,達到50層的深層神經網絡。模型架構圖如下所示。

        通過torchvision快速實現ResNet-50模型,由於當前模型是通過ImageNet進行訓練,而ImageNet有1000個分類,本文的犬種只有120類,所以需要對模型進行重新配置。首先將模型的所有參數進行凍結,再修改模型的輸出層。

model_ft = models.resnet50(pretrained=True)

for param in model_ft.parameters():
    param.requires_grad = False

num_fc_ftr = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_fc_ftr, len(breeds))
model_ft = model_ft.to(device)

3、定義模型超參數以及評估方法

        模型的輸入圖片大小、圖片預處理的MEAN、STD通過以下定義,損失函數采用交叉熵誤差計算,優化函數采用Adam。

IMG_SIZE = 224
IMG_MEAN = [0.485, 0.456, 0.406]
IMG_STD = [0.229, 0.224, 0.225]

LR = 0.001
EPOCHES = 20
BATCHSIZE = 256

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam([{'params': model_ft.fc.parameters()}], lr=LR)

4、參數優化

        以下是通過定義的訓練次數進行模型的參數優化過程,每一次訓練輸出模型的訓練誤差、測試誤差、准確率。

for epoch in range(1, EPOCHES):
    model.train()
    for batch_idx, data in enumerate(train_loader):
        x, y = data
        x = x.to(device)
        y = y.to(device)
        optimizer.zero_grad()
        y_hat = model(x)
        loss = criterion(y_hat, y)
        loss.backward()
        optimizer.step()
    print('Train Epoch: {}\t Loss: {:.6f}'.format(epoch, loss.item()))
    
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for i, data in enumerate(test_loader):
            x, y = data
            x = x.to(device)
            y = y.to(device)
            optimizer.zero_grad()
            y_hat = model(x)
            test_loss += criterion(y_hat, y).item()
            pred = y_hat.max(1, keepdim=True)[1]
            correct += pred.eq(y.view_as(pred)).sum().item()
    
    test_loss /= len(test_loader.dataset)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(test_loss, correct, len(val_dataset), 100. * correct / len(val_dataset)))

        訓練輸出如下所示:

Train Epoch: 1	 Loss: 2.652317
Test set: Average loss: 0.0080, Accuracy: 679/1023 (66%)
Train Epoch: 2	 Loss: 1.943187
Test set: Average loss: 0.0048, Accuracy: 765/1023 (75%)
Train Epoch: 3	 Loss: 1.791019
Test set: Average loss: 0.0038, Accuracy: 787/1023 (77%)
Train Epoch: 4	 Loss: 1.551752
Test set: Average loss: 0.0033, Accuracy: 815/1023 (80%)
Train Epoch: 5	 Loss: 1.534580
Test set: Average loss: 0.0030, Accuracy: 808/1023 (79%)
Train Epoch: 6	 Loss: 1.454419
Test set: Average loss: 0.0029, Accuracy: 803/1023 (78%)
Train Epoch: 7	 Loss: 1.365119
Test set: Average loss: 0.0027, Accuracy: 828/1023 (81%)
Train Epoch: 8	 Loss: 1.234558
Test set: Average loss: 0.0027, Accuracy: 812/1023 (79%)
Train Epoch: 9	 Loss: 1.290068
Test set: Average loss: 0.0026, Accuracy: 836/1023 (82%)
Train Epoch: 10	 Loss: 1.474308
Test set: Average loss: 0.0026, Accuracy: 826/1023 (81%)
Train Epoch: 11	 Loss: 1.262610
Test set: Average loss: 0.0026, Accuracy: 840/1023 (82%)
Train Epoch: 12	 Loss: 1.240969
Test set: Average loss: 0.0026, Accuracy: 815/1023 (80%)
Train Epoch: 13	 Loss: 1.098669
Test set: Average loss: 0.0025, Accuracy: 825/1023 (81%)
Train Epoch: 14	 Loss: 1.006308
Test set: Average loss: 0.0026, Accuracy: 822/1023 (80%)
Train Epoch: 15	 Loss: 1.144400
Test set: Average loss: 0.0025, Accuracy: 819/1023 (80%)
Train Epoch: 16	 Loss: 1.033793
Test set: Average loss: 0.0025, Accuracy: 829/1023 (81%)
Train Epoch: 17	 Loss: 1.105277
Test set: Average loss: 0.0024, Accuracy: 839/1023 (82%)
Train Epoch: 18	 Loss: 1.132692
Test set: Average loss: 0.0024, Accuracy: 833/1023 (81%)
Train Epoch: 19	 Loss: 1.225546
Test set: Average loss: 0.0025, Accuracy: 830/1023 (81%)

三、總結

        本文通過微調ResNet-50模型,對犬種圖片數據進行鑒定,分類效果達到81%~82%左右,可以進一步調整超參數、更換更深的殘差網絡實現更高的分類效果。通過細分預測結果,得出准確率在50%及以下的犬類品種有12中,其中miniature_poodle、appenzeller、siberian_husky在30%以下。

Accuracy of            siberian_husky : 20 %
Accuracy of               appenzeller : 25 %
Accuracy of          miniature_poodle : 25 %
Accuracy of                toy_poodle : 37 %
Accuracy of              walker_hound : 42 %
Accuracy of                    collie : 44 %
Accuracy of          english_foxhound : 44 %
Accuracy of      bouvier_des_flandres : 44 %
Accuracy of                bloodhound : 50 %
Accuracy of           irish_wolfhound : 50 %
Accuracy of staffordshire_bullterrier : 50 %
Accuracy of                rottweiler : 50 %

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM