作者:如縷清風
本文為博主原創,未經允許,請勿轉載:https://www.cnblogs.com/warren2123/p/15033224.html
一、前言
本文基於殘差網絡模型,通過對ResNet-50模型進行微調,對不同狗狗品種數據集進行鑒定。
Dog Breed Identification數據集包含20579張不同size的彩色圖片,共分為120類犬種。其中訓練集包含10222張圖片,測試集包含10357張圖片。犬種數據集樣本圖如下所示。
二、基於Fine Tuning構建ResNet-50模型
隨着模型深度的提升,出現網絡出現退化問題,因此殘差網絡應運而生。通過微調已經構建好的模型,能夠在相似的數據集運用,不再需要重新訓練模型。本文模型構建分為四個部分:數據讀取及預處理、構建ResNet-50模型以及模型微調、定義模型超參數以及評估方法、參數優化。
1、數據讀取及預處理
本文采用pandas對數據進行預處理,以及通過GPU對運算進行提速。
import os, torch, torchvision import torch.nn as nn import torch.nn.functional as F import numpy as np import pandas as pd from torch.utils.data import DataLoader, Dataset from torchvision import datasets, models, transforms from PIL import Image from sklearn.model_selection import StratifiedShuffleSplit device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
通過pandas讀取csv文件,顯示前5條數據,發現原數據只存在兩列數據,id對應圖片名,breed對應犬種名稱,總共有120種。
data_root = 'data' all_labels_df = pd.read_csv(os.path.join(data_root, 'labels.csv')) all_labels_df.head()
根據犬種名稱,將其轉化為id的形式進行一一對應,並在數據集中增加新列,表示犬種類別的id標簽。
breeds = all_labels_df.breed.unique() breed2idx = dict((breed, idx) for idx, breed in enumerate(breeds)) idx2breed = dict((idx, breed) for idx, breed in enumerate(breeds)) all_labels_df['label_idx'] = all_labels_df['breed'].map(breed2idx)
將訓練集分割出10%當作驗證集,得到訓練集和驗證集兩個數據集。
dataset_names = ['train', 'valid'] stratified_split = StratifiedShuffleSplit(n_splits=1, test_size=0.1, random_state=0) train_split_idx, val_split_idx = next(iter(stratified_split.split(all_labels_df.id, all_labels_df.breed))) train_df = all_labels_df.iloc[train_split_idx].reset_index() val_df = all_labels_df.iloc[val_split_idx].reset_index()
由於當前處理后的數據不能作為PyTorch的輸入,所以需要自定義一個數據的讀取。通過DogDataset類實現如下:
class DogDataset(Dataset): def __init__(self, labels_df, img_path, transform=None): self.labels_df = labels_df self.img_path = img_path self.transform = transform def __len__(self): return self.labels_df.shape[0] def __getitem__(self, idx): image_name = os.path.join(self.img_path, self.labels_df.id[idx]) + '.jpg' img = Image.open(image_name) label = self.labels_df.label_idx[idx] if self.transform: img = self.transform(img) return img, label
下一步,定義了數據的預處理方法,包括:Resize、Crop、Normalize等三種主要方法,數據增強采用了RandomResizedCrop、RandomHorizontalFlip、RandomRotation。圖片讀取的方式采用PyTorch的DataLoader。
train_transforms = transforms.Compose([ transforms.Resize(img_size), transforms.RandomResizedCrop(img_size), transforms.RandomHorizontalFlip(), transforms.RandomRotation(30), transforms.ToTensor(), transforms.Normalize(img_mean, img_std) ]) val_transforms = transforms.Compose([ transforms.Resize(img_size), transforms.CenterCrop(img_size), transforms.ToTensor(), transforms.Normalize(img_mean, img_std) ]) image_transforms = {'train': train_transforms, 'valid': val_transforms} train_dataset = DogDataset(train_df, os.path.join(data_root, 'train'), transform=image_transforms['train']) val_dataset = DogDataset(val_df, os.path.join(data_root, 'train'), transform=image_transforms['valid']) image_dataset = {'train': train_dataset, 'valid': val_dataset} image_dataloader = {x: DataLoader(image_dataset[x], batch_size=batch_size, shuffle=True, num_workers=0) for x in dataset_names} dataset_sizes = {x: len(image_dataset[x]) for x in dataset_names}
2、構建ResNet-50模型以及模型微調
殘差網絡是由一系列殘差塊組成,網絡的一層通常可以看做, 而殘差網絡的一個殘差塊可以表示為
,也就是
,在單位映射中,
便是觀測值,而
是預測值,所以
便對應着殘差,因此叫做殘差網絡。殘差塊的示意圖如下所示。
ResNet-50模型是將殘差塊堆疊而成,達到50層的深層神經網絡。模型架構圖如下所示。
通過torchvision快速實現ResNet-50模型,由於當前模型是通過ImageNet進行訓練,而ImageNet有1000個分類,本文的犬種只有120類,所以需要對模型進行重新配置。首先將模型的所有參數進行凍結,再修改模型的輸出層。
model_ft = models.resnet50(pretrained=True) for param in model_ft.parameters(): param.requires_grad = False num_fc_ftr = model_ft.fc.in_features model_ft.fc = nn.Linear(num_fc_ftr, len(breeds)) model_ft = model_ft.to(device)
3、定義模型超參數以及評估方法
模型的輸入圖片大小、圖片預處理的MEAN、STD通過以下定義,損失函數采用交叉熵誤差計算,優化函數采用Adam。
IMG_SIZE = 224 IMG_MEAN = [0.485, 0.456, 0.406] IMG_STD = [0.229, 0.224, 0.225] LR = 0.001 EPOCHES = 20 BATCHSIZE = 256 criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam([{'params': model_ft.fc.parameters()}], lr=LR)
4、參數優化
以下是通過定義的訓練次數進行模型的參數優化過程,每一次訓練輸出模型的訓練誤差、測試誤差、准確率。
for epoch in range(1, EPOCHES): model.train() for batch_idx, data in enumerate(train_loader): x, y = data x = x.to(device) y = y.to(device) optimizer.zero_grad() y_hat = model(x) loss = criterion(y_hat, y) loss.backward() optimizer.step() print('Train Epoch: {}\t Loss: {:.6f}'.format(epoch, loss.item())) model.eval() test_loss = 0 correct = 0 with torch.no_grad(): for i, data in enumerate(test_loader): x, y = data x = x.to(device) y = y.to(device) optimizer.zero_grad() y_hat = model(x) test_loss += criterion(y_hat, y).item() pred = y_hat.max(1, keepdim=True)[1] correct += pred.eq(y.view_as(pred)).sum().item() test_loss /= len(test_loader.dataset) print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(test_loss, correct, len(val_dataset), 100. * correct / len(val_dataset)))
訓練輸出如下所示:
Train Epoch: 1 Loss: 2.652317 Test set: Average loss: 0.0080, Accuracy: 679/1023 (66%) Train Epoch: 2 Loss: 1.943187 Test set: Average loss: 0.0048, Accuracy: 765/1023 (75%) Train Epoch: 3 Loss: 1.791019 Test set: Average loss: 0.0038, Accuracy: 787/1023 (77%) Train Epoch: 4 Loss: 1.551752 Test set: Average loss: 0.0033, Accuracy: 815/1023 (80%) Train Epoch: 5 Loss: 1.534580 Test set: Average loss: 0.0030, Accuracy: 808/1023 (79%) Train Epoch: 6 Loss: 1.454419 Test set: Average loss: 0.0029, Accuracy: 803/1023 (78%) Train Epoch: 7 Loss: 1.365119 Test set: Average loss: 0.0027, Accuracy: 828/1023 (81%) Train Epoch: 8 Loss: 1.234558 Test set: Average loss: 0.0027, Accuracy: 812/1023 (79%) Train Epoch: 9 Loss: 1.290068 Test set: Average loss: 0.0026, Accuracy: 836/1023 (82%) Train Epoch: 10 Loss: 1.474308 Test set: Average loss: 0.0026, Accuracy: 826/1023 (81%) Train Epoch: 11 Loss: 1.262610 Test set: Average loss: 0.0026, Accuracy: 840/1023 (82%) Train Epoch: 12 Loss: 1.240969 Test set: Average loss: 0.0026, Accuracy: 815/1023 (80%) Train Epoch: 13 Loss: 1.098669 Test set: Average loss: 0.0025, Accuracy: 825/1023 (81%) Train Epoch: 14 Loss: 1.006308 Test set: Average loss: 0.0026, Accuracy: 822/1023 (80%) Train Epoch: 15 Loss: 1.144400 Test set: Average loss: 0.0025, Accuracy: 819/1023 (80%) Train Epoch: 16 Loss: 1.033793 Test set: Average loss: 0.0025, Accuracy: 829/1023 (81%) Train Epoch: 17 Loss: 1.105277 Test set: Average loss: 0.0024, Accuracy: 839/1023 (82%) Train Epoch: 18 Loss: 1.132692 Test set: Average loss: 0.0024, Accuracy: 833/1023 (81%) Train Epoch: 19 Loss: 1.225546 Test set: Average loss: 0.0025, Accuracy: 830/1023 (81%)
三、總結
本文通過微調ResNet-50模型,對犬種圖片數據進行鑒定,分類效果達到81%~82%左右,可以進一步調整超參數、更換更深的殘差網絡實現更高的分類效果。通過細分預測結果,得出准確率在50%及以下的犬類品種有12中,其中miniature_poodle、appenzeller、siberian_husky在30%以下。
Accuracy of siberian_husky : 20 % Accuracy of appenzeller : 25 % Accuracy of miniature_poodle : 25 % Accuracy of toy_poodle : 37 % Accuracy of walker_hound : 42 % Accuracy of collie : 44 % Accuracy of english_foxhound : 44 % Accuracy of bouvier_des_flandres : 44 % Accuracy of bloodhound : 50 % Accuracy of irish_wolfhound : 50 % Accuracy of staffordshire_bullterrier : 50 % Accuracy of rottweiler : 50 %