遷移學習


在這一教程中,你將會學習到怎么使用遷移學習訓練網絡。你可以在cs231n課程中學習更多有關遷移學習的內容。

引用如下筆記:

  實踐中,很少有人從隨機開始訓練一個完整的網絡,因為缺乏足夠的數據。通用的做法是在一個非常大的數據集上(比如ImageNet,它有120萬圖片,1000個類別)預訓練一個ConvNet,然后使用這個ConvNet作為初始化或一個固定的特征提取器用在你感興趣的任務上。

 有如下兩種主要的遷移學習形式:

  • 微調ConvNet:相對於隨機初始化,我們使用一個預訓練網絡參數來初始化網絡,剩下的與常規的相同。
  • ConvNet作為一個固定的特征提取器:除了最后的全連接層,我們將會鎖住所有的網絡權重。最后的全連接層替換成一個新的隨機初始化的層,只有這層被訓練。

 導入需要的庫:

import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import torchvision
from torchvision import datasets,models,transforms
import matplotlib.pyplot as plt
import time
import os
import copy

plt.ion()

 

加載數據:

我們將使用torchvision和torch.utils.data包來加載數據。

我們今天想要解決的問題是訓練一個模型來分類螞蟻和蜜蜂。我們有將近120張關於螞蟻和蜜蜂的訓練圖片。對於每個類有75張驗證圖片。通常這是一個非常小的數據集,如果從隨機開始訓練不足以泛化。使用遷移學習,能夠相對泛化地很好。

這個數據集是imgnet的一個小的子集。

注意:點擊這里下載,並提取到當下文件夾。

data_transforms={
    'train':transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485,0.456,0.406],[0.229,0.224,0.225])
    ]),
    'val':transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485,0.456,0.406],[0.229,0.224,0.225])
    ])
}
data_dir='hymenoptera_data'
image_datasets={x:datasets.ImageFolder(os.path.join(data_dir,x),data_transforms[x])
               for x in ['train','val']}
dataloaders
={x:torch.utils.data.DataLoader(image_datasets[x],batch_size=4,shuffle=True,num_workers=0) for x in ['train','val']}
dataset_size
={x:len(image_datasets[x]) for x in ['train','val']} class_name=image_datasets['train'].classes device=torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

 

 可視化部分圖像

讓我們可視化一些訓練圖片來了解下數據。

import numpy as np
def imshow(inp,title=None):
    """Imshow for Tensor"""
    inp=inp.numpy().transpose((1,2,0))
    mean=np.array([0.485,0.456,0.406])
    std=np.array([0.229,0.224,0.225])
   #輸出*標准差+均值 inp
=std*inp+mean
#限制范圍為[0,1] inp
=np.clip(inp,0,1) plt.imshow(inp) if title is not None: plt.title(title) plt.pause(0.001) inputs,classes=next(iter(dataloaders['train'])) out=torchvision.utils.make_grid(inputs) imshow(out,title=[class_name[x] for x in classes])

 

訓練模型

現在,讓我們寫一個常規的函數來訓練模型,我們將會實現:

  • 調整學習率
  • 保存最好的模型

在下面,參數scheduler是來自torch.optim.lr_scheduler的LR調度目標

def train_model(model,criterion,optimizer,scheduler,num_epoch=25):
    since=time.time()
    
    # 最佳的模型參數
    best_model_wts=copy.deepcopy(model.state_dict())
    best_acc=0.0
    
    for epoch in range(num_epoch):
        print('Epoch {}/{}'.format(epoch,num_epoch-1))
        print('-'*10)
        
        # 在每個epoch划分訓練與驗證集合
        for phase in ['train','val']:
            if phase=='train':
                scheduler.step()
                model.train()
            else:
                model.eval()
            
            running_loss=0.0
            running_corrects=0
            
            #迭代數據
            for inputs,labels in dataloaders[phase]:
                inputs=inputs.to(device)
                labels=labels.to(device)
                
                # 清空梯度
                optimizer.zero_grad()
                
                # 前向傳播
                # 在訓練中追蹤歷史
                with torch.set_grad_enabled(phase=='train'):
                    outputs=model(inputs)
                    _,preds=torch.max(outputs,1)
                    loss=criterion(outputs,labels)
                    
                    #在訓練中后向傳播並優化
                    if phase=='train':
                        loss.backward()
                        optimizer.step()
                
                    running_loss+=loss.item()*inputs.size(0)
                    running_corrects+=torch.sum(preds==labels.data)
                    
            epoch_loss=running_loss/dataset_sizes[phase]
            epoch_acc=running_corrects.double()/dataset_sizes[phase]
            
            print('{} loss:{:.4f} Acc:{:.4f}'.format(phase,epoch_loss,epoch_acc))
            
            #對模型深度復制
            if phase=='val' and epoch_acc > best_acc:
                best_acc=epoch_acc
                best_model_wts=copy.deepcopy(model.state_dict())
            
        print()
    
    time_elapsed=time.time()-since
    print('Traning complete in {:.0f}m {:.0f}s'.format(time_elapsed//60,time_elapsed%60))
    print('Best val Acc: {:4f}'.format(best_acc))
    
    # 加載最佳的模型參數
    model.load_state_dict(best_model_wts)
    return model  

 

可視化模型預測

常規的函數來展示部分圖片的預測

def visualize_model(model,num_images=6):
    was_training=model.training
    model.eval()
    images_so_far=0
    fig=plt.figure()
    
    with torch.no_grad():
        for i,(inputs,labels) in enumerate(dataloaders['val']):
            inputs=inputs.to(device)
            lables=labels.to(device)
            
            outputs=model(inputs)
            _,preds=torch.max(outputs,1)
            
            for j in range(inputs.size()[0]):
                images_so_far+=1
                #被分為num_images//2行,2列,顯示位置從1開始
                ax =plt.subplot(num_images//2,2,images_so_far)
                ax.axis('off')
                ax.set_title('predicted:{}'.format(class_names[preds[j]]))
                imshow(inputs.cpu.data[j])
                
                if images_so_far==num_images:
                    model.train(mode=was_training)
                    return
                
        # 在測試之后將模型恢復之前的形式        
        model.train(mode=was_trainning)

 

 

微調卷積網絡

加載預訓練模型和替換最后的全連接層

model_ft=models.resnet18(pretrained=True)
num_ftrs=model_ft.fc.in_features
model_ft.fc=nn.Linear(num_ftrs,2)

model_ft=model_ft.to(device)

criterion=nn.CrossEntropyLoss()

# 這樣所有的參數都將優化
optimizer_ft=optim.SGD(model_ft.parameters(),lr=0.001,momentum=0.9)

# 每7個epoch LR衰減0.1
exp_lr_scheduler=lr_scheduler.StepLR(optimizer_ft,step_size=7,gamma=0.1)

訓練和評估

model_ft=train_model(model_ft,criterion,optimizer_ft,exp_lr_scheduler,num_epoch=25)
out:
Epoch 0/24 ----------
train loss:0.5504 Acc:0.7582
val loss:0.2265 Acc:0.9085
Epoch 1/24 ----------
train loss:0.5732 Acc:0.7377
val loss:0.3290 Acc:0.8693
Epoch 2/24 ----------
train loss:0.4693 Acc:0.7664
val loss:0.3554 Acc:0.8824
Epoch 3/24 ----------
train loss:0.5210 Acc:0.8402
val loss:0.5970 Acc:0.7843
Epoch 4/24 ----------
train loss:0.4709 Acc:0.8361
val loss:0.2318 Acc:0.8758
Epoch 5/24 ----------
train loss:0.4802 Acc:0.8115
val loss:0.2669 Acc:0.8693
Epoch 6/24 ----------
train loss:0.5484 Acc:0.7910
val loss:0.2531 Acc:0.8889
Epoch 7/24 ----------
train loss:0.3559 Acc:0.8443
val loss:0.2264 Acc:0.9085
Epoch 8/24 ----------
train loss:0.3517 Acc:0.8689
val loss:0.2716 Acc:0.8954
Epoch 9/24 ----------
train loss:0.3331 Acc:0.8607
val loss:0.2068 Acc:0.9150
Epoch 10/24 ----------
train loss:0.3589 Acc:0.8402
val loss:0.1906 Acc:0.9216
Epoch 11/24 ----------
train loss:0.2559 Acc:0.9057
val loss:0.1732 Acc:0.9346
Epoch 12/24 ----------
train loss:0.3467 Acc:0.8279
val loss:0.1772 Acc:0.9346
Epoch 13/24 ----------
train loss:0.3280 Acc:0.8730
val loss:0.1722 Acc:0.9346
Epoch 14/24 ----------
train loss:0.2826 Acc:0.8975
val loss:0.1651 Acc:0.9346
Epoch 15/24 ----------
train loss:0.2317 Acc:0.9098
val loss:0.1984 Acc:0.9346
Epoch 16/24 ----------
train loss:0.2708 Acc:0.8811
val loss:0.2128 Acc:0.9150
Epoch 17/24 ----------
train loss:0.3020 Acc:0.8811
val loss:0.1926 Acc:0.9281
Epoch 18/24 ----------
train loss:0.3015 Acc:0.8689
val loss:0.2119 Acc:0.8954
Epoch 19/24 ----------
train loss:0.3194 Acc:0.8525
val loss:0.1766 Acc:0.9281
Epoch 20/24 ----------
train loss:0.2548 Acc:0.8893
val loss:0.1709 Acc:0.9346
Epoch 21/24 ----------
train loss:0.2439 Acc:0.8893
val loss:0.1996 Acc:0.9216
Epoch 22/24 ----------
train loss:0.2203 Acc:0.9262
val loss:0.1606 Acc:0.9281
Epoch 23/24 ----------
train loss:0.2406 Acc:0.9057
val loss:0.1681 Acc:0.9412
Epoch 24/24 ----------
train loss:0.1897 Acc:0.9385
val loss:0.1548 Acc:0.9281
Traning complete in 6m 39s Best val Acc: 0.941176
visualize_model(model_ft)

 

 

卷積網絡作為固定的特征提取器

現在,我們需要固定住所有的網絡,除了最后一層。我們需要設置requires_grad==False來固定網絡參數,以便於它們的梯度在反向傳播不進行計算。

你可以點擊這里了解更多。

model_conv=torchvision.models.resnet18(pretrained=True)
for param in model_conv.parameters():
    param.requires_grad=False

# 新加的模塊默認是requires_grad=True
num_ftrs=model_conv.fc.in_features
model_conv.fc=nn.Linear(num_ftrs,2)

model_conv=model_conv.to(device)
criterion=nn.CrossEntropyLoss()

# 只有最后一層的參數被優化
optimizer_conv=optim.SGD(model_conv.fc.parameters(),lr=0.001,momentum=0.9)

exp_lr_scheduler=lr_scheduler.StepLR(optimizer_conv,step_size=7,gamma=0.1)

 訓練和評估

model_conv=train_model(model_conv,criterion,optimizer_conv,exp_lr_scheduler,num_epoch=25)
out:
Epoch 0/24
----------
train loss:0.6060 Acc:0.6311
val loss:0.3218 Acc:0.8562

Epoch 1/24
----------
train loss:0.6493 Acc:0.7418
val loss:0.5251 Acc:0.7712

Epoch 2/24
----------
train loss:0.5541 Acc:0.7623
val loss:0.1878 Acc:0.9477

Epoch 3/24
----------
train loss:0.4401 Acc:0.7828
val loss:0.3141 Acc:0.8758

Epoch 4/24
----------
train loss:0.5697 Acc:0.7869
val loss:0.3362 Acc:0.8627

Epoch 5/24
----------
train loss:0.3668 Acc:0.8402
val loss:0.4343 Acc:0.8301

Epoch 6/24
----------
train loss:0.4692 Acc:0.8238
val loss:0.2586 Acc:0.9085

Epoch 7/24
----------
train loss:0.2712 Acc:0.8770
val loss:0.1950 Acc:0.9477

Epoch 8/24
----------
train loss:0.3284 Acc:0.8443
val loss:0.1944 Acc:0.9542

Epoch 9/24
----------
train loss:0.3115 Acc:0.8852
val loss:0.2000 Acc:0.9542

Epoch 10/24
----------
train loss:0.3889 Acc:0.8402
val loss:0.1896 Acc:0.9542

Epoch 11/24
----------
train loss:0.3071 Acc:0.8689
val loss:0.1981 Acc:0.9542

Epoch 12/24
----------
train loss:0.2208 Acc:0.9098
val loss:0.1956 Acc:0.9542

Epoch 13/24
----------
train loss:0.3622 Acc:0.8320
val loss:0.2058 Acc:0.9477

Epoch 14/24
----------
train loss:0.3290 Acc:0.8525
val loss:0.2212 Acc:0.9412

Epoch 15/24
----------
train loss:0.3359 Acc:0.8525
val loss:0.2120 Acc:0.9542

Epoch 16/24
----------
train loss:0.3550 Acc:0.8279
val loss:0.1864 Acc:0.9477

Epoch 17/24
----------
train loss:0.3395 Acc:0.8402
val loss:0.2104 Acc:0.9542

Epoch 18/24
----------
train loss:0.2966 Acc:0.8811
val loss:0.2044 Acc:0.9477

Epoch 19/24
----------
train loss:0.3477 Acc:0.8320
val loss:0.1918 Acc:0.9542

Epoch 20/24
----------
train loss:0.3185 Acc:0.8607
val loss:0.1891 Acc:0.9477

Epoch 21/24
----------
train loss:0.4269 Acc:0.8238
val loss:0.2043 Acc:0.9412

Epoch 22/24
----------
train loss:0.3475 Acc:0.8566
val loss:0.1913 Acc:0.9542

Epoch 23/24
----------
train loss:0.4412 Acc:0.7951
val loss:0.1972 Acc:0.9542

Epoch 24/24
----------
train loss:0.3772 Acc:0.8279
val loss:0.2329 Acc:0.9346

Traning complete in 3m 2s
Best val Acc: 0.954248

 

visualize_model(model_conv)
plt.ioff()
plt.show()

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM