MxNet 遷移學習實現深度學習分類


利用MxNet實現圖像分類任務

在這里插入圖片描述

這篇文章將利用MxNet以及其前端gluon 實現一個完整的圖像分類任務,其中主要包括以下幾個方面:

  • 圖像I/O
  • 搭建網絡
  • 進行訓練
  • 驗證算法
  • 輸出結果
定義輔助函數
損失函數
驗證
數據I/O
定義網絡模型
訓練
測試
生成結果

1. 訓練數據I/O

將處理好的訓練數據讀入,進行訓練。

訓練數據的格式基本按照一個子類一個子文件夾的形式保持,具體可以參考MXNet的數據I/O

1.1 程序的第一步,首先導入相關的包
#import some packages import sys import collections import datetime #用於計時 import gluonbook as gb #用於導入一些功能函數 import math import numpy as np import mxnet as mx #mxnet from mxnet import autograd, gluon, init, nd, image #導入自動梯度,gluon前端,圖像等模塊 from mxnet.gluon import data as gdata, loss as gloss, model_zoo, nn #導入模型相關模塊 import os import shutil #用於預處理復制文件 import zipfile import matplotlib.pyplot as plt #繪圖工具導入 

在這里插入圖片描述

1.2 隨后定義精度計算函數、圖像增廣函數等輔助函數
# 圖像增廣和輔助函數 # 計算 Average Precision def calculate_ap(labels, outputs): cnt = 0 ap = 0. for label, output in zip(labels, outputs): for lb, op in zip(label.asnumpy().astype(np.int), output.asnumpy()): op_argsort = np.argsort(op)[::-1] #輸出排序后的index,最大概率的值對應的index lb_int = int(lb) #標簽對應的整數 ap += 1.0 / (1+list(op_argsort).index(lb_int)) #精度計算 正確的個數 cnt += 1 return ((ap, cnt)) # 訓練集圖片增廣 def transform_train(data, label): im = data.astype('float32') / 255 #歸並到0~1之間 #圖像增強的函數組定義,並利用ImageNet的預訓練均值、方差歸一化輸入圖像 auglist = image.CreateAugmenter(data_shape=(3, 224, 224), resize=256, rand_crop=True, rand_mirror=True, mean = np.array([0.485, 0.456, 0.406]), std = np.array([0.229, 0.224, 0.225])) for aug in auglist: im = aug(im) im = nd.transpose(im, (2,0,1)) #改變 return (im, nd.array([label]).asscalar()) # 驗證集圖片增廣,沒有隨機裁剪和翻轉 def transform_val(data, label): im = data.astype('float32') / 255 auglist = image.CreateAugmenter(data_shape=(3, 224, 224), resize=256, mean = np.array([0.485, 0.456, 0.406]), std = np.array([0.229, 0.224, 0.225])) for aug in auglist: im = aug(im) im = nd.transpose(im, (2,0,1)) #改變格式為 channel width height return (im, nd.array([label]).asscalar()) # 在驗證集上預測並評估 def validate(net, val_data, ctx): metric = mx.metric.Accuracy() L = gluon.loss.SoftmaxCrossEntropyLoss() AP = 0. AP_cnt = 0 val_loss = 0 for i, batch in enumerate(val_data): data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0, even_split=False) label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0, even_split=False) outputs = [net(X) for X in data] metric.update(label, outputs) loss = [L(yhat, y) for yhat, y in zip(outputs, label)] val_loss += sum([l.mean().asscalar() for l in loss]) / len(loss) #平均損失 ap, cnt = calculate_ap(label, outputs) AP += ap AP_cnt += cnt #精度也要求平均 _, val_acc = metric.get() return ((val_acc, AP / AP_cnt, val_loss / len(val_data))) 

在這里插入圖片描述

1.3 讀取訓練和驗證數據

這時候可以利用gluon的內置函數來對數據進行讀取了,只需要輸入對應數據的文件夾即可,參考MXNet I/O

#讀取數據文件 train_set = gdata.vision.ImageFolderDataset('./train_dis/',flag=1) valid_set = gdata.vision.ImageFolderDataset('./valid_dis/',flag=1) #check data classes print(train_set) #check數據的長度是否正確,應為訓練圖像總數量 print(train_set.synsets) #also has items attributes,現實分類別是否正確,應為類別數目 print(valid_set) print(valid_set.synsets) #also has items attributes 
<mxnet.gluon.data.vision.datasets.ImageFolderDataset object at 0x7fb3d6e06710>
['0', '1', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '2', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '3', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '4', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '5', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '6', '60', '7', '8', '9']
<mxnet.gluon.data.vision.datasets.ImageFolderDataset object at 0x7fb3d6e06668>
['0', '1', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '2', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '3', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '4', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '5', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '6', '60', '7', '8', '9']

得到輸入序列后,將圖像讀入迭代器中,根據顯存設置批量的大小。

#data into iter and realized argumentation batch_size = 64 #32--2821M could be 64 train_iter = gdata.DataLoader(train_set.transform(transform_train), batch_size, shuffle=True, last_batch='keep', num_workers=4) valid_iter = gdata.DataLoader(valid_set.transform(transform_val), batch_size, shuffle=True, last_batch='keep', num_workers=4) 

讀入后check迭代器的數據,並顯示目測

# check the data set in iter print("trainiter lenght is: %d"%len(train_iter)) import matplotlib.pyplot as plt for imgs, labels in train_iter: print(labels) #打印label 對應類別label print(imgs.shape) #查看batch圖像的維度 break #讀入一個batch #show images nor_parms = [[0.485, 0.456, 0.406],[0.229, 0.224, 0.225]] #_,figs = plt.subplots(8,4,figsize=(8,4)) for i in range(8): for j in range(4): x = nd.transpose(imgs[i*4+j,:,:,:],(1,2,0)).asnumpy() print(x.shape,type(x)) #查看batch中圖像的維度和類型 #x[:,:,0]*nor_parms[0][0]+nor_parms[1][0] #x[:,:,1]*nor_parms[0][1]+nor_parms[1][1] #x[:,:,2]*nor_parms[0][2]+nor_parms[1][2] plt.imshow(x) plt.show() break 
trainiter lenght is: 512    #總共有512個batch,每個batch有64個訓練數據

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

[35.  0. 31. 19. 38. 33. 35. 33. 19. 25. 16. 26. 36. 52. 18. 16. 27. 23.
 19.  4. 19. 38. 38. 11. 41. 36. 22. 36. 29. 57. 26. 55. 18. 55. 55. 16.
 27. 26. 55. 10. 19. 21. 23. 19. 50. 56. 31. 14. 20. 19.  8. 54. 57.  8.
 52. 19. 56. 57. 17. 42. 18.  0. 23. 55.]
<NDArray 64 @cpu_shared(0)>
(64, 3, 224, 224)
(224, 224, 3) <class 'numpy.ndarray'>

在這里插入圖片描述

2.定義模型

這里主要使用遷移學習的方式,利用預訓練模型抽取圖像的基本特征,而后只需要訓練最后的輸出層來進行分類。

#define the net work by pre-train def get_net(ctx): resnet = model_zoo.vision.resnet50_v2(pretrained=True) #ctx 使用resnet_50作為基本網絡抽取特征 resnet.output_new = nn.HybridSequential(prefix='') #output is the origin 得到特征,新定義一個輸出 #add two fcn for finetune resnet.output_new.add(nn.Dense(256,activation = 'relu')) #在模型基礎上,定義最后兩個全連接層 resnet.output_new.add(nn.Dense(61)) #initialize resnet.output_new.initialize(init.Xavier(),ctx=ctx) #for fintune resnet.collect_params().reset_ctx(ctx) #for whole net return resnet 

定義損失函數,這里主要使用分類的softmax交叉熵來作為損失。

#for loss loss = gloss.SoftmaxCrossEntropyLoss() #分類損失交叉熵 def get_loss(data,net,ctx): l=0.0 #loss for X,y in data: y = y.as_in_context(ctx) #計算預訓練模型輸出的特征 out_features = net.features(X.as_in_context(ctx)) outputs = net.output_new(out_features) #final output l += loss(outputs,y).mean().asscalar() #loss for the process return l/len(data) 
2.1定義訓練過程

完成了以上的准備工作,讀入了數據、定義好了網絡和損失,我們可以開始進行訓練了,訓練函數定義如下,輸入為網絡模型、數據、訓練epochs、學習率、衰減等:

#def trainning process, trainer, epochscircles, lossback, valide def train(net,train_iter,valid_iter,num_epochs, lr, wd, ctx, lr_period, lr_decay): trainer = gluon.Trainer(net.output_new.collect_params(), 'sgd', {'learning_rate':lr, 'momentum':0.9, 'wd': wd}) plot_loss = [] #plot loss tic = datetime.datetime.now() print('Traing is begining, please waiting......') for epoch in range(num_epochs): train_l = 0.0 #存儲訓練loss counter = 0 #訓練batch周期計數器 #if epoch >0 and epoch %lr_period==0: #every period step update lr trainer.set_learning_rate(trainer.learning_rate*lr_decay) #every steps updata lr #print("There are %d data could train network"%len(train_iter)) for X,y in train_iter: #X~32(batch)*1024(iter)= 32768 #output for process reminding counter +=1 if counter % 256 ==0: print('processd %d images'%(counter*batch_size)) #一定批量就顯示處理過程 #output finished y = y.astype('float32').as_in_context(ctx) #feature out_features = net.features(X.as_in_context(ctx)) #預訓練直接前傳得到特征,未來這一步可以一次性做 #partly training fineturning with autograd.record(): #features to output, just use features as input outputs = net.output_new(out_features) #這里只bp最后兩層,只訓練最后新定義的部分 l = loss(outputs, y) l.backward() #for next batch trainer.step(batch_size) train_l += l.mean().asscalar() #log time into toc = datetime.datetime.now() h, remainder = divmod((toc - tic).seconds, 3600) m, s = divmod(remainder, 60) time_s = "time %02d:%02d:%02d" % (h, m, s) #validata if valid_iter is not None: #驗證數據,驗證訓練效果 valid_loss = get_loss(valid_iter, net, ctx) epoch_s = ("epoch %d, train loss is %f, valid loss is %f :D " %(epoch+1, train_l/len(train_iter),valid_loss)) else: epoch_s = ("epoch %d, train loss is %f :D" %(epoch+1, train_l/len(train_iter))) tic = toc print(epoch_s + time_s + ', lr ' + str(trainer.learning_rate)) #plot loss plot_loss.append(train_l/len(train_iter)) plt.plot(plot_loss) #將損失優化結果保存到圖里 plt.savefig("./training_loss.png") 

在這里插入圖片描述

2.2 開始訓練
ctx = gb.try_gpu();num_epochs = 1000;lr = 0.01;wd = 1e-4;lr_period = 10;lr_decay = 0.99; net = get_net(ctx) #將網絡和數據定義到gpu上 train(net,train_iter,valid_iter,num_epochs, lr, wd, ctx, lr_period, lr_decay) #訓練 net.output_new.collect_params().save('./output_new_2_1000.params') #訓練結束后保存參數 #net.output_new.save_params('./output_new_50.params') 
Traing is begining, please waiting......
processd xxxxx images
processd xxxxx images
epoch 1, train loss is 1.234988, valid loss is 0.776764 :Dtime 00:04:10, lr 0.0099

在這里插入圖片描述

3.測試

在訓練完成得到模型后,我們需要對數據進行測試。同樣需要讀入數據,並利用網絡進行分類。

#prepaer data test_set = gdata.vision.ImageFolderDataset('./test_dis/',flag=1) print("There are %d test imgs"%len(test_set)) 
There are xxxx test imgs

定義圖像讀入函數

def plot_image(img_path): with open(img_path, 'rb') as f: img = image.imdecode(f.read()) #讀入輸入 #plt.imshow(img.asnumpy()) return img 

接下來就是測試過程了:

#predict process preds = [] count_p=0 for img_path,label in test_set.items: #將加載列表中每一張測試圖進行分類 img = plot_image(img_path) data, _ = transform_val(img, 0) data = data.expand_dims(axis=0) #plt.imshow(img.asnumpy()) #plt.show() #print(img_path) #break # 計算預訓練模型輸出層的輸入,即特征。 output_features = net.features(data.as_in_context(mx.gpu())) # 將特征作為我們定義的輸出網絡的輸入,計算輸出。 output = nd.softmax(net.output_new(output_features)) preds.extend(output.asnumpy()) count_p +=1 #print(count_p) if count_p%100==0: print("processed %d imgs"%count_p) 
processed 100 imgs

在這里插入圖片描述


可以根據需要將生成的預測結果preds保存為json文件:

# use the tese_set name and predict results with open('submission.json', 'w') as f: f.write("[") for i in range(len(preds)): if i==len(preds)-1: f.write("{"+"\"image_id\": "+"\""+test_set.items[i][0].split('/')[-1]+"\""+','+"\"xxxx_class\":"+str(preds[i].argmax())+'}') else: f.write("{"+"\"image_id\": "+"\""+test_set.items[i][0].split('/')[-1]+"\""+','+"\"xxxx_class\":"+str(preds[i].argmax())+'}'+',') f.write("]") 

最后檢查生成的數據長度,是否和測試集數據長度相同,然后就大功告成啦~~~~

#check format import json user_result_list = json.load(open('./submission.json', encoding='utf-8')) len(user_result_list) 

代碼:
1.gluon
2.論壇
3.代碼1代碼2
4. Logo from zcool.com


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM