數據集下載地址:
鏈接:https://pan.baidu.com/s/1l1AnBgkAAEhh0vI5_loWKw
提取碼:2xq4
創建數據集:https://www.cnblogs.com/xiximayou/p/12398285.html
讀取數據集:https://www.cnblogs.com/xiximayou/p/12422827.html
進行訓練:https://www.cnblogs.com/xiximayou/p/12448300.html
保存模型並繼續進行訓練:https://www.cnblogs.com/xiximayou/p/12452624.html
加載保存的模型並測試:https://www.cnblogs.com/xiximayou/p/12459499.html
划分驗證集並邊訓練邊驗證:https://www.cnblogs.com/xiximayou/p/12464738.html
使用學習率衰減策略並邊訓練邊測試:https://www.cnblogs.com/xiximayou/p/12468010.html
利用tensorboard可視化訓練和測試過程:https://www.cnblogs.com/xiximayou/p/12482573.html
從命令行接收參數:https://www.cnblogs.com/xiximayou/p/12488662.html
使用top1和top5准確率來衡量模型:https://www.cnblogs.com/xiximayou/p/12489069.html
使用預訓練的resnet18模型:https://www.cnblogs.com/xiximayou/p/12504579.html
epoch、batchsize、step之間的關系:https://www.cnblogs.com/xiximayou/p/12405485.html
計算數據集的均值和方差有兩種方式:
方法一:在utils下新建一個count_mean_std.py文件
import os import cv2 import numpy as np from torch.utils.data import Dataset from PIL import Image import torchvision import time from time import time from tqdm import tqdm def compute_mean_and_std(dataset): # 輸入PyTorch的dataset,輸出均值和標准差 mean_r = 0 mean_g = 0 mean_b = 0 print("計算均值>>>") for img_path, _ in tqdm(dataset,ncols=80): img=Image.open(img_path) img = np.asarray(img) # change PIL Image to numpy array mean_b += np.mean(img[:, :, 0]) mean_g += np.mean(img[:, :, 1]) mean_r += np.mean(img[:, :, 2]) mean_b /= len(dataset) mean_g /= len(dataset) mean_r /= len(dataset) diff_r = 0 diff_g = 0 diff_b = 0 N = 0 print("計算方差>>>") for img_path, _ in tqdm(dataset,ncols=80): img=Image.open(img_path) img = np.asarray(img) diff_b += np.sum(np.power(img[:, :, 0] - mean_b, 2)) diff_g += np.sum(np.power(img[:, :, 1] - mean_g, 2)) diff_r += np.sum(np.power(img[:, :, 2] - mean_r, 2)) N += np.prod(img[:, :, 0].shape) std_b = np.sqrt(diff_b / N) std_g = np.sqrt(diff_g / N) std_r = np.sqrt(diff_r / N) mean = (mean_b.item() / 255.0, mean_g.item() / 255.0, mean_r.item() / 255.0) std = (std_b.item() / 255.0, std_g.item() / 255.0, std_r.item() / 255.0) return mean, std path = "/content/drive/My Drive/colab notebooks/data/dogcat" train_path=path+"/train" test_path=path+"/test" val_path=path+'/val' train_data = torchvision.datasets.ImageFolder(train_path) val_data = torchvision.datasets.ImageFolder(val_path) test_data = torchvision.datasets.ImageFolder(test_path) #train_mean,train_std=compute_mean_and_std(train_data.imgs) time_start =time() val_mean,val_std=compute_mean_and_std(val_data.imgs) time_end=time() print("驗證集計算消耗時間:", round(time_end - time_start, 4), "s") #test_mean,test_std=compute_mean_and_std(test_data.imgs) #print("訓練集的平均值:{},方差:{}".format(train_mean,train_std)) print("驗證集的平均值:{}".format(val_mean)) print("驗證集的方差:{}".format(val_mean)) #print("測試集的平均值:{},方差:{}".format(test_mean,test_std))
輸出的時候輸出錯了:應該是
print("驗證集的方差:{}".format(val_std))
結果:
說明:由於我們是使用pytorch的datasets.ImageFolder 讀取數據集。為了傳入圖片,我們需要使用train_data.imgs類似的操作取出圖片。train_data.imgs的值是[(圖片地址1,標簽),(圖片地址2,標簽),...]的格式。在代碼中for img_path,_ in dataset正好取出圖片的地址。再使用Image.open()打開一張圖片,轉換成numpy格式,最后計算均值和方差。別看圖中速度還是很快的,其實這是我運行幾次的結果,數據是從緩存中獲取的,第一次運行的時候速度會很慢。這里只對驗證集進行了計算,訓練集有接近2萬張圖片,就更慢了,就不計算了。
得到均值和方差之后,在數據增強時可以這么使用:
train_transform = torchvision.transforms.Compose([ torchvision.transforms.RandomResizedCrop(size=224, scale=(0.08, 1.0)), torchvision.transforms.RandomHorizontalFlip(), torchvision.transforms.ToTensor(), torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)), ]) val_transform = torchvision.transforms.Compose([ torchvision.transforms.Resize(256), torchvision.transforms.CenterCrop(224), torchvision.transforms.ToTensor(), torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)), ])
注意標准化是放在所有數據增強最后的。因為之前對數據增強是對圖片而言。這些操作都會在ToTensor()操作之前。進行了ToTensor()操作之后,像素點的值會在0-1之間了,而且是張量。
方法二:
import numpy as np import cv2 import random # calculate means and std train_txt_path = './train_val_list.txt' CNum = 10000 # 挑選多少圖片進行計算 img_h, img_w = 224, 224 imgs = np.zeros([img_w, img_h, 3, 1]) means, stdevs = [], [] with open(train_txt_path, 'r') as f: lines = f.readlines() random.shuffle(lines) # shuffle , 隨機挑選圖片 for i in tqdm_notebook(range(CNum)): img_path = os.path.join('./train', lines[i].rstrip().split()[0]) img = cv2.imread(img_path) img = cv2.resize(img, (img_h, img_w)) img = img[:, :, :, np.newaxis] imgs = np.concatenate((imgs, img), axis=3) # print(i) imgs = imgs.astype(np.float32)/255. for i in tqdm_notebook(range(3)): pixels = imgs[:,:,i,:].ravel() # 拉成一行 means.append(np.mean(pixels)) stdevs.append(np.std(pixels)) # cv2 讀取的圖像格式為BGR,PIL/Skimage讀取到的都是RGB不用轉 means.reverse() # BGR --> RGB stdevs.reverse() print("normMean = {}".format(means)) print("normStd = {}".format(stdevs)) print('transforms.Normalize(normMean = {}, normStd = {})'.format(means, stdevs))
從網上摘的,供參考
之前我們都是利用datasets.ImageFolder讀取數據集,下一節我們使用第二種方式讀取貓狗數據集。