【貓狗數據集】計算數據集的平均值和方差


數據集下載地址:

鏈接:https://pan.baidu.com/s/1l1AnBgkAAEhh0vI5_loWKw
提取碼:2xq4

創建數據集:https://www.cnblogs.com/xiximayou/p/12398285.html

讀取數據集:https://www.cnblogs.com/xiximayou/p/12422827.html

進行訓練:https://www.cnblogs.com/xiximayou/p/12448300.html

保存模型並繼續進行訓練:https://www.cnblogs.com/xiximayou/p/12452624.html

加載保存的模型並測試:https://www.cnblogs.com/xiximayou/p/12459499.html

划分驗證集並邊訓練邊驗證:https://www.cnblogs.com/xiximayou/p/12464738.html

使用學習率衰減策略並邊訓練邊測試:https://www.cnblogs.com/xiximayou/p/12468010.html

利用tensorboard可視化訓練和測試過程:https://www.cnblogs.com/xiximayou/p/12482573.html

從命令行接收參數:https://www.cnblogs.com/xiximayou/p/12488662.html

使用top1和top5准確率來衡量模型:https://www.cnblogs.com/xiximayou/p/12489069.html

使用預訓練的resnet18模型:https://www.cnblogs.com/xiximayou/p/12504579.html

epoch、batchsize、step之間的關系:https://www.cnblogs.com/xiximayou/p/12405485.html

 

計算數據集的均值和方差有兩種方式:

方法一:在utils下新建一個count_mean_std.py文件

import os
import cv2
import numpy as np
from torch.utils.data import Dataset
from PIL import Image
import torchvision
import time
from time import time 
from tqdm import tqdm

def compute_mean_and_std(dataset):
    # 輸入PyTorch的dataset,輸出均值和標准差
    mean_r = 0
    mean_g = 0
    mean_b = 0
    print("計算均值>>>")
    for img_path, _ in tqdm(dataset,ncols=80):
      img=Image.open(img_path)
      img = np.asarray(img) # change PIL Image to numpy array
      mean_b += np.mean(img[:, :, 0])
      mean_g += np.mean(img[:, :, 1])
      mean_r += np.mean(img[:, :, 2])

    mean_b /= len(dataset)
    mean_g /= len(dataset)
    mean_r /= len(dataset)

    diff_r = 0
    diff_g = 0
    diff_b = 0

    N = 0
    print("計算方差>>>")
    for img_path, _ in tqdm(dataset,ncols=80):
      img=Image.open(img_path)
      img = np.asarray(img)
      diff_b += np.sum(np.power(img[:, :, 0] - mean_b, 2))
      diff_g += np.sum(np.power(img[:, :, 1] - mean_g, 2))
      diff_r += np.sum(np.power(img[:, :, 2] - mean_r, 2))

      N += np.prod(img[:, :, 0].shape)

    std_b = np.sqrt(diff_b / N)
    std_g = np.sqrt(diff_g / N)
    std_r = np.sqrt(diff_r / N)

    mean = (mean_b.item() / 255.0, mean_g.item() / 255.0, mean_r.item() / 255.0)
    std = (std_b.item() / 255.0, std_g.item() / 255.0, std_r.item() / 255.0)
    return mean, std
path = "/content/drive/My Drive/colab notebooks/data/dogcat"
train_path=path+"/train"
test_path=path+"/test"
val_path=path+'/val'
train_data = torchvision.datasets.ImageFolder(train_path)
val_data = torchvision.datasets.ImageFolder(val_path)
test_data = torchvision.datasets.ImageFolder(test_path)
#train_mean,train_std=compute_mean_and_std(train_data.imgs)
time_start =time()
val_mean,val_std=compute_mean_and_std(val_data.imgs)
time_end=time()
print("驗證集計算消耗時間:", round(time_end - time_start, 4), "s")
#test_mean,test_std=compute_mean_and_std(test_data.imgs)
#print("訓練集的平均值:{},方差:{}".format(train_mean,train_std))
print("驗證集的平均值:{}".format(val_mean))
print("驗證集的方差:{}".format(val_mean))
#print("測試集的平均值:{},方差:{}".format(test_mean,test_std))

輸出的時候輸出錯了:應該是

print("驗證集的方差:{}".format(val_std))

結果:

說明:由於我們是使用pytorch的datasets.ImageFolder 讀取數據集。為了傳入圖片,我們需要使用train_data.imgs類似的操作取出圖片。train_data.imgs的值是[(圖片地址1,標簽),(圖片地址2,標簽),...]的格式。在代碼中for img_path,_ in dataset正好取出圖片的地址。再使用Image.open()打開一張圖片,轉換成numpy格式,最后計算均值和方差。別看圖中速度還是很快的,其實這是我運行幾次的結果,數據是從緩存中獲取的,第一次運行的時候速度會很慢。這里只對驗證集進行了計算,訓練集有接近2萬張圖片,就更慢了,就不計算了。

得到均值和方差之后,在數據增強時可以這么使用:

train_transform = torchvision.transforms.Compose([
    torchvision.transforms.RandomResizedCrop(size=224,
                                             scale=(0.08, 1.0)),
    torchvision.transforms.RandomHorizontalFlip(),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
 ])
 val_transform = torchvision.transforms.Compose([
    torchvision.transforms.Resize(256),
    torchvision.transforms.CenterCrop(224),
    torchvision.transforms.ToTensor(),
   torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
])

注意標准化是放在所有數據增強最后的。因為之前對數據增強是對圖片而言。這些操作都會在ToTensor()操作之前。進行了ToTensor()操作之后,像素點的值會在0-1之間了,而且是張量。

方法二:

import numpy as np
import cv2
import random
 
# calculate means and std
train_txt_path = './train_val_list.txt'
 
CNum = 10000   # 挑選多少圖片進行計算
 
img_h, img_w = 224, 224
imgs = np.zeros([img_w, img_h, 3, 1])
means, stdevs = [], []
 
with open(train_txt_path, 'r') as f:
  lines = f.readlines()
  random.shuffle(lines)  # shuffle , 隨機挑選圖片
 
  for i in tqdm_notebook(range(CNum)):
    img_path = os.path.join('./train', lines[i].rstrip().split()[0])
 
    img = cv2.imread(img_path)
    img = cv2.resize(img, (img_h, img_w))
    img = img[:, :, :, np.newaxis]
    
    imgs = np.concatenate((imgs, img), axis=3)
#     print(i)
 
imgs = imgs.astype(np.float32)/255.
 
 
for i in tqdm_notebook(range(3)):
  pixels = imgs[:,:,i,:].ravel() # 拉成一行
  means.append(np.mean(pixels))
  stdevs.append(np.std(pixels))
 
# cv2 讀取的圖像格式為BGR,PIL/Skimage讀取到的都是RGB不用轉
means.reverse() # BGR --> RGB
stdevs.reverse()
 
print("normMean = {}".format(means))
print("normStd = {}".format(stdevs))
print('transforms.Normalize(normMean = {}, normStd = {})'.format(means, stdevs))

從網上摘的,供參考

 

之前我們都是利用datasets.ImageFolder讀取數據集,下一節我們使用第二種方式讀取貓狗數據集。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM