Tiny-ImageNet的下載鏈接如下:http://cs231n.stanford.edu/tiny-imagenet-200.zip
下載完成后進行解壓,可以看到在windows下的目錄顯示為:
可以看到train文件夾中,所有圖片都像ImageNet一樣放在以類別命名的文件夾中,可以不用管,但是val文件夾中同樣也需要像Imagenet一樣利用腳本將各文件放置於文件夾中,以符合pytorch讀取數據的要求,這里我們通過如下腳本實現:
import glob
import os
from shutil import move
from os import rmdir
target_folder = './tiny-imagenet-200/val/'
val_dict = {}
with open('./tiny-imagenet-200/val/val_annotations.txt', 'r') as f:
for line in f.readlines():
split_line = line.split('\t')
val_dict[split_line[0]] = split_line[1]
paths = glob.glob('./tiny-imagenet-200/val/images/*')
for path in paths:
file = path.split('/')[-1]
folder = val_dict[file]
if not os.path.exists(target_folder + str(folder)):
os.mkdir(target_folder + str(folder))
os.mkdir(target_folder + str(folder) + '/images')
for path in paths:
file = path.split('/')[-1]
folder = val_dict[file]
dest = target_folder + str(folder) + '/images/' + str(file)
move(path, dest)
rmdir('./tiny-imagenet-200/val/images')
就讓Tiny-ImageNet的文件格式基本與ImageNet一致了,在DataLoader時,也可以用相似的代碼,這里是將尺寸變成了32來處理
def tiny_loader(batch_size, data_dir):
num_label = 200
normalize = transforms.Normalize((0.4802, 0.4481, 0.3975), (0.2770, 0.2691, 0.2821))
transform_train = transforms.Compose(
[transforms.RandomResizedCrop(32), transforms.RandomHorizontalFlip(), transforms.ToTensor(),
normalize, ])
transform_test = transforms.Compose([transforms.Resize(32), transforms.ToTensor(), normalize, ])
trainset = datasets.ImageFolder(root=os.path.join(data_dir, 'train'), transform=transform_train)
testset = datasets.ImageFolder(root=os.path.join(data_dir, 'val'), transform=transform_test)
train_loader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, pin_memory=True)
test_loader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, pin_memory=True)
return train_loader, test_loader, num_label
就可以正常進行訓練了