pytorch DataLoader extremely slow first epoch
在檢查pytorch dataloader加載速度時發現,在第一次加載數據集時非常的慢。
例如:
data_loader = DataLoader(dataset=data_set,
batch_size=64,
num_workers=2,
shuffle=True,
pin_memory=False,
drop_last=True)
TT = [];
for i in range(3):
S = time.time()
for index, (input, target) in enumerate(data_loader):
print(index)
E = time.time()
T = E - S
TT.append(T)
print(TT) #[75.70432996749878, 5.695326089859009, 5.47631311416626]
首次加載數據花了75s,后續加載數據均為5s左右。
Pytorch's dataloader is too slow when processing large dataset.
