pytorch DataLoader extremely slow first epoch
在檢查pytorch dataloader加載速度時發現,在第一次加載數據集時非常的慢。
例如:
data_loader = DataLoader(dataset=data_set, batch_size=64, num_workers=2, shuffle=True, pin_memory=False, drop_last=True)
TT = []; for i in range(3): S = time.time() for index, (input, target) in enumerate(data_loader): print(index) E = time.time() T = E - S TT.append(T) print(TT) #[75.70432996749878, 5.695326089859009, 5.47631311416626]
首次加載數據花了75s,后續加載數據均為5s左右。
Pytorch's dataloader is too slow when processing large dataset.