題目描述
代碼闡釋
注意事項
如果用colab和kaggle的notebook上跑的話,需要刪去num_workers=8。否則跑一會就會爆內存。 刪除去后帶來的負面影響是運行時間會顯著增加。。。
因為train時間可能有超過6h以上,所以需要保存模型的參數,之后再訓練。
medium baseline
直接修改transform compose里面的內容即可。對於這個題目而言,翻轉和轉換角度是比較合理的。這是一篇介紹data augmented的博客,蠻不錯的。
# It is important to do data augmentation in training.
# However, not every augmentation is useful.
# Please think about what kind of augmentation is helpful for food recognition.
train_tfm1 = transforms.Compose([
# Resize the image into a fixed shape (height = width = 128)
transforms.Resize((128, 128)),
# You may add some transforms here.
# ToTensor() should be the last one of the transforms.
transforms.ToTensor(),
])
train_tfm2 = transforms.Compose([
# Resize the image into a fixed shape (height = width = 128)
transforms.Resize((128, 128)),
# You may add some transforms here.
# ToTensor() should be the last one of the transforms.
transforms.RandomHorizontalFlip(p=1.0),
transforms.ToTensor(),
])
train_tfm3 = transforms.Compose([
# Resize the image into a fixed shape (height = width = 128)
transforms.Resize((128, 128)),
# You may add some transforms here.
# ToTensor() should be the last one of the transforms.
transforms.RandomRotation(30),
transforms.ToTensor(),
])
# We don't need augmentations in testing and validation.
# All we need here is to resize the PIL image and transform it into Tensor.
test_tfm = transforms.Compose([
transforms.Resize((128, 128)),
transforms.ToTensor(),
])
# Batch size for training, validation, and testing.
# A greater batch size usually gives a more stable gradient.
# But the GPU memory is limited, so please adjust it carefully.
batch_size = 128
# Construct datasets.
# The argument "loader" tells how torchvision reads the data.
train_set1 = DatasetFolder("food-11/training/labeled", loader=lambda x: Image.open(x), extensions="jpg", transform=train_tfm1)
train_set2 = DatasetFolder("food-11/training/labeled", loader=lambda x: Image.open(x), extensions="jpg", transform=train_tfm2)
train_set3 = DatasetFolder("food-11/training/labeled", loader=lambda x: Image.open(x), extensions="jpg", transform=train_tfm3)
train_set = ConcatDataset([train_set1, train_set2, train_set3])
valid_set = DatasetFolder("food-11/validation", loader=lambda x: Image.open(x), extensions="jpg", transform=test_tfm)
unlabeled_set = DatasetFolder("food-11/training/unlabeled", loader=lambda x: Image.open(x), extensions="jpg", transform=train_tfm1)
test_set = DatasetFolder("food-11/testing", loader=lambda x: Image.open(x), extensions="jpg", transform=test_tfm)
# Construct data loaders.
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True, pin_memory=True)
valid_loader = DataLoader(valid_set, batch_size=batch_size, shuffle=True, pin_memory=True)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)
colab跑了2個半小時,同時kaggle分數如下:

hard baseline
自己沒跑進去,推薦一位大佬的https://github.com/1am9trash/Hung_Yi_Lee_ML_2021/blob/main/hw/hw3/hw3_code.ipynb
用了Pre-trained的Rsenet試了一下,發現效果非常好,訓練30輪就可以沖進hard baseline.
完整代碼放在我的github里面,需要自取
調參心得
https://github.com/lyfer233/deeplearning/blob/main/LiHongYi_ML2021Spring/HW3/煉丹總結和思路.md
