想要嘗試一下將resnet18最后一層的全連接層改成卷積層看會不會對網絡效果和網絡大小有什么影響
1.首先先對train.py中的更改是:
train.py代碼可見:pytorch實現性別檢測
# model_conv.fc = nn.Linear(fc_features, 2)這是之前的寫法 model_conv.fc = nn.Conv2d(fc_features, 2, 1) print(model_conv.fc)
但是運行的時候出錯:
1)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [2, 512, 1, 1], but got 2-dimensional input of size [4, 512] instead
[2, 512, 1, 1]為[batch_size, channels, height, width],壓扁flat后為[4, 512],即[batch_size, out_size]
這是因為在傳到fc層前進行了壓扁的操作:
x = x.view(x.size(0), -1)
到相應的代碼處/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torchvision/models/resnet.py注釋掉其即可
2)
Traceback (most recent call last): File "train.py", line 192, in <module> model_train = train_model(model_conv, criterion, optimizer_conv, exp_lr_scheduler) File "train.py", line 135, in train_model loss = criterion(outputs, labels) File "/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 904, in forward ignore_index=self.ignore_index, reduction=self.reduction) File "/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/nn/functional.py", line 1970, in cross_entropy return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction) File "/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/nn/functional.py", line 1792, in nll_loss ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index) RuntimeError: invalid argument 3: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1 at /Users/soumith/b101_2/2019_02_08/wheel_build_dirs/wheel_3.6/pytorch/aten/src/THNN/generic/SpatialClassNLLCriterion.c:59
先將得到的結果打印出來:
print(outputs,outputs.shape)
print(labels, labels.shape)
得到:
tensor([[[[-0.8409]], [[ 0.3311]]], [[[-0.3910]], [[ 0.6904]]], [[[-0.4417]], [[ 0.3846]]], [[[-1.1002]], [[ 0.6044]]]], grad_fn=<ThnnConv2DBackward>) torch.Size([4, 2, 1, 1]) tensor([1, 1, 0, 0]) torch.Size([4])
可見得到的結果不是最后想要的結果,需要將channel*height*width=2*1*1變為2,結果為[4,2]
然后后面回運行:
_, preds = torch.max(outputs, 1)
得到兩個值中最大那個值的索引,結果的shape就會變成[4]
這里的解決辦法就是在resnet.py代碼的fc層下面加入一層代碼:
x = x.view(x.size(0), -1)
這樣最終resnet網絡的forward()函數應該是:
def forward(self, x): x = self.conv1(x) x = self.bn1(x) x = self.relu(x) x = self.maxpool(x) x = self.layer1(x) x = self.layer2(x) x = self.layer3(x) x = self.layer4(x) x = self.avgpool(x)
#x = x.view(x.size(0), -1) x = self.fc(x) x = x.view(x.size(0), -1) return x
2.然后再運行即可,但是我的結果並沒有很大的不同,訓練的網絡大小也差不多