在jupyter Notebook中使用PyTorch中的預訓練模型ResNet進行圖像分類


預訓練模型是在像ImageNet這樣的大型基准數據集上訓練得到的神經網絡模型。

現在通過Pytorch的torchvision.models 模塊中現有模型如 ResNet,用一張圖片去預測其類別。

1. 下載資源

這里隨意從網上下載一張狗的圖片。

類別標簽IMAGENET1000 從 https://blog.csdn.net/weixin_34304013/article/details/93708121復制到一個空的txt里,去掉最外面的{}即可。

 

2. 使用TorchVision加載預訓練模型ResNet

2.1 從torchvison模塊導入models模塊,可以看一下有哪些不同的模型和網絡結構。

1 from torchvision import models
2 dir(models)
 1 ['AlexNet',
 2  'DenseNet',
 3  'GoogLeNet',
 4  'GoogLeNetOutputs',
 5  'Inception3',
 6  'InceptionOutputs',
 7  'MNASNet',
 8  'MobileNetV2',
 9  'ResNet',
10  'ShuffleNetV2',
11  'SqueezeNet',
12  'VGG',
13  '_GoogLeNetOutputs',
14  '_InceptionOutputs',
15  '__builtins__',
16  '__cached__',
17  '__doc__',
18  '__file__',
19  '__loader__',
20  '__name__',
21  '__package__',
22  '__path__',
23  '__spec__',
24  '_utils',
25  'alexnet',
26  'densenet',
27  'densenet121',
28  'densenet161',
29  'densenet169',
30  'densenet201',
31  'detection',
32  'googlenet',
33  'inception',
34  'inception_v3',
35  'mnasnet',
36  'mnasnet0_5',
37  'mnasnet0_75',
38  'mnasnet1_0',
39  'mnasnet1_3',
40  'mobilenet',
41  'mobilenet_v2',
42  'quantization',
43  'resnet',
44  'resnet101',
45  'resnet152',
46  'resnet18',
47  'resnet34',
48  'resnet50',
49  'resnext101_32x8d',
50  'resnext50_32x4d',
51  'segmentation',
52  'shufflenet_v2_x0_5',
53  'shufflenet_v2_x1_0',
54  'shufflenet_v2_x1_5',
55  'shufflenet_v2_x2_0',
56  'shufflenetv2',
57  'squeezenet',
58  'squeezenet1_0',
59  'squeezenet1_1',
60  'utils',
61  'vgg',
62  'vgg11',
63  'vgg11_bn',
64  'vgg13',
65  'vgg13_bn',
66  'vgg16',
67  'vgg16_bn',
68  'vgg19',
69  'vgg19_bn',
70  'video',
71  'wide_resnet101_2',
72  'wide_resnet50_2']

注: 大寫的名稱指的是實現許多流行模型的Python類。它們的體系結構不同——也就是說,在輸入和輸出之間發生的操作的安排不同。

        小寫的名稱是函數,返回從這些類實例化的模型,有時使用不同的參數集。例如,resnet101返回一個有101層的ResNet實例,resnet18有18層,以此類推。

2.2 加載預訓練模型,創建實例

1 resnet = models.resnet101(pretrained=True)

注:下載的模型文件會緩存在用戶的相應目錄中,這里在C:\Users\Dell\.cache\torch\hub\checkpoints\resnet101-5d3b4d8f.pth

可輸出網絡結構如下:

  1 ResNet(
  2   (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  3   (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  4   (relu): ReLU(inplace=True)
  5   (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  6   (layer1): Sequential(
  7     (0): Bottleneck(
  8       (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
  9       (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 10       (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
 11       (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 12       (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
 13       (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 14       (relu): ReLU(inplace=True)
 15       (downsample): Sequential(
 16         (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
 17         (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 18       )
 19     )
 20     (1): Bottleneck(
 21       (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
 22       (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 23       (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
 24       (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 25       (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
 26       (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 27       (relu): ReLU(inplace=True)
 28     )
 29     (2): Bottleneck(
 30       (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
 31       (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 32       (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
 33       (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 34       (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
 35       (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 36       (relu): ReLU(inplace=True)
 37     )
 38   )
 39   (layer2): Sequential(
 40     (0): Bottleneck(
 41       (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
 42       (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 43       (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
 44       (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 45       (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
 46       (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 47       (relu): ReLU(inplace=True)
 48       (downsample): Sequential(
 49         (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
 50         (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 51       )
 52     )
 53     (1): Bottleneck(
 54       (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
 55       (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 56       (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
 57       (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 58       (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
 59       (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 60       (relu): ReLU(inplace=True)
 61     )
 62     (2): Bottleneck(
 63       (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
 64       (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 65       (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
 66       (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 67       (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
 68       (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 69       (relu): ReLU(inplace=True)
 70     )
 71     (3): Bottleneck(
 72       (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
 73       (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 74       (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
 75       (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 76       (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
 77       (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 78       (relu): ReLU(inplace=True)
 79     )
 80   )
 81   (layer3): Sequential(
 82     (0): Bottleneck(
 83       (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
 84       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 85       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
 86       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 87       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
 88       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 89       (relu): ReLU(inplace=True)
 90       (downsample): Sequential(
 91         (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
 92         (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 93       )
 94     )
 95     (1): Bottleneck(
 96       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
 97       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
 98       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
 99       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
100       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
101       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
102       (relu): ReLU(inplace=True)
103     )
104     (2): Bottleneck(
105       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
106       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
107       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
108       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
109       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
110       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
111       (relu): ReLU(inplace=True)
112     )
113     (3): Bottleneck(
114       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
115       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
116       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
117       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
118       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
119       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
120       (relu): ReLU(inplace=True)
121     )
122     (4): Bottleneck(
123       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
124       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
125       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
126       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
127       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
128       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
129       (relu): ReLU(inplace=True)
130     )
131     (5): Bottleneck(
132       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
133       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
134       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
135       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
136       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
137       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
138       (relu): ReLU(inplace=True)
139     )
140     (6): Bottleneck(
141       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
142       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
143       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
144       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
145       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
146       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
147       (relu): ReLU(inplace=True)
148     )
149     (7): Bottleneck(
150       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
151       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
152       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
153       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
154       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
155       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
156       (relu): ReLU(inplace=True)
157     )
158     (8): Bottleneck(
159       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
160       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
161       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
162       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
163       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
164       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
165       (relu): ReLU(inplace=True)
166     )
167     (9): Bottleneck(
168       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
169       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
170       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
171       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
172       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
173       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
174       (relu): ReLU(inplace=True)
175     )
176     (10): Bottleneck(
177       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
178       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
179       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
180       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
181       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
182       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
183       (relu): ReLU(inplace=True)
184     )
185     (11): Bottleneck(
186       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
187       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
188       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
189       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
190       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
191       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
192       (relu): ReLU(inplace=True)
193     )
194     (12): Bottleneck(
195       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
196       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
197       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
198       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
199       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
200       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
201       (relu): ReLU(inplace=True)
202     )
203     (13): Bottleneck(
204       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
205       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
206       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
207       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
208       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
209       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
210       (relu): ReLU(inplace=True)
211     )
212     (14): Bottleneck(
213       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
214       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
215       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
216       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
217       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
218       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
219       (relu): ReLU(inplace=True)
220     )
221     (15): Bottleneck(
222       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
223       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
224       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
225       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
226       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
227       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
228       (relu): ReLU(inplace=True)
229     )
230     (16): Bottleneck(
231       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
232       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
233       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
234       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
235       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
236       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
237       (relu): ReLU(inplace=True)
238     )
239     (17): Bottleneck(
240       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
241       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
242       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
243       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
244       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
245       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
246       (relu): ReLU(inplace=True)
247     )
248     (18): Bottleneck(
249       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
250       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
251       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
252       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
253       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
254       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
255       (relu): ReLU(inplace=True)
256     )
257     (19): Bottleneck(
258       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
259       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
260       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
261       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
262       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
263       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
264       (relu): ReLU(inplace=True)
265     )
266     (20): Bottleneck(
267       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
268       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
269       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
270       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
271       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
272       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
273       (relu): ReLU(inplace=True)
274     )
275     (21): Bottleneck(
276       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
277       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
278       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
279       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
280       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
281       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
282       (relu): ReLU(inplace=True)
283     )
284     (22): Bottleneck(
285       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
286       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
287       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
288       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
289       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
290       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
291       (relu): ReLU(inplace=True)
292     )
293   )
294   (layer4): Sequential(
295     (0): Bottleneck(
296       (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
297       (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
298       (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
299       (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
300       (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
301       (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
302       (relu): ReLU(inplace=True)
303       (downsample): Sequential(
304         (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
305         (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
306       )
307     )
308     (1): Bottleneck(
309       (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
310       (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
311       (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
312       (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
313       (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
314       (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
315       (relu): ReLU(inplace=True)
316     )
317     (2): Bottleneck(
318       (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
319       (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
320       (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
321       (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
322       (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
323       (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
324       (relu): ReLU(inplace=True)
325     )
326   )
327   (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
328   (fc): Linear(in_features=2048, out_features=1000, bias=True)
329 )
View Code

2.3 圖像預處理

1 from torchvision import transforms
2 preprocess = transforms.Compose([
3 transforms.Resize(256),
4 transforms.CenterCrop(224),
5 transforms.ToTensor(),
6 transforms.Normalize(
7 mean=[0.485, 0.456, 0.406],
8 std=[0.229, 0.224, 0.225]
9 )])

注:借助TochVision模塊中的transforms對輸入圖像進行預處理

第2行:定義了一個變量,是對輸入圖像進行的所有圖像轉換的組合。

第3行:將圖像調整為256×256像素。

第4行:將圖像中心裁剪出來,大小為224×224像素。

第5行:將圖像轉換為PyTorch張量(tensor)數據類型。

第6-8行]:通過將圖像的平均值和標准差設置為指定的值來正則化圖像。

2.4 加載圖像,並進行預處理轉換為模型對應的輸入形式

1 from PIL import Image
2 img = Image.open("C:/Users/Dell/Pictures/dog.jpg")
3 img

圖像如下:

 

注:PIL:Python Imaging Library,可以通過Python解釋器進行圖像處理,提供了強大的圖像處理能力。

1 import torch
2 img_t = preprocess(img)
3 batch_t = torch.unsqueeze(img_t, 0)

注:將圖像tensor增加一個維度,因為一張圖像只有3個維度,但模型要求輸入是4緯張量,也就是默認是輸入一批圖像,而不是一張。

       經過處理后,batch_t也代表一批圖像,不過其中只有一張圖像而已。

這里查看一下 img_t 和 batch_t 的維度

1 img_t.size()
2 
3 torch.Size([3, 224, 224])
1 batch_t.size()
2 
3 torch.Size([1, 3, 224, 224])

2.5 模型推斷

使用預訓練模型來看看模型認為圖像是什么。首先,將模型置於eval模式,然后推斷。

1 resnet.eval()
2 out = resnet(batch_t)
3 out.size()
1 torch.Size([1, 1000])

注:out為一個二維向量,行為1,列為1000。

       前面提到,模型輸入要求是一批圖像,如果我們輸入5張圖像,則out的行為5,列為1000,列表示1000個類。

       也就是行表示每一個圖像,列表示1000個類,每個類的置信度。故每一行中的1000個元素,分別表示該行對應圖像為每個類的可能性。

 

接下來需要用到一開始下載的類別標簽imagenet_classes.txt,從文本文件中讀取和存儲標簽。

1 with open('C:/Users/Dell/Desktop/imagenet_classes.txt') as f:
2     classes = [line.strip() for line in f.readlines()]

注:classes為含有1000個類名稱字符串的列表(ImageNet數據集共包含1000個類)。

       行號確定了類號,因此順序不可更改。

 

接下來找出輸出向量out中的最大置信度發生在哪個位置,用這個位置的下標來得出預測。

即索引最大預測值的位置

1 _, index = torch.max(out, 1)
1 index
2 
3 tensor([207])

注:取出二維向量 out 中每一行的最大值及下標,index為每行最大值的下標組成的列表。

 

接下來把預測值變為概率值。

1 percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100
2 classes[index[0]], percentage[index[0]].item()
1 ("207: 'golden retriever',", 41.10982131958008)

該模型預測圖像是一只金毛獵犬,置信度為41.11%

注:

第1行:對二維向量 out 中的每一行進行歸一化(softmax是常用的歸一化指數函數),然后取出第一行並使每個元素乘以100,得到本例中狗對應的每種類型的可能性(即置信度)。

第2行:打印類名及其置信度。

classes[index[0]]即是最大置信度對應的類名稱。classes[index[0]]中,index[0]是第一行最大值的下標,即第一張圖片的最大置信度的下標,index[1]為第二張圖片的,index[2]是第三張圖片的,以此類推。所以classes列表的元素順序不可更改。

percentage[index[0]].item()中,index[0]的含義同上,percentage[index[0]]代表最大置信度那一項,.item()取出該項的值

 

接下來看一看模型認為圖像屬於其他類的置信度。

1 _, indices = torch.sort(out, descending=True)
2 [(classes[idx], percentage[idx].item()) for idx in indices[0][:5]]
1 [("207: 'golden retriever',", 41.10982131958008),
2  ("151: 'Chihuahua',", 20.436004638671875),
3  ("154: 'Pekinese, Pekingese, Peke',", 8.29426097869873),
4  ("852: 'tennis ball',", 7.233486175537109),
5  ("259: 'Pomeranian',", 5.713674068450928)]

注:

torch.sort將out進行排序,默認對每一行排序,這里指定以遞減的方式排序。

[(classes[idx], percentage[idx].item()) for idx in indices[0][:5]]中,indices[0][:5]產生一個臨時的一維列表,包含indices的第一行前5個元素,也就是置信度最高的5個元素的下標值。

 

參考: 使用PyTorch中的預訓練模型進行圖像分類_u013679159的博客-CSDN博客

           PyTorch預訓練模型圖像分類之一_zhangzhifu2019的博客-CSDN博客(前面有.pth文件加載)


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM