一、介紹
bvlc_reference_caffenet網絡模型是由AlexNet的網絡模型改寫的,輸入圖片尺寸大小為227x227x3,輸出的為該圖片對應1000個分類的概率值。
介紹參考:caffe/models/bvlc_reference_caffenet at master · BVLC/caffe · GitHub https://github.com/BVLC/caffe/tree/master/models/bvlc_reference_caffenet
二、利用pycaffe可視化網絡結構
caffe/python$ python draw_net.py ../models/bvlc_reference_caffenet/deploy.prototxt deploy.png
網絡結構:
大圖下載地址:鏈接:https://pan.baidu.com/s/1ggeKlLstZQrOklvnZ03L5A 密碼:x7r8
三、matlab可視化
1、網絡權值可視化:https://www.cnblogs.com/smbx-ztbz/p/9343874.html
2、特征圖可視化
(1)visualize_feature_maps.m
function [] = visualize_feature_maps(w, s) h = max(size(w, 1), size(w, 2)); g = h + s; c = size(w, 3); cv = ceil(sqrt(c));%按長寬相等方式排布,ceil向上取整 W = zeros(g*cv, g*cv); for u = 1:cv for v = 1:cv tw = zeros(h, h); if (((u-1)*cv + v) <= c) tw = w(:, :, (u-1)*cv+v, 1)';%只對第四維度為1進行可視化,即第一個樣本進行可視化 tw = tw - min(min(tw)); tw = tw / max(max(tw))*255; end W(g*(u-1) + (1:h), g*(v-1) + (1:h)) = tw; end end W = uint8(W); figure, imshow(W);
(2)fm_visual.m
clear; clc; close all; addpath('matlab') caffe.set_mode_cpu(); sprintf(['Caffe Version = ', caffe.version(), '\n']); net = caffe.Net('models/bvlc_reference_caffenet/deploy.prototxt',... 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel', 'test'); sprintf('Load net done. Net layers: '); net.layer_names sprintf('Net blobs: '); net.blob_names sprintf('Now preparing data...\n'); im = imread('examples/images/cat.jpg'); figure;imshow(im);title('Original Image'); d = load('matlab/+caffe/imagenet/ilsvrc_2012_mean.mat'); mean_data = d.mean_data;%256x256x3 IMAGE_DIM = 256; CROPPED_DIM = 227; %Convert an fimage returned by Matlab's imread to im_data in caffe's data %format: W x H x C with BGR channels im_data = im(:, :, [3, 2, 1]); %permute channels from RGB to BGR im_data = permute(im_data, [2, 1, 3]); %flip width and height im_data = single(im_data); %convert from uint8 to single im_data = imresize(im_data, [IMAGE_DIM IMAGE_DIM], 'bilinear'); %resize im_data 使得跟mean_data尺寸一致 im_data = im_data - mean_data; % subtract mean_data (already in W x H x C, BGR) im = imresize(im_data, [CROPPED_DIM CROPPED_DIM], 'bilinear'); %resize im_data km = cat(4, im, im, im, im, im);%在第四個維度往后疊加,第三維度為1。 227x227x3x5 pm = cat(4, km, km);%在第四個維度往后疊加。 227x227x3x10 input_data = {pm};%輸入的數據為輸入圖片拷貝10份 scores = net.forward(input_data);%cell 1000x10,輸入的樣本個數為10 scores = scores{1};%指向第一個cell,轉換為矩陣 scores = mean(scores, 2); %take average scores over 10 crops,對10個樣本求均值 [~, maxlabel] = max(scores);%獲取概率均值最大的索引 282 maxlabel %顯示所屬類別概率最大的下標 figure; plot(scores); fm_data = net.blob_vec(1);%輸入數據 d1 = fm_data.get_data(); sprintf('Data size = '); size(d1) %227x227x3x10 visualize_feature_maps(d1, 1); fm_conv1 = net.blob_vec(2); f1 = fm_conv1.get_data(); sprintf('Feature map conv1 size = '); %kernel_size: 11, stride: 4, pad: 0 (pad為0表示不對邊界進行擴展) size(f1)%55x55x96x10 visualize_feature_maps(f1, 1); fm_conv2 = net.blob_vec(5); f2 = fm_conv2.get_data(); sprintf('Feature map conv2 size = '); %kernel_size: 5, stride: 1, pad: 2 (步進應該為2?) size(f2) %27 27 256 10 visualize_feature_maps(f2, 1); fm_conv3 = net.blob_vec(8); f3 = fm_conv3.get_data(); sprintf('Feature map conv3 size = '); %kernel_size: 3, stride: 1, pad: 1 (步進應該為2?) size(f3)%13 13 384 10 visualize_feature_maps(f3, 1); fm_conv4 = net.blob_vec(9); f4 = fm_conv4.get_data(); sprintf('Feature map conv4 size = '); %kernel_size: 3, stride: 1, pad: 1 size(f4)%13 13 384 10 visualize_feature_maps(f4, 1); fm_conv5 = net.blob_vec(10); f5 = fm_conv5.get_data(); sprintf('Feature map conv5 size = '); %kernel_size: 3, stride: 1, pad: 1 size(f5)%13 13 256 10 visualize_feature_maps(f5, 1);
(3)說明
a、scores為輸入圖片對應1000個類別的概率值,maxlabel為對應最大概率值的下標,及所輸入圖像被分為哪一類,得到該圖片的最大概率對應的索引為282。
b、類別索引和名稱對應表可通過data/ilsvrc12/get_ilsvrc_aux.sh 下載解壓,在synset_words.txt文件中,根據行號,來找對應的類別。
四、對輸入圖片進行類別預測
clear; clc; close all; addpath('matlab') caffe.set_mode_cpu(); sprintf(['Caffe Version = ', caffe.version(), '\n']); net = caffe.Net('models/bvlc_reference_caffenet/deploy.prototxt',... 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel', 'test'); im = imread('examples/images/cat.jpg'); % figure;imshow(im);title('Original Image'); d = load('matlab/+caffe/imagenet/ilsvrc_2012_mean.mat'); mean_data = d.mean_data;%256x256x3 IMAGE_DIM = 256; CROPPED_DIM = 227; %Convert an fimage returned by Matlab's imread to im_data in caffe's data %format: W x H x C with BGR channels im_data = im(:, :, [3, 2, 1]); %permute channels from RGB to BGR im_data = permute(im_data, [2, 1, 3]); %flip width and height im_data = single(im_data); %convert from uint8 to single im_data = imresize(im_data, [IMAGE_DIM IMAGE_DIM], 'bilinear'); %resize im_data 使得跟mean_data尺寸一致 im_data = im_data - mean_data; % subtract mean_data (already in W x H x C, BGR) im = imresize(im_data, [CROPPED_DIM CROPPED_DIM], 'bilinear'); %resize im_data km = cat(4, im, im, im, im, im);%在第四個維度往后疊加,第三維度為1。 227x227x3x5 pm = cat(4, km, km);%在第四個維度往后疊加。 227x227x3x10 input_data = {pm};%輸入的數據為輸入圖片拷貝10份 scores = net.forward(input_data);%cell 1000x10,輸入的樣本個數為10 scores = scores{1};%指向第一個cell,轉換為矩陣 scores = mean(scores, 2); %take average scores over 10 crops,對10個樣本求均值 [~, maxlabel] = max(scores);%獲取概率均值最大的索引 maxlabel %顯示所屬類別概率最大的下標 figure; plot(scores); %打印出對應的label字符串 ffid = fopen('data/ilsvrc12/synset_words.txt','r'); for i = 1:1000 tline = fgetl(ffid); if(i == maxlabel) % tline break; end end label_string = tline(11:size(tline, 2)); sprintf('predict value is: %s\n', label_string) sprintf('probability is: %f\n', scores(maxlabel))
輸出:
maxlabel = 282 ans = predict value is: tabby, tabby cat ans = probability is: 0.288967
可用其他圖片進行測試,例如網上下載個熊貓圖片進行測試。
參考:
caffe中pad的作用 - CSDN博客 https://blog.csdn.net/xunan003/article/details/79110253
與AlexNet對比:Caffe學習筆記(二)——AlexNet模型 - CSDN博客 https://blog.csdn.net/hong__fang/article/details/52080280
【AlexNet】模型訓練與測試導讀 - CSDN博客 https://blog.csdn.net/xiequnyi/article/details/52276240?locationNum=5
Caffe下自己的數據訓練和測試 - CSDN博客 https://blog.csdn.net/qqlu_did/article/details/47131549
end