若對caltech數據集進行測評的話,需要使用code3.2.1中的dbeva.m 代碼進行測評,之前不知道,很長時間里,都是自己畫log average miss rate曲線,導致得到的分數就很高,結果很差,還不能解釋...知道在github上看到了這位大牛的解釋:
https://github.com/zhaoweicai/mscnn/issues/63
I finally managed to work this out... the devkit is so user-unfriendly
steps (the working dir of all the following steps is the folder of the devkit):
- I used the python script to generate detection results in #4 . The provided run_mscnn_detection.m was unreasonably slow for me
- create a folder "data-USA", and put the "annotations" folder of caltech in it (copy/soft-link/whatever)
- create a folder "data-USA/res, and place the unzipped results from other algorithms here (http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/datasets/USA/res/)
- put your own results under res following the format of the official results from other algorithms
- run dbEval, and a folder named "results" will be created to store the generated graphs
FYI: the devkit evaluates 1 based 30,60,90... frames, so in python they are 29, 59, 89... and it is already well handled
行人檢測FPPI miss rate
FPPI/FPPW
Miss Rate:丟失率=測試集正例判別為負例的數目/測試集檢索到想要的正例數加上未檢測到不想要的 即是 全部groud truth的數量
Pedestrian detection: A benchmark
兩者都側重考察FP(False Positive)出現的頻率。
這周的大量時間都用在了去研究log Average Miss Rate - FPPI曲線了,昨天出來了結果。現在總結一下自己查到過的有用的知識點,使以后再用的時候不至於很費勁。
(一)繪制log Average Miss Rate - FPPI曲線所使用的函數及其參數
圖(0) log Average Miss Rate - FPPI曲線
關於log Average Miss Rate - FPPI 曲線,是用來衡量行人檢測的檢測器效果的衡量指標,曲線越低,效果越好。
畫log Average Miss Rate - FPPI曲線,matlab2017版本提供了專門的函數,函數的詳細用法,在matlab的官方文檔中已經給出。
%detectionResults 是你檢測器生成的關於每幅圖像的 boundinb boxes 和對應與每個bbox 得到的分數 見圖(1) %trainingData 是你數據關於每幅圖像標定的 ground truth 見圖(2) [am,fppi,missRate] = evaluateDetectionMissRate(detectionResults,trainingData)
圖(1-1) detectionResults的具體結果展示
圖(1-2) Boxes 中可以是一個double 類型的矩陣
圖(1-3) Scr中是與Boxes一一對應的分數,Scr也是double類型的
圖(2) trainingData的具體結果展示(這里的3*4 double 類型同上邊的圖中一樣)
有了這些數據后,傳給函數 evaluateDetectionMissRate() 就能得到最終的結果,然后用miss rate 和FPPI繪制最終的曲線
figure loglog(fppi, missRate); grid on title(sprintf('log Average Miss Rate = %.5f',am))
我的整體代碼是

[am,fppi,missRate] = evaluateDetectionMissRate(get_scr_bbox,get_results(:,1),0.5); %% % Plot log average miss rate - FPPI. figure loglog(fppi, missRate); grid on title(sprintf('log Average Miss Rate = %.5f',am))
(二)將的到的檢測結果文件和ground truth整理成matlab容易讀取的格式
從檢測器得到的結果中,取得具體對應每張圖片的分數和它的bboxes。因為檢測得到的結果比較大,有200M+,所以,對於這么大的文本,處理方法還是很關鍵的
因為每張圖片 與(分數score和bbox)是一對多的關系,也就是說,一張圖片會有多個(分數score和bbox),所以存儲的時候就要考慮好。
這里我用的python 中的字典(dict)對圖片和(分數score和bbox)進行存儲,key是圖片名,value是一個存放多個(分數score和bbox)的 list。這樣在具體查找當前的圖片名是否已經保存在字典中,因為字典的性質,就可以相當快了。
(i)檢測器的到的數據的格式
圖(3) 檢測器的到結果的具體展示
(ii)用matlab 處理前,要整理成的格式
圖(4-1)生成的out.txt文件,用來保存圖片的名字,之后用ground truth 來與這個文件里的名字對應起來
圖(4-2)生成的out1.txt文件,用來保存每張圖片對應的bboxes 的具體分數,會有很多維
圖(4-3)生成的out2.txt文件,用來保存每張圖片對應的bboxes,這里的坐標的含義是:前兩個是bbox 的的左上坐標點,然后后兩個是它的寬和高。
(iii)具體代碼:

def generate_result(resource_path, des_path): """ :param path: :return: """ des_path1 = "/home/user/PycharmProjects/MissRate_FPPI_plot/out1.txt" des_path2 = "/home/user/PycharmProjects/MissRate_FPPI_plot/out2.txt" rf = open(resource_path) content = rf.readline() cnt = 0 tmp_dict = {} while content: #print content res = content.replace("\n", "").split(" ") cls = str(res[0:1][0]) bbox = res[1:6] if cls in tmp_dict: tmp_dict[cls].append(bbox) else: tmp_dict[cls] = [bbox] cnt += 1 content = rf.readline() rf.close() wpath = resource_path.split("/")[-1] respath = wpath[-9:-4] + "/" + wpath[-4:] print wpath, respath wfname = open(des_path, "a+") wfscr = open(des_path1, "a+") wfbbox = open(des_path2, "a+") for key_ in tmp_dict: wfname.write(str(key_)+',') for detail in tmp_dict[key_]: for index in detail: if index == detail[0]: wfscr.write(str(index)) else: if index is detail[1]: tmpp1 = index wfbbox.write(str(int(float(index)))) if index is detail[2]: tmpp2 = index wfbbox.write(str(int(float(index)))) if index is detail[3]: wfbbox.write(str(int(float(index) - float(tmpp1)))) if index is detail[4]: wfbbox.write(str(int(float(index) - float(tmpp2)))) if index is not detail[-1]: wfbbox.write(",") if len(tmp_dict[key_]) > 1: if detail is not tmp_dict[key_][-1]: wfscr.write(";") wfbbox.write(";") wfname.write("\n") wfscr.write("\n") wfbbox.write("\n") wfname.close() wfscr.close() wfbbox.close() generate_result("/home/user/PycharmProjects/MissRate_FPPI_plot/comp4_det_test_person.txt", "/home/user/PycharmProjects/MissRate_FPPI_plot/out.txt") def generate_all_result(path): import os dirList = [] fileList = [] files = os.listdir(path) for f in files: if(os.path.isdir(path + "/" + f)): if f[0] != '.': dirList.append(f) if(os.path.isfile(path + '/'+ f)): fileList.append(f) for fl in fileList: generate_result(path + fl, "/home/user/PycharmProjects/MissRate_FPPI_plot/out.txt") #generate_all_result("/home/user/Downloads/caltech_data_set/test/")
還需要得到ground truth的具體信息
圖(5) ground truth 標注
一張圖片也可能有多個標注信息,同要也需要處理 ,想法同上邊是一樣的,將圖片名當做key,然后用list 作為value,保存ground truth bboxes

def generate_result(resource_path, des_path): """ :param path: :return: """ supname = resource_path[-9:-4] + "/" + resource_path[-4:] + "/" print supname rf = open(resource_path) content = rf.readline() cnt = 0 tmp_dict = {} while content: #print content res = content.replace("\n", "").split(" ") cls = supname + str(res[0:1][0]) bbox = res[1:5] if cls in tmp_dict: tmp_dict[cls].append(bbox) else: tmp_dict[cls] = [bbox] cnt += 1 content = rf.readline() rf.close() wpath = resource_path.split("/")[-1] respath = wpath[-9:-4] + "/" + wpath[-4:] print wpath, respath wfname = open(des_path, "a+") for key_ in tmp_dict: wfname.write(str(key_)+',') for detail in tmp_dict[key_]: for index in detail: if index is detail[0]: tmpp1 = index wfname.write(str(int(float(index)))) if index is detail[1]: tmpp2 = index wfname.write(str(int(float(index)))) if index is detail[2]: wfname.write(str(int(float(index)))) if index is detail[3]: wfname.write(str(int(float(index)))) if index is not detail[-1]: wfname.write(" ") if len(tmp_dict[key_]) > 1: if detail is not tmp_dict[key_][-1]: wfname.write(",") wfname.write("\n") wfname.close() def generate_all_result(path): import os dirList = [] fileList = [] files = os.listdir(path) for f in files: if(os.path.isdir(path + "/" + f)): if f[0] != '.': dirList.append(f) if(os.path.isfile(path + '/'+ f)): fileList.append(f) for fl in fileList: generate_result(path + fl, "/home/user/PycharmProjects/MissRate_FPPI_plot/new_ground_truth.txt") generate_all_result("/home/user/Downloads/caltech_data_set/data_reasonable_test/")
最后的效果是:
圖(6)生成的finally_ground_truth.txt文件,最后的ground truth 結果
最后,用finally_ground_truth.txt文件和out.txt文件生成 ground truth bbox 同檢測圖片一一對應的結果。
圖(7) 生成的與out.txt 對應的ground truth bboxes 結果(為result_pair1.txt文件)(可以同圖(4-1)對比,這兩個文件的名字順序是相同的)
有了每張圖片對應的bboxes的分數(out1.txt)、每張圖片對應的bboxes(out2.txt)、和每張圖片的對應的ground truth bboxes (result_pair1.txt)這些文件,
然后接下來用matlab讀取這些文件,然后生成 [am,fppi,missRate] = evaluateDetectionMissRate(detectionResults,trainingData)中對應的detectionResults 和 trainingData
(i)生成的detectionResults保存在get_scr_bbox中

fid=fopen("/home/user/PycharmProjects/MissRate_FPPI_plot/result_pair1.txt", "rt"); data = textscan(fid, '%s', 'delimiter', '\n'); data = data{1,1}; %debug % A = data{3} % % % A = regexp(A, '\-', 'split') % B = A(2) % % B = transpose(str2num(cell2mat(B))) % % B = str2num(cell2mat(B)) % S = regexp(B, ';', 'split') % res = [] % [m,n] = size(S) % for i = 1:n % res = [res;S(i)] % end get_results(64933) = struct('Boxes',[],'name',[]); for i=1:64933 A = data{i} A = regexp(A, '\-', 'split') get_results(i).name = A(1) B = A(2) B = str2num(cell2mat(B)) get_results(i).Boxes = B end % get_results = struct2table(get_results); while feof(fid) ~= 1 file = fgetl(fpn); end fclose(fid);
(ii)生成的trainingData保存在get_results中

fid1=fopen("/home/user/PycharmProjects/MissRate_FPPI_plot/out2.txt", "rt"); fid2=fopen("/home/user/PycharmProjects/MissRate_FPPI_plot/out1.txt", "rt"); data1 = textscan(fid1, '%s', 'delimiter', '\n'); data2 = textscan(fid2, '%s', 'delimiter', '\n'); data1 = data1{1,1}; data2 = data2{1,1}; %debug %A = data1{1} %A = cellstr(A) % %A = regexp(A, ';', 'split') %A = str2num(cell2mat(A)) %B = data2{1} %B = cellstr(B) %B = str2num(cell2mat(B)) % B = A(2) % % B = transpose(str2num(cell2mat(B))) % % B = str2num(cell2mat(B)) % S = regexp(B, ';', 'split') % res = [] % [m,n] = size(S) % for i = 1:n % res = [res;S(i)] % end % get_scr_bbox(64933) = struct('Boxes',[],'Scr',[]); for i=1:64933 A = data1{i} A = cellstr(A) A = str2num(cell2mat(A)) B = data2{i} B = cellstr(B) B = str2num(cell2mat(B)) get_scr_bbox(i).Boxes = A get_scr_bbox(i).Scr = B end % get_scr_bbox = struct2table(get_scr_bbox); % while feof(fid) ~= 1 % file = fgetl(fpn); % end fclose(fid1); fclose(fid2);
(iii) 最后,使用 [am,fppi,missRate] = evaluateDetectionMissRate(detectionResults,trainingData) 就可以得到最終的結果(代碼在最開始)
(三)相關知識點整理
(1)Python debug —— invalid literal for int() with base 10
int(float("1.5"))
(2)matlab中的cell array, cellstr()和char()的用法
(3)MATLAB元胞數組
(4)用Matlab實現字符串分割(split)
(5)Matlab---------字符串分割(split)
S = regexp(str, '\s+', 'split')
(6)matlab怎樣將輸入的數字字符矩陣轉化為數值矩陣?
(7)matlab中cell數組的全面介紹
cell2mat
(9)matlab將cell型變成double型
>> A=transpose(str2num(cell2mat(test')))
1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4
(10)matlab如何取矩陣的某一行,或某一列
x=A(i,j);就是提取矩陣A的第i行,第j列的元素注:提取元素是MATLAB中最常用的操作x(1,:)代表提取第1行,從第1列到最后一列;x(:,1)代表提取第1列,從第1行到最后一行;其他的還有提取最大值最小值等操作,可以多看下help.或者提取矩陣A的第一行,第二列,賦給aa=A(1,2);如果光要取第一行a=A(1,:);如果光要取第二列a=A(:,2);
(11)Matlab 文件操作 [轉]
(12)Matlab文件操作及讀txt文件
(13)matlab 中的textscan
fid = fopen('mydata1.txt'); C = textscan(fid, ''%s%s%f32%d8%u%f%f%s%f'); fclose(fid);
(14)Python 實現圖片加框和加字
from matplotlib import pyplot as plt import cv2 im = cv2.imread("/home/user/PycharmProjects/MissRate_FPPI_plot/image001.jpg") cv2.rectangle(im,(int(856),int(318)),(int(856+39),int(318+41)),(0,225,0),2) plt.imshow(im) plt.show()
(15)【搬磚】【PYTHON數據分析】PYCHARM中PLOT繪圖不能顯示出來
(16) python 中dict 的value 是list
tmp_dict = {} while content: #print content res = content.replace("\n", "").split(" ") cls = str(res[0:1][0]) bbox = res[1:6] if cls in tmp_dict: tmp_dict[cls].append(bbox) else: tmp_dict[cls] = [bbox] cnt += 1 content = rf.readline()
若對caltech數據集進行測評的話,需要使用code3.2.1中的dbeval 代碼進行測評,之前不知道,很長時間里,都是自己畫log average miss rate曲線,導致得到的分數就很高,結果很差,還不能解釋...知道在github上看到了這位大牛的解釋:
https://github.com/zhaoweicai/mscnn/issues/63
I finally managed to work this out... the devkit is so user-unfriendly
steps (the working dir of all the following steps is the folder of the devkit):
- I used the python script to generate detection results in #4 . The provided run_mscnn_detection.m was unreasonably slow for me
- create a folder "data-USA", and put the "annotations" folder of caltech in it (copy/soft-link/whatever)
- create a folder "data-USA/res, and place the unzipped results from other algorithms here (http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/datasets/USA/res/)
- put your own results under res following the format of the official results from other algorithms
- run dbEval, and a folder named "results" will be created to store the generated graphs
FYI: the devkit evaluates 1 based 30,60,90... frames, so in python they are 29, 59, 89... and it is already well handled
感覺有點走上正規的感覺,哈哈哈哈