做deep-sort多目標跟蹤需要結合yolo_v3進行檢測行人
由於我的項目中需要對人臉進行檢測,所以需要訓練針對人臉的模型
訓練樣本是來自WIDER-FACE人臉庫。(有3w+的圖片和標注框)
deep-sort結合yolo-v3的博客分享
https://blog.csdn.net/weixin_42755375/article/details/85723239
分享一篇博客(按照博客要求 可以完美訓練自己的人臉模型)
https://blog.csdn.net/caroline_wendy/article/details/80924371
圖一 使用原始模型, 圖二使用新訓練人臉模型
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
如果不是使用人臉數據,而是自己的數據或voc格式數據,步驟其實差不多,只是得到train.txt的過程不一樣,這里給出方法。
首先需要得到train.txt的py文件 voc_annotation.py
1 import xml.etree.ElementTree as ET 2 from os import getcwd 3 4 import sys 5 sys.path.append("/var/Data/pcr/ljf/project/rubbish_faster_rcnn/data/VOCself") 6 print(sys.path) 7 root_data="/var/Data/pcr/ljf/project/rubbish_faster_rcnn/data/" 8 sets=[('self', 'train'), ('self', 'val'), ('self', 'test')] 9 10 classes = ["trash"] 11 12 13 def convert_annotation(year, image_id, list_file): 14 in_file = open(root_data+'VOC%s/Annotations/%s.xml'%(year, image_id)) 15 tree=ET.parse(in_file) 16 root = tree.getroot() 17 18 for obj in root.iter('object'): 19 difficult = obj.find('difficult').text 20 cls = obj.find('name').text 21 if cls not in classes or int(difficult)==1: 22 continue 23 cls_id = classes.index(cls) 24 xmlbox = obj.find('bndbox') 25 b = (int(xmlbox.find('xmin').text), int(xmlbox.find('ymin').text), int(xmlbox.find('xmax').text), int(xmlbox.find('ymax').text)) 26 list_file.write(" " + ",".join([str(a) for a in b]) + ',' + str(cls_id)) 27 28 wd = getcwd() 29 30 for year, image_set in sets: 31 image_ids = open(root_data+'VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split() 32 list_file = open('%s_%s.txt'%(year, image_set), 'w') 33 for image_id in image_ids: 34 # list_file.write('%s/VOC%s/JPEGImages/%s.jpg' % (wd, year, image_id)) 35 list_file.write('%sVOC%s/JPEGImages/%s.jpg'%(root_data, year, image_id)) 36 convert_annotation(year, image_id, list_file) 37 list_file.write('\n') 38 list_file.close()
注意:一開始VOC2007,也可以叫VOC2008之類,這樣此處的txt就會成為2008_xxx.txt。此外,有一個很關鍵的地方需要注意,必須修改,不然此處生成的三個新的txt文件中僅僅比前面Main下的txt中多了圖片路徑而已,並不包含框box的信息,這樣的話在后面的訓練步驟,由於沒有框的信息,僅僅是圖片路徑和名稱信息,是訓練不好的,即使可以得到訓練后的h5文件,但是當用這樣的h5文件去執行類似前面所說的測試圖片識別,效果就是將整幅圖框住,而不是框住你所要識別的部分。
故所要做的是:在執行voc_annotation.py之前,打開它,進行修改。將其中最上面的sets改為你自己的,比如2012改為我得2007,要和前面的目錄年份保持一致。還需要將最上面的classes中的內容,改為你自己xml文件中object屬性中name屬性的值。你有哪些name值,就改為哪些,不然其中讀取xml框信息的代碼就不會執行。
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
最后是anchor值得獲取,在訓練自己得數據集時,不同的數據集有不同的anchor值,為9對數字。
在得到上一步的train.txt的基礎上, 使用kmeans.py文件就可以得到anchor值
1 import numpy as np 2 3 4 class YOLO_Kmeans: 5 6 def __init__(self, cluster_number, filename): 7 self.cluster_number = cluster_number 8 self.filename = r"./self_train.txt" 9 10 def iou(self, boxes, clusters): # 1 box -> k clusters 11 n = boxes.shape[0] 12 k = self.cluster_number 13 14 box_area = boxes[:, 0] * boxes[:, 1] 15 box_area = box_area.repeat(k) 16 box_area = np.reshape(box_area, (n, k)) 17 18 cluster_area = clusters[:, 0] * clusters[:, 1] 19 cluster_area = np.tile(cluster_area, [1, n]) 20 cluster_area = np.reshape(cluster_area, (n, k)) 21 22 box_w_matrix = np.reshape(boxes[:, 0].repeat(k), (n, k)) 23 cluster_w_matrix = np.reshape(np.tile(clusters[:, 0], (1, n)), (n, k)) 24 min_w_matrix = np.minimum(cluster_w_matrix, box_w_matrix) 25 26 box_h_matrix = np.reshape(boxes[:, 1].repeat(k), (n, k)) 27 cluster_h_matrix = np.reshape(np.tile(clusters[:, 1], (1, n)), (n, k)) 28 min_h_matrix = np.minimum(cluster_h_matrix, box_h_matrix) 29 inter_area = np.multiply(min_w_matrix, min_h_matrix) 30 31 result = inter_area / (box_area + cluster_area - inter_area) 32 return result 33 34 def avg_iou(self, boxes, clusters): 35 accuracy = np.mean([np.max(self.iou(boxes, clusters), axis=1)]) 36 return accuracy 37 38 def kmeans(self, boxes, k, dist=np.median): 39 box_number = boxes.shape[0] 40 distances = np.empty((box_number, k)) 41 last_nearest = np.zeros((box_number,)) 42 np.random.seed() 43 clusters = boxes[np.random.choice( 44 box_number, k, replace=False)] # init k clusters 45 while True: 46 47 distances = 1 - self.iou(boxes, clusters) 48 49 current_nearest = np.argmin(distances, axis=1) 50 if (last_nearest == current_nearest).all(): 51 break # clusters won't change 52 for cluster in range(k): 53 clusters[cluster] = dist( # update clusters 54 boxes[current_nearest == cluster], axis=0) 55 56 last_nearest = current_nearest 57 58 return clusters 59 60 def result2txt(self, data): 61 f = open("yolo_anchors.txt", 'w') 62 row = np.shape(data)[0] 63 for i in range(row): 64 if i == 0: 65 x_y = "%d,%d" % (data[i][0], data[i][1]) 66 else: 67 x_y = ", %d,%d" % (data[i][0], data[i][1]) 68 f.write(x_y) 69 f.close() 70 71 def txt2boxes(self): 72 f = open(self.filename, 'r') 73 dataSet = [] 74 for line in f: 75 infos = line.split(" ") 76 print(infos) 77 length = len(infos) 78 for i in range(1, length): 79 width = int(infos[i].split(",")[2]) - \ 80 int(infos[i].split(",")[0]) 81 height = int(infos[i].split(",")[3]) - \ 82 int(infos[i].split(",")[1]) 83 dataSet.append([width, height]) 84 result = np.array(dataSet) 85 f.close() 86 return result 87 88 def txt2clusters(self): 89 all_boxes = self.txt2boxes() 90 print(all_boxes) 91 result = self.kmeans(all_boxes, k=self.cluster_number) 92 result = result[np.lexsort(result.T[0, None])] 93 self.result2txt(result) 94 print("K anchors:\n {}".format(result)) 95 print("Accuracy: {:.2f}%".format( 96 self.avg_iou(all_boxes, result) * 100)) 97 98 99 if __name__ == "__main__": 100 cluster_number = 9 101 filename = r"../self_trainval.txt" 102 kmeans = YOLO_Kmeans(cluster_number, filename) 103 kmeans.txt2clusters()
使用方法:修改第8行 self.filename的值為自己得到的train.txt文件
最后如下 anchor為
[25 14]
[33 20]
[38 27]
[45 34]
[55 24]
[58 37]
[67 48]
[92 34]
[98 63]
參考:https://blog.csdn.net/m0_37857151/article/details/81330699