報名了一個蝴蝶檢測比賽,一共給了700多張圖,包含94種蝴蝶類別,要求檢測出圖片中的蝴蝶並正確分類。
1.拿到數據集后,第一部就是將700多張圖分成了 483張訓練樣本和238張測試樣本(由於數據集中,有15種類別的蝴蝶只有一張,所以在測試樣本中,僅包含了79種蝴蝶類別)
2.利用一個現有的包含蝴蝶類別的模型直接對測試集中的蝴蝶進行檢測(相當於二分類),這里選用的是“ faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28 ”模型。該模型是在Open Image 數據集上訓練的,總共有545個不同物體類別。
先是看的 object_detection/object_detection_tutorial.ipynb 這里直接導入的是frozen model,導致無法修改閾值,所以師兄換了一種模型導入方式,可以把檢測的閾值調低,以提高檢測到的蝴蝶數目
然后我需要做的是對檢測的結果進行評估(計算Percison),但是走了點彎路。
tensorflow教程中的流程是,先把測試集轉成TFrecord格式,然后再inference之后,直接在后面加入檢測到的結果。但是在模型的輸入處出現的問題,因為原本Frozen Model的輸入是一個tensor,而師兄修改后的模型輸入直接就是 image,所以我不知道怎么將輸入進行轉換。這里還需要后續的學習。於是我就換了一個方向,以師兄的code為基礎,每次從文件夾中提出一張image,進行inference,然后我就根據image信息,解析其對應的xml文件中的信息,並寫成tf_example的格式;同時將模型的輸出dets添加到剛剛生成的example中。這樣就解決了之前的問題,順利把 annotation+detection結果保存成了 TFrecord格式。
下一步就是利用 object_detection/metrics/offline_eval_map_corloc.py 進行評估,但是出現了兩個問題,導致我一度陷入僵局。第一個就是出現了 " ground_truth_group_of .size :None type has no attrbute to size ",我以為是我的TFrecord出現了問題,但是最后發現是因為
decoded_dict = data_parser.parse(example)
這里解析的時候,由於我原本TFrecord中並沒有寫入 standard_fields.TfExampleFields.object_group_of.object_group_of 信息,所以在解析的時候,這個內容就被填上了None ,所以不存在size,導致上面的問題產生。
self.optional_items_to_handlers = {
fields.InputDataFields.groundtruth_difficult:
Int64Parser(fields.TfExampleFields.object_difficult),
fields.InputDataFields.groundtruth_group_of:
Int64Parser(fields.TfExampleFields.object_group_of)
查下 open image 中的特有的group_of參數是什么意思: Indicates that the box spans a group of objects (e.g., a bed of flowers or a crowd of people). We asked annotators to use this tag for cases with more than 5 instances which are heavily occluding each other and are physically touching.
也就是說,帶有group_of標記的說明,該框中包含了5個以上的物體,如擁擠的人群,一個鋪滿鮮花的床等等。
還有一個問題就是,我的GroundTruth中只有蝴蝶和背景兩類,但是原本模型的label_map中卻包含545類,所以其余的類別是沒有GT的,這樣在程序中有一個判斷:
# object_detection/utils/object_detection_evaluation.py if (self.num_gt_instances_per_class == 0).any(): logging.warn( 'The following classes have no ground truth examples: %s', np.squeeze(np.argwhere(self.num_gt_instances_per_class == 0)) + self.label_id_offset)
我后來找到后,直接將其注釋掉,最終跑通了。
該模型在蝴蝶單類的檢測Precision=0.728
以下主要解釋下評估的代碼,防止以后忘記。假設模型輸出validation_detections.tfrecord已保存在
models/research/butterfly路徑下
第一步是生成配置文件:
# From models/research/butterfly SPLIT=validation # or test mkdir -p ${SPLIT}_eval_metrics echo " label_map_path: '../object_detection/data/oid_bbox_trainable_label_map.pbtxt' tf_record_input_reader: { input_path: '${SPLIT}_detections.tfrecord' } " > ${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt echo " metrics_set: 'open_images_detection_metrics' " > ${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt
然后運行評估程序:
# From tensorflow/models/research/butterfly SPLIT=validation # or test PYTHONPATH=$PYTHONPATH:$(readlink -f ..) \ python -m object_detection/metrics/offline_eval_map_corloc \ --eval_dir=${SPLIT}_eval_metrics \ #結果保存的路徑 --eval_config_path=${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt \ --input_config_path=${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt #輸入的路徑
首先來看下主程序 models/research/object_detection/metrics/offline_eval_map_corloc.py
import csv
import os
import re
import tensorflow as tf
from object_detection import evaluator
from object_detection.core import standard_fields
from object_detection.metrics import tf_example_parser
from object_detection.utils import config_util
from object_detection.utils import label_map_util
...
def read_data_and_evaluate(input_config, eval_config):
略
def write_metrics(metrics, output_dir):
...
def main(argv): del argv required_flags = ['input_config_path', 'eval_config_path', 'eval_dir'] #對應輸入的三個參數 for flag_name in required_flags: if not getattr(FLAGS, flag_name): raise ValueError('Flag --{} is required'.format(flag_name)) configs = config_util.get_configs_from_multiple_files( eval_input_config_path=FLAGS.input_config_path, eval_config_path=FLAGS.eval_config_path) eval_config = configs['eval_config'] input_config = configs['eval_input_config'] metrics = read_data_and_evaluate(input_config, eval_config) #主要實現部分在這里 # Save metrics write_metrics(metrics, FLAGS.eval_dir)
具體來看下
read_data_and_evaluate(input_config, eval_config):
def read_data_and_evaluate(input_config, eval_config): """Reads pre-computed object detections and groundtruth from tf_record. Args: input_config: input config proto of type 輸入配置文件 object_detection.protos.InputReader. eval_config: evaluation config proto of type 評估配置文件 object_detection.protos.EvalConfig. Returns: Evaluated detections metrics. 返回:評估結果 Raises: ValueError: if input_reader type is not supported or metric type is unknown. """ if input_config.WhichOneof('input_reader') == 'tf_record_input_reader': input_paths = input_config.tf_record_input_reader.input_path label_map = label_map_util.load_labelmap(input_config.label_map_path)#載入label_map max_num_classes = max([item.id for item in label_map.item]) #獲得最大的類別對應id (545) categories = label_map_util.convert_label_map_to_categories( label_map, max_num_classes) #list類型,eg. categories[110]={'id':111,'name':'Butterfly'} object_detection_evaluators = evaluator.get_evaluators( eval_config, categories) # Support a single evaluator object_detection_evaluator = object_detection_evaluators[0] #對應object_detection_evaluation.OpenImagesDetectionEvaluator skipped_images = 0 processed_images = 0 for input_path in _generate_filenames(input_paths): tf.logging.info('Processing file: {0}'.format(input_path)) record_iterator = tf.python_io.tf_record_iterator(path=input_path) #讀取 validation_detection.tfrecord data_parser = tf_example_parser.TfExampleDetectionAndGTParser() for string_record in record_iterator: #迭代器,一共238個測試樣本,每次讀取一個樣本檢測結果 tf.logging.log_every_n(tf.logging.INFO, 'Processed %d images...', 1000, processed_images) processed_images += 1 example = tf.train.Example() example.ParseFromString(string_record) #解析TFrecord--> example.features.feature 中以字典形式存放數據 decoded_dict = data_parser.parse(example) #對TFrecord進一步解析,還原:groundtruth_boxes、groundtruth_classes、detection_boxes、detection_classes、detection_scores
if decoded_dict:
#對應 object_detection/utils/object_detection_evaluation.py 中的 class OpenImagesDetectionEvaluator(),默認iou_threshold=0.5 object_detection_evaluator.add_single_ground_truth_image_info( decoded_dict[standard_fields.DetectionResultFields.key], decoded_dict) object_detection_evaluator.add_single_detected_image_info( decoded_dict[standard_fields.DetectionResultFields.key], decoded_dict) else: skipped_images += 1 tf.logging.info('Skipped images: {0}'.format(skipped_images)) return object_detection_evaluator.evaluate() raise ValueError('Unsupported input_reader_config.')
可以看出,主要的評測過程又放在了
object_detection/utils/object_detection_evaluation.py
第一個是 class OpenImagesDetectionEvaluator(ObjectDetectionEvaluator): 繼承自 class ObjectDetectionEvaluator(DetectionEvaluator) ,而這個class 又繼承自 class DetectionEvaluator(object)
所以我們從上往下看這幾個函數,先是基類 DetectionEvaluator(object):Line:42
class DetectionEvaluator(object): """Interface for object detection evalution classes. Example usage of the Evaluator: ------------------------------ evaluator = DetectionEvaluator(categories) 即挨個添加 GT和detections ,最后一起evaluate() # Detections and groundtruth for image 1. evaluator.add_single_groundtruth_image_info(...) evaluator.add_single_detected_image_info(...) # Detections and groundtruth for image 2. evaluator.add_single_groundtruth_image_info(...) evaluator.add_single_detected_image_info(...) metrics_dict = evaluator.evaluate() """ __metaclass__ = ABCMeta def __init__(self, categories): """Constructor. Args: categories: A list of dicts, each of which has the following keys - 'id': (required) an integer id uniquely identifying this category. 'name': (required) string representing category name e.g., 'cat', 'dog'. """ self._categories = categories @abstractmethod def add_single_ground_truth_image_info(self, image_id, groundtruth_dict): """Adds groundtruth for a single image to be used for evaluation. Args: image_id: A unique string/integer identifier for the image. groundtruth_dict: A dictionary of groundtruth numpy arrays required for evaluations. """ pass @abstractmethod def add_single_detected_image_info(self, image_id, detections_dict): """Adds detections for a single image to be used for evaluation. Args: image_id: A unique string/integer identifier for the image. detections_dict: A dictionary of detection numpy arrays required for evaluation. """ pass @abstractmethod def evaluate(self): """Evaluates detections and returns a dictionary of metrics.""" pass @abstractmethod def clear(self): """Clears the state to prepare for a fresh evaluation.""" pass
然后class ObjectDetectionEvaluator(DetectionEvaluator) Line:104class ObjectDetectionEvaluator(DetectionEvaluator):
"""A class to evaluate detections.""" def __init__(self, categories, matching_iou_threshold=0.5, evaluate_corlocs=False, metric_prefix=None, use_weighted_mean_ap=False, evaluate_masks=False): """Constructor. Args: xxxxxx Raises: ValueError: If the category ids are not 1-indexed. """
...#這個地方是最關鍵的,后面會一直用到 self._evaluation = ObjectDetectionEvaluation( num_groundtruth_classes=self._num_classes, matching_iou_threshold=self._matching_iou_threshold, use_weighted_mean_ap=self._use_weighted_mean_ap, label_id_offset=self._label_id_offset) ... def add_single_ground_truth_image_info(self, image_id, groundtruth_dict): """Adds groundtruth for a single image to be used for evaluation. """ ...略
self._evaluation.add_single_ground_truth_image_info(xxx)
def add_single_detected_image_info(self, image_id, detections_dict): """Adds detections for a single image to be used for evaluation. """ ...
self._evaluation.add_detected_image_info(xxx)
def evaluate(self): """Compute evaluation result. """ ...
(per_class_ap, mean_ap, _, _, per_class_corloc, mean_corloc) = (self._evaluation.evaluate())
...
def clear(self): """Clears the state to prepare for a fresh evaluation.""" self._evaluation = ObjectDetectionEvaluation( num_groundtruth_classes=self._num_classes, matching_iou_threshold=self._matching_iou_threshold, use_weighted_mean_ap=self._use_weighted_mean_ap, label_id_offset=self._label_id_offset) self._image_ids.clear()
最后是OpenImagesDetectionEvaluator(ObjectDetectionEvaluator) Line:376
class OpenImagesDetectionEvaluator(ObjectDetectionEvaluator): """A class to evaluate detections using Open Images V2 metrics. Open Images V2 introduce group_of type of bounding boxes and this metric handles those boxes appropriately. """ def __init__(self, categories, matching_iou_threshold=0.5, evaluate_corlocs=False): """Constructor. Args: categories: A list of dicts, each of which has the following keys - 'id': (required) an integer id uniquely identifying this category. 'name': (required) string representing category name e.g., 'cat', 'dog'. matching_iou_threshold: IOU threshold to use for matching groundtruth boxes to detection boxes. evaluate_corlocs: if True, additionally evaluates and returns CorLoc. """ super(OpenImagesDetectionEvaluator, self).__init__( categories, matching_iou_threshold, evaluate_corlocs, metric_prefix='OpenImagesV2') def add_single_ground_truth_image_info(self, image_id, groundtruth_dict): """Adds groundtruth for a single image to be used for evaluation. """ if image_id in self._image_ids: raise ValueError('Image with id {} already added.'.format(image_id)) groundtruth_classes = ( groundtruth_dict[standard_fields.InputDataFields.groundtruth_classes] - self._label_id_offset) # If the key is not present in the groundtruth_dict or the array is empty # (unless there are no annotations for the groundtruth on this image) # use values from the dictionary or insert None otherwise. if (standard_fields.InputDataFields.groundtruth_group_of in groundtruth_dict.keys() and (groundtruth_dict[standard_fields.InputDataFields.groundtruth_group_of] .size or not groundtruth_classes.size)): groundtruth_group_of = groundtruth_dict[ standard_fields.InputDataFields.groundtruth_group_of] else: groundtruth_group_of = None if not len(self._image_ids) % 1000: logging.warn( 'image %s does not have groundtruth group_of flag specified', image_id) self._evaluation.add_single_ground_truth_image_info( image_id, groundtruth_dict[standard_fields.InputDataFields.groundtruth_boxes], groundtruth_classes, groundtruth_is_difficult_list=None, groundtruth_is_group_of_list=groundtruth_group_of) self._image_ids.update([image_id])
可以看出,這里只是修改了add_single_ground_truth_image_info 函數,其他都沒變。而在其父類中,有把主要的工作交給了 class ObjectDetectionEvaluation(object) 來處理,這下整個代碼在逐漸清晰起來。我最后會畫個程序包含關系圖,可能更容易理解些。
下面整個才是主要的保存 GT 和 Detection 結果的部分哦!!!
class ObjectDetectionEvaluation(object): """Internal implementation of Pascal object detection metrics.""" def __init__(self,num_groundtruth_classes,matching_iou_threshold=0.5,nms_iou_threshold=1.0,nms_max_output_boxes=10000,use_weighted_mean_ap=False,label_id_offset=0): if num_groundtruth_classes < 1: raise ValueError('Need at least 1 groundtruth class for evaluation.') self.per_image_eval = per_image_evaluation.PerImageEvaluation( num_groundtruth_classes=num_groundtruth_classes, matching_iou_threshold=matching_iou_threshold, nms_iou_threshold=nms_iou_threshold, nms_max_output_boxes=nms_max_output_boxes) def clear_detections(self): self._initialize_detections() def add_single_ground_truth_image_info(self,image_key, groundtruth_boxes, groundtruth_class_labels, groundtruth_is_difficult_list=None, groundtruth_is_group_of_list=None, groundtruth_masks=None): def add_single_detected_image_info(self, image_key, detected_boxes,detected_scores, detected_class_labels,detected_masks=None): scores, tp_fp_labels, is_class_correctly_detected_in_image = ( self.per_image_eval.compute_object_detection_metrics( detected_boxes=detected_boxes, detected_scores=detected_scores, detected_class_labels=detected_class_labels, groundtruth_boxes=groundtruth_boxes, groundtruth_class_labels=groundtruth_class_labels, groundtruth_is_difficult_list=groundtruth_is_difficult_list, groundtruth_is_group_of_list=groundtruth_is_group_of_list, detected_masks=detected_masks, groundtruth_masks=groundtruth_masks)) for i in range(self.num_class): if scores[i].shape[0] > 0: self.scores_per_class[i].append(scores[i]) self.tp_fp_labels_per_class[i].append(tp_fp_labels[i]) (self.num_images_correctly_detected_per_class ) += is_class_correctly_detected_in_image
def evaluate(self): """Compute evaluation result. Returns: A named tuple with the following fields - average_precision: float numpy array of average precision for each class. mean_ap: mean average precision of all classes, float scalar precisions: List of precisions, each precision is a float numpy array recalls: List of recalls, each recall is a float numpy array corloc: numpy float array mean_corloc: Mean CorLoc score for each class, float scalar """ scores = np.concatenate(self.scores_per_class[class_index]) tp_fp_labels = np.concatenate(self.tp_fp_labels_per_class[class_index]) precision, recall = metrics.compute_precision_recall( scores, tp_fp_labels, self.num_gt_instances_per_class[class_index]) self.precisions_per_class.append(precision) self.recalls_per_class.append(recall) average_precision = metrics.compute_average_precision(precision, recall) self.average_precision_per_class[class_index] = average_precision self.corloc_per_class = metrics.compute_cor_loc( self.num_gt_imgs_per_class, self.num_images_correctly_detected_per_class)
mean_ap = np.nanmean(self.average_precision_per_class) mean_corloc = np.nanmean(self.corloc_per_class) return ObjectDetectionEvalMetrics( self.average_precision_per_class, mean_ap, self.precisions_per_class, self.recalls_per_class, self.corloc_per_class, mean_corloc)
我把不重要的部分都剃掉了,主要有兩個重要的函數 1. object_detection/utils/per_image_evaluatuion.py 計算單張圖的precision和recall
2. object_detection/utils/metrics.py 統計上述結果,並計算mAP等數值
1. object_detection/utils/per_image_evaluatuion.py 計算單張圖的precision和recall
scores, tp_fp_labels, is_class_correctly_detected_in_image = compute_object_detection_metrics(...) --> scores, tp_fp_labels = self._compute_tp_fp(...) -->for i in range(self.num_groundtruth_classes): scores, tp_fp_labels = self._compute_tp_fp_for_single_class(...) -->(iou, ioa, scores,num_detected_boxes) = self._get_overlaps_and_scores_box_mode(...) -->detected_boxlist = np_box_list_ops.non_max_suppression(...)
-->