一、數據集格式轉換
我的數據集一共有400張車牌圖片,其中訓練集360張,測試集40張,分別放在D:\PythonNotebook\models-master\research\object_detection\images下的train和test文件夾,因為labellmg生成的是xml格式的文件,所以需要將xml文件轉換成csv文件,再轉換成tfrecord格式文件,其中xml文件轉換成csv文件代碼如下:
1 import os 2 import glob 3 import pandas as pd 4 import xml.etree.ElementTree as ET 5 6 path = 'images/test' 7 8 def xml_to_csv(path): 9 xml_list = [] 10 for xml_file in glob.glob(path + '/*.xml'): 11 tree = ET.parse(xml_file) 12 root = tree.getroot() 13 for member in root.findall('object'): 14 value = (root.find('filename').text, 15 int(root.find('size')[0].text), 16 int(root.find('size')[1].text), 17 member[0].text, 18 int(member[4][0].text), 19 int(member[4][1].text), 20 int(member[4][2].text), 21 int(member[4][3].text) 22 ) 23 xml_list.append(value) 24 column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax'] 25 xml_df = pd.DataFrame(xml_list, columns=column_name) 26 return xml_df 27 28 29 def main(): 30 image_path = path 31 xml_df = xml_to_csv(path) 32 xml_df.to_csv('data/plate_test_labels.csv', index=None) 33 print('Successfully converted xml to csv.')
保存在D:\PythonNotebook\models-master\research\object_detection文件夾下為xml_to_csv.py,由csv文件轉換成tf_record文件代碼為generate_tfrecord.py,代碼也保存在D:\PythonNotebook\models-master\research\object_detection下
1 import os 2 import io 3 import pandas as pd 4 import tensorflow as tf 5 6 from PIL import Image 7 from object_detection.utils import dataset_util 8 from collections import namedtuple, OrderedDict 9 10 11 # TO-DO replace this with label map 12 def class_text_to_int(row_label): 13 if row_label == 'plate': 14 return 1 15 else: 16 None 17 18 19 def split(df, group): 20 data = namedtuple('data', ['filename', 'object']) 21 gb = df.groupby(group) 22 print('split') 23 return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)] 24 25 26 def create_tf_example(group, path): 27 with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid: 28 encoded_jpg = fid.read() 29 encoded_jpg_io = io.BytesIO(encoded_jpg) 30 image = Image.open(encoded_jpg_io) 31 width, height = image.size 32 print(width) 33 34 filename = group.filename.encode('utf8') 35 image_format = b'jpg' 36 xmins = [] 37 xmaxs = [] 38 ymins = [] 39 ymaxs = [] 40 classes_text = [] 41 classes = [] 42 43 for index, row in group.object.iterrows(): 44 xmins.append(row['xmin'] / width) 45 xmaxs.append(row['xmax'] / width) 46 ymins.append(row['ymin'] / height) 47 ymaxs.append(row['ymax'] / height) 48 classes_text.append(row['class'].encode('utf8')) 49 classes.append(class_text_to_int(row['class'])) 50 51 tf_example = tf.train.Example(features=tf.train.Features(feature={ 52 'image/height': dataset_util.int64_feature(height), 53 'image/width': dataset_util.int64_feature(width), 54 'image/filename': dataset_util.bytes_feature(filename), 55 'image/source_id': dataset_util.bytes_feature(filename), 56 'image/encoded': dataset_util.bytes_feature(encoded_jpg), 57 'image/format': dataset_util.bytes_feature(image_format), 58 'image/object/bbox/xmin': dataset_util.float_list_feature(xmins), 59 'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs), 60 'image/object/bbox/ymin': dataset_util.float_list_feature(ymins), 61 'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs), 62 'image/object/class/text': dataset_util.bytes_list_feature(classes_text), 63 'image/object/class/label': dataset_util.int64_list_feature(classes), 64 })) 65 return tf_example 66 67 68 def main(): 69 writer = tf.python_io.TFRecordWriter('data/test.record') 70 path = 'images/test' 71 examples = pd.read_csv('data/plate_test_labels.csv') 72 grouped = split(examples, 'filename') 73 for group in grouped: 74 tf_example = create_tf_example(group, path) 75 writer.write(tf_example.SerializeToString()) 76 77 writer.close() 78 output_path = 'data/test.record' 79 print('Successfully created the TFRecords: {}'.format(output_path)) 80 81 82 if __name__ == '__main__': 83 main()
然后分別執行這兩段代碼就可以了,我將csv文件以及tfrecord文件都放在了object_detection/data下了。當然我這里都只是對測試集進行了格式轉換,對於訓練集只需要將代碼里所有的test改成train就可以了。
二、選擇模型和配置文件、
在tensorflow的物體檢測模型界面選擇你的模型,這里我用的是ssd_mobilenet_v1_coco.config模型,點擊ssd_mobilenet_v1_coco.config進入代碼界面,將代碼拷貝下來,在D:\PythonNotebook\models-master\research\object_detection下新建一個文件夾training,在training文件夾下新建一個文本文檔,命名為ssd_mobilenet_v1_coco.config,將剛才拷貝的代碼復制到里面,並修改一些代碼:
1、input_path改為自己的train.tfrecord以及test.tfrecord路徑,注意不要把test跟train弄反了;
2、將 num_classes 按照實際情況更改,本文中是1;
3、batch_size 原本是24改為1或者2,因為電腦內存的原因;
4、fine_tune_checkpoint: "ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"
from_detection_checkpoint: true這兩行刪除。
5、eval_config中的num_examples改為40,對應你的測試集樣本數。
6、在object_detection/data文件夾下隨便拷貝一個.pbtxt文件副本命名為plate.pbtxt,將里面的代碼改成如下:
item {
name: "plate"
id: 1
}
代表label的名稱以及對應的數字,將label_map_path設置為對應的plate.pbtxt路徑,我這里是data/plate.pbtxt。
整個配置文件如下:
1 # SSD with Mobilenet v1 configuration for MSCOCO Dataset. 2 # Users should configure the fine_tune_checkpoint field in the train config as 3 # well as the label_map_path and input_path fields in the train_input_reader and 4 # eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that 5 # should be configured. 6 7 model { 8 ssd { 9 num_classes: 1 10 box_coder { 11 faster_rcnn_box_coder { 12 y_scale: 10.0 13 x_scale: 10.0 14 height_scale: 5.0 15 width_scale: 5.0 16 } 17 } 18 matcher { 19 argmax_matcher { 20 matched_threshold: 0.5 21 unmatched_threshold: 0.5 22 ignore_thresholds: false 23 negatives_lower_than_unmatched: true 24 force_match_for_each_row: true 25 } 26 } 27 similarity_calculator { 28 iou_similarity { 29 } 30 } 31 anchor_generator { 32 ssd_anchor_generator { 33 num_layers: 6 34 min_scale: 0.2 35 max_scale: 0.95 36 aspect_ratios: 1.0 37 aspect_ratios: 2.0 38 aspect_ratios: 0.5 39 aspect_ratios: 3.0 40 aspect_ratios: 0.3333 41 } 42 } 43 image_resizer { 44 fixed_shape_resizer { 45 height: 300 46 width: 300 47 } 48 } 49 box_predictor { 50 convolutional_box_predictor { 51 min_depth: 0 52 max_depth: 0 53 num_layers_before_predictor: 0 54 use_dropout: false 55 dropout_keep_probability: 0.8 56 kernel_size: 1 57 box_code_size: 4 58 apply_sigmoid_to_scores: false 59 conv_hyperparams { 60 activation: RELU_6, 61 regularizer { 62 l2_regularizer { 63 weight: 0.00004 64 } 65 } 66 initializer { 67 truncated_normal_initializer { 68 stddev: 0.03 69 mean: 0.0 70 } 71 } 72 batch_norm { 73 train: true, 74 scale: true, 75 center: true, 76 decay: 0.9997, 77 epsilon: 0.001, 78 } 79 } 80 } 81 } 82 feature_extractor { 83 type: 'ssd_mobilenet_v1' 84 min_depth: 16 85 depth_multiplier: 1.0 86 conv_hyperparams { 87 activation: RELU_6, 88 regularizer { 89 l2_regularizer { 90 weight: 0.00004 91 } 92 } 93 initializer { 94 truncated_normal_initializer { 95 stddev: 0.03 96 mean: 0.0 97 } 98 } 99 batch_norm { 100 train: true, 101 scale: true, 102 center: true, 103 decay: 0.9997, 104 epsilon: 0.001, 105 } 106 } 107 } 108 loss { 109 classification_loss { 110 weighted_sigmoid { 111 } 112 } 113 localization_loss { 114 weighted_smooth_l1 { 115 } 116 } 117 hard_example_miner { 118 num_hard_examples: 3000 119 iou_threshold: 0.99 120 loss_type: CLASSIFICATION 121 max_negatives_per_positive: 3 122 min_negatives_per_image: 0 123 } 124 classification_weight: 1.0 125 localization_weight: 1.0 126 } 127 normalize_loss_by_num_matches: true 128 post_processing { 129 batch_non_max_suppression { 130 score_threshold: 1e-8 131 iou_threshold: 0.6 132 max_detections_per_class: 100 133 max_total_detections: 100 134 } 135 score_converter: SIGMOID 136 } 137 } 138 } 139 140 train_config: { 141 batch_size: 2 142 optimizer { 143 rms_prop_optimizer: { 144 learning_rate: { 145 exponential_decay_learning_rate { 146 initial_learning_rate: 0.004 147 decay_steps: 800720 148 decay_factor: 0.95 149 } 150 } 151 momentum_optimizer_value: 0.9 152 decay: 0.9 153 epsilon: 1.0 154 } 155 } 156 # Note: The below line limits the training process to 200K steps, which we 157 # empirically found to be sufficient enough to train the pets dataset. This 158 # effectively bypasses the learning rate schedule (the learning rate will 159 # never decay). Remove the below line to train indefinitely. 160 num_steps: 200000 161 data_augmentation_options { 162 random_horizontal_flip { 163 } 164 } 165 data_augmentation_options { 166 ssd_random_crop { 167 } 168 } 169 } 170 171 train_input_reader: { 172 tf_record_input_reader { 173 input_path: "data/train.record" 174 } 175 label_map_path: "data/plate.pbtxt" 176 } 177 178 eval_config: { 179 num_examples: 40 180 # Note: The below line limits the evaluation process to 10 evaluations. 181 # Remove the below line to evaluate indefinitely. 182 max_evals: 10 183 } 184 185 eval_input_reader: { 186 tf_record_input_reader { 187 input_path: "data/test.record" 188 } 189 label_map_path: "data/plate.pbtxt" 190 shuffle: false 191 num_readers: 1 192 }
接下來就是訓練模型了
三、模型訓練
打開Anaconda Prompt,定位到object_detection文件夾路徑,我的是D:\PythonNotebook\models-master\research\object_detection,輸入如下命令:
python model_main.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_coco.config
期間遇到很多問題,從網上查找了各種解決方法,匯總如下:
1、No modul named pycocotools
原生態的pycocotools是不支持windows系統的,但是有大神已經改寫了代碼支持windows版本,鏈接在此,下載后解壓,打開Anaconda Prompt,定位到你的解壓路徑下的cocoapi-master\PythonAPI ,輸入命令:python setup.py install即可安裝成功。
2、TypeError: can’t pickle dict_values objects”
找到前面拷貝到Anaconda安裝路徑里面的object_detection文件夾,我的路徑是D:\Anaconda3\Lib\site-packages\object_detection,打開model_lib.py文件,修改其中的代碼:
在category_index.values()前加上list()即可。
再次運行python model_main.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_coco.config發現已經可以訓練了。
部分參考博主dy_guox,在此表示感謝。