labllmg標注,得到xml文件,xml轉成csv,csv轉成tfrecord,就是跑幾個腳本。
設置配置文件
到object dection github尋找配置文件sample
如果你下載的是ssd_mobilenet_v1_coco_2017_11_17.tar.gz,就找到 ssd_mobilenet_v1_coco.config,然后打開配置文件,作如下修改:
- PATH_TO_BE_CONFIGURED,改為自己的路徑
- num_class,(不包括背景)
- batch_size
- 使用的自己數據從頭訓練,刪掉下面兩行
- fine_tune_checkpoint: "ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"
from_detection_checkpoint: true
- fine_tune_checkpoint: "ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"
# SSD with Mobilenet v1 configuration for MSCOCO Dataset. # Users should configure the fine_tune_checkpoint field in the train config as # well as the label_map_path and input_path fields in the train_input_reader and # eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that # should be configured. model { ssd { num_classes: 2 #不包括類別 box_coder { faster_rcnn_box_coder { y_scale: 10.0 x_scale: 10.0 height_scale: 5.0 width_scale: 5.0 } } matcher { argmax_matcher { matched_threshold: 0.5 unmatched_threshold: 0.5 ignore_thresholds: false negatives_lower_than_unmatched: true force_match_for_each_row: true } } similarity_calculator { iou_similarity { } } anchor_generator { ssd_anchor_generator { num_layers: 6 min_scale: 0.2 max_scale: 0.95 aspect_ratios: 1.0 aspect_ratios: 2.0 aspect_ratios: 0.5 aspect_ratios: 3.0 aspect_ratios: 0.3333 } } image_resizer { fixed_shape_resizer { height: 300 width: 300 } } box_predictor { convolutional_box_predictor { min_depth: 0 max_depth: 0 num_layers_before_predictor: 0 use_dropout: false dropout_keep_probability: 0.8 kernel_size: 1 box_code_size: 4 apply_sigmoid_to_scores: false conv_hyperparams { activation: RELU_6, regularizer { l2_regularizer { weight: 0.00004 } } initializer { truncated_normal_initializer { stddev: 0.03 mean: 0.0 } } batch_norm { train: true, scale: true, center: true, decay: 0.9997, epsilon: 0.001, } } } } feature_extractor { type: 'ssd_mobilenet_v1' min_depth: 16 depth_multiplier: 1.0 conv_hyperparams { activation: RELU_6, regularizer { l2_regularizer { weight: 0.00004 } } initializer { truncated_normal_initializer { stddev: 0.03 mean: 0.0 } } batch_norm { train: true, scale: true, center: true, decay: 0.9997, epsilon: 0.001, } } } loss { classification_loss { weighted_sigmoid { anchorwise_output: true } } localization_loss { weighted_smooth_l1 { anchorwise_output: true } } hard_example_miner { num_hard_examples: 3000 iou_threshold: 0.99 loss_type: CLASSIFICATION max_negatives_per_positive: 3 min_negatives_per_image: 0 } classification_weight: 1.0 localization_weight: 1.0 } normalize_loss_by_num_matches: true post_processing { batch_non_max_suppression { score_threshold: 1e-8 iou_threshold: 0.6 max_detections_per_class: 100 max_total_detections: 100 } score_converter: SIGMOID } } } train_config: { batch_size: 1 optimizer { rms_prop_optimizer: { learning_rate: { exponential_decay_learning_rate { initial_learning_rate: 0.004 decay_steps: 800720 decay_factor: 0.95 } } momentum_optimizer_value: 0.9 decay: 0.9 epsilon: 1.0 } }
#用自己數據從頭訓練,所以刪掉這兩行
#fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
#from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we # empirically found to be sufficient enough to train the pets dataset. This # effectively bypasses the learning rate schedule (the learning rate will # never decay). Remove the below line to train indefinitely. num_steps: 200000 data_augmentation_options { random_horizontal_flip { } } data_augmentation_options { ssd_random_crop { } } } train_input_reader: { tf_record_input_reader { input_path: "data/train.record" #訓練集tfrecord文件 } label_map_path: "data/tv_vehicle_detection.pbtxt" #標簽文件 } eval_config: { num_examples: 4 # Note: The below line limits the evaluation process to 10 evaluations. # Remove the below line to evaluate indefinitely. max_evals: 10 } eval_input_reader: { tf_record_input_reader { input_path: "data/test.record" #測試集record文件 } label_map_path: "data/tv_vehicle_detection.pbtxt" #標簽文件 shuffle: false num_readers: 1 num_epochs: 1 }
其中標簽文件是你要自己新建一個文件如detection.pbtxt,內如如下格式:
注意id序號要和csv文件中的一致。
訓練文件
定位到 models\research\object_detection文件夾下:
python object_detection/model_main.py \
--pipeline_config_path=object_detection/training/ssd_mobilenet_v1_coco.config \
--model_dir=object_detection/training \
--num_train_steps=50000 \
--num_eval_steps=2000 \
--alsologtostderr
中途打斷也不要緊,可以再次運行上述Python命令,會從上次的checkpoint繼續。
可視化
tensorboard --logdir=文件夾目錄
生成模型pb
export_inference_graph.py 文件
python export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path training/ssd_mobilenet_v1_coco.config \
--trained_checkpoint_prefix 寫自己訓練出的模型checkpoint文件,選最大迭代次數的模型,例如:training/model.ckpt-31012 \
--output_directory 輸出文件夾的名字,例如:results
執行完之后,可以在results文件夾下面發現很多文件,aved_model、checkpoint、frozen_inference_graph.pb等。 .pb結尾的就是最重要的frozen model了,我們最開始用demo測試的時候就使用的是frozen.model。
測試模型並輸出
# -*- coding: utf-8 -*- 對tutorial做了一點修改,改成py文件 import time start = time.time() import numpy as np import os import six.moves.urllib as urllib import sys import tarfile import tensorflow as tf import zipfile import cv2 from collections import defaultdict from io import StringIO from matplotlib import pyplot as plt from PIL import Image import pandas as pd if tf.__version__ < '1.4.0': raise ImportError('Please upgrade your tensorflow installation to v1.4.* or later!') #改變當前工作目錄到指定的路徑 #默認情況下,Python解釋器會搜索當前目錄、所有已安裝的內置模塊和第三方模塊 #sys.path,返回的模塊搜索路徑列表,類似於這樣的: # [ # 'F:\\', 'F:\\測試自己的圖像識別模型.py', # 'D:\\anaconda\\anaconda3.4.2.0\\python35.zip', # 'D:\\anaconda\\anaconda3.4.2.0\\DLLs', # 'D:\\anaconda\\anaconda3.4.2.0\\lib' # ] os.chdir(r'E:/object_detection/') #.. 或 ../是父目錄,表示將父目錄添加進了sys.path # . 或 ./是當前目錄 sys.path.append("..") from object_detection.utils import label_map_util from object_detection.utils import visualization_utils as vis_util #Model preparation # What model to download. #MODEL_NAME = 'tv_vehicle_inference_graph' #MODEL_NAME = 'tv_vehicle_inference_graph_fasterCNN' #最后生成的那個文件夾名result MODEL_NAME = 'result' #MODEL_NAME = 'ssd_mobilenet_v1_coco_2017_11_17' #[30,21] best #MODEL_NAME = 'ssd_inception_v2_coco_2017_11_17' #[42,24] #MODEL_NAME = 'faster_rcnn_inception_v2_coco_2017_11_08' #[58,28] #MODEL_NAME = 'faster_rcnn_resnet50_coco_2017_11_08' #[89,30] #MODEL_NAME = 'faster_rcnn_resnet50_lowproposals_coco_2017_11_08' #[64, ] #MODEL_NAME = 'rfcn_resnet101_coco_2017_11_08' #[106,32] #PATH_TO_CKPT = 'result' + '/你自己生成的pb模型文件名' PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb' # 標簽問價你在training文件夾下面,當前腳本和training文件夾是同一級的 PATH_TO_LABELS = os.path.join('training', 'detection.pbtxt') # 目標類倍數 NUM_CLASSES = 2 # 將訓練完的載入內存(不用改) detection_graph = tf.Graph() with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: serialized_graph = fid.read() od_graph_def.ParseFromString(serialized_graph) tf.import_graph_def(od_graph_def, name='') # 載入標簽map(不用改) label_map = label_map_util.load_labelmap(PATH_TO_LABELS) categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True) category_index = label_map_util.create_category_index(categories) def load_image_into_numpy_array(image): (im_width, im_height) = image.size # getdata()返回的是2維度的:(width*height , 3) # [[1 2 3] # [3 4 5]]類似這種的 # 因此reshape成3維 return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8) # 放圖片文件夾的的文件夾名!!!就是test_images文件夾在PATH_TO_TEST_IMAGES_DIR文件夾里面 # 這個腳本和 PATH_TO_TEST_IMAGES_DIR 文件夾 是在同一個目錄下的,都在object_detection/下面 PATH_TO_TEST_IMAGES_DIR = 'PATH_TO_TEST_IMAGES_DIR' # 將PATH_TO_TEST_IMAGES_DIR文件夾目錄設置為當前路徑 os.chdir(PATH_TO_TEST_IMAGES_DIR) # os.listdir() 方法用於返回指定的文件夾包含的文件或文件夾的名字的列表,帶后綴 # 這個列表以字母順序。 它不包括 '.' 和'..' 即使它在文件夾中。 # 這個列表里就只有一個元素:test_images文件夾 TEST_IMAGE_DIRS = os.listdir(PATH_TO_TEST_IMAGES_DIR) # 輸出圖片的尺寸,inches IMAGE_SIZE = (12, 8) #輸出圖片帶畫框的結果,這個文件夾和test_images同級 output_image_path = ("PATH_TO_TEST_IMAGES_DIR\") # 另外加了輸出識別結果框的坐標,保存為.csv表格文件 output_csv_path = ("\輸出\識別結果\表格\的\路徑\") # image_folder:test_images文件夾 for image_folder in TEST_IMAGE_DIRS: with detection_graph.as_default(): with tf.Session(graph=detection_graph) as sess: # Definite input and output Tensors for detection_graph image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') # Each box represents a part of the image where a particular object was detected. detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0') # Each score represent how level of confidence for each of the objects. # Score is shown on the result image, together with the class label. detection_scores = detection_graph.get_tensor_by_name('detection_scores:0') detection_classes = detection_graph.get_tensor_by_name('detection_classes:0') num_detections = detection_graph.get_tensor_by_name('num_detections:0') # 當前路徑加test_images文件夾名,再listdir得到的就是所有test圖片的path列表。 TEST_IMAGE_PATHS = os.listdir(os.path.join(image_folder)) # 新建一個文件夾PATH_TO_TEST_IMAGES_DIR\draw_test_images os.makedirs(output_image_path+'draw_'+image_folder) data = pd.DataFrame() for image_path in TEST_IMAGE_PATHS: image = Image.open(image_folder + '//'+image_path) width, height = image.size # the array based representation of the image will be used later in order to prepare the # result image with boxes and labels on it. image_np = load_image_into_numpy_array(image)#[width,height,2] # 在axis=0即第一個維度擴充一個維度,就是在最外面加了一層括號,shpe=[1,width,height,3] # 這樣變是因為,輸入的圖片格式要是四維的 image_np_expanded = np.expand_dims(image_np, axis=0) # run檢測,得到結果值 (boxes, scores, classes, num) = sess.run([detection_boxes, detection_scores, detection_classes, num_detections], feed_dict={image_tensor: image_np_expanded}) # 可視化 # vis_util是objection模塊里的一個函數 vis_util.visualize_boxes_and_labels_on_image_array( image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8) #保存識別結果圖片,輸出的也是圖片,保存在draw_test_images文件夾下 cv2.imwrite(output_image_path+'draw_'+image_folder+'\\'+image_path.split('\\')[-1],image_np) s_boxes = boxes[scores > 0.5] s_classes = classes[scores > 0.5] s_scores=scores[scores>0.5] #write table #保存位置坐標結果到 .csv表格 for i in range(len(s_classes)): newdata= pd.DataFrame(0, index=range(1), columns=range(7)) newdata.iloc[0,0] = image_path.split("\\")[-1].split('.')[0] newdata.iloc[0,1] = s_boxes[i][0]*height #ymin newdata.iloc[0,2] = s_boxes[i][1]*width #xmin newdata.iloc[0,3] = s_boxes[i][2]*height #ymax newdata.iloc[0,4] = s_boxes[i][3]*width #xmax newdata.iloc[0,5] = s_scores[i] newdata.iloc[0,6] = s_classes[i] data = data.append(newdata) data.to_csv(output_csv_path+image_folder+'.csv',index = False) end = time.time() print("Execution Time: ", end - start)
object_detection文件夾下的eval_util.py文件打開,visualize_detection_results函數里面min_score_thresh=.5, 可以把這個改小一點,會輸出分數更低的框。max_num_predictions=20 調整最多輸出的個數。