【項目實戰】YOLOv5 訓練自己的數據集

本文轉載自查看原文 2021-03-02 09:28 6911 YOLOv5/ 項目實戰/ 目標檢測

源碼結構
整體流程

源碼地址：https://github.com/ultralytics/yolov5

上一節，我們介紹了YOLOv5的配置和簡單使用，本節我們重在解決自己場景的實際問題，進一步了解代碼的組成。

源碼結構

Pytorch版本的YOLOv5 源碼結構如下圖所示。

train.py / test.py / detect.py:
- train.py : 訓練
- test.py: 測試COCO指標
- detect.py ：批量檢測圖片並生成檢測圖像；
data文件夾：包含自帶驗證的兩張圖像和相應訓練數據地址，類別數量，類別名稱等配置，可以類比Darknet中的voc.data；
models文件夾：包含網絡結構的yaml文件以及網絡實現所需的腳本文件；
runs文件夾：包含每次執行detect.py / test.py / train.py 的結果；
utils文件夾：包含網絡訓練、驗證、測試所需要的腳本文件；
weights文件夾：下載預訓練腳本的.sh文件。

整體流程

類比於Darknetyolov3/v4的配置，Pytorch版本YOLOv5在訓練自己的數據集時需要包含以下步驟。

准備自己的數據集

本文采用與Darknet相似的處理方式進行數據集的放置，即將數據集統一放在COCOdevkit文件夾下(由於所使用的標注文件是json，不是xml的VOC)。大致結構如下圖所示。

左圖為數據集的基本結構以及准備過程生成的文件；
- 划分后的圖像images
- 划分后的labels (txt格式)
- 匯總的txt文件
- 生成的anchor
中圖為COCOdevikit的目錄結構
右圖為准備過程所需要的代碼文件
- remove_img_without_jsonlabel.py : 去除沒有標注文件的圖像，並將不同文件夾下的圖像和標注文件分別匯總到同一個文件夾下；
- show_labels.py : 顯示標注文件中可能包含的標簽（對於類型較多，例如標注名稱包含狀態類的，需要匯總標注名稱的情況，對於標注類型較少情況可以跳過該腳本的執行）
- create_txt.py : 划分數據集為訓練、驗證和測試；生成對應的txt文件（包含每張圖像的txt和訓練、驗證、測試匯總的txt文件），其中包含類型的映射；
- kmeans.py ： Darknet YOLOv3中的anchor聚類腳本，運行此腳本需要包含已經生成的匯總的txt文件。

整體流程與Darknet YOLOv3/v4的過程相似，具體如下圖所示：

remove_img_without_jsonlabel.py

由於標注過程中圖像和標注文件可能會分布在不同的文件夾下，並且可能會包含部分圖像中沒有待標注的目標（不存在標注文件），因此，需要將不同文件夾下的圖像和標注文件匯總到各自的目錄下，並去除沒有標注文件的圖像，使圖像和標注文件統一。

使用下述代碼需要修改圖像的路徑path，以及目標圖像路徑move_path_img和標注路徑move_path_anno：

remove_img_without_jsonlabel.py

# -*- coding: utf-8 -*-
# @Time : 2021/1/15 9:06
# @Author : smw
# @Site : jnsenter
# @File : remove_img_without_jsonlabel.py
# @Software: PyCharm

# 功能：1. 統一標注和圖像，將沒有標注的圖像刪除；
#      2. 將不同文件夾下的圖像和標注文件匯總到一個文件夾下

import os
import shutil

def scandir(path, file_list):
    for item in os.scandir(path):
        if item.is_dir():
            scandir(item.path, file_list)
        elif item.is_file():
            file_list.append(item.path)

def image_remove(image_file_path, delete_num=0, residue_num=0):
    move_path_img = "E:\image_dataset\yaban_train_images\JPEGImages"
    move_path_anno = "E:\image_dataset\yaban_train_images\Annotations"
    os.makedirs(move_path_img, exist_ok=True)
    os.makedirs(move_path_anno, exist_ok=True)

    for root, dir, file_names in os.walk(image_file_path):
        if not file_names:  # 空
            for _dir in dir:
                next_dir = os.path.join(root, _dir)
                delete_num, residue_num = image_remove(next_dir, delete_num, residue_num)
        else:   # 非空
            for file_name in file_names:
                image_path = os.path.join(root, file_name)
                if image_path.endswith(".jpg"):
                    print("Process on {}".format(image_path))
                    json_path = image_path.replace(".jpg", ".json")
                    if not os.path.isfile(json_path):   # 如果不是一個文件則刪除圖像
                        os.remove(image_path)
                        delete_num += 1
                    else:    # 如果是一個文件 就把它們統一匯總在一起
                        img_dir = os.path.join(move_path_img, file_name)
                        anno_dir = os.path.join(move_path_anno, file_name.replace(".jpg", ".json"))
                        shutil.move(image_path, img_dir)
                        shutil.move(json_path, anno_dir)
                        residue_num += 1
                else:    # 非jpg文件跳過不處理
                    continue
    return delete_num, residue_num

if __name__ == '__main__':
    path = "E:\image_dataset\yaban"
    delete_num, residue_num = image_remove(path)
    print(delete_num, residue_num)

show_labels.py

本腳本適合以下情況：

對於標注文件中類型較多，需要匯總的情況；
對於需要在標注文件中提取待訓練的類型（工程實際中，往往會對一張圖像上包含的目標進行統一標注，不返工；其中可能會包含此次不需要訓練的目標）

使用前提是在知道目標類型的部分開頭的情況下。

需要修改json文件的路徑json_path，以及篩選的labelcandidate_label.

import json
import os 
import os.path as osp
import sys
import shutil
import math
import numpy as np

# 本腳本的目的在於顯示json中包含的標簽名稱

# 由於標注時會臨時添加某種類型的某個標簽，無法獲取該有的標簽類型

# 在獲得標簽名稱后，可以移步create_txt.py生成label
if __name__ == "__main__":
    json_path = "/home/smw/Project/yolov5/COCOdevkit/COCO2021/labels_json"
    json_dir = [os.path.join(json_path, json_name) for json_name in os.listdir(json_path)]

    # 篩選的label
    candidate_label = ['XX1', 'XX2']

    # 建立label
    label_list = []  # 用於存放不同符合要求的label

    # 循環讀取json文件，並獲得其中的label
    for js in json_dir:
        
        with open(js, "r", encoding="utf-8") as f:
            json_info = json.load(f)
        
        shapes_info = json_info.get("shapes")
        for label in shapes_info:
            label_name = label.get("label")
            flag = [True if label_name.startswith(m) else False for m in candidate_label]
            flag_sum = sum(flag)   # 是否所有的篩選label都不符合
            if not flag_sum:   # 如果均不符合  則跳過 到下一個標簽
                continue
            # 若符合，進行label_list 的添加
            if label_name not in label_list:
                label_list.append(label_name) 
    print(label_list)

create_txt.py

YOLOv5中訓練需要json文件生成的txt文件。並且包含單張圖像的txt文件和訓練、驗證、測試匯總的txt文件。

單張圖像包含的txt文件示例

包含類別序號、x、y、w、h的相對位置。

匯總的txt文件示例

包含圖像名稱、以逗號分割的左上角點坐標右下角點坐標(x1, y1, x2, y2, classid)以及類別序號，不同框之間以空格作為分隔符。

其中，會過濾掉不包含預定目標的圖像以及json文件。

create_txt.py

# -*- coding: utf-8 -*-
# @Time : 2021/1/15 16:06
# @Author : smw
# @Site : jnsenter
# @File : create_txt.py
# @Software: PyCharm

import json
import os 
import os.path as osp
import sys
import shutil
import math
import numpy as np

from functools import reduce
import logging
import logging.handlers
import logging.config

# 功能： 將數據集划分為訓練集、驗證集、測試集，並生成對應的txt於train_2021, val_2021, test_2021

classses = ['XX1', 'XX2']
sets = [('2022', 'train'), ('2022', 'val'), ("2022", "test")]  # float為划分比例 需保證比例總和為1
ratio = [0.8, 0.1, 0.1]
names_reflect = {"XX1": "YY1", "XX2": "YY2"}
# print(names_reflect.values())
# print("")

def initLog(logfile):
    logger = logging.getLogger()
    # 指定日志的最低輸出級別，默認為WARN級別
    logger.setLevel(logging.INFO)
    fileHandeler = logging.handlers.TimedRotatingFileHandler(logfile, 'M', 1, 0, encoding="utf-8")
    # rotate_handler = ConcurrentRotatingFileHandler(logfile, "a", 1024 * 1024 * 100 * 6, backupCount=5, encoding="utf-8")
    # 指定logger輸出格式
    formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
    fileHandeler.setFormatter(formatter)
    # 為logger添加的日志處理器
    logger.addHandler(fileHandeler)
    return logger

def convert(size, box):
    """坐標轉換為中心點坐標+寬高"""
    dw = 1./(size[0])
    dh = 1./(size[1])
    x = (box[0] + box[1])/2.0 - 1
    y = (box[2] + box[3])/2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

def image_datasets_split(year, image_set, json_path, data, names_reflect, f_main, logger):
    """移動圖像數據到相應的訓練 測試 驗證文件夾下"""
    move_path_image = "/home/smw/Project/yolov5/COCOdevkit/COCO{}/images/{}".format(year, image_set+year)
    os.makedirs(move_path_image, exist_ok=True)
    save_label_path = "/home/smw/Project/yolov5/COCOdevkit/COCO{}/labels/{}".format(year, image_set+year)
    os.makedirs(save_label_path, exist_ok=True)
    no_object_num = 0
    main_info_list = []   # 匯總的txt
    for path in data:  # 某中數據集的循環
        image_name = os.path.split(path)[1]
        if not image_name.endswith(".jpg"):
            logger.warning("{} is not endwith the .jpg".format(path))
            print("{} is not endwith the .jpg".format(path))
            continue
        
        # label txt
        save_txt_dir = os.path.join(save_label_path, image_name.replace(".jpg", ".txt"))
        # 1. 讀取json 信息
        json_name = image_name.replace(".jpg", ".json")
        json_dir = os.path.join(json_path, json_name)
        # 再次驗證json文件是否存在
        if not os.path.isfile(json_dir):
            logger.info("{} is not exist".format(json_dir))
            print("{} is not exist".format(json_dir))
            continue
        
        # 2. 過濾標簽
        label_names = list(names_reflect.keys())
        with open(json_dir, "r", encoding="utf-8") as f:
            json_info = json.load(f)
        shapes_info = json_info.get("shapes")
        
        info_num = 0
        info_list = []  # 單個的label txt
        main_info_list_pick = [image_name]
        h = json_info.get("imageHeight")
        w = json_info.get("imageWidth")
        img_size = (w, h)

        for label_info in shapes_info:
            if label_info.get("shape_type") != 'rectangle':   # 過濾掉標錯的 類別名稱相同的內容
                continue

            label_name = label_info.get("label")
            if not label_name in label_names:    # 標注類型符合后 過濾標簽名字
                continue
            # 准備寫txt的內容
            info_num += 1  # 符合數量增
            label_reflect_name = names_reflect[label_name]

            label_class_id = label_names.index(label_name)   # info 1
            points = label_info.get("points")
            
            x, y, _w, _h = convert(img_size, [points[0][0], points[1][0],points[0][1], points[1][1]])   # 轉xywh
            info_list.append([label_class_id, x, y, _w, _h])   # 記錄信息
            main_info_list_pick.append([str(math.floor(points[0][0])), str(math.floor(points[0][1])), str(math.floor(points[1][0])), str(math.floor(points[1][1])), str(label_class_id)])

        
        if info_num > 0:  # 存在目標對象， 開始寫txt，移動圖像， 否則不移動圖像
            # ToDo 移動圖像，寫txt
            # 如果不存在，則移動
            if not os.path.isfile(os.path.join(move_path_image, image_name)):
                # 移動圖像
                shutil.copy(path, os.path.join(move_path_image, image_name))
            # shutil.move(path, os.path.join(move_path_image, image_name))
            # 寫txt
            if not os.path.isfile(save_txt_dir):
                with open(save_txt_dir, "w") as f_txt:
                    for write_info in info_list:
                        write_info_str = map(str, write_info)
                        str_info = " ".join(write_info_str)
                        f_txt.write(str_info)
                        f_txt.write("\r\n")
        else:  # 不存在目標     
            # ToDo 不移動目標也不寫txt 加日志輸出
            no_object_num += 1
            logger.info("{} file does not have object!".format(image_name))
            print("{} file does not have object!".format(image_name))
        if len(main_info_list_pick) != 1:
            main_info_list.append(main_info_list_pick)  # 匯總每張圖的
    # 所有圖像匯總好后，寫train2021.txt
    main_txt_write(main_info_list, f_main)
    return no_object_num

def main_txt_write(info_list, f):
    def add_dou(x, y):
        return x+","+y
    def add_space(x, y):
        return x+" "+y

    for info in info_list:
        image_name = info[0]
        bbox_list = []
        for info_bbox in info[1:]:
            info_single = reduce(add_dou, info_bbox)
            bbox_list.append(info_single)
        info_str_all = image_name + " " + reduce(add_space, bbox_list) 
        f.write(info_str_all)
        f.write("\r\n")
    print("")

if __name__ == "__main__":
    pwd = os.getcwd()
    print(pwd)
    logfile = os.path.join(pwd, "run.log")
    logger = initLog(logfile)

    image_path = "/home/smw/Project/yolov5/COCOdevkit/COCO2021/images_ori"
    json_path = "/home/smw/Project/yolov5/COCOdevkit/COCO2021/labels_json"

    main_txt_path = "/home/smw/Project/yolov5/COCOdevkit/COCO{}".format(sets[0][0])
    image_dir = [os.path.join(image_path, image_name) for image_name in os.listdir(image_path)]
    json_dir = [os.path.join(json_path, json_name) for json_name in os.listdir(json_path)]
    image_info = np.array(image_dir)   # image_dir 和 json_dir 的序號直接拼接對不上  因此僅以圖像為主！！
    # 判斷數量一致
    image_num = len(image_dir)
    json_num = len(json_dir)
    if image_num != json_num:
        logger.error(" The num of images is not equal to json, please excute remove_img_without_jsonlabel.py first")
        print(" The num of images is not equal to json, please excute remove_img_without_jsonlabel.py first")
        sys.exit(0)   # 此處可以進行刪除無json的圖像的代碼 代替退出程序！！
    else:
        logger.info("The num of images is equal to json")
        logger.info("The num of images is {}, the number of images is {}".format(image_num, json_num))
        print(" The num of images is not equal to json, please excute remove_img_without_jsonlabel.py first")
    # 數據集划分  只看數據 json文件沒有再說吧
    image_num_index = list(range(image_num))
    np.random.shuffle(image_num_index)
    train_end_num = int(image_num * ratio[0])
    val_end_num = int(image_num * (ratio[0] + ratio[1]))     #  沒有對是否有測試 驗證集的兼容！！
    
    print(image_info[0])
    print(image_info.shape)
    print(type(image_num_index))
    train_info = image_info[image_num_index[:train_end_num]]
    val_info = image_info[image_num_index[train_end_num: val_end_num]]
    test_info = image_info[image_num_index[val_end_num:]]
    logger.info("train num: {}".format(train_info.shape[0]))
    logger.info("val num:{}".format(val_info.shape[0]))
    logger.info("test num{}".format(test_info.shape[0]))
    print("train num: {}".format(train_info.shape[0]))
    print("val num:{}".format(val_info.shape[0]))
    print("test num:{}".format(test_info.shape[0]))
    print("The number of total split data is {}".format(train_info.shape[0]+val_info.shape[0]+test_info.shape[0]))   # 目前來看划分后的數據和總數是相等的
    no_object_list = []
    for path_info, data in zip(sets, (train_info, val_info, test_info)):
        year = path_info[0]
        image_set = path_info[1]
        main_txt_dir = os.path.join(main_txt_path, image_set+year+".txt")
        f = open(main_txt_dir, "w")
        no_object_num = image_datasets_split(year, image_set, json_path, data, names_reflect, f, logger)
        no_object_list.append(no_object_num)
        f.close()
        print("{} finished!".format(image_set))
    print("No object Number:")
    print("train: {}".format(no_object_list[0]))
    print("val: {}".format(no_object_list[1]))
    print("test: {}".format(no_object_list[2]))

kmeans.py

用於anchor的優化，直接使用darknet中的kmeans.py即可。在使用前需要具有匯總的txt文件，並在代碼中修改相應的路徑，示例如下：

if __name__ == "__main__":	
    cluster_number = 9
    filename = "/home/smw/Project/yolov5/COCOdevkit/COCO2022/train2022.txt"
    kmeans = YOLO_Kmeans(cluster_number, filename)
    kmeans.txt2clusters()

另外，對於YOLOv5而言，無論是輕量級模型YOLOv5s還是達模型YOLOv5l，anchor的數量都是9個，不像YOLOv3/v4的tiny模型，anchor數量為6。

kmeans.py

import numpy as np

class YOLO_Kmeans:
    def __init__(self, cluster_number, filename):
    self.cluster_number = cluster_number
    self.filename = filename

    def iou(self, boxes, clusters):  # 1 box -> k clusters
        n = boxes.shape[0]
        k = self.cluster_number

        box_area = boxes[:, 0] * boxes[:, 1]    # 把要聚類的框的寬高相乘，作為了一個box_area
        box_area = box_area.repeat(k)     # 要算到k個類中心的距離，需要搞一個每個都有k個的矩陣
        box_area = np.reshape(box_area, (n, k))

        cluster_area = clusters[:, 0] * clusters[:, 1]
        cluster_area = np.tile(cluster_area, [1, n])
        cluster_area = np.reshape(cluster_area, (n, k))
        # 把box和cluster的寬都整理成n行k列的形式，並把兩者做比較，最后還是一個n行k列的形式，這個
        # 過程其實在比較box和兩個cluster的寬，並選出小的
        box_w_matrix = np.reshape(boxes[:, 0].repeat(k), (n, k))
        cluster_w_matrix = np.reshape(np.tile(clusters[:, 0], (1, n)), (n, k))
        min_w_matrix = np.minimum(cluster_w_matrix, box_w_matrix)
        # 把box和cluster的高都整理成n行k列的形式，並把兩者做比較，最后還是一個n行k列的形式，這個
        # 過程其實在比較box和兩個cluster的高，並選出小的
        box_h_matrix = np.reshape(boxes[:, 1].repeat(k), (n, k))
        cluster_h_matrix = np.reshape(np.tile(clusters[:, 1], (1, n)), (n, k))
        min_h_matrix = np.minimum(cluster_h_matrix, box_h_matrix)
        # 將篩選出來的小的寬高 相乘
        inter_area = np.multiply(min_w_matrix, min_h_matrix)

        result = inter_area / (box_area + cluster_area - inter_area)
        return result

    def avg_iou(self, boxes, clusters):
        accuracy = np.mean([np.max(self.iou(boxes, clusters), axis=1)])
        return accuracy

    def kmeans(self, boxes, k, dist=np.median):
        box_number = boxes.shape[0]
        distances = np.empty((box_number, k))
        last_nearest = np.zeros((box_number,))
        np.random.seed()
        clusters = boxes[np.random.choice(
            box_number, k, replace=False)]  # 隨機選擇k個類中心
        while True:

            distances = 1 - self.iou(boxes, clusters)

            current_nearest = np.argmin(distances, axis=1)
            if (last_nearest == current_nearest).all():
                break  # clusters won't change
            for cluster in range(k):
                # print(clusters[cluster])
                # print(boxes[current_nearest == cluster])
                # print(np.mean(boxes[current_nearest == cluster][:, 0]))
                clusters[cluster] = dist(boxes[current_nearest == cluster], axis=0)  # update clusters
                # 類中心的修改選取的是中位數 不是平均值
            last_nearest = current_nearest

        return clusters

	def result2txt(self, data):
        f = open("/home/smw/Project/yolov5/COCOdevkit/COCO2022/huikonggui2022.txt", 'w')
        row = np.shape(data)[0]
        for i in range(row):
            if i == 0:
                x_y = "%d,%d" % (data[i][0], data[i][1])
            else:
                x_y = ", %d,%d" % (data[i][0], data[i][1])
            f.write(x_y)
        f.close()

    def txt2boxes(self):
        f = open(self.filename, 'r')
        dataSet = []
        for line in f:
            infos = line.split(" ")
            length = len(infos)
            for i in range(1, length):
                width = int(infos[i].split(",")[2]) - \
                    int(infos[i].split(",")[0])
                height = int(infos[i].split(",")[3]) - \
                    int(infos[i].split(",")[1])
                dataSet.append([width, height])
        result = np.array(dataSet)
        f.close()
        return result

    def txt2clusters(self):
        all_boxes = self.txt2boxes()      # 將txt中數值信息轉化為圖像標記框的寬高，並返回
        result = self.kmeans(all_boxes, k=self.cluster_number)
        result = result[np.lexsort(result.T[0, None])]
        self.result2txt(result)
        print("K anchors:\n {}".format(result))
        print("Accuracy: {:.2f}%".format(
            self.avg_iou(all_boxes, result) * 100)

if  __name__ == "__main__":
    cluster_number = 9
    filename = "/home/smw/Project/yolov5/COCOdevkit/COCO2022/train2022.txt"
    kmeans = YOLO_Kmeans(cluster_number, filename)
    kmeans.txt2clusters()

綜上所述，執行完上述步驟，YOLOv5的准備工作基本完成。