從零開始制作數據集所需要的所有python腳本


最近一直在做圖片數據集,積累了很多心得。我把我所使用的python腳本全部拿出來,當然這些腳本大部分網上都有,只不過比較分散。

我已經把所有代碼上傳到github上,覺得寫的好的話,請給我一個star

https://github.com/gzz1529657064/Python-scripts-used-to-make-datasets

由於我的數據集是在拍攝路面的一些物體。因此分為視頻和圖片兩種。視頻分辨率1920x1080,幀率為60fps,圖片分辨率為1920x1080。光拍攝圖片比較慢,拍攝視頻獲取圖片速度很快,畢竟可以將視頻分解成幀,這樣就可以在短時間內獲取大量圖片。順便說一句,錄制視頻的時候可以緩慢的上下、左右移動鏡頭,這樣得到的圖片數據比較豐富。不是那種高度重復的

1. 視頻分解為幀 video_to_picture.py

import cv2
vc = cv2.VideoCapture('E:/HDV-2019-5-8/Movie/20190508_0095.MP4') 
c=0
rval=vc.isOpened()
timeF = 30
while rval:   
    c = c + 1
    rval, frame = vc.read()
    if (c % timeF == 0):
        cv2.imwrite('E:/HDV-2019-5-8/digital_light/95/'+str(c).zfill(5) + '.jpg', frame)    
    cv2.waitKey(1)

vc.release()

其中 timeF 表示幀率,你也可以改小一點。一秒中獲取2幀到4幀左右;zfill(5):表示圖片從00000~99999,數字的位數。如果視頻很長,可以把5調大一點。

 

2. 手動刪除不需要的圖片

 

3. 按照VOC數據集的格式。詳情請看我上篇博客 : 在Ubuntu內制作自己的VOC數據集

 

4. 把所有圖片放入JPEGImages文件中,后綴名一般為 .jpg .png .JPG。需要批量重命名文件夾中圖片文件。使用rename.py

# -*- coding:utf8 -*-
 
import os
class BatchRename():
    '''
    批量重命名文件夾中的圖片文件
    '''
    def __init__(self):
        self.path = '/home/z/work/train'     #存放圖片的文件夾路徑
    def rename(self):
        filelist = os.listdir(self.path)
        total_num = len(filelist)
        i = 1
        for item in filelist:
            if item.endswith('.jpg') or item.endswith('.JPG'):  #圖片格式為jpg、JPG
 
                src = os.path.join(os.path.abspath(self.path), item)
                dst = os.path.join(os.path.abspath(self.path), str(i).zfill(5) + '.jpg')      #設置新的圖片名稱
                try:
                    os.rename(src, dst)
                    print ("converting %s to %s ..." % (src, dst))
                    i = i + 1        
                except:
                    continue
 
        print ("total %d to rename & converted %d jpgs" % (total_num, i))
if __name__ == '__main__':
    demo = BatchRename()
 
    demo.rename()

只需要修改圖片路徑、增添圖片格式、zfill(5)表示圖片名稱從00001~99999,可以按照自己的圖片數量進行修改。

 

5. 使用labelImg進行標注。標注是一個非常漫長而又無聊的過程,堅持住!

每個圖片都會產生一個xml文件。

 

6. 檢查xml文件。check_annotations.py

import os
def getFilePathList(dirPath, partOfFileName=''):
    allFileName_list = list(os.walk(dirPath))[0][2]
    fileName_list = [k for k in allFileName_list if partOfFileName in k]
    filePath_list = [os.path.join(dirPath, k) for k in fileName_list]
    return filePath_list


def check_1(dirPath):
    jpgFilePath_list = getFilePathList(dirPath, '.jpg')
    allFileMarked = True
    for jpgFilePath in jpgFilePath_list:
        xmlFilePath = jpgFilePath[:-4] + '.xml'
        if not os.path.exists(xmlFilePath):
            print('%s this picture is not marked.' %jpgFilePath)
            allFileMarked = False
    if allFileMarked:
        print('congratulation! it is been verified that all jpg file are marked.')

       
import xml.etree.ElementTree as ET
def check_2(dirPath, className_list):
    className_set = set(className_list)
    xmlFilePath_list = getFilePathList(dirPath, '.xml')
    allFileCorrect = True
    for xmlFilePath in xmlFilePath_list:
        with open(xmlFilePath, 'rb') as file:
            fileContent = file.read()
        root = ET.XML(fileContent)
        object_list = root.findall('object')
        for object_item in object_list:
            name = object_item.find('name')
            className = name.text
            if className not in className_set:
                print('%s this xml file has wrong class name "%s" ' %(xmlFilePath, className))
                allFileCorrect = False
    if allFileCorrect:
        print('congratulation! it is been verified that all xml file are correct.')

if __name__ == '__main__':
    dirPath = 'Picture/'
    className_list = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]    
    check_1(dirPath)
    check_2(dirPath, className_list)

 此時圖片和xml在一個文件夾下。文件夾名稱為dirPath。

兩個功能:1. 是否有圖片漏標。2. 標注的類別是否有拼寫錯誤。在className_list中填寫正確的所有類別。

如果存在漏標、類別拼寫錯誤,會打印出圖片的名稱。

 

7. 如果出現大數量的類別拼寫錯誤。比如:行人(pedestrian)拼寫成 pedestrain。可以使用replace_xml_label.py

# coding=utf-8
import os
import os.path
import xml.dom.minidom
 
path = 'Annotations'
files = os.listdir(path)
s = []
for xmlFile in files:
    portion = os.path.splitext(xmlFile)
    if not os.path.isdir(xmlFile):
 
        dom = xml.dom.minidom.parse(os.path.join(path, xmlFile))

        root = dom.documentElement
        name = root.getElementsByTagName('name')

        for i in range(len(name)):
            if name[i].firstChild.data == 'pedestrain':
                name[i].firstChild.data = 'pedestrian'
    with open(os.path.join(path, xmlFile), 'w', encoding='UTF-8') as fh:
        dom.writexml(fh)
        print('replace filename OK!')

 

8. 獲取每個類的數目,查看數據是否平衡。 getClasses.py

import os
import xml.etree.ElementTree as ET
import numpy as np

np.set_printoptions(suppress=True, threshold=np.nan)
import matplotlib
from PIL import Image


def parse_obj(xml_path, filename):
    tree = ET.parse(xml_path + filename)
    objects = []
    for obj in tree.findall('object'):
        obj_struct = {}
        obj_struct['name'] = obj.find('name').text
        objects.append(obj_struct)
    return objects


def read_image(image_path, filename):
    im = Image.open(image_path + filename)
    W = im.size[0]
    H = im.size[1]
    area = W * H
    im_info = [W, H, area]
    return im_info


if __name__ == '__main__':
    xml_path = 'Annotations/'
    filenamess = os.listdir(xml_path)
    filenames = []
    for name in filenamess:
        name = name.replace('.xml', '')
        filenames.append(name)
    recs = {}
    obs_shape = {}
    classnames = []
    num_objs = {}
    obj_avg = {}
    for i, name in enumerate(filenames):
        recs[name] = parse_obj(xml_path, name + '.xml')
    for name in filenames:
        for object in recs[name]:
            if object['name'] not in num_objs.keys():
                num_objs[object['name']] = 1
            else:
                num_objs[object['name']] += 1
            if object['name'] not in classnames:
                classnames.append(object['name'])
    for name in classnames:
        print('{}:{}個'.format(name, num_objs[name]))
    print('信息統計算完畢。')

 

9. 生成ImageSets\Main文件夾下的4個txt文件:test.txt,train.txt,trainval.txt,val.txt

這四個文件存儲的是上一步xml文件的文件名。trainval和test內容相加為所有xml文件,train和val內容相加為trainval。使用CreateTxt.py生成。要將該文件與ImageSets和Annotations放在同一目錄下

import os
import random

trainval_percent = 0.8  # trainval數據集占所有數據的比例
train_percent = 0.5  # train數據集占trainval數據的比例
xmlfilepath = 'Annotations'
txtsavepath = 'ImageSets/Main'
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)
print('total number is ', num)
list = range(num)
tv = int(num * trainval_percent)
print('trainVal number is ', tv)
tr = int(tv * train_percent)
print('train number is ', tr)
print('test number is ', num - tv)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open('ImageSets/Main/trainval.txt', 'w')
ftest = open('ImageSets/Main/test.txt', 'w')
ftrain = open('ImageSets/Main/train.txt', 'w')
fval = open('ImageSets/Main/val.txt', 'w')

for i in list:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

 

10. 將test.txt,train.txt,trainval.txt,val.txt轉化為下面這種格式。使用voc_annotation.py

路徑 類別名 xmin ymin xmax ymax

例如:

xxx/xxx/a.jpg 0 453 369 473 391 1 588 245 608 268

xxx/xxx/b.jpg 1 466 403 485 422 2 793 300 809 320

import xml.etree.ElementTree as ET
from os import getcwd

sets=[('2018', 'train'), ('2018', 'val'), ('2018', 'test'), ('2018', 'trainval')]

classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]


def convert_annotation(year, image_id, list_file):
    in_file = open('VOCdevkit\VOC%s\Annotations\%s.xml'%(year, image_id), encoding = 'utf-8')
    tree=ET.parse(in_file)
    root = tree.getroot()

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult)==1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (int(xmlbox.find('xmin').text), int(xmlbox.find('ymin').text), int(xmlbox.find('xmax').text), int(xmlbox.find('ymax').text))
        #list_file.write(" " + ",".join([str(a) for a in b]) + ',' + str(cls_id))
        list_file.write(" " + str(cls_id) + ' ' + " ".join([str(a) for a in b]))

wd = getcwd()

for year, image_set in sets:
    image_ids = open('VOCdevkit\VOC%s\ImageSets\Main\%s.txt'%(year, image_set)).read().strip().split()
    list_file = open('%s_%s.txt'%(year, image_set), 'w')
    for image_id in image_ids:
        list_file.write('%s\VOCdevkit\VOC%s\JPEGImages\%s.jpg'%(wd, year, image_id))
        convert_annotation(year, image_id, list_file)
        list_file.write('\n')
        
    list_file.close()

同樣地在classes里面填寫你自己實際的類別。

如果碰到圖片輸入是這樣:路徑 xmin ymin xmax ymax 類別名。將代碼中標紅的部分調換一下順序即可

list_file.write(" " + " ".join([str(a) for a in b]) + ' ' + str(cls_id))

 

總結

后面可能還會有將圖片制作成 tfrecord文件用於tensorflow訓練,lmdb文件用於caffe訓練。腳本會繼續增加。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM