VOC數據集目標檢測

本文轉載自查看原文 2019-05-02 00:34 8280 深度學習/ 目標檢測/ 計算機視覺與深度學習

最近在做與目標檢測模型相關的工作,很多都要求VOC格式的數據集.

PASCAL VOC挑戰賽（The PASCAL Visual Object Classes ）是一個世界級的計算機視覺挑戰賽, PASCAL全稱：Pattern Analysis, Statical Modeling and Computational Learning，是一個由歐盟資助的網絡組織。很多模型都基於此數據集推出.比如目標檢測領域的yolo,ssd等等.

voc數據集結構

看下目錄結構

:~/git_projects/models/research/VOCdevkit/VOC2012$ tree -d 
.
├── Annotations
├── ImageSets
│   ├── Action
│   ├── Layout
│   ├── Main
│   └── Segmentation
├── JPEGImages
├── SegmentationClass
└── SegmentationObject

JPEGImages
這個目錄下存放的是圖片數據.
Annotations下存放的是xml文件,描述了圖片信息.

~/git_projects/models/research/VOCdevkit/VOC2012/Annotations$ cat 2012_004331.xml
<annotation>
	<filename>2012_004331.jpg</filename>
	<folder>VOC2012</folder>
	<object>
		<name>person</name>
		<actions>
			<jumping>1</jumping>
			<other>0</other>
			<phoning>0</phoning>
			<playinginstrument>0</playinginstrument>
			<reading>0</reading>
			<ridingbike>0</ridingbike>
			<ridinghorse>0</ridinghorse>
			<running>0</running>
			<takingphoto>0</takingphoto>
			<usingcomputer>0</usingcomputer>
			<walking>0</walking>
		</actions>
		<bndbox>
			<xmax>208</xmax>
			<xmin>102</xmin>
			<ymax>230</ymax>
			<ymin>25</ymin>
		</bndbox>
		<difficult>0</difficult>
		<pose>Unspecified</pose>
		<point>
			<x>155</x>
			<y>119</y>
		</point>
	</object>
	<segmented>0</segmented>
	<size>
		<depth>3</depth>
		<height>375</height>
		<width>500</width>
	</size>
	<source>
		<annotation>PASCAL VOC2012</annotation>
		<database>The VOC2012 Database</database>
		<image>flickr</image>
	</source>
</annotation>

對應的圖片為

我們注意需要關注的就是節點下的數據,尤其是bndbox下的數據.xmin,ymin構成了boundingbox的左上角,xmax,ymax構成了boundingbox的右下角.
啥叫boundingbox? 模型檢測出目標了,會畫一個框框,標定這個框框內的東西,認為是一個object.

3. ImageSets

Action下存放的是人的動作（例如running、jumping等等，這也是VOC challenge的一部分）
Layout下存放的是具有人體部位的數據（人的head、hand、feet等等，這也是VOC challenge的一部分）
Segmentation下存放的是可用於分割的數據。
Main下存放的是圖像物體識別的數據，總共分為20類。
我們主要關注Main下面的文件.

一共63個文件,train.txt/val.txt/trainval.txt里面記錄的是對應的數據集圖片名字. 剩下60個文件=20*3. 一共20個類別,每個類別有xxx_train.txt,xxx_val.txt,xxx_trainval.txt.

1代表正樣本,-1代表負樣本

看一下aeroplane_train.txt中的部分內容
2011_003177  1    //意思是2011_003177.jpg中有aeroplane
2011_003183 -1    //意思是2011_003183.jpg中沒有aeroplane
2011_003184 -1
2011_003187 -1
2011_003188 -1
2011_003192 -1
2011_003194 -1
2011_003216 -1
2011_003223 -1
2011_003230 -1
2011_003236 -1
2011_003238 -1
2011_003246 -1
2011_003247 -1
2011_003253 -1
2011_003255 -1
2011_003259 -1
2011_003274 -1

看一下train.txt中的內容  只含圖片名稱
2011_003187
2011_003188
2011_003192
2011_003194
2011_003216
2011_003223
2011_003230
2011_003236
2011_003238

制作自己的voc數據集

數據准備
標定圖片:生成label文件,文件內容為類別及boundingbox信息
生成符合VOC格式要求的文件主要是Annotations/.xml ImageSets/main/.txt

數據准備這一步,你的數據可能來自公開數據集,或者合作方的私有數據.
數據集的標注這一步可以使用labelIImg 標注自己的圖片https://github.com/tzutalin/labelImg

在做數據集格式轉換的過程里,不可避免的要寫很多腳本,每個人的需求不同,轉換前拿到的文件內的數據格式不同,需要的腳本也都有所差異.這里提供幾個我自己用的腳本.

#數據集划分
import os
import random

root_dir='./park_voc/VOC2007/'

## 0.7train 0.1val 0.2test
trainval_percent = 0.8
train_percent = 0.7
xmlfilepath = root_dir+'Annotations'
txtsavepath = root_dir+'ImageSets/Main'
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)  # 100
list = range(num)
tv = int(num*trainval_percent)  # 80
tr = int(tv*train_percent)  # 80*0.7=56
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open(root_dir+'ImageSets/Main/trainval.txt', 'w')
ftest = open(root_dir+'ImageSets/Main/test.txt', 'w')
ftrain = open(root_dir+'ImageSets/Main/train.txt', 'w')
fval = open(root_dir+'ImageSets/Main/val.txt', 'w')

for i in list:
    name = total_xml[i][:-4]+'\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest .close()

#.txt-->.xml
#! /usr/bin/python
# -*- coding:UTF-8 -*-
import os, sys
import glob
from PIL import Image
 
# VEDAI 圖像存儲位置
src_img_dir = "/home/train/dataset-expand/park_voc/VOC2007/JPEGImages"
# VEDAI 圖像的 ground truth 的 txt 文件存放位置
src_txt_dir = "/home/train/dataset-expand/label_expand"
src_xml_dir = "/home/train/dataset-expand/park_voc/VOC2007/Annotations"
 
img_Lists = glob.glob(src_img_dir + '/*.jpg')
 
img_basenames = [] # e.g. 100.jpg
for item in img_Lists:
    img_basenames.append(os.path.basename(item))
 
img_names = [] # e.g. 100
for item in img_basenames:
    temp1, temp2 = os.path.splitext(item)
    img_names.append(temp1)
 
for img in img_names:
    im = Image.open((src_img_dir + '/' + img + '.jpg'))
    width, height = im.size
 
    # open the crospronding txt file
    gt = open(src_txt_dir + '/' + img.replace('img','label',1) + '.txt').read().splitlines()
    #gt = open(src_txt_dir + '/gt_' + img + '.txt').read().splitlines()
 
    # write in xml file
    #os.mknod(src_xml_dir + '/' + img + '.xml')
    xml_file = open((src_xml_dir + '/' + img + '.xml'), 'w')
    xml_file.write('<annotation>\n')
    xml_file.write('    <folder>VOC2007</folder>\n')
    xml_file.write('    <filename>' + str(img) + '.jpg' + '</filename>\n')
    xml_file.write('    <size>\n')
    xml_file.write('        <width>' + str(width) + '</width>\n')
    xml_file.write('        <height>' + str(height) + '</height>\n')
    xml_file.write('        <depth>3</depth>\n')
    xml_file.write('    </size>\n')
 
    # write the region of image on xml file
    for img_each_label in gt:
        spt = img_each_label.split(',') #這里如果txt里面是以逗號‘，’隔開的，那么就改為spt = img_each_label.split(',')。
        xml_file.write('    <object>\n')
        xml_file.write('        <name>' + str(spt[4]) + '</name>\n')
        xml_file.write('        <pose>Unspecified</pose>\n')
        xml_file.write('        <truncated>0</truncated>\n')
        xml_file.write('        <difficult>0</difficult>\n')
        xml_file.write('        <bndbox>\n')
        xml_file.write('            <xmin>' + str(spt[0]) + '</xmin>\n')
        xml_file.write('            <ymin>' + str(spt[1]) + '</ymin>\n')
        xml_file.write('            <xmax>' + str(spt[2]) + '</xmax>\n')
        xml_file.write('            <ymax>' + str(spt[3]) + '</ymax>\n')
        xml_file.write('        </bndbox>\n')
        xml_file.write('    </object>\n')
 
    xml_file.write('</annotation>')

目標檢測判斷標准

今天先不寫了,待補充.

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 目標檢測 – 解析VOC和COCO格式並制作自己的數據集【pytorch-ssd目標檢測】制作類似pascal voc格式的目標檢測數據集目標檢測數據集目標檢測之常用數據集目標檢測數據集格式 gluoncv 目標檢測，訓練自己的數據集第三十二節，使用谷歌Object Detection API進行目標檢測、訓練新的模型(使用VOC 2012數據集) 目標檢測：keras-yolo3之制作VOC數據集訓練指南 voc數據集坐標，coco數據集坐標數據集：Pascal VOC 2007數據集分析