自制 COCO api 直接讀取類 COCO 的標注數據的壓縮文件

本文轉載自查看原文 2018-12-13 22:28 1798 API 設計

第6章 COCO API 的使用

COCO 數據庫是由微軟發布的一個大型圖像數據集，該數據集專為對象檢測、分割、人體關鍵點檢測、語義分割和字幕生成而設計。如果你要了解 COCO 數據庫的一些細節，你可以參考：

MS COCO 數據集主頁：http://mscoco.org/
我改寫的 COCO API 網址：https://github.com/Xinering/cocoapi
數據下載: http://mscoco.org/dataset/#download

COCO API^[1] 提供了 Matlab, Python 和 Lua 的 API 接口。該 API 接口提供完整的圖像標簽數據的加載，解析和可視化的工作。此外，網站還提供了與數據相關的文章，教程等。

在使用 COCO 數據庫提供的 API 和 demo 之前，我們首先需要下載 COCO 的圖像和標簽數據：

圖像數據下載到 coco/images/ 文件夾中
標簽數據下載到 coco/annotations/ 文件夾中

本章快報：

介紹和使用官方 API：詳細說明如何在 Linux 和 Windows 系統下使用 cocoapi
改寫官方 API：利用 Python 的特性對 API 進行改寫，同時支持直接讀取壓縮文件
API 擴展：將 API 推廣至其他數據集

下面我們來探討一下如何利用 Python 來使用 COCO 數據集？

6.1 COCO API 的配置與簡介

為了方便操作，我們先 fork 官方 COCO API，然后下載到本地，並切換到 API 所在目錄，如 D:\API\cocoapi\PythonAPI。

cd D:\API\cocoapi\PythonAPI

打開當前目錄下的 Makefile 可以看到 API 的安裝和使用說明。

6.1.1 Windows 的配置

在 Windows 下直接運行 python setup.py build_ext --inplace 會報錯：

Windows 中（一般需要安裝 visual studio）有許多的坑，參考 Windows 10 編譯 Pycocotools 踩坑記^[2] 暴力刪掉參數 Wno-cpp 和 Wno-unused-function，如下圖所示：

這樣，我們便可以在 Python 中使用 pycocotools，不過每次你想要調用 pycocotools 需要先載入局部環境：

import sys
sys.path.append('D:\API\cocoapi\PythonAPI')   # 將你的 `pycocotools` 所在路徑添加到系統環境

如果你不想這么麻煩，你可以直接將 pycocotools 安裝在你的主環境下：

cd D:\API\cocoapi\PythonAPI

python setup.py build_ext install
rd build   # 刪除

但是，這樣並沒有解決根本問題，還有許多 bug 需要你自己調，因而在第 6.2 節介紹了 cocoapi 對 Windows 系統更加的友好實現。

6.1.2 Linux 下的配置

在 Linux 下，不需要上面這么多編譯步驟，我們直接在終端輸入下列命令即可正常使用 COCO API：

pip3 install -U Cython
pip3 install -U pycocotools

當然，你也可以使用和 Windows 系統同樣的處理方法，具體操作方法也可以參考 Makefile^[3]。

6.1.3 API 簡介

COCO API 可以幫助你載入、解析和可視化 annotations。該 API 支持 multiple annotation 格式 (詳情見 data format^[4]). 更多關於 API 的細節可參考 coco.py^[5]，同時 Python API demo^[6] 也提供了API 的英文使用說明。

下面從官方頁面截了張 COCO API 的記號說明的圖片：

cocoapi

COCO 還為每個實例對象提供了分割掩碼（segmentation masks），但是產生了兩個挑戰：緊湊地存儲掩碼和高效地執行掩碼計算。 MASK API 使用自定義運行長度編碼 (Run Length Encoding, RLE) 方案解決這兩個難題。RLE 表示的大小與掩碼的邊界像素數成正比，並且可以在 RLE 上直接有效地計算操作 (如面積、聯合或交集)。具體來說，假設 shapes 相當簡單, RLE 表示形式為 \(O(\sqrt{n})\), 其中 \(n\) 是對象中的像素數, 而通常計算量同樣是 \(O(\sqrt{n})\)。在解碼掩碼 (存儲為陣列) 上進行相同操作的自然的計算量將是 \(O(n)\)。^[4:1]

Mask API 提供了一個用於操作以 RLE 格式存儲的掩碼的接口。這個 API 被定義在 mask.py^[7]。最后, 大多數 ground truth masks 存儲為多邊形 (相當緊湊), 這些多邊形在需要時轉換為 RLE。

MASK

至此，cocoapi 的介紹便宣告結束了，具體使用細則可以參考pycocoDemo.ipynb^[8] 提供的 cocoapi 的使用 demo，我已經翻譯成中文，大家可以查閱：COCO 數據集的使用^[9]。

6.2 改寫 COCO API 的初衷

前文我一直在說 cocoapi 對 Windows 系統不友好，相信在 Windows 系統下使用過 cocoapi 的朋友一定會十分贊同的。

6.2.1 Why? API 改寫的目的

為了在 Windows 系統下更加友好的使用 cocoapi，拋去各種調 bug 的煩惱，我們十分有必要對 cocoapi 進行改寫。但是，完全改寫源碼是有點讓人感到恐懼的事情，而 Python 是一個十分強大的語言，我們利用它的繼承機制可以無壓力改寫代碼。

6.2.2 What? API 可以做什么

讀者朋友是不是感覺改寫 API 在做無用功，我們直接在 Linux 系統使用 cocoapi 也沒有這么多的煩惱，為什么一定要改寫？因為，改寫后的 API 除了可以直接在 Windows 系統友好使用之外，它還提供了無需解壓（直接跳過解壓）直接獲取標注信息和圖片的功能。

6.2.3 How? API 如何設計

我們在 cocoapi 所在目錄 D:\API\cocoapi\PythonAPI\pycocotools 下創建 cocoz.py 文件。下面我們來一步一步的填充 cocoz.py。為了方便調試，我們先在 Notebook 模式下設計該 API，設計好之后，我們再封裝到 cocoz.py 文件中。為了令 cocoapi 可以使用，需要先載入環境：

import sys
sys.path.append(r'D:\API\cocoapi\PythonAPI')

from pycocotools.coco import COCO

由於我們需要直接讀取壓縮文件，因而我們需要 zipfile，為了減少代碼編寫的工作量，我們直接借用 cocoapi 的 COCO 類。又因為標注信息是以 .json 形式存儲的，所以載入 json 也是必要的，而 numpy 和 cv2 處理圖片數據的重要工具當然也需要。

import os
import zipfile
import numpy as np
import cv2
import json
import time

為了更加方便的查看某個函數運行時間，我們需要一個計時器：

def timer(func):
    '''
    Define a timer, pass in one, and
    return another method with the timing feature attached
    '''
    def wrapper(*args):
        start = time.time()
        print('Loading json in memory ...')
        value = func(*args)
        end = time.time()
        print('used time: {0:g} s'.format(end - start))
        return value

    return wrapper

我將 COCO 的所有數據都下載到了磁盤，我們可以查看如下：

root = r'E:\Data\coco'   # COCO 數據根目錄

dataDir = os.path.join(root, 'images')  #  圖片所在目錄
annDir = os.path.join(root, 'annotations')  # 標注信息所在目錄

print('images:\n',os.listdir(dataDir))
print('='*50)
print('annotations:\n',os.listdir(dataDir))
print(os.listdir(annDir))

輸出結果：

images:
['test2014.zip', 'test2015.zip', 'test2017.zip', 'train2014.zip', 'train2017.zip', 'unlabeled2017.zip', 'val2014.zip', 'val2017.zip']
==================================================
annotations:
['test2014.zip', 'test2015.zip', 'test2017.zip', 'train2014.zip', 'train2017.zip', 'unlabeled2017.zip', 'val2014.zip', 'val2017.zip']
['annotations_trainval2014.zip', 'annotations_trainval2017.zip', 'image_info_test2014.zip', 'image_info_test2015.zip', 'image_info_test2017.zip', 'image_info_unlabeled2017.zip', 'panoptic_annotations_trainval2017.zip', 'stuff_annotations_trainval2017.zip']

可以看出：所有數據我都沒有解壓，下面我們將動手設計一個無需解壓便可獲取數據信息的接口。

6.3 ImageZ 的設計和使用

我們先設計一個用來處理 coco/images/ 文件夾下的圖片數據集的類：

class ImageZ(dict):
    '''
    Working with compressed files under the images
    '''

    def __init__(self, root, dataType, *args, **kwds):
        '''
        root:: root dir
        dataType in ['test2014', 'test2015',
                    'test2017', 'train2014',
                    'train2017', 'unlabeled2017',
                    'val2014', 'val2017']
        '''
        super().__init__(*args, **kwds)
        self.__dict__ = self
        self.shuffle = True if dataType.startswith('train') else False
        self.Z = self.__get_Z(root, dataType)
        self.names = self.__get_names(self.Z)
        self.dataType = self.Z.namelist()[0]

    @staticmethod
    def __get_Z(root, dataType):
        '''
        Get the file name of the compressed file under the images
        '''
        dataType = dataType + '.zip'
        img_root = os.path.join(root, 'images')
        return zipfile.ZipFile(os.path.join(img_root, dataType))

    @staticmethod
    def __get_names(Z):
        names = [
            name.split('/')[1] for name in Z.namelist()
            if not name.endswith('/')
        ]
        return names

    def buffer2array(self, image_name):
        '''
        Get picture data directly without decompression

        Parameters
        ===========
        Z:: Picture data is a ZipFile object
        '''
        image_name = self.dataType + image_name
        buffer = self.Z.read(image_name)
        image = np.frombuffer(buffer, dtype="B")  # 將 buffer 轉換為 np.uint8 數組
        img_cv = cv2.imdecode(image, cv2.IMREAD_COLOR)  # BGR 格式
        img = cv2.cvtColor(img_cv, cv2.COLOR_BGR2RGB)
        return img

代碼這么長看着是不是有點懵，具體細節大家自己琢磨，我們直接看看它有什么神奇之處？

dataDir = r'E:\Data\coco'   # COCO 數據根目錄
dataType = 'val2017'
imgZ = ImageZ(dataDir, dataType)

由於 imgZ 繼承自 dict，所以它擁有字典的幾乎所有屬性和功能：

imgZ.keys()

輸出

dict_keys(['shuffle', 'Z', 'names', 'dataType'])

names：存儲了 val2017.zip 的所有圖片的文件名
shuffle：判斷是否是訓練數據集
Z：ZipFile 對象，用來操作整個 val2017.zip 文件

還有一個實例方法 buffer2array 可以直接通過圖片的文件名獲取其像素級特征。

fname = imgZ.names[77]  # 一張圖片的文件名
img = imgZ.buffer2array(fname)  # 獲取像素級特征

由於 img 是 Numpy 數組，這樣我們就可以對其進行各種我們熟悉的操作，如圖片顯示：

from matplotlib import pyplot as plt

plt.imshow(img)
plt.show()

輸出：

至此，我們已經完成無需解壓直接讀取圖片的工作。

6.4 AnnZ 的設計和使用

代碼如下：

class AnnZ(dict):
    '''
    Working with compressed files under annotations
    '''

    def __init__(self, root, annType, *args, **kwds):
        '''
        dataType in [
              'annotations_trainval2014',
              'annotations_trainval2017',
              'image_info_test2014',
              'image_info_test2015',
              'image_info_test2017',
              'image_info_unlabeled2017',
              'panoptic_annotations_trainval2017',
              'stuff_annotations_trainval2017'
        ]
        '''
        super().__init__(*args, **kwds)
        self.__dict__ = self
        self.Z = self.__get_Z(root, annType)
        self.names = self.__get_names(self.Z)

    @staticmethod
    def __get_Z(root, annType):
        '''
        Get the file name of the compressed file under the annotations
        '''
        annType = annType + '.zip'
        annDir = os.path.join(root, 'annotations')
        return zipfile.ZipFile(os.path.join(annDir, annType))

    @staticmethod
    def __get_names(Z):
        names = [name for name in Z.namelist() if not name.endswith('/')]
        return names

    @timer
    def json2dict(self, name):
        with self.Z.open(name) as fp:
            dataset = json.load(fp)
        return dataset

我們直接看看如何使用？

root = r'E:\Data\coco'   # COCO 數據集所在根目錄
annType = 'annotations_trainval2017'   # COCO 標注數據類型

annZ = AnnZ(root, annType)

我們來查看一下，該標注數據所包含的標注種類：

annZ.names

輸出：

['annotations/instances_train2017.json',
 'annotations/instances_val2017.json',
 'annotations/captions_train2017.json',
 'annotations/captions_val2017.json',
 'annotations/person_keypoints_train2017.json',
 'annotations/person_keypoints_val2017.json']

下面以 dict 的形式載入 'annotations/instances_train2017.json' 的具體信息：

annFile = 'annotations/instances_val2017.json'
dataset = annZ.json2dict(annFile)

輸出：

Loading json in memory ...
used time: 1.052 s

我們還可以查看 dataset 的關鍵字：

dataset.keys()

輸出：

dict_keys(['info', 'licenses', 'images', 'annotations', 'categories'])

這樣，我們可以很方便的使用 dict 的相關操作獲取我們想要的一些信息：

dataset['images'][7]  # 查看一張圖片的一些標注信息

輸出：

{'license': 6,
'file_name': '000000480985.jpg',
'coco_url': 'http://images.cocodataset.org/val2017/000000480985.jpg',
'height': 500,
'width': 375,
'date_captured': '2013-11-15 13:09:24',
'flickr_url': 'http://farm3.staticflickr.com/2336/1634911562_703ff01cff_z.jpg',
'id': 480985}

我們可以利用 'coco_url' 直接從網上獲取圖片：

from matplotlib import pyplot as plt
import skimage.io as sio

coco_url = dataset['images'][7]['coco_url']
# use url to load image
I = sio.imread(coco_url)
plt.axis('off')
plt.imshow(I)
plt.show()

輸出：

借助 ImageZ 從本地讀取圖片：

from matplotlib import pyplot as plt
imgType = 'val2017'
imgZ = ImageZ(root, imgType)

I = imgZ.buffer2array(dataset['images'][100]['file_name'])

plt.axis('off')
plt.imshow(I)
plt.show()

輸出：

6.5 COCOZ 的設計和使用

ImageZ 和 AnnZ 雖然很好用，但是它們的靈活性太大，並且現在的開源代碼均是基於 COCO 類進行設計的。為了更加契合 cocoapi 我們需要一個中轉類 COCOZ 去實現和 COCO 幾乎一樣的功能，並且使用方法也盡可能的保留。具體是代碼如下：

class COCOZ(COCO, dict):
    def __init__(self, annZ, annFile, *args, **kwds):
        '''
        ptint(coco):: View Coco's Instance object Coco's 'info'

        example
        ==========
        annZ = AnnZ(annDir, annType)
        '''
        super().__init__(*args, **kwds)
        self.__dict__ = self
        self.dataset = annZ.json2dict(annFile)
        self.createIndex()

    @timer
    def createIndex(self):
        # create index
        print('creating index...')
        cats, anns, imgs = {}, {}, {}
        imgToAnns, catToImgs = {}, {}
        if 'annotations' in self.dataset:
            for ann in self.dataset['annotations']:
                imgToAnns[ann['image_id']] = imgToAnns.get(
                    ann['image_id'], []) + [ann]
                anns[ann['id']] = ann
        if 'images' in self.dataset:
            for img in self.dataset['images']:
                imgs[img['id']] = img
        if 'categories' in self.dataset:
            for cat in self.dataset['categories']:
                cats[cat['id']] = cat
        if 'annotations' in self.dataset and 'categories' in self.dataset:
            for ann in self.dataset['annotations']:
                catToImgs[ann['category_id']] = catToImgs.get(
                    ann['category_id'], []) + [ann['image_id']]

        print('index created!')

        # create class members
        self.anns = anns
        self.imgToAnns = imgToAnns
        self.catToImgs = catToImgs
        self.imgs = imgs
        self.cats = cats

    def __str__(self):
        """
        Print information about the annotation file.
        """
        S = [
            '{}: {}'.format(key, value)
            for key, value in self.dataset['info'].items()
        ]
        return '\n'.join(S)

我們直接看看如何使用 COCOZ？

root = r'E:\Data\coco'   # COCO 數據集所在根目錄
annType = 'annotations_trainval2017'   # COCO 標注數據類型
annFile = 'annotations/instances_val2017.json'

annZ = AnnZ(root, annType)
coco = COCOZ(annZ, annFile)

輸出：

Loading json in memory ...
used time: 1.036 s
Loading json in memory ...
creating index...
index created!
used time: 0.421946 s

如果你需要預覽你載入的 COCO 數據集，可以使用 print() 來實現：

print(coco)

輸出：

description: COCO 2017 Dataset
url: http://cocodataset.org
version: 1.0
year: 2017
contributor: COCO Consortium
date_created: 2017/09/01

再次查看：

coco.keys()

輸出：

dict_keys(['dataset', 'anns', 'imgToAnns', 'catToImgs', 'imgs', 'cats'])

6.5.1 展示 COCO 的類別與超類

cats = coco.loadCats(coco.getCatIds())
nms = set([cat['name'] for cat in cats])  # 獲取 cat 的 name 信息
print('COCO categories: \n{}\n'.format(' '.join(nms)))
# ============================================================
snms = set([cat['supercategory'] for cat in cats])  # 獲取 cat 的 name 信息
print('COCO supercategories: \n{}'.format(' '.join(snms)))

輸出：

COCO categories:
kite sports ball horse banana toilet mouse frisbee bed donut clock sheep keyboard tv cup elephant cake potted plant snowboard train zebra fire hydrant handbag cow wine glass bowl sink parking meter umbrella giraffe suitcase skis surfboard stop sign bear cat chair traffic light fork truck orange carrot broccoli couch remote hair drier sandwich laptop tie person tennis racket apple spoon pizza hot dog bird refrigerator microwave scissors backpack airplane knife baseball glove vase toothbrush book bottle motorcycle bicycle car skateboard bus dining table cell phone toaster boat teddy bear dog baseball bat bench oven

COCO supercategories:
animal kitchen food appliance indoor accessory person sports furniture outdoor electronic vehicle

6.5.2 通過給定條件獲取圖片

獲取包含給定類別的所有圖片

# get all images containing given categories, select one at random
catIds = coco.getCatIds(catNms=['cat', 'dog', 'snowboar'])  # 獲取 Cat 的 Ids
imgIds = coco.getImgIds(catIds=catIds )  # 
img = coco.loadImgs(imgIds)
# 隨機選擇一張圖片的信息
img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]

img

輸出：

{'license': 3,
'file_name': '000000179392.jpg',
'coco_url': 'http://images.cocodataset.org/val2017/000000179392.jpg',
'height': 640,
'width': 480,
'date_captured': '2013-11-18 04:07:31',
'flickr_url': 'http://farm5.staticflickr.com/4027/4329554124_1ce02506f8_z.jpg',
'id': 179392}

6.5.3 將圖片的 anns 信息標注在圖片上

先從本地磁盤獲取一張圖片：

from matplotlib import pyplot as plt
imgType = 'val2017'
imgZ = ImageZ(root, imgType)

I = imgZ.buffer2array(dataset['images'][55]['file_name'])

plt.axis('off')
plt.imshow(I)
plt.show()

輸出：

將標注信息加入圖片：

# load and display instance annotations
plt.imshow(I)
plt.axis('off')
annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco.loadAnns(annIds)
coco.showAnns(anns)

輸出：

6.5.4 關鍵點檢測

載入標注信息：

# initialize COCO api for person keypoints annotations
root = r'E:\Data\coco'   # COCO 數據集所在根目錄
annType = 'annotations_trainval2017'   # COCO 標注數據類型
annFile = 'annotations/person_keypoints_val2017.json'

annZ = AnnZ(root, annType)
coco_kps = COCOZ(annZ, annFile)

輸出：

Loading json in memory ...
used time: 0.924155 s
Loading json in memory ...
creating index...
index created!
used time: 0.378003 s

先選擇一張帶有 person 的圖片：

from matplotlib import pyplot as plt

catIds = coco.getCatIds(catNms=['person'])  # 獲取 Cat 的 Ids
imgIds = coco.getImgIds(catIds=catIds)  
img = coco.loadImgs(imgIds)[99]
# use url to load image
I = sio.imread(img['coco_url'])
plt.axis('off')
plt.imshow(I)
plt.show()

輸出：

將標注加到圖片上：

# load and display keypoints annotations
plt.imshow(I); plt.axis('off')
ax = plt.gca()
annIds = coco_kps.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco_kps.loadAnns(annIds)
coco_kps.showAnns(anns)

輸出：

6.5.5 看圖說話

載入標注信息：

# initialize COCO api for person keypoints annotations
root = r'E:\Data\coco'   # COCO 數據集所在根目錄
annType = 'annotations_trainval2017'   # COCO 標注數據類型
annFile = 'annotations/captions_val2017.json'

annZ = AnnZ(root, annType)
coco_caps = COCOZ(annZ, annFile)

輸出

Loading json in memory ...
used time: 0.0760329 s
Loading json in memory ...
creating index...
index created!
used time: 0.0170002 s

將標注加到圖片上：

A man riding on a skateboard on the sidewalk.
a kid riding on a skateboard on the cement
There is a skateboarder riding his board on the sidewalk
A skateboarder with one fut on a skateboard raising it up.
A pavement where a person foot are seen having skates.

至此，我們已經完成了我們的預期目標。

6.6 讓 API 更加通用

雖然我們完成了預期目標，但是 cocoz.py 還有很大的改進余地。比如我們可以令 ImageZ 變得像列表一樣，支持索引和切片。為了優化結構，我們可以將其封裝為生成器。基於這些想法我們便得到一個改進版的 ImageZ，只需添加幾個實例方法即可：

def __getitem__(self, item):
    names = self.names[item]
    if isinstance(item, slice):
        return [self.buffer2array(name) for name in names]
    else:
        return self.buffer2array(names)


def __len__(self):
    return len(self.names)


def __iter__(self):
    for name in self.names:
        yield self.buffer2array(name)

詳細代碼被我放在：cocoz.py^[10]。

同時，為了令其他數據也可以使用 ImageZ 類，我們將 ImageZ 的輸入參數改為 images 所在路徑，其他不變。由於我們已經將上述的 ImageZ、AnnZ、COCOZ 封裝進了 cocoz.py，所以下面我們可以直接調用它們：

import sys

# 將 cocoapi 添加進入環境變量
sys.path.append(r'D:\API\cocoapi\PythonAPI')

from pycocotools.cocoz import AnnZ, ImageZ, COCOZ

為了避免重復，下面我們查看一張 train2017,zip 的圖片：

dataDir = r'E:\Data\coco\images'   # COCO 數據根目錄
dataType = 'train2017'
imgZ = ImageZ(dataDir, dataType)

下面我們通過索引的方式查看：

from matplotlib import pyplot as plt

img = imgZ[78]
plt.imshow(img)
plt.show()

顯示如下：

我們也可以通過切片的方式獲取多張圖片，為了可視化的方便，我們先定義一個用來可視化的函數：

from IPython import display


def use_svg_display():
    # 用矢量圖顯示, 效果更好
    display.set_matplotlib_formats('svg')


def show_imgs(imgs, is_first_channel=False):
    '''
    展示 多張圖片
    '''
    if is_first_channel:
        imgs = imgs.transpose((0, 2, 3, 1))
    n = len(imgs)
    h, w = 4, int(n / 4)
    use_svg_display()
    _, ax = plt.subplots(h, w, figsize=(5, 5))  # 設置圖的尺寸
    K = np.arange(n).reshape((h, w))
    for i in range(h):
        for j in range(w):
            img = imgs[K[i, j]]
            ax[i][j].imshow(img)
            ax[i][j].axes.get_yaxis().set_visible(False)
            ax[i][j].set_xticks([])
    plt.show()

下面我們看看其中的 16 張圖片：

show_imgs(imgZ[100:116])

顯示結果如下：

到目前為此，我們都是僅僅關注了 COCO 數據集，而 ImageZ 不僅僅可以處理 COCO 數據集，它也能處理其它以 .zip 形式壓縮的數據集，比如 Kaggle 上的一個比賽 Humpback Whale Identification^[11] 提供的關於座頭鯨的數據集。該數據集大小為 5G 左右，如果直接解壓然后再處理也是很麻煩，我們可以直接使用 ImageZ 來讀取圖片。

首先，我們先將下載好的 all.zip 進行解壓：

import zipfile
import os

dataDir = r'E:\Data\Kaggle'

fname = 'all.zip'

with zipfile.ZipFile(os.path.join(dataDir, fname)) as z:
    z.extractall(os.path.join(dataDir, 'HumpbackWhale'))

解壓好之后，我們查看一下該數據集的組成：

dataDir = r'E:\Data\Kaggle\HumpbackWhale'

os.listdir(dataDir)

輸出：

['sample_submission.csv', 'test.zip', 'train.csv', 'train.zip']

可以看出：該數據集有圖片數據集 'test.zip' 與 'train.zip'。這樣我們直接借助 ImageZ 來讀取圖片：

dataDir = r'E:\Data\Kaggle\HumpbackWhale'
dataType = 'train'
imgZ = ImageZ(dataDir, dataType)

我們也看看其中的 16 張圖片：

show_imgs(imgZ[100:116])

顯示：

6.7 小節

本章主要介紹了 COCO 數據及其 API cocoapi ，同時為了更好的使用 cocoapi，又自制了一個可以直接讀取 .zip 數據集的接口 cocoz.py。同時，cocoz.py 也可以用來直接讀取以 COCO 標注數據形式封裝的其他數據集。

也可以直接下載代碼 Demo：

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 MS COCO數據標注詳解配置COCO API（安裝COCO）使用mmdetection訓練自己的coco數據集(免費分享自制數據集文件) 將彩色RGB分割標注圖像數據集轉換為COCO格式的JSON文件 coco數據集標注文件json格式化查看 COCO數據集提取自己需要的類轉VOC spark讀取壓縮文件 coco數據集標注圖轉為二值圖python（附代碼） CrowdHuman數據集標注格式轉換為YOLOv3可以使用的COCO格式 coco數據集理解（一）