下面的代碼改寫自 COCO 官方 API,改寫后的代碼 cocoz.py 被我放置在 Xinering/cocoapi。我的主要改進有:
- 增加對 Windows 系統的支持;
- 替換
defaultdict
為dict.get()
,解決 Windows 的編碼問題。 - 跳過解壓這一步驟(包括直接的或間接的解壓),直接對圖片數據
images
與標注數據annotations
操作。 - 因為,無需解壓,所以 API 的使用更加便捷和高效。
具體的 API 使用說明見如下內容:
0 准備
為了可以使用 cocoz
,你需要下載 Xinering/cocoapi。之后將其放在你需要運行的項目或程序根目錄,亦或者使用如下命令添加環境變量(暫時的):
import sys
sys.path.append('D:\API\cocoapi\PythonAPI') # 你下載的 cocoapi 所在路徑
from pycocotools.cocoz import AnnZ, ImageZ, COCOZ # 載入 cocoz
下面我們就可以利用這個 API 的 cocoz.AnnZ
、cocoz.ImageZ
和 cocoz.COCOZ
類來操作 COCO 圖片和標注了。下面我以 Windows 系統為例說明,Linux 是類似的。
1 cocoz.AnnZ 與 cocoz.ImageZ
root = r'E:\Data\coco' # COCO 數據集所在根目錄
annType = 'annotations_trainval2017' # COCO 標注數據類型
annZ = AnnZ(root, annType)
我們來查看一下,該標注數據所包含的標注類型:
annZ.names
['annotations/instances_train2017.json',
'annotations/instances_val2017.json',
'annotations/captions_train2017.json',
'annotations/captions_val2017.json',
'annotations/person_keypoints_train2017.json',
'annotations/person_keypoints_val2017.json']
以 dict
的形式載入 'annotations/instances_train2017.json'
的具體信息:
annFile = 'annotations/instances_val2017.json'
dataset = annZ.json2dict(annFile)
Loading json in memory ...
used time: 0.890035 s
dataset.keys()
dict_keys(['info', 'licenses', 'images', 'annotations', 'categories'])
dataset['images'][0] # 記錄了一張圖片的一些標注信息
{'license': 4,
'file_name': '000000397133.jpg',
'coco_url': 'http://images.cocodataset.org/val2017/000000397133.jpg',
'height': 427,
'width': 640,
'date_captured': '2013-11-14 17:02:52',
'flickr_url': 'http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg',
'id': 397133}
1.1 從網頁獲取圖片
%pylab inline
import skimage.io as sio
coco_url = dataset['images'][0]['coco_url']
# use url to load image
I = sio.imread(coco_url)
plt.axis('off')
plt.imshow(I)
plt.show()
Populating the interactive namespace from numpy and matplotlib
1.2 從本地讀取圖片
為了避免解壓數據集,我使用了 zipfile
模塊:
imgType = 'val2017'
imgZ = ImageZ(root, imgType)
I = imgZ.buffer2array(imgZ.names[0])
plt.axis('off')
plt.imshow(I)
plt.show()
2 cocoz.COCOZ
root = r'E:\Data\coco' # COCO 數據集所在根目錄
annType = 'annotations_trainval2017' # COCO 標注數據類型
annFile = 'annotations/instances_val2017.json'
annZ = AnnZ(root, annType)
coco = COCOZ(annZ, annFile)
Loading json in memory ...
used time: 1.02004 s
Loading json in memory ...
creating index...
index created!
used time: 0.431003 s
如果你需要預覽你載入的 COCO 數據集,可以使用 print()
來實現:
print(coco)
description: COCO 2017 Dataset
url: http://cocodataset.org
version: 1.0
year: 2017
contributor: COCO Consortium
date_created: 2017/09/01
coco.keys()
dict_keys(['dataset', 'anns', 'imgToAnns', 'catToImgs', 'imgs', 'cats'])
2.1 展示 COCO 的類別與超類
cats = coco.loadCats(coco.getCatIds())
nms = set([cat['name'] for cat in cats]) # 獲取 cat 的 name 信息
print('COCO categories: \n{}\n'.format(' '.join(nms)))
# ============================================================
snms = set([cat['supercategory'] for cat in cats]) # 獲取 cat 的 name 信息
print('COCO supercategories: \n{}'.format(' '.join(snms)))
COCO categories:
kite potted plant handbag clock umbrella sports ball bird frisbee toilet toaster spoon car snowboard banana fire hydrant skis chair tv skateboard wine glass tie cell phone cake zebra baseball glove stop sign airplane bed surfboard cup knife apple broccoli bicycle train carrot remote cat bear teddy bear person bench horse dog couch orange hair drier backpack giraffe sandwich book donut sink oven refrigerator boat mouse laptop toothbrush keyboard truck motorcycle bottle pizza traffic light cow microwave scissors bus baseball bat elephant fork bowl tennis racket suitcase vase sheep parking meter dining table hot dog
COCO supercategories:
accessory furniture sports vehicle appliance electronic animal indoor outdoor person kitchen food
2.2 通過給定條件獲取圖片
獲取包含給定類別的所有圖片
# get all images containing given categories, select one at random
catIds = coco.getCatIds(catNms=['cat', 'dog', 'snowboar']) # 獲取 Cat 的 Ids
imgIds = coco.getImgIds(catIds=catIds ) #
img = coco.loadImgs(imgIds)
隨機選擇一張圖片的信息:
img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]
img
{'license': 4,
'file_name': '000000318238.jpg',
'coco_url': 'http://images.cocodataset.org/val2017/000000318238.jpg',
'height': 640,
'width': 478,
'date_captured': '2013-11-21 00:01:06',
'flickr_url': 'http://farm8.staticflickr.com/7402/9964003514_84ce7550c9_z.jpg',
'id': 318238}
2.2.1 獲取圖片
從網絡獲取圖片:
coco_url = img['coco_url']
I = sio.imread(coco_url)
plt.axis('off')
plt.imshow(I)
plt.show()
從本地獲取圖片:
這里有一個梗:cv2
的圖片默認模式是 BGR 而不是 RGB,所以,將 I
直接使用 plt
會改變原圖的顏色空間,為此我們可以使用 cv2.COLOR_BGR2RGB
.
imgType = 'val2017'
imgZ = ImageZ(root, imgType)
I = imgZ.buffer2array(img['file_name'])
plt.axis('off')
plt.imshow(I)
plt.show()
2.3 將圖片的 anns 信息標注在圖片上
# load and display instance annotations
plt.imshow(I)
plt.axis('off')
annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco.loadAnns(annIds)
coco.showAnns(anns)
2.4 關鍵點檢測
# initialize COCO api for person keypoints annotations
root = r'E:\Data\coco' # COCO 數據集所在根目錄
annType = 'annotations_trainval2017' # COCO 標注數據類型
annFile = 'annotations/person_keypoints_val2017.json'
annZ = AnnZ(root, annType)
coco_kps = COCOZ(annZ, annFile)
Loading json in memory ...
used time: 0.882997 s
Loading json in memory ...
creating index...
index created!
used time: 0.368036 s
先選擇一張帶有 person
的圖片:
catIds = coco.getCatIds(catNms=['person']) # 獲取 Cat 的 Ids
imgIds = coco.getImgIds(catIds=catIds)
img = coco.loadImgs(imgIds)[77]
# use url to load image
I = sio.imread(img['coco_url'])
plt.axis('off')
plt.imshow(I)
plt.show()
# load and display keypoints annotations
plt.imshow(I); plt.axis('off')
ax = plt.gca()
annIds = coco_kps.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco_kps.loadAnns(annIds)
coco_kps.showAnns(anns)
2.5 看圖說話
# initialize COCO api for person keypoints annotations
root = r'E:\Data\coco' # COCO 數據集所在根目錄
annType = 'annotations_trainval2017' # COCO 標注數據類型
annFile = 'annotations/captions_val2017.json'
annZ = AnnZ(root, annType)
coco_caps = COCOZ(annZ, annFile)
Loading json in memory ...
used time: 0.435748 s
Loading json in memory ...
creating index...
index created!
used time: 0.0139964 s
# load and display caption annotations
annIds = coco_caps.getAnnIds(imgIds=img['id']);
anns = coco_caps.loadAnns(annIds)
coco_caps.showAnns(anns)
plt.imshow(I)
plt.axis('off')
plt.show()
show:
A brown horse standing next to a woman in front of a house.
a person standing next to a horse next to a building
A woman stands beside a large brown horse.
The woman stands next to the large brown horse.
A woman hold a brown horse while a woman watches.
如果你需要使用官方 API, 可以參考 COCO 數據集的使用。
如果你覺得對你有幫助,請幫忙在 Github 上點個 star:datasetsome。該教程的代碼我放在了 GitHub: COCOZ 使用說明書。