yolov3 kmeans
yolov3在做boundingbox預測的時候,用到了anchor boxes.這個anchors的含義即最有可能的object的width,height.事先通過聚類得到.比如某一個feature map cell,我想對這個feature map cell預測出一個object,圍繞這個feature map cell,可以預測出無數種object的形狀,並不是隨便預測的,要參考anchor box的大小,即從已標注的數據中通過聚類統計到的最有可能的object的形狀.
.cfg文件內的配置如下:
[yolo]
mask = 3,4,5
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
在用我們自己的數據做訓練的時候,要先修改anchors,匹配我們自己的數據.anchors大小通過聚類得到.
通俗地說,聚類就是把挨得近的數據點划分到一起.
kmeans算法的思想很簡單
- 隨便指定k個cluster
- 把點划分到與之最近的一個cluster
- 上面得到的cluster肯定是不好的,因為一開始的cluster是亂選的嘛
- 更新每個cluster為當前cluster的點的均值.
這時候cluster肯定變准了,為什么呢?比如當前這個cluster里有3個點,2個點靠的很近,還有1個點離得稍微遠點,那取均值的話,那相當於靠的很近的2個點有更多投票權,新算出來的cluster的中心會更加靠近這兩個點.你要是非要抬杠:那萬一一開始我隨機指定的cluster中心點就特別准呢,重新取均值反而把中心點弄的不准了?事實上這是kmeans的一個缺陷:比較依賴初始的k個cluster的位置.選擇不恰當的k值可能會導致糟糕的聚類結果。這也是為什么要進行特征檢查來決定數據集的聚類數目了。 - 重新執行上述過程
- 把點划分到與之最近的一個cluster
- 更新每個cluster為當前cluster的點的均值
- 不斷重復上述過程,直至cluster中心變化很小
yolov3要求的label文件格式
<object-class> <x_center> <y_center> <width> <height>
Where:
<object-class> - integer object number from 0 to (classes-1)
<x_center> <y_center> <width> <height> - float values relative to width and height of image, it can be equal from (0.0 to 1.0]
> for example: <x> = <absolute_x> / <image_width> or <height> = <absolute_height> / <image_height>
atention: <x_center> <y_center> - are center of rectangle (are not top-left corner)
舉例:
1 0.716797 0.395833 0.216406 0.147222
所有的值都是比例.(中心點x,中心點y,目標寬,目標高)
kmeans實現
一般來說,計算樣本點到質心的距離的時候直接算的是兩點之間的距離,然后將樣本點划歸為與之距離最近的一個質心.
在yolov3中樣本點的數據是有具體的業務上的含義的,我們其實最終目的是想知道最有可能的object對應的bounding box的形狀是什么樣子的. 所以這個距離的計算我們並不是直接算兩點之間的距離,我們計算兩個box的iou,即2個box的相似程度.d=1-iou(box1,box_cluster). 這樣d越小,說明box1與box_cluster越類似.將box划歸為box_cluster.
數據加載
f = open(args.filelist)
lines = [line.rstrip('\n') for line in f.readlines()]
annotation_dims = []
size = np.zeros((1,1,3))
for line in lines:
#line = line.replace('images','labels')
#line = line.replace('img1','labels')
line = line.replace('JPEGImages','labels')
line = line.replace('.jpg','.txt')
line = line.replace('.png','.txt')
print(line)
f2 = open(line)
for line in f2.readlines():
line = line.rstrip('\n')
w,h = line.split(' ')[3:]
#print(w,h)
annotation_dims.append(tuple(map(float,(w,h))))
annotation_dims = np.array(annotation_dims)
看着一大段,其實重點就一句
w,h = line.split(' ')[3:]
annotation_dims.append(tuple(map(float,(w,h))))
這里涉及到了python的語法,map用法https://www.runoob.com/python/python-func-map.html
這樣就生成了一個N*2矩陣. N代表你的樣本個數.
- 定義樣本點到質心點的距離
計算樣本x代表的box和k個質心box的IOU.(即比較box之間的形狀相似程度).
這里涉及到一個IOU的概念:即交並集比例.交叉面積/總面積.
def IOU(x,centroids):
similarities = []
k = len(centroids)
for centroid in centroids:
c_w,c_h = centroid
w,h = x
if c_w>=w and c_h>=h: #box(c_w,c_h)完全包含box(w,h)
similarity = w*h/(c_w*c_h)
elif c_w>=w and c_h<=h: #box(c_w,c_h)寬而扁平
similarity = w*c_h/(w*h + (c_w-w)*c_h)
elif c_w<=w and c_h>=h:
similarity = c_w*h/(w*h + c_w*(c_h-h))
else: #means both w,h are bigger than c_w and c_h respectively
similarity = (c_w*c_h)/(w*h)
similarities.append(similarity) # will become (k,) shape
return np.array(similarities)
kmeans實現
def kmeans(X,centroids,eps,anchor_file):
N = X.shape[0]
iterations = 0
k,dim = centroids.shape
prev_assignments = np.ones(N)*(-1)
iter = 0
old_D = np.zeros((N,k)) #距離矩陣 N個點,每個點到k個質心 共計N*K個距離
while True:
D = []
iter+=1
for i in range(N):
d = 1 - IOU(X[i],centroids) #d是一個k維的
D.append(d)
D = np.array(D) # D.shape = (N,k)
print("iter {}: dists = {}".format(iter,np.sum(np.abs(old_D-D))))
#assign samples to centroids
assignments = np.argmin(D,axis=1) #返回每一行的最小值的下標.即當前樣本應該歸為k個質心中的哪一個質心.
if (assignments == prev_assignments).all() : #質心已經不再變化
print("Centroids = ",centroids)
write_anchors_to_file(centroids,X,anchor_file)
return
#calculate new centroids
centroid_sums=np.zeros((k,dim),np.float) #(k,2)
for i in range(N):
centroid_sums[assignments[i]]+=X[i] #將每一個樣本划分到對應質心
for j in range(k):
centroids[j] = centroid_sums[j]/(np.sum(assignments==j)) #更新質心
prev_assignments = assignments.copy()
old_D = D.copy()
- 計算每個樣本點到每一個cluster質心的距離 d = 1- IOU(X[i],centroids)表示樣本點到每個cluster質心的距離.
- np.argmin(D,axis=1)得到每一個樣本點離哪個cluster質心最近
argmin函數用法參考https://docs.scipy.org/doc/numpy/reference/generated/numpy.argmin.html - 計算每一個cluster中的樣本點總和,取平均,更新cluster質心.
for i in range(N):
centroid_sums[assignments[i]]+=X[i] #將每一個樣本划分到對應質心
for j in range(k):
centroids[j] = centroid_sums[j]/(np.sum(assignments==j)) #更新質心
- 不斷重復上述過程,直到質心不再變化 聚類完成.
保存聚類得到的anchor box大小
def write_anchors_to_file(centroids,X,anchor_file):
f = open(anchor_file,'w')
anchors = centroids.copy()
print(anchors.shape)
for i in range(anchors.shape[0]):
anchors[i][0]*=width_in_cfg_file/32.
anchors[i][1]*=height_in_cfg_file/32.
widths = anchors[:,0]
sorted_indices = np.argsort(widths)
print('Anchors = ', anchors[sorted_indices])
for i in sorted_indices[:-1]:
f.write('%0.2f,%0.2f, '%(anchors[i,0],anchors[i,1]))
#there should not be comma after last anchor, that's why
f.write('%0.2f,%0.2f\n'%(anchors[sorted_indices[-1:],0],anchors[sorted_indices[-1:],1]))
f.write('%f\n'%(avg_IOU(X,centroids)))
print()
由於yolo要求的label文件中,填寫的是相對於width,height的比例.所以得到的anchor box的大小要乘以模型輸入圖片的尺寸.
上述代碼里
anchors[i][0]*=width_in_cfg_file/32.
anchors[i][1]*=height_in_cfg_file/32.
這里除以32是yolov2的算法要求. yolov3實際上不需要!!,注意你自己用的是yolov2還是v3,v3的話把/32去掉.參見以下鏈接https://github.com/pjreddie/darknet/issues/911
for Yolo v2: width=704 height=576 in cfg-file
./darknet detector calc_anchors data/hand.data -num_of_clusters 5 -width 22 -height 18 -show
for Yolo v3: width=704 height=576 in cfg-file
./darknet detector calc_anchors data/hand.data -num_of_clusters 9 -width 704 -height 576 -show
And you can use any images with any sizes.
完整代碼見https://github.com/AlexeyAB/darknet/tree/master/scripts
用法:python3 gen_anchors.py -filelist ../build/darknet/x64/data/park_train.txt
/20190822***************/
完整代碼 詳細注釋
'''
'''
Created on Feb 20, 2017
@author: jumabek
'''
from os import listdir
from os.path import isfile, join
import argparse
#import cv2
import numpy as np
import sys
import os
import shutil
import random
import math
width_in_cfg_file = 320.
height_in_cfg_file = 320.
def IOU(x,centroids):
similarities = []
k = len(centroids)
for centroid in centroids:
c_w,c_h = centroid
w,h = x
if c_w>=w and c_h>=h:
similarity = w*h/(c_w*c_h)
elif c_w>=w and c_h<=h:
similarity = w*c_h/(w*h + (c_w-w)*c_h)
elif c_w<=w and c_h>=h:
similarity = c_w*h/(w*h + c_w*(c_h-h))
else: #means both w,h are bigger than c_w and c_h respectively
similarity = (c_w*c_h)/(w*h)
similarities.append(similarity) # will become (k,) shape
return np.array(similarities)
def avg_IOU(X,centroids):
n,d = X.shape
sum = 0.
for i in range(X.shape[0]):
#note IOU() will return array which contains IoU for each centroid and X[i] // slightly ineffective, but I am too lazy
sum+= max(IOU(X[i],centroids))
return sum/n
def write_anchors_to_file(centroids,X,anchor_file):
f = open(anchor_file,'w')
anchors = centroids.copy()
print(anchors.shape)
for i in range(anchors.shape[0]):
anchors[i][0]*=width_in_cfg_file/32.
anchors[i][1]*=height_in_cfg_file/32.
widths = anchors[:,0]
sorted_indices = np.argsort(widths)
print('Anchors = ', anchors[sorted_indices])
for i in sorted_indices[:-1]:
f.write('%0.2f,%0.2f, '%(anchors[i,0],anchors[i,1]))
#there should not be comma after last anchor, that's why
f.write('%0.2f,%0.2f\n'%(anchors[sorted_indices[-1:],0],anchors[sorted_indices[-1:],1]))
f.write('%f\n'%(avg_IOU(X,centroids)))
print()
def kmeans(X,centroids,eps,anchor_file):
"""
X.shape = N * dim N代表全部樣本數量,dim代表樣本有dim個維度
centroids.shape = k * dim k代表聚類的cluster數,dim代表樣本維度
"""
print("X.shape=",X.shape,"centroids.shape=",centroids.shape)
N = X.shape[0]
iterations = 0
k,dim = centroids.shape
prev_assignments = np.ones(N)*(-1)
iter = 0
old_D = np.zeros((N,k))
while True:
"""
D.shape = N * k N代表全部樣本數量,k列分別為到k個質心的距離
1. 計算出D
2. 獲取出當前樣本應該歸屬哪個cluster
assignments = np.argmin(D,axis=1)
assignments.shape = N * 1 N代表N個樣本,1列為當前歸屬哪個cluster
numpy里row=0,line=1,np.argmin(D,axis=1)即沿着列的方向,即每一行的最小值的下標
3. 將樣本划分到相對應的cluster后,重新計算每個cluster的質心
centroid_sums.shape = k * dim k代表刻個cluster,dim列分別為該cluster的樣本在該維度的均值
centroid_sums=np.zeros((k,dim),np.float)
for i in range(N):
centroid_sums[assignments[i]]+=X[i] # assignments[i]為cluster x 將每一個樣本都歸到其所屬的cluster.
for j in range(k):
centroids[j] = centroid_sums[j]/(np.sum(assignments==j)) #np.sum(assignments==j)為cluster j中的樣本總量
"""
D = []
iter+=1
for i in range(N):
d = 1 - IOU(X[i],centroids)
D.append(d)
D = np.array(D) # D.shape = (N,k)
print("iter {}: dists = {}".format(iter,np.sum(np.abs(old_D-D))))
assignments = np.argmin(D,axis=1)
#每個樣本歸屬的cluster都不再變化了,就退出
if (assignments == prev_assignments).all() :
print("Centroids = ",centroids)
write_anchors_to_file(centroids,X,anchor_file)
return
#calculate new centroids
centroid_sums=np.zeros((k,dim),np.float)
for i in range(N):
centroid_sums[assignments[i]]+=X[i]
for j in range(k):
print("cluster{} has {} sample".format(j,np.sum(assignments==j)))
centroids[j] = centroid_sums[j]/(np.sum(assignments==j))
prev_assignments = assignments.copy()
old_D = D.copy()
def main(argv):
parser = argparse.ArgumentParser()
parser.add_argument('-filelist', default = '\\path\\to\\voc\\filelist\\train.txt',
help='path to filelist\n' )
parser.add_argument('-output_dir', default = 'generated_anchors/anchors', type = str,
help='Output anchor directory\n' )
parser.add_argument('-num_clusters', default = 0, type = int,
help='number of clusters\n' )
args = parser.parse_args()
if not os.path.exists(args.output_dir):
os.mkdir(args.output_dir)
f = open(args.filelist)
lines = [line.rstrip('\n') for line in f.readlines()]
#將label文件里的obj的w_ratio,h_ratio存儲到annotation_dims
annotation_dims = []
for line in lines:
#line = line.replace('images','labels')
#line = line.replace('img1','labels')
line = line.replace('JPEGImages','labels')
line = line.replace('.jpg','.txt')
line = line.replace('.png','.txt')
print(line)
f2 = open(line)
for line in f2.readlines():
line = line.rstrip('\n')
w,h = line.split(' ')[3:]
#print(w,h)
annotation_dims.append(tuple(map(float,(w,h))))
annotation_dims = np.array(annotation_dims)
eps = 0.005
if args.num_clusters == 0:
for num_clusters in range(1,11): #we make 1 through 10 clusters
print(num_clusters)
anchor_file = join( args.output_dir,'anchors%d.txt'%(num_clusters))
indices = [ random.randrange(annotation_dims.shape[0]) for i in range(num_clusters)]
centroids = annotation_dims[indices]
kmeans(annotation_dims,centroids,eps,anchor_file)
print('centroids.shape', centroids.shape)
else:
anchor_file = join( args.output_dir,'anchors%d.txt'%(args.num_clusters))
##隨機選取args.num_clusters個質心
indices = [ random.randrange(annotation_dims.shape[0]) for i in range(args.num_clusters)]
print("indices={}".format(indices))
centroids = annotation_dims[indices]
print("centroids=",centroids)
##
kmeans(annotation_dims,centroids,eps,anchor_file)
print('centroids.shape', centroids.shape)
if __name__=="__main__":
main(sys.argv)
如果訓練圖片的目標形狀很少,比如就2,3種,那很可能
說明你的cluster過多了,某個cluster根本沒有任何樣本歸屬於他.那你可以通過命令行指定num_clusters.
python3 gen_anchors.py -filelist ./train.txt -num_clusters 3