在colab上使用yolo v3訓練自己的數據集

本文轉載自查看原文 2020-09-28 15:43 919 Machine Learning

在colab上使用yolo v3訓練自己的數據集

其實這個項目中yolo3\yolo4等都有，但是這里就只用yolo3做測試了，yolo3和yolo4的使用方法差不多

關於那個競賽，有位博主已經寫過了如何使用yolo獲得較好的效果:https://tianchi.aliyun.com/notebook-ai/detail?spm=5176.12586969.1002.108.2ce879de4cKZcz&postId=118780
但是我這里主要關注於把項目先跑通，最佳實踐之后可以參考

reference:

因為是ipunb轉的markdown,所以閱讀起來可能不是很好看，可以下載源文件:https://files.cnblogs.com/files/jiading/yolo_in_colab.zip。注意源文件不包括本文最后的結論部分

#查看colab分配的gpu
!/opt/bin/nvidia-smi

Mon Sep 28 05:14:09 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P8     9W /  70W |      0MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

#下載項目
!git clone https://github.com/AlexeyAB/darknet

Cloning into 'darknet'...
remote: Enumerating objects: 14321, done.[K
remote: Total 14321 (delta 0), reused 0 (delta 0), pack-reused 14321[K
Receiving objects: 100% (14321/14321), 12.87 MiB | 22.64 MiB/s, done.
Resolving deltas: 100% (9772/9772), done.

# 修改makefile 將OpenCV和GPU設置為可用
%cd darknet
'''
Linux sed 命令是利用腳本來處理文本文件
sed 可依照腳本的指令來處理、編輯文本文件
i ：插入， i 的后面可以接字串，而這些字串會在新的一行出現(目前的上一行)；
'''
!sed -i 's/OPENCV=0/OPENCV=1/' Makefile
!sed -i 's/GPU=0/GPU=1/' Makefile
!sed -i 's/CUDNN=0/CUDNN=1/' Makefile

/content/darknet

#驗證CUDA版本
!/usr/local/cuda/bin/nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

運行demo顯示bbox

此步只為了檢驗環境和編譯成功與否

#下載訓練好的coco數據集權重,保存到darknet文件夾中
#!wget https://pjreddie.com/media/files/yolov3.weights

#可以把權重保存下來，之后直接從google drive拉就可以了，比下載快點
#先保存
#!cp /content/darknet/yolov3.weights '/content/drive/My Drive/cvComp1Realted/yolov3.weights'

#再拉取
!cp '/content/drive/My Drive/cvComp1Realted/yolov3.weights' /content/darknet/yolov3.weights

#編譯項目生成darknet運行程序
!make

#定義imshow 調用opencv顯示圖片
def imShow(path):
  import cv2
  import matplotlib.pyplot as plt
  %matplotlib inline

  image = cv2.imread(path)
  height, width = image.shape[:2]
  resized_image = cv2.resize(image,(3*width, 3*height), interpolation = cv2.INTER_CUBIC)

  fig = plt.gcf()
  fig.set_size_inches(18, 10)
  plt.axis("off")
  plt.imshow(cv2.cvtColor(resized_image, cv2.COLOR_BGR2RGB))
  plt.show()

#運行demo
!./darknet detect cfg/yolov3.cfg yolov3.weights data/person.jpg
imShow('predictions.jpg')

 CUDA-version: 10010 (10010), cuDNN: 7.6.5, GPU count: 1  
 OpenCV version: 3.2.0
 0 : compute_capability = 750, cudnn_half = 0, GPU: Tesla T4 
net.optimized_memory = 0 
mini_batch = 1, batch = 1, time_steps = 1, train = 0 
   layer   filters  size/strd(dil)      input                output
   0 conv     32       3 x 3/ 1    416 x 416 x   3 ->  416 x 416 x  32 0.299 BF
   1 conv     64       3 x 3/ 2    416 x 416 x  32 ->  208 x 208 x  64 1.595 BF
   2 conv     32       1 x 1/ 1    208 x 208 x  64 ->  208 x 208 x  32 0.177 BF
   3 conv     64       3 x 3/ 1    208 x 208 x  32 ->  208 x 208 x  64 1.595 BF
   4 Shortcut Layer: 1,  wt = 0, wn = 0, outputs: 208 x 208 x  64 0.003 BF
   5 conv    128       3 x 3/ 2    208 x 208 x  64 ->  104 x 104 x 128 1.595 BF
   6 conv     64       1 x 1/ 1    104 x 104 x 128 ->  104 x 104 x  64 0.177 BF
   7 conv    128       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x 128 1.595 BF
   8 Shortcut Layer: 5,  wt = 0, wn = 0, outputs: 104 x 104 x 128 0.001 BF
   9 conv     64       1 x 1/ 1    104 x 104 x 128 ->  104 x 104 x  64 0.177 BF
  10 conv    128       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x 128 1.595 BF
  11 Shortcut Layer: 8,  wt = 0, wn = 0, outputs: 104 x 104 x 128 0.001 BF
  12 conv    256       3 x 3/ 2    104 x 104 x 128 ->   52 x  52 x 256 1.595 BF
  13 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  14 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  15 Shortcut Layer: 12,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  16 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  17 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  18 Shortcut Layer: 15,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  19 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  20 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  21 Shortcut Layer: 18,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  22 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  23 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  24 Shortcut Layer: 21,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  25 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  26 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  27 Shortcut Layer: 24,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  28 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  29 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  30 Shortcut Layer: 27,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  31 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  32 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  33 Shortcut Layer: 30,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  34 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  35 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  36 Shortcut Layer: 33,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  37 conv    512       3 x 3/ 2     52 x  52 x 256 ->   26 x  26 x 512 1.595 BF
  38 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  39 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  40 Shortcut Layer: 37,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  41 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  42 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  43 Shortcut Layer: 40,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  44 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  45 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  46 Shortcut Layer: 43,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  47 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  48 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  49 Shortcut Layer: 46,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  50 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  51 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  52 Shortcut Layer: 49,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  53 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  54 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  55 Shortcut Layer: 52,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  56 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  57 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  58 Shortcut Layer: 55,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  59 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  60 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  61 Shortcut Layer: 58,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  62 conv   1024       3 x 3/ 2     26 x  26 x 512 ->   13 x  13 x1024 1.595 BF
  63 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  64 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  65 Shortcut Layer: 62,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  66 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  67 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  68 Shortcut Layer: 65,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  69 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  70 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  71 Shortcut Layer: 68,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  72 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  73 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  74 Shortcut Layer: 71,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  75 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  76 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  77 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  78 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  79 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  80 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  81 conv    255       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 255 0.088 BF
  82 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.00
  83 route  79 		                           ->   13 x  13 x 512 
  84 conv    256       1 x 1/ 1     13 x  13 x 512 ->   13 x  13 x 256 0.044 BF
  85 upsample                 2x    13 x  13 x 256 ->   26 x  26 x 256
  86 route  85 61 	                           ->   26 x  26 x 768 
  87 conv    256       1 x 1/ 1     26 x  26 x 768 ->   26 x  26 x 256 0.266 BF
  88 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  89 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  90 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  91 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  92 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  93 conv    255       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 255 0.177 BF
  94 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.00
  95 route  91 		                           ->   26 x  26 x 256 
  96 conv    128       1 x 1/ 1     26 x  26 x 256 ->   26 x  26 x 128 0.044 BF
  97 upsample                 2x    26 x  26 x 128 ->   52 x  52 x 128
  98 route  97 36 	                           ->   52 x  52 x 384 
  99 conv    128       1 x 1/ 1     52 x  52 x 384 ->   52 x  52 x 128 0.266 BF
 100 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
 101 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
 102 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
 103 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
 104 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
 105 conv    255       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 255 0.353 BF
 106 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 65.879 
avg_outputs = 532444 
 Allocate additional workspace_size = 52.43 MB 
Loading weights from yolov3.weights...
 seen 64, trained: 32013 K-images (500 Kilo-batches_64) 
Done! Loaded 107 layers from weights-file 
 Detection layer: 82 - type = 28 
 Detection layer: 94 - type = 28 
 Detection layer: 106 - type = 28 
data/person.jpg: Predicted in 41.478000 milli-seconds.
dog: 99%
person: 100%
horse: 100%
Unable to init server: Could not connect: Connection refused

(predictions:1181): Gtk-[1;33mWARNING[0m **: [34m05:16:13.455[0m: cannot open display:

png

將我們cv入門賽的資料解壓到目錄下

#先查看一下當前位置
!pwd

/content/darknet

#把原始數據從google drive拉下來
!mkdir /content/input
!mkdir /content/input/train
!mkdir /content/input/val
!mkdir /content/input/test
#路徑中有空格的話需要用引號引住
!unzip '/content/drive/My Drive/cvComp1Realted/mchar_train.zip' -d /content/input/train
!unzip '/content/drive/My Drive/cvComp1Realted/mchar_val.zip' -d /content/input/val
!unzip '/content/drive/My Drive/cvComp1Realted/mchar_test_a.zip' -d /content/input/test
!cp '/content/drive/My Drive/cvComp1Realted/mchar_train.json' -d /content/input
!cp '/content/drive/My Drive/cvComp1Realted/mchar_val.json' -d /content/input

按照yolo的格式要求構造數據集

Anontations用於存放標簽xml文件
JPEGImage用於存放圖像
ImageSets內的Main文件夾用於存放生成的圖片名字，例如：

#按照voc數據集的格式創建文件夾
!mkdir /content/VOCdevkit/
!mkdir /content/VOCdevkit/VOC2007
!mkdir /content/VOCdevkit/VOC2007/Annotations
!mkdir /content/VOCdevkit/VOC2007/ImageSets
!mkdir /content/VOCdevkit/VOC2007/JPEGImages
!mkdir /content/VOCdevkit/VOC2007/ImageSets/Main
!mkdir /content/VOCdevkit/VOC2007/labels

import glob
#構造圖片名稱保存到ImageSets中
test_path = glob.glob('/content/input/test/mchar_test_a/*.png')
train_path=glob.glob('/content/input/train/mchar_train/*.png')
val_path=glob.glob('/content/input/val/mchar_val/*.png')

#路徑長這樣
train_path[0]

'/content/input/train/mchar_train/000653.png'

#把原始圖片重命名(因為原始圖片在train\val\test中都是從00000.png開始的，所以放在一起會有重名)並且都拷貝到JPEGImages中，且寫入到txt文件中
from shutil import copyfile
import os
path='/content/VOCdevkit/VOC2007/JPEGImages'
#training part
with open('/content/VOCdevkit/VOC2007/ImageSets/Main/train.txt','w') as f:
  for item in train_path:
    splited=item.split('/')
    filename='train_'+splited[5].split('.')[0]
    topath=os.path.join(path,filename+'.png')
    f.write(os.path.join(path,filename)+'.png\n')
    copyfile(item,topath)
#val part
with open('/content/VOCdevkit/VOC2007/ImageSets/Main/val.txt','w') as f:
  for item in val_path:
    splited=item.split('/')
    filename='val_'+splited[5].split('.')[0]
    topath=os.path.join(path,filename+'.png')
    f.write(os.path.join(path,filename)+'.png\n')
    copyfile(item,topath)
#test part
with open('/content/VOCdevkit/VOC2007/ImageSets/Main/test.txt','w') as f:
  for item in test_path:
    splited=item.split('/')
    filename='test_'+splited[5].split('.')[0]
    topath=os.path.join(path,filename+'.png')
    f.write(os.path.join(path,filename)+'.png\n')
    copyfile(item,topath)

構造xml格式的label文件

xml文件格式要求如下圖:

#讀取我們json格式的label文件
import json
train_labels=json.load(open('/content/input/mchar_train.json'))
val_labels=json.load(open('/content/input/mchar_val.json'))

#讀一個label看看
train_labels['000000.png']

{'height': [219, 219],
 'label': [1, 9],
 'left': [246, 323],
 'top': [77, 81],
 'width': [81, 96]}

#key為不加前綴的文件名,type_name為train/test/val
#pic_path填粘貼后的地址即可
#這里看着代碼多，其實是用純代碼的方式寫了一個xml文件，要簡化的話其實可以弄一個xml模板然后往里面填內容
def create_xml(key,value,type_name,xml_path,pic_path):
  from PIL import Image
  import os
  import xml.dom.minidom as minidom
  filename=type_name+'_'+key.split('.')[0]
  with open(os.path.join(xml_path,filename+'.xml'),'w') as f:
    dom=minidom.Document()
    annotation_node=dom.createElement('annotation')

    folder_node=dom.createElement('folder')
    name_text_value = dom.createTextNode("VOC2007")
    folder_node.appendChild(name_text_value)
    annotation_node.appendChild(folder_node)

    filename_node=dom.createElement('filename')
    name_text_value = dom.createTextNode(filename+'.png')
    filename_node.appendChild(name_text_value)
    annotation_node.appendChild(filename_node)

    source_node=dom.createElement('source')
    database_node=dom.createElement('database')
    name_text_value = dom.createTextNode("My Database")
    database_node.appendChild(name_text_value)
    source_node.appendChild(database_node)
    annotation_node_2=dom.createElement('annotation')
    name_text_value = dom.createTextNode("PASCAL VOC2007")
    annotation_node_2.appendChild(name_text_value)
    source_node.appendChild(annotation_node_2)
    image_node=dom.createElement('image')
    name_text_value = dom.createTextNode("flickr")
    image_node.appendChild(name_text_value)
    source_node.appendChild(image_node)
    flickrid_node=dom.createElement('flickrid')
    name_text_value = dom.createTextNode("NULL")
    flickrid_node.appendChild(name_text_value)
    source_node.appendChild(flickrid_node)
    annotation_node.appendChild(source_node)

    owner_node=dom.createElement('owner')
    flickrid_node_2=dom.createElement('flickrid')
    name_text_value = dom.createTextNode("NULL")
    flickrid_node_2.appendChild(name_text_value)
    owner_node.appendChild(flickrid_node_2)
    name_node=dom.createElement('name')
    name_text_value = dom.createTextNode("company")
    name_node.appendChild(name_text_value)
    owner_node.appendChild(name_node)
    annotation_node.appendChild(owner_node)

    size_node=dom.createElement('size')
    img = Image.open(os.path.join(pic_path,filename+'.png'))
    width_node=dom.createElement('width')
    name_text_value = dom.createTextNode(str(img.width))
    width_node.appendChild(name_text_value)
    height_node=dom.createElement('height')
    name_text_value = dom.createTextNode(str(img.height))
    height_node.appendChild(name_text_value)
    depth_node=dom.createElement('depth')
    name_text_value = dom.createTextNode(str(3))
    depth_node.appendChild(name_text_value)
    size_node.appendChild(width_node)
    size_node.appendChild(height_node)
    size_node.appendChild(depth_node)
    annotation_node.appendChild(size_node)

    segmented_node=dom.createElement('segmented')
    name_text_value = dom.createTextNode(str(0))
    segmented_node.appendChild(name_text_value)
    annotation_node.appendChild(segmented_node)

    if value is not None:
      labels=value['label']
      index=0
      for label in labels:
        object_node=dom.createElement('object')
        name_node_2=dom.createElement('name')
        name_text_value = dom.createTextNode(str(label))
        name_node_2.appendChild(name_text_value)
        object_node.appendChild(name_node_2)

        pose_node=dom.createElement('pose')
        name_text_value = dom.createTextNode('Unspecified')
        pose_node.appendChild(name_text_value)
        object_node.appendChild(pose_node)

        truncated_node=dom.createElement('truncated')
        name_text_value = dom.createTextNode(str(0))
        truncated_node.appendChild(name_text_value)
        object_node.appendChild(truncated_node)

        difficult_node=dom.createElement('difficult')
        name_text_value = dom.createTextNode(str(0))
        difficult_node.appendChild(name_text_value)
        object_node.appendChild(difficult_node)

        bndbox_node=dom.createElement('bndbox')
        xmin_node=dom.createElement('xmin')
        name_text_value = dom.createTextNode(str(value['left'][index]))
        xmin_node.appendChild(name_text_value)
        bndbox_node.appendChild(xmin_node)
        ymin_node=dom.createElement('ymin')
        name_text_value = dom.createTextNode(str(value['top'][index]))
        ymin_node.appendChild(name_text_value)
        bndbox_node.appendChild(ymin_node)
        xmax_node=dom.createElement('xmax')
        name_text_value = dom.createTextNode(str(value['left'][index]+value['width'][index]))
        xmax_node.appendChild(name_text_value)
        bndbox_node.appendChild(xmax_node)
        ymax_node=dom.createElement('ymax')
        name_text_value = dom.createTextNode(str(value['top'][index]+value['height'][index]))
        ymax_node.appendChild(name_text_value)
        bndbox_node.appendChild(ymax_node)
        object_node.appendChild(bndbox_node)

        annotation_node.appendChild(object_node)
        index+=1
    dom.appendChild(annotation_node)
    dom.writexml(f, addindent='\n', encoding='utf-8')

#創建一個xml試一試
#def create_xml(key,value,type_name,xml_path,pic_path):
xml_path='/content/VOCdevkit/VOC2007/Annotations'
pic_path='/content/VOCdevkit/VOC2007/JPEGImages'
create_xml('000000.png',train_labels['000000.png'],'train',xml_path=xml_path,pic_path=pic_path)

#沒有問題了，就把所有的都轉換掉
for (key,value) in train_labels.items():
  try:
    create_xml(key,value,'train',xml_path=xml_path,pic_path=pic_path)
  except (FileNotFoundError):
    print(key)
    continue
for (key,value) in val_labels.items():
  try:
    create_xml(key,value,'val',xml_path=xml_path,pic_path=pic_path)
  except (FileNotFoundError):
    print(key)
    continue
#test的xml做不做都行，我在voc_label.py中把test部分刪了
'''
test_path = glob.glob('/content/input/test/mchar_test_a/*.png')
for i in test_path:
  name=i.split('/')
  try:
    create_xml(name[len(name)-1],None,'test',xml_path=xml_path,pic_path=pic_path)
  except (FileNotFoundError):
    print(name[len(name)-1])
    continue
'''

"\ntest_path = glob.glob('/content/input/test/mchar_test_a/*.png')\nfor i in test_path:\n  name=i.split('/')\n  try:\n    create_xml(name[len(name)-1],None,'test',xml_path=xml_path,pic_path=pic_path)\n  except (FileNotFoundError):\n    print(name[len(name)-1])\n    continue\n"

#這個colab環境里面是有的
!pip install opencv-python

Requirement already satisfied: opencv-python in /usr/local/lib/python3.6/dist-packages (4.1.2.30)
Requirement already satisfied: numpy>=1.11.3 in /usr/local/lib/python3.6/dist-packages (from opencv-python) (1.18.5)

修改參數：

進入cfg文件夾，修改yolov3.cfg中:

文本最開始的batch和subdivision
(colab使用的是16GB顯存的Tesla T4，batch設置為128比較合適)
文本最后三處[yolo]標簽中的classes和classes前面的一個filters(等於(5+類別數)*3)

!pwd

/content/darknet

#寫train.data文件
with(open('train.data','w')) as f:
  f.write('classes=10\ntrain=/content/VOCdevkit/VOC2007/ImageSets/Main/train.txt\nvalid=/content/VOCdevkit/VOC2007/ImageSets/Main/val.txt\nnames=train.names\nbackup=/content/drive/My Drive/cvComp1Realted/backup')

#寫train.names文件
with(open('train.names','w'))as f:
  f.write('0\n1\n2\n3\n4\n5\n6\n7\n8\n9')

#!cp /content/darknet/scripts/voc_label.py /content/

這里對voc_label.py文件進行了修改，修改后的文件如下:

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join

#sets=[('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test')]
sets=[ ('2007', 'train'), ('2007', 'val')]

#classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
classes=['0','1','2','3','4','5','6','7','8','9']

def convert(size, box):
    dw = 1./(size[0])
    dh = 1./(size[1])
    x = (box[0] + box[1])/2.0 - 1
    y = (box[2] + box[3])/2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

def convert_annotation(year, image_id):
    image_ids=image_id.split('/')
    image_id=image_ids[len(image_ids)-1]
    image_id=image_id.split('.')[0]
    in_file = open('/content/VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
    out_file = open('/content/VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w')
    tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult)==1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

wd = getcwd()

for year, image_set in sets:
    if not os.path.exists('/content/VOCdevkit/VOC%s/labels/'%(year)):
        os.makedirs('/content/VOCdevkit/VOC%s/labels/'%(year))
    image_ids = open('/content/VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
    list_file = open('%s_%s.txt'%(year, image_set), 'w')
    for image_id in image_ids:
        list_file.write(image_id+'\n')
        convert_annotation(year, image_id)
    list_file.close()

os.system("cat 2007_train.txt 2007_val.txt  > train.txt")
os.system("cat 2007_train.txt 2007_val.txt 2007_test.txt > train.all.txt")

#保存修改后的voc_label.py
#!cp /content/voc_label.py '/content/drive/My Drive/cvComp1Realted/voc_label.py'

#加載
!cp '/content/drive/My Drive/cvComp1Realted/voc_label.py' /content/voc_label.py

!python /content/voc_label.py

cat: 2007_test.txt: No such file or directory

#開啟訓練
!./darknet detector train /content/darknet/train.data cfg/yolov3.cfg yolov3.weights -dont_show  -map

#!cp /content/darknet/backup/yolov3_last.weights '/content/drive/My Drive/cvComp1Realted/backup/yolov3_last.weights'

#load back
!cp '/content/drive/My Drive/cvComp1Realted/backup/yolov3_last.weights' /content/darknet/backup/yolov3_last.weights

#測試一張
!./darknet detector test /content/darknet/train.data /content/darknet/cfg/yolov3.cfg /content/darknet/backup/yolov3_last.weights /content/input/val/mchar_val/000001.png -i 0 -thresh 0.05
imShow('predictions.jpg')

 CUDA-version: 10010 (10010), cuDNN: 7.6.5, GPU count: 1  
 OpenCV version: 3.2.0
 0 : compute_capability = 750, cudnn_half = 0, GPU: Tesla T4 
net.optimized_memory = 0 
mini_batch = 1, batch = 1, time_steps = 1, train = 0 
   layer   filters  size/strd(dil)      input                output
   0 conv     32       3 x 3/ 1    416 x 416 x   3 ->  416 x 416 x  32 0.299 BF
   1 conv     64       3 x 3/ 2    416 x 416 x  32 ->  208 x 208 x  64 1.595 BF
   2 conv     32       1 x 1/ 1    208 x 208 x  64 ->  208 x 208 x  32 0.177 BF
   3 conv     64       3 x 3/ 1    208 x 208 x  32 ->  208 x 208 x  64 1.595 BF
   4 Shortcut Layer: 1,  wt = 0, wn = 0, outputs: 208 x 208 x  64 0.003 BF
   5 conv    128       3 x 3/ 2    208 x 208 x  64 ->  104 x 104 x 128 1.595 BF
   6 conv     64       1 x 1/ 1    104 x 104 x 128 ->  104 x 104 x  64 0.177 BF
   7 conv    128       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x 128 1.595 BF
   8 Shortcut Layer: 5,  wt = 0, wn = 0, outputs: 104 x 104 x 128 0.001 BF
   9 conv     64       1 x 1/ 1    104 x 104 x 128 ->  104 x 104 x  64 0.177 BF
  10 conv    128       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x 128 1.595 BF
  11 Shortcut Layer: 8,  wt = 0, wn = 0, outputs: 104 x 104 x 128 0.001 BF
  12 conv    256       3 x 3/ 2    104 x 104 x 128 ->   52 x  52 x 256 1.595 BF
  13 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  14 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  15 Shortcut Layer: 12,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  16 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  17 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  18 Shortcut Layer: 15,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  19 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  20 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  21 Shortcut Layer: 18,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  22 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  23 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  24 Shortcut Layer: 21,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  25 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  26 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  27 Shortcut Layer: 24,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  28 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  29 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  30 Shortcut Layer: 27,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  31 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  32 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  33 Shortcut Layer: 30,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  34 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  35 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  36 Shortcut Layer: 33,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  37 conv    512       3 x 3/ 2     52 x  52 x 256 ->   26 x  26 x 512 1.595 BF
  38 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  39 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  40 Shortcut Layer: 37,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  41 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  42 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  43 Shortcut Layer: 40,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  44 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  45 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  46 Shortcut Layer: 43,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  47 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  48 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  49 Shortcut Layer: 46,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  50 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  51 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  52 Shortcut Layer: 49,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  53 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  54 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  55 Shortcut Layer: 52,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  56 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  57 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  58 Shortcut Layer: 55,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  59 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  60 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  61 Shortcut Layer: 58,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  62 conv   1024       3 x 3/ 2     26 x  26 x 512 ->   13 x  13 x1024 1.595 BF
  63 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  64 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  65 Shortcut Layer: 62,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  66 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  67 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  68 Shortcut Layer: 65,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  69 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  70 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  71 Shortcut Layer: 68,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  72 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  73 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  74 Shortcut Layer: 71,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  75 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  76 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  77 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  78 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  79 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  80 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  81 conv     45       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x  45 0.016 BF
  82 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.00
  83 route  79 		                           ->   13 x  13 x 512 
  84 conv    256       1 x 1/ 1     13 x  13 x 512 ->   13 x  13 x 256 0.044 BF
  85 upsample                 2x    13 x  13 x 256 ->   26 x  26 x 256
  86 route  85 61 	                           ->   26 x  26 x 768 
  87 conv    256       1 x 1/ 1     26 x  26 x 768 ->   26 x  26 x 256 0.266 BF
  88 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  89 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  90 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  91 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  92 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  93 conv     45       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x  45 0.031 BF
  94 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.00
  95 route  91 		                           ->   26 x  26 x 256 
  96 conv    128       1 x 1/ 1     26 x  26 x 256 ->   26 x  26 x 128 0.044 BF
  97 upsample                 2x    26 x  26 x 128 ->   52 x  52 x 128
  98 route  97 36 	                           ->   52 x  52 x 384 
  99 conv    128       1 x 1/ 1     52 x  52 x 384 ->   52 x  52 x 128 0.266 BF
 100 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
 101 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
 102 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
 103 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
 104 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
 105 conv     45       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x  45 0.062 BF
 106 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 65.370 
avg_outputs = 518514 
 Allocate additional workspace_size = 52.43 MB 
Loading weights from /content/darknet/backup/yolov3_last.weights...
 seen 64, trained: 32038 K-images (500 Kilo-batches_64) 
Done! Loaded 107 layers from weights-file 
 Detection layer: 82 - type = 28 
 Detection layer: 94 - type = 28 
 Detection layer: 106 - type = 28 
/content/input/val/mchar_val/000001.png: Predicted in 40.634000 milli-seconds.
1: 6%
Unable to init server: Could not connect: Connection refused

(predictions:1358): Gtk-[1;33mWARNING[0m **: [34m05:21:04.194[0m: cannot open display:

png

結論

其實我在嘗試的時候，最后的預測如果使用默認的置信度(0.25)是出不來框的，我這里在預測時將置信度通過-thresh 0.05調節到了0.05，當然不能應用（誤差太大），只是為了證明項目的流程沒有問題。但精度不足的可能原因有：

訓練不足：實測colab在使用shell命令訓練一段時間后會斷開，這一點我嘗試了三次都是如此，所以保存的是一個訓練不足的版本

此外，默認的訓練好像是只有在完成時才會保存參數，沒有找到設置多少epoch自動保存的設置，這個有待研究
輸入尺寸有問題：理論上圖片的輸入應該是什么尺寸都可以的，但是根據這篇博文的討論，yolo的效果在輸入圖片和它配置文件(指cfg/yolo3.cfg)中的尺寸一致時候效果最好，而我這里用的數據集各個圖片的尺寸都不一樣，和配置文件中的尺寸更就不一樣了，我猜測這也是在訓練時只有小尺寸的檢測框能檢測到物體，大尺寸的框就不行的原因(訓練的時候每次會輸出三個值，就是三個yolo輸出，采集的不同尺度的結果)。因為目標檢測的resize還涉及到label中定位框的坐標變化,比較復雜，我這里就沒做，當然要做的話其實也不難，圖片和label中的定位框尺寸同比例縮放就好了。

另外有一點要注意，根據其他博主的文章 (例如https://blog.csdn.net/qq_44166805/article/details/105876028)，我們自己在數據集的ImageSet/Main文件夾下的txt文件中是只需要寫不含后綴的文件名的，而在模型的如下的配置文件中我們需要寫兩個txt的路徑，這兩個文件是由voc_label.py隨着各個圖片的txt一起生成的，里面是各個圖片的全路徑。但我因為是第一次用，不熟悉，自己在Main文件夾下生成的txt其實就是全路徑的，而我手動修改了voc_label.py的代碼，讓它輸出了一個和我的txt完全相同的文件，當然這兩種在使用上沒有什么區別，都是可以用的

classes= 2 							#classes為訓練樣本集的類別總數
train  = scripts/train.txt 			#train的路徑為訓練樣本集所在的路徑，前面生成的
valid  = scripts/test.txt			#valid的路徑為驗證樣本集所在的路徑，前面生成的
names = data/safe.names				#names的路徑為***.names文件所在的路徑 
backup = backup/

不管精度怎樣，模型起碼是通了

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Tensorflow+YOLO v3訓練自己的數據集合 YOLO v3 包括Tiny-Yolo 訓練自己的數據集（Pytorch版本）以及模型評價指標的介紹【項目實戰】YOLO v4訓練自己的數據集【YOLO v5】訓練VisDrone數據集 Win7+keras+tensorflow使用YOLO-v3訓練自己的數據集使用yolo3模型訓練自己的數據集教你使用pytorch-yolo訓練KITTI數據集！ YOLO3訓練widerface數據集 yolo_v3訓練自己的模型（人臉及deep-sort)（或自己數據集） COCO數據集提取特定多個類並在YOLO-V3上訓練