yolov5 增加檢測層
參考1:https://blog.csdn.net/weixin_41868104/article/details/111596851
參考2:https://blog.csdn.net/jacke121/article/details/118714043
原模型結構
原配置文件
# YOLOv5 by Ultralytics, GPL-3.0 license
# Parameters
nc: 80 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Focus, [64, 3]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 9, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 1, SPP, [1024, [5, 9, 13]]],
[-1, 3, C3, [1024, False]], # 9
]
# YOLOv5 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 23 (P5/32-large)
[[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]
新模型結構
新配置文件
# YOLOv5 by Ultralytics, GPL-3.0 license
# Parameters
nc: 80 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
anchors:
- [ 5,6,8,14,15,11 ]
- [ 10,13, 16,30, 33,23 ] # P3/8
- [ 30,61, 62,45, 59,119 ] # P4/16
- [ 116,90, 156,198, 373,326 ] # P5/32
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[ [ -1, 1, Focus, [ 64, 3 ] ], # 0-P1/2
[ -1, 1, Conv, [ 128, 3, 2 ] ], # 1-P2/4
[ -1, 3, C3, [ 128 ] ], # 2
[ -1, 1, Conv, [ 256, 3, 2 ] ], # 3-P3/8
[ -1, 9, C3, [ 256 ] ], # 4
[ -1, 1, Conv, [ 512, 3, 2 ] ], # 5-P4/16
[ -1, 9, C3, [ 512 ] ], # 6
[ -1, 1, Conv, [ 1024, 3, 2 ] ], # 7-P5/32
[ -1, 1, SPP, [ 1024, [ 5, 9, 13 ] ] ],
[ -1, 3, C3, [ 1024, False ] ], # 9
]
# YOLOv5 head
head:
[ [ -1, 1, Conv, [ 512, 1, 1 ] ], # 10
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ], # 11
[ [ -1, 6 ], 1, Concat, [ 1 ] ], # cat backbone P4 #12
[ -1, 3, C3, [ 512, False ] ], # 13
[ -1, 1, Conv, [ 512, 1, 1 ] ], # 14
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ], # 15
[ [ -1, 4 ], 1, Concat, [ 1 ] ], # cat backbone P3 16
[ -1, 3, C3, [ 512, False ] ], # 17
[ -1, 1, Conv, [ 256, 1, 1 ] ], # 18
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ], #19
[ [ -1, 2 ], 1, Concat, [ 1 ] ], # cat backbone P3 # 20
[ -1, 3, C3, [ 256, False ] ], # 21 (P3/8-small) 21
[ -1, 1, Conv, [ 256, 3, 2 ] ], #22
[ [ -1, 18 ], 1, Concat, [ 1 ] ], # cat backbone P3 # 23
[ -1, 3, C3, [ 256, False ] ], # 24 (P3/8-small)
[ -1, 1, Conv, [ 256, 3, 2 ] ],
[ [ -1, 14 ], 1, Concat, [ 1 ] ], # cat head P4
[ -1, 3, C3, [ 512, False ] ], # 27 (P4/16-medium)
[ -1, 1, Conv, [ 512, 3, 2 ] ],
[ [ -1, 10 ], 1, Concat, [ 1 ] ], # cat head P5
[ -1, 3, C3, [ 1024, False ] ], # 30 (P5/32-large)
[ [ 21,24,27,30 ], 1, Detect, [ nc, anchors ] ], # Detect(P3, P4, P5)
]
主要修改點:
1、 添加針對小目標的 Anchor參數
2、 從骨干網絡的第2層就開始增強特征
3、 添加針對第2層的檢測頭部
按照原作者(參考1)上解釋:
在第17層后,繼續對特征圖進行上采樣等處理,使得特征圖繼續擴大,同時在第20層時,將獲取到的大小為160X160的特征圖與骨干網絡中第2層特征圖進行concat融合,以此獲取更大的特征圖進行小目標檢測。
結果
yolov5s1 原生的yolov5s模型
yolov5s2 增加檢測頭的yolov5s模型
由於yolov5s2並不能完全遷移之前的權重數據,需要重新訓練模型,其中精確度並不一定比yolov5s高。
以下結果 僅僅代表個人實驗
yolov5s1 | yolov5s2 | yolov5m | |
---|---|---|---|
layers | 283 | 341 | 391 |
parameter | 7063542 | 7724296 | 21056406 |
gpu_mem | 10G | 11.4G | 12.6G |
weights | 14M | 15M | 41M |
GFLOPs | 16.4 | 27.5 | 50.4 |
time | 0.0017 | 0.0023 | 0.034 |
max_ap05 | 0.8856 | 0.89846 | 0.9016 |
附加
此外論文 《TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone》
針對高空無人機對地面目標的檢測 也采用了該四個檢測頭結構
論文鏈接: https://arxiv.org/pdf/2108.11539.pdf
論文解讀: https://zhuanlan.zhihu.com/p/410166419