超參數進化 Hyperparameter Evolution

本文轉載自查看原文 2021-04-15 15:08 857 Pytorch/ 機器學習/ 深度學習/ Python

前言
yolov5提供了一種超參數優化的方法–Hyperparameter Evolution，即超參數進化。超參數進化是一種利用遺傳算法(GA) 進行超參數優化的方法，我們可以通過該方法選擇更加合適自己的超參數。

提供的默認參數也是通過在COCO數據集上使用超參數進化得來的。由於超參數進化會耗費大量的資源和時間，如果默認參數訓練出來的結果能滿足你的使用，使用默認參數也是不錯的選擇。

ML中的超參數控制訓練的各個方面，找到一組最佳的超參數值可能是一個挑戰。像網格搜索這樣的傳統方法由於以下原因可能很快變得難以處理：

高維度的搜索空間；
維度之間未知的相關性；
在每個點上評估fitness的代價很高
由於這些原因使得遺傳算法成為超參數搜索的合適候選。
1. 初始化超參數
YOLOv5有大約25個用於各種訓練設置的超參數，它們定義在/data目錄下的yaml文件中。好的初始參數值將產生更好的最終結果，因此在演進之前正確初始化這些值是很重要的。如果有不清楚怎么初始化，只需使用默認值，這些值是針對COCO訓練優化得到的。

yolov5/data/hyp.scratch.yaml

# Hyperparameters for COCO training from scratch
# python train.py --batch 40 --cfg yolov5m.yaml --weights '' --data coco.yaml --img 640 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials


lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.2 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 0.05 # box loss gain
cls: 0.5 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 1.0 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.5 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.0 # image mixup (probability)

2. 定義fitness
fitness是我們尋求最大化的值。在YOLOv5中，定義了一個fitness函數對指標進行加權。
yolov5/utils/metrics.py

def fitness(x):
    # Model fitness as a weighted combination of metrics
    w = [0.0, 0.0, 0.1, 0.9] # weights for [P, R, mAP@0.5, mAP@0.5:0.95]
    return (x[:, :4] * w).sum(1)

3. 進化
使用預訓練的yolov5s對COCO128進行微調

python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache

基於這個場景進行超參數進化選擇，通過使用參數--evolve：

# Single-GPU
python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve

# Multi-GPU
for i in 0 1 2 3; do
  nohup python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve --device $i > evolve_gpu_$i.log &
done

# 其中多GPU運行時的`nohub`是`no hang up`（不掛起），用於在系統后台不掛斷地運行命令，退出終端不會影響程序的運行。
# `&`符號的用途：在后台運行。
# 一般兩個一起用`nohup command &`。

# 查看進程：
ps -aux|grep train.py

# #終止進程：
kill -9 進程號

代碼中默認進化設置將運行基本場景300次，即300代
yolov5/train.py

for _ in range(300): # generations to evolve

主要的遺傳操作是交叉和變異。在這項工作中，使用了90%的概率和0.04的方差的變異，以所有前幾代最好的父母的組合來創造新的后代。結果記錄在yolov5/evolve.txt，fitness最高的后代保存在yolov5/runs/evolve/hyp_evolved.yaml

4. 可視化
結果被保存在yolov5/evolve.png，每個超參數一個圖表。超參數的值在x軸上，fitness在y軸上。黃色表示濃度較高。垂直線表示某個參數已被固定，且不會發生變化。這是用戶在train.py上可選擇的meta字典，這對於固定參數和防止它們進化是很有用的。

報錯問題
報錯1：KeyError: ‘anchors’ ：
issues/2485
issues/1411
pull/1135

I think commenting the same field in the meta dictionary can work… yes that should work, it will act as if the field does not exist at all. Anchor count will be fixed at 3, and autoanchor will be run if the Best Possible Recall (BPR) dips below threshold, which is set at 0.98 at the moment. Varying the hyps can cause your BPR to vary, so its possible some generations may use it and other not. - - glenn-jocher

EDIT: BTW the reason there are two dictionaries is that the meta dictionary contains gains and bounds applied to each hyperparameter during evolution as key: [gain, lower_bound, upper_bound]. meta is only ever used during evolution, I kept it separated to avoid complicating the hyp dictionary, again not sure if that’s the best design choice, we could merge them, but then each hyp.yaml would be busier and more complicated to read. - - glenn-jocher

原因是data/hyp.scratch.yaml里面的anchors被注釋掉，取消注釋繼續運行，出現下面的錯誤

報錯2：IndexError: index 34 is out of bounds for axis 0 with size 34 ：
pull/1135

將data/hyp.scratch.yaml里面的anchors注釋掉；同時將train.py中的mate字典中的anchors也注釋掉。運行成功

如果為hyp['anchors']設置一個值，autoanchor將創建新的錨覆蓋在model.yaml中指定的任何錨信息。比如：你可以設置anchors:5強制autoanchor為每個輸出層創建5個新的錨，取代現有的錨。超參數進化將使用該參數為您進化出最優數量的錨。issue
————————————————
版權聲明：本文為CSDN博主「ayiya_Oese」的原創文章，遵循CC 4.0 BY-SA版權協議，轉載請附上原文出處鏈接及本聲明。
原文鏈接：https://blog.csdn.net/ayiya_Oese/article/details/115369068

1. 超參數

YOLOv3中的超參數在train.py中提供，其中包含了一些數據增強參數設置，具體內容如下：

hyp = {'giou': 3.54, # giou loss gain 'cls': 37.4, # cls loss gain 'cls_pw': 1.0, # cls BCELoss positive_weight 'obj': 49.5, # obj loss gain (*=img_size/320 if img_size != 320) 'obj_pw': 1.0, # obj BCELoss positive_weight 'iou_t': 0.225, # iou training threshold 'lr0': 0.00579, # initial learning rate (SGD=1E-3, Adam=9E-5) 'lrf': -4., # final LambdaLR learning rate = lr0 * (10 ** lrf) 'momentum': 0.937, # SGD momentum 'weight_decay': 0.000484, # optimizer weight decay 'fl_gamma': 0.5, # focal loss gamma 'hsv_h': 0.0138, # image HSV-Hue augmentation (fraction) 'hsv_s': 0.678, # image HSV-Saturation augmentation (fraction) 'hsv_v': 0.36, # image HSV-Value augmentation (fraction) 'degrees': 1.98, # image rotation (+/- deg) 'translate': 0.05, # image translation (+/- fraction) 'scale': 0.05, # image scale (+/- gain) 'shear': 0.641} # image shear (+/- deg)

2. 使用方法

在訓練的時候，train.py提供了一個可選參數--evolve, 這個參數決定了是否進行超參數搜索與進化（默認是不開啟超參數搜索的）。

具體使用方法也很簡單：

python train.py --data data/voc.data --cfg cfg/yolov3-tiny.cfg --img-size 416 --epochs 273 --evolve

實際使用的時候，需要進行修改，train.py中的約444行：

for _ in range(1): # generations to evolve

將其中的1修改為你想設置的迭代數，比如200代，如果不設置，結果將會如下圖所示，實際上就是只有一代。

3. 原理

整個過程比較簡單，對於進化過程中的新一代，都選了了適應性最高的前一代（在前幾代中）進行突變。以上所有的參數將有約20%的 1-sigma的正態分布幾率同時突變。

s = 0.2 # sigma

整個進化過程需要搞清楚兩個點：

如何評判其中一代的好壞？
下一代如何根據上一代進行進化？

**第一個問題：**判斷好壞的標准。

def fitness(x): w = [0.0, 0.0, 0.8, 0.2] # weights for [P, R, mAP, F1]@0.5 return (x[:, :4] * w).sum(1)

YOLOv3進化部分是通過以上的適應度函數判斷的，適應度越高，代表這一代的性能越好。而在適應度中，是通過Precision,Recall ,mAP,F1這四個指標作為適應度的評價標准。

其中的w是設置的加權，如果更關心mAP的值，可以提高mAP的權重；如果更關心F1,則設置更高的權重在對應的F1上。這里分配mAP權重為0.8、F1權重為0.2。

**第二個問題：**如何進行進化？

進化過程中有兩個重要的參數:

第一個參數為parent, 可選值為single或者weighted，這個參數的作用是：決定如何選擇上一代。如果選擇single，代表只選擇上一代中最好的那個。

 if parent == 'single' or len(x) == 1: x = x[fitness(x).argmax()]

如果選擇weighted，代表選擇得分的前10個加權平均的結果作為下一代，具體操作如下：

elif parent == 'weighted': # weighted combination n = min(10, len(x)) # number to merge x = x[np.argsort(-fitness(x))][:n] # top n mutations w = fitness(x) - fitness(x).min() # weights x = (x * w.reshape(n, 1)).sum(0) / w.sum() # new parent

第二個參數為method，可選值為1,2,3, 分別代表使用三種模式來進化：

# Mutate
method = 2 s = 0.2 # 20% sigma np.random.seed(int(time.time())) g = np.array([1, 1, 1, 1, 1, 1, 1, 0, .1, \ 1, 0, 1, 1, 1, 1, 1, 1, 1]) # gains # 這里的g類似加權 ng = len(g) if method == 1: v = (np.random.randn(ng) * np.random.random() * g * s + 1) ** 2.0 elif method == 2: v = (np.random.randn(ng) * np.random.random(ng) * g * s + 1) ** 2.0 elif method == 3: v = np.ones(ng) while all(v == 1): # 為了防止重復，直到有變化才停下來 r = (np.random.random(ng) < 0.1) * np.random.randn(ng) # 10% 的突變幾率 v = (g * s * r + 1) ** 2.0 for i, k in enumerate(hyp.keys()): hyp[k] = x[i + 7] * v[i] # 進行突變

另外，為了防止突變過程，導致參數出現明顯不合理的范圍，需要用一個范圍進行框定，將超出范圍的內容剪切掉。具體方法如下：

# Clip to limits
keys = ['lr0', 'iou_t', 'momentum', 'weight_decay', 'hsv_s', 'hsv_v', 'translate', 'scale', 'fl_gamma'] limits = [(1e-5, 1e-2), (0.00, 0.70), (0.60, 0.98), (0, 0.001), (0, .9), (0, .9), (0, .9), (0, .9), (0, 3)] for k, v in zip(keys, limits): hyp[k] = np.clip(hyp[k], v[0], v[1])

最終訓練的超參數搜索的結果可視化：

參考資料：

官方issue: https://github.com/ultralytics/yolov3/issues/392

官方代碼：https://github.com/ultralytics/yolov3

本文分享自微信公眾號 - GiantPandaCV（BBuf233），作者：pprp

原文出處及轉載信息見文內詳細說明，如有侵權，請聯系 yunjia_community@tencent.com 刪除。

原始發表時間：2020-01-19

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 超參數（Hyperparameter）差分進化算法 DE-Differential Evolution SBX(Simulated binary crossover)模擬二進制交叉算子和DE(differential evolution)差分進化算子什么是超參數？ Hyperparameter tuning 什么是超參數差分進化算法優化集成參數參數與超參數 Hyperband:超參數優化超參數優化