[PocketFlow]解決在coco上mAP非常低的bug

本文轉載自查看原文 2019-03-23 20:58 1323 深度學習/ 目標檢測/ PocketFlow

1.問題

繼上次訓練掛起的bug后，又遇到了現在評估時AP非常低的bug。具體有多低呢？Pelee論文中提到，用128的batchsize大小在coco數據集上訓練70K次迭代后，AP@0.5:0.95為22.4，而我用32的batchsize反復微調之后，最后AP也只從2.9上升到了3.7...下圖為訓練的過程：

2.解決

其實看loss和accuracy還是可以的，但是ap就是上不去，粗略想到了4個地方可能存在的問題：

訓練有誤
因為嘗試了各種學習率，而且在各個學習率下都是訓練到ap和loss都不變之后才改變學習率，因此我想不出訓練還有什么其他花樣了...排除
數據有誤
因為coco的數據是自己轉換的，說實話對這塊還是有點不放心的。況且之前訓練的bug已經檢查過數據了，看樣子是沒什么問題。現在再檢查工作量也比較大，存疑，先放着
模型有誤
模型在voc數據集上能成功復現，排除
計算有誤
因為coco評估的腳本是在voc上改的，生成json文件之后再使用官方的cocoapi計算。所以有很大可能是生成json文件的腳本哪里寫錯了

綜上，先排查評估是否有計算錯誤。
但是沒看出來...於是，我尋思着在原來的voc_eval.py文件上做一些修改，以適配coco，再用voc的方式評估。雖然計算方式有差別，但不會差太遠。如果ap變化不大，那說明計算方式沒有問題，需要檢查數據(最怕的就是這種情況，因為工作量很大，而且數據也容易出錯，還好gt標錯的坑已經提前踩了，~~想想就可怕~~)；反之就是評估計算有問題。

# voc_eval.py 程序結構
def do_python_eval(dataset_path, pred_path, use_07=True):
aps = []
#對每個類別：執行
rec, prec, ap = voc_eval(filename, # 每個類別的預測文件result_x.txt
    os.path.join(dataset_path, anno_files), # 每張圖片的標注文件
    os.path.join(dataset_path, all_images_file), # 所有圖片的文件名文件
    cls_name, # 類別名
    cache_path, # 用於暫存所有圖片的標注
    ovthresh=0.5,
    use_07_metric=use_07_metric)
aps += [ap]
#將rec, prec和ap 存到對應類別的 xx_pr.pkl文件
#打印AP和mAP

def voc_eval(detpath,
                     annopath,
                     imagesetfile,
             	     classname,
                     cachedir,
                     ovthresh=0.5,
                     use_07_metric=True):
    #1.從imagesetfile中讀取所有圖片文件名
    #2.如果cachedir中的annots.pkl文件不存在，則將標注按文件名打包之后寫入annots.pkl；否則加載文件到recs字典中（key為文件名）
    #3.提取該類別的標注，按文件名打包存到class_recs字典中
    #4.從txt文件讀取該類別的標注
    #5.計算rec, prec, 用voc_ap計算ap
    #6.返回rec, prec, ap

#如果要更改以適配COCO，需要：
#獲取所有image_id
#從minival.json中按image_id提取標注

分析了一波之后改好了代碼，也能正常運行（非常慢）。要命的是好不容易把文件都讀完了，結果報錯了。顯示keyerror：

{213035: [{'name': 'scissors', 'bbox': [314.25, 168.05, 79.57, 53.75]}, {'name': 'scissors', 'bbox': [238.06, 170.62, 89.75, 64.11]}, {'name': 'person', 'bbox': [0.0, 110.12, 311.59, 235.62]}, {'name': 'person', 'bbox': [195.75, 2.88, 260.04, 109.39]}, {'name': 'bowl', 'bbox': [177.04, 0.0, 66.69, 135.89]}, {'name': 'person', 'bbox': [305.0, 53.24, 330.51, 373.76]}]}
r = [obj for obj in dic[213035] if obj['name'] == 'person']
# 大致是說obj沒有name屬性

輸出改obj的iamge_id之后到minival.json去找，發現這個標注的類別是82還是83...coco不是80個類嗎？給我整懵逼了...
但是，突然靈光一閃，我似乎已經找到問題所在了。coco數據集雖然有80個類，但是卻不是順序排下來的，中間有跳過的序號，所以真實的序號是從1到90，這個項目之前做過一個轉換：

labelmap = {
    "none_of_the_above": 0,
    "1": 1,
    "2": 2,
    "3": 3,
    "4": 4,
    "5": 5,
    "6": 6,
    "7": 7,
    "8": 8,
    "9": 9,
    "10": 10,
    "11": 11,
    "13": 12,
    "14": 13,
    "15": 14,
    "16": 15,
    "17": 16,
    "18": 17,
    "19": 18,
    "20": 19,
    "21": 20,
    "22": 21,
    "23": 22,
    "24": 23,
    "25": 24,
    "27": 25,
    "28": 26,
    "31": 27,
    "32": 28,
    "33": 29,
    "34": 30,
    "35": 31,
    "36": 32,
    "37": 33,
    "38": 34,
    "39": 35,
    "40": 36,
    "41": 37,
    "42": 38,
    "43": 39,
    "44": 40,
    "46": 41,
    "47": 42,
    "48": 43,
    "49": 44,
    "50": 45,
    "51": 46,
    "52": 47,
    "53": 48,
    "54": 49,
    "55": 50,
    "56": 51,
    "57": 52,
    "58": 53,
    "59": 54,
    "60": 55,
    "61": 56,
    "62": 57,
    "63": 58,
    "64": 59,
    "65": 60,
    "67": 61,
    "70": 62,
    "72": 63,
    "73": 64,
    "74": 65,
    "75": 66,
    "76": 67,
    "77": 68,
    "78": 69,
    "79": 70,
    "80": 71,
    "81": 72,
    "82": 73,
    "84": 74,
    "85": 75,
    "86": 76,
    "87": 77,
    "88": 78,
    "89": 79,
    "90": 80
}
COCO_LABELS = {
    "bench": (14, 'outdoor'),
    "skateboard": (37, 'sports'),
    "toothbrush": (80, 'indoor'),
    "person": (1, 'person'),
    "donut": (55, 'food'),
    "none": (0, 'background'),
    "refrigerator": (73, 'appliance'),
    "horse": (18, 'animal'),
    "elephant": (21, 'animal'),
    "book": (74, 'indoor'),
    "car": (3, 'vehicle'),
    "keyboard": (67, 'electronic'),
    "cow": (20, 'animal'),
    "microwave": (69, 'appliance'),
    "traffic light": (10, 'outdoor'),
    "tie": (28, 'accessory'),
    "dining table": (61, 'furniture'),
    "toaster": (71, 'appliance'),
    "baseball glove": (36, 'sports'),
    "giraffe": (24, 'animal'),
    "cake": (56, 'food'),
    "handbag": (27, 'accessory'),
    "scissors": (77, 'indoor'),
    "bowl": (46, 'kitchen'),
    "couch": (58, 'furniture'),
    "chair": (57, 'furniture'),
    "boat": (9, 'vehicle'),
    "hair drier": (79, 'indoor'),
    "airplane": (5, 'vehicle'),
    "pizza": (54, 'food'),
    "backpack": (25, 'accessory'),
    "kite": (34, 'sports'),
    "sheep": (19, 'animal'),
    "umbrella": (26, 'accessory'),
    "stop sign": (12, 'outdoor'),
    "truck": (8, 'vehicle'),
    "skis": (31, 'sports'),
    "sandwich": (49, 'food'),
    "broccoli": (51, 'food'),
    "wine glass": (41, 'kitchen'),
    "surfboard": (38, 'sports'),
    "sports ball": (33, 'sports'),
    "cell phone": (68, 'electronic'),
    "dog": (17, 'animal'),
    "bed": (60, 'furniture'),
    "toilet": (62, 'furniture'),
    "fire hydrant": (11, 'outdoor'),
    "oven": (70, 'appliance'),
    "zebra": (23, 'animal'),
    "tv": (63, 'electronic'),
    "potted plant": (59, 'furniture'),
    "parking meter": (13, 'outdoor'),
    "spoon": (45, 'kitchen'),
    "bus": (6, 'vehicle'),
    "laptop": (64, 'electronic'),
    "cup": (42, 'kitchen'),
    "bird": (15, 'animal'),
    "sink": (72, 'appliance'),
    "remote": (66, 'electronic'),
    "bicycle": (2, 'vehicle'),
    "tennis racket": (39, 'sports'),
    "baseball bat": (35, 'sports'),
    "cat": (16, 'animal'),
    "fork": (43, 'kitchen'),
    "suitcase": (29, 'accessory'),
    "snowboard": (32, 'sports'),
    "clock": (75, 'indoor'),
    "apple": (48, 'food'),
    "mouse": (65, 'electronic'),
    "bottle": (40, 'kitchen'),
    "frisbee": (30, 'sports'),
    "carrot": (52, 'food'),
    "bear": (22, 'animal'),
    "hot dog": (53, 'food'),
    "teddy bear": (78, 'indoor'),
    "knife": (44, 'kitchen'),
    "train": (7, 'vehicle'),
    "vase": (76, 'indoor'),
    "banana": (47, 'food'),
    "motorcycle": (4, 'vehicle'),
    "orange": (50, 'food')
}

媽的生成json文件的時候我忘了換回來了（其實想一下好像不這樣來回轉也行）...所以只有序號從1到11的類別能夠對上。簡單修改后，得到真正的預測結果:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.199
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.343
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.201
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.030
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.200
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.365
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.201
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.295
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.314
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.054
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.342
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.548

與論文中的22.4和38.3相比還有差距，但是至少沒有開始嚇人了，再根據正確的AP微調一下應該還能提高點。
算是誤打誤撞解決了？
雖然這次的bug也調了一星期，但是明顯沒有上次那么慌了。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 python3.6 配置COCO API出錯解決方案 kafka遇到的bug（待解決） jsPDF怎么解決分辨率低的問題解決 Jumpserver coco 使用登錄用戶(ldap)進行SSH連接目標主機，忽略系統用戶怎么解決你的小程序有“bug”的問題？ pytest + 登錄禪道：自動提交bug-編輯bug-確認bug-解決bug-關閉bug Android 視頻通信，低延時解決方案 BUG element-ui bug及解決方案 Synergy 一個bug的解決辦法