caffe-ssd訓練自己的數據

本文轉載自查看原文 2019-08-05 16:44 532 SSD/ 深度學習/ 目標檢測/ Caffe

1.配置環境

參考上一篇博客：cuda:9.0-cudnn7-devel-ubuntu16.04 python3 docker 安裝 caffe

2.准備數據

2.1 獲取數據

這次是要做一個車身條幅檢測的項目。大部分數據從百度圖片爬取，少量通過微博或者截取視頻幀獲取。因為違規的車身條幅數據較難獲取，所以大部分圖片都是一些運送救災物資的車輛。

2.2 標注數據

使用labelImg對圖片進行標注。因為檢測類別只有一個，所以設置默認標簽和自動保存之后還是標的比較快的。這里有兩個地方要注意：

因為打開圖片進行標記時默認全屏（也可能是設置的問題？）而一些小圖片上面的物體本來就看不清，強行放大之后進行標記的話可能對模型的准確率有影響（沒有標記difficult屬性），所以我在標記的時候都是先將圖片顯示為原圖大小。可以修改labelImg.py設置顯示原圖大小的快捷鍵。
因為標記框都是水平的矩形，而只要條幅傾斜一點，標記框內大部分區域都是沒有物體的。所以后面數據增廣如果有旋轉操作的話，之前的標記框就不適用了，只能重新標記。這里建議先做旋轉的增廣，統一標注之后再做其他的增廣。

2.3 數據增廣

使用這個repo做數據增廣。我做的增廣包括小角度旋轉、水平鏡像和修改hsv值。根據里面的quick-start.ipynb很容易修改適來配自己的數據。最后得到訓練集3041張和測試集605張。截取部分代碼：

try:
    import xml.etree.cElementTree as ET
except ImportError:
    import xml.etree.ElementTree as ET
# 解析xml文件
def getAnnotBoxLoc(AnotPath):
    tree = ET.ElementTree(file=AnotPath)
    root = tree.getroot()
    ObjectSet=root.findall('object')
    ObjBndBoxSet={}
    for Object in ObjectSet:
        ObjName=Object.find('name').text
        BndBox=Object.find('bndbox')
        x1 = float(BndBox.find('xmin').text)
        y1 = float(BndBox.find('ymin').text)
        x2 = float(BndBox.find('xmax').text)
        y2 = float(BndBox.find('ymax').text)
        BndBoxLoc=[x1,y1,x2,y2]
        if ObjBndBoxSet.__contains__(ObjName):
            ObjBndBoxSet[ObjName].append(BndBoxLoc)
        else:
            ObjBndBoxSet[ObjName]=[BndBoxLoc]
    if len(ObjBndBoxSet.keys()) == 1:
        return np.array(ObjBndBoxSet[ObjName])
    else:
        return ObjBndBoxSet

data_path = ''
xml_path = ''
aug_data_path = ''
aug_xml_path = ''
train_image_list = os.listdir(data_path)
print(len(train_image_list))

for image_name in train_image_list:
    img = cv2.imread(os.path.join(data_path, image_name))
    bboxes = getAnnotBoxLoc(os.path.join(xml_path, image_name.split('.')[0]+'.xml'))
    img_, bboxes_ = RandomHorizontalFlip(1)(img.copy(), bboxes.copy())
    bboxes_.astype(np.int64)
#     保存圖片
    cv2.imwrite(os.path.join(aug_data_path, image_name.split('.')[0]+'_hf.jpg'), img_)
#     保存標注
    tree = ET.ElementTree(file=os.path.join(xml_path, image_name.split('.')[0]+'.xml'))
    root = tree.getroot()
    
    name = root.find('filename')
    name.text = image_name.split('.')[0]+'_hf.jpg'
    
    objectSet = root.findall('object')
    for i,obj in enumerate(objectSet):
        bndBox = obj.find('bndbox')
        bndBox.find('xmin').text = str(int(bboxes_[i][0]))
        bndBox.find('ymin').text = str(int(bboxes_[i][1]))
        bndBox.find('xmax').text = str(int(bboxes_[i][2]))
        bndBox.find('ymax').text = str(int(bboxes_[i][3]))
    tree.write(os.path.join(aug_xml_path, image_name.split('.')[0]+'_hf.xml'))

3.處理數據

在caffe的data目錄下新建banner（自己的數據集）和VOCdevkit兩個目錄，使data目錄結構如下：

.
├── banner
├── cifar10
├── coco
├── ilsvrc12
├── ILSVRC2016
├── mnist
├── VOC0712
└── VOCdevkit
    └── banner
        ├── Annotations（所有的標注文件）
        ├── ImageSets
        │   └── Main（里面有test.txt和trainval.txt兩個文件，文件內容分別是測試集和訓練集全部圖片的文件名，每個文件名占一行，一般由matlab腳本生成，這里是我自己分的）
        ├── JPEGImages（保存所有的圖片）
        └── lmdb（最后生成的lmdb文件存放目錄）
            ├── banner_test_lmdb
            └── banner_trainval_lmdb

data/banner下包含以下文件：

labelmap_voc_banner.prototxt

item {
  name: "none_of_the_above"
  label: 0
  display_name: "background"
}
item {
  name: "banner"
  label: 1
  display_name: "banner"
}

test_name_size.txt(由create_list_banner.sh生成)

test0259 360 480
test0375 1457 1080
test0118 350 500
test0056 750 1334
...

test.txt(同上)

banner/JPEGImages/test0259.jpg banner/Annotations/test0259.xml
banner/JPEGImages/test0375.jpg banner/Annotations/test0375.xml
banner/JPEGImages/test0118.jpg banner/Annotations/test0118.xml
banner/JPEGImages/test0056.jpg banner/Annotations/test0056.xml
...

trainval.txt(同上)

banner/JPEGImages/train_2878.jpg banner/Annotations/train_2878.xml
banner/JPEGImages/train_2904.jpg banner/Annotations/train_2904.xml
banner/JPEGImages/train_0786.jpg banner/Annotations/train_0786.xml
banner/JPEGImages/train_1818.jpg banner/Annotations/train_1818.xml
...

create_list_banner.sh由data/VOC0712中的create_list.sh（看得懂大概的意思，有些細節還不清楚）修改而來，修改之后mv到data/banner下。執行前ImageSets/Main中需要有test.txt和trainval.txt兩個文件。執行后會生成test.txt、trainval.txt、test_name_size.txt和trainval_name_size.txt4個文件。

修改同目錄下的create_data.sh得到create_data_banner.sh，生成用於訓練的lmdb文件。

4.訓練

下載SSD頁面的預訓練模型：07++12+COCO: SSD300*（07++12表示用10k的trainval2007+test2007和16k的trainval2012作為訓練集（07++12），test2012作為測試集）。下載之后將VGGNet文件夾整個移到model目錄下面。
修改里面的finetune_ssd_pascal.py，包括

文件路徑
defaultbox寬高比（適配條幅）
mbox_source_layers的后綴（因為原來是在21類訓練，自己的數據集只有兩類，使用不同的名字表示不讀取這些層（mbox_source_layers）的參數）
類別數和訓練參數

因為原版ssd的腳本是在python2下寫的，需要修改一些地方適配python3：

finetune_ssd_pascal.py: 在Python 3中，range()與xrange()合並為range()。因此修改文件中的xrange為range
finetune_ssd_pascal.py: test_iter = int(num_test_image/test_batch_size)
model_libs.py: 注釋掉文件中，函數UnpackVariable定義中與assert相關的語句: assert len > 0
model_libs.py: 將文件中類似pad=int((3+(dilation-1)*2)-1)/2的語句的/2改為//2，一共3句(python中/的結果都是float，//才是int)

使用在07++12+COCO上的預訓練模型能顯著加快模型訓練速度，比如迭代200次就能達到59.8%的mAP。最后模型的mAP為85.8%，共迭代44000次。

完整finetune_ssd_pascal.py文件

5.測試

修改caffe/examples中的ssd_detect.ipynb，在圖片上標記檢測結果並保存:

# 返回檢測結果
def get_result(images_dir_path):
    images_list = os.listdir(images_dir_path)
    # 對每張圖片
    res_dict = dict()
    for image_name in images_list:
        image = caffe.io.load_image(os.path.join(images_dir_path, image_name))
        transformed_image = transformer.preprocess('data', image)
        net.blobs['data'].data[...] = transformed_image

        # Forward pass.
        detections = net.forward()['detection_out']

        # Parse the outputs.
        det_label = detections[0,0,:,1]
        det_conf = detections[0,0,:,2]
        det_xmin = detections[0,0,:,3]
        det_ymin = detections[0,0,:,4]
        det_xmax = detections[0,0,:,5]
        det_ymax = detections[0,0,:,6]

        # Get detections with confidence higher than 0.6.
        top_indices = [i for i, conf in enumerate(det_conf) if conf >= 0.3]

        top_conf = det_conf[top_indices] 
        top_xmin = det_xmin[top_indices] * image.shape[1]
        top_ymin = det_ymin[top_indices] * image.shape[0]
        top_xmax = det_xmax[top_indices] * image.shape[1]
        top_ymax = det_ymax[top_indices] * image.shape[0]
        
        res = np.stack([top_conf, top_xmin, top_ymin, top_xmax, top_ymax], -1).reshape(len(top_conf), 5)
        res_dict[image_name] = res
        
    return res_dict

anno_data = get_result('')

import cv2
dir = ""
images_list = os.listdir(dir)
for image_name in images_list:
    res = anno_data[image_name]
    image = cv2.imread(os.path.join(dir, image_name))
    for data in res:
        pt1 = int(data[1]), int(data[2])
        pt2 = int(data[3]), int(data[4])
        pt_txt = int(data[1]), int(data[2])-6
        pt_bg_tl = int(data[1]), int(data[2])-20
        pt_bg_br = int(data[1]) + 90, int(data[2])
        # image = cv2.rectangle(image.copy(), pt1, pt2, (0,255,0), int(max(image.shape[:2])/200))
        image = cv2.rectangle(image.copy(), pt1, pt2, (0,255,0), 2)
        image = cv2.rectangle(image.copy(), pt_bg_tl, pt_bg_br, (255, 255, 255), -1)
        cv2.putText(image, 'banner: %.2f' % data[0], pt_txt, cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0, 0, 0), 1)
        os.chdir("")
    cv2.imwrite(image_name, image)

測試效果圖：

測試效果圖（GIF）：

以下圖文無關：

參考

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 caffe-ssd使用預訓練模型做目標檢測 Caffe上用SSD訓練和測試自己的數據 Ubuntu16.04 + caffe-ssd + [CPU_ONLY] + KITTI 訓練總結 caffe_ssd學習-用自己的數據做訓練 Caffe-SSD相關源碼說明和調試記錄 caffe-ssd的GPU在make runtest的時候報錯：BatchReindexLayerTest/2.TestGradient，where TypeParam=caffe::GPUdevice（）( ) caffe + ssd網絡訓練過程 caffe再見之訓練自己的數據在caffe-ssd的環境搭建中遇到報錯信息：Makefile:588: recipe for target '.build_release/cuda/src/caffe/layers/softmax_loss_layer.o' failed caffe訓練自己的數據集