caffe學習一：ubuntu16.04下跑Faster R-CNN demo (基於caffe). (親測有效，記錄經歷兩天的吐血經歷)

https://www.cnblogs.com/elitphil/p/11527732.html

caffe學習二：py-faster-rcnn配置運行faster_rcnn_end2end-VGG_CNN_M_1024 (Ubuntu16.04)

https://www.cnblogs.com/elitphil/p/11547429.html

一般上面兩個操作你實現了，使用Faster RCNN訓練自己的數據就順手好多。

第一步：准備自己的數據集

(1). 首先，自己的數據集(或自己拍攝或網上下載)分辨率可能太大，不利於訓練，通過一頓操作把他們縮小到跟VOC里的圖片差不多大小。

在/py-faster-rcnn/data/VOCdevkit2007/VOC2007 (找到你自己文件相對應的目錄)，新建一個python文件(如命名為trans2voc_format.py)

把以下內容粘貼復制進去，然后執行該python文件即可對你的圖片進行裁剪縮放等操作：

#coding=utf-8
import os  #打開文件時需要
from PIL import Image
import re
 
Start_path='./JPEGImages/'      # 唯一一處需要修改的地方。把對應的圖片目錄換成你的圖片目錄。
iphone5_width=333               # 圖片最大寬度
iphone5_depth=500               # 圖片最大高度
 
list=os.listdir(Start_path)
#print list
count=0
for pic in list:
    path=Start_path+pic
    print path
    im=Image.open(path)
    w,h=im.size
    #print w,h
    #iphone 5的分辨率為1136*640，如果圖片分辨率超過這個值，進行圖片的等比例壓縮
 
    if w>iphone5_width:
        print pic
        print "圖片名稱為"+pic+"圖片被修改"
        h_new=iphone5_width*h/w
        w_new=iphone5_width
        count=count+1
        out = im.resize((w_new,h_new),Image.ANTIALIAS)
        new_pic=re.sub(pic[:-4],pic[:-4]+'_new',pic)
        #print new_pic
        new_path=Start_path+new_pic
        out.save(new_path)
 
    if h>iphone5_depth:
        print pic
        print "圖片名稱為"+pic+"圖片被修改"
        w=iphone5_depth*w/h
        h=iphone5_depth
        count=count+1
        out = im.resize((w_new,h_new),Image.ANTIALIAS)
        new_pic=re.sub(pic[:-4],pic[:-4]+'_new',pic)
        #print new_pic
        new_path=Start_path+new_pic
        out.save(new_path)
 
print 'END'
count=str(count)
print "共有"+count+"張圖片尺寸被修改"



(2).圖片有了，然后我們需要對圖片進行重命名(理論上來說你不重命名來說也沒影響)。
    同樣在/py-faster-rcnn/data/VOCdevkit2007/VOC2007 (找到你自己文件相對應的目錄)，新建一個python文件(如命名為pic_rename.py)
    把以下內容粘貼復制進去，然后執行該文件，就可以把圖片重命名(如你有一百張圖片，則會重命名為：000001～0001000)：

# coding=utf-8
import os  # 打開文件時需要
from PIL import Image
import re


class BatchRename():
    def __init__(self):
        self.path = './JPEGImages'    # 同樣(也是)，把圖片路徑換成你的圖片路徑

    def rename(self):
        filelist = os.listdir(self.path)
        total_num = len(filelist)
        i = 000001                    # 還有這里需要注意下，圖片編號從多少開始，不要跟VOC原本的編號重復了。
        n = 6
        for item in filelist:
            if item.endswith('.jpg'):
                n = 6 - len(str(i))
                src = os.path.join(os.path.abspath(self.path), item)
                dst = os.path.join(os.path.abspath(self.path), str(0) * n + str(i) + '.jpg')
                try:
                    os.rename(src, dst)
                    print 'converting %s to %s ...' % (src, dst)
                    i = i + 1
                except:
                    continue
        print 'total %d to rename & converted %d jpgs' % (total_num, i)


if __name__ == '__main__':
    demo = BatchRename()
    demo.rename()

(3). 然后需要對圖片進行手動標注，建議使用labelImg工具，簡單方便。

下載地址：https://github.com/tzutalin/labelImg

使用方法特別簡單，設定xml文件保存的位置，打開你的圖片目錄，然后一幅一幅的標注就可以了

(借用參考鏈接第二條的一張圖)

把所有圖片文件標准完畢，並且生成了相對應的.xml文件。

接下來，來到voc207這里，把原來的圖片和xml刪掉(或備份)，位置分別是：

/home/py-faster-rcnn/data/VOCdevkit2007/VOC2007/JPEGImages
/home/py-faster-rcnn/data/VOCdevkit2007/VOC2007/Annotations

刪掉是因為我們不需要別的數據集，只想訓練自己的數據集，這樣能快一點

(4)數據和圖片就位以后，接下來生成訓練和測試用需要的txt文件索引，程序是根據這個索引來獲取圖像的。

在/py-faster-rcnn/data/VOCdevkit2007/VOC2007 (找到你自己文件相對應的目錄)，新建一個python文件(如命名為xml2txt.py)

把以下內容粘貼復制進去，然后執行該python文件即可生成索引文件：

# !/usr/bin/python
# -*- coding: utf-8 -*-
import os
import random  
  
trainval_percent = 0.8           #trainval占比例多少
train_percent = 0.7              #test數據集占比例多少
xmlfilepath = 'Annotations'  
txtsavepath = 'ImageSets\Main'   # 生成的索引文集所在路徑
total_xml = os.listdir(xmlfilepath)  
  
num=len(total_xml)  
list=range(num)  
tv=int(num*trainval_percent)  
tr=int(tv*train_percent)  
trainval= random.sample(list,tv)  
train=random.sample(trainval,tr)  
  
ftrainval = open('ImageSets/Main/trainval.txt', 'w')  
ftest = open('ImageSets/Main/test.txt', 'w')  
ftrain = open('ImageSets/Main/train.txt', 'w')  
fval = open('ImageSets/Main/val.txt', 'w')  
  
for i  in list:  
    name=total_xml[i][:-4]+'\n'  
    if i in trainval:  
        ftrainval.write(name)  
        if i in train:  
            ftrain.write(name)  
        else:  
            fval.write(name)  
    else:  
        ftest.write(name)  
  
ftrainval.close()  
ftrain.close()  
fval.close()  
ftest.close()


生成的索引文件在這