使用LeNet訓練自己的手寫圖片數據

本文轉載自查看原文 2017-11-02 10:11 2030

一、前言

本文主要嘗試將自己的數據集制作成lmdb格式，送進lenet作訓練和測試，參考了http://blog.csdn.net/liuweizj12/article/details/52149743和http://blog.csdn.net/xiaoxiao_huitailang/article/details/51361036這兩篇博文

二、從訓練模型到使用模型預測圖片分類

（1）自己准備的圖像數據

由於主要是使用lenet模型訓練自己的圖片數據，我的圖像數據共有10個類別，分別是0～9，相應地保存在名為0～9的文件夾，在/homg/您的用戶名/下新建一文件夾char_images，用於保存圖像數據，在/homg/您的用戶名/char_images/下新建兩個文件夾，名字分別為train和val，各自都包含了名為0～9的文件夾，例如文件夾0內存放的是字符”0”的圖像，我的文件夾如下：

（2）對圖像數據作統一縮放至28*28，並生成txt標簽

為了計算均值文件，需要將所有圖片縮放至統一的尺寸，在train和val文件夾所在路徑下創建python文件，命名getPath.py，並寫入以下內容：

[python] view plain copy

#coding:utf-8
import cv2
import os
def IsSubString( SubStrList , Str): #判斷SubStrList的元素
flag = True #是否在Str內
for substr in SubStrList:
if not ( substr in Str):
flag = False
return flag
def GetFileList(FindPath,FlagStr=[]): #搜索目錄下的子文件路徑
FileList=[]
FileNames=os.listdir(FindPath)
if len(FileNames)>0:
for fn in FileNames:
if len(FlagStr)>0:
if IsSubString(FlagStr,fn): #不明白這里判斷是為了啥
fullfilename=os.path.join(FindPath,fn)
FileList.append(fullfilename)
else:
fullfilename=os.path.join(FindPath,fn)
FileList.append(fullfilename)
if len(FileList)>0:
FileList.sort()
return FileList
train_txt = open('train.txt' , 'w') #制作標簽數據
classList =['0','1','2','3','4','5','6','7','8','9']
for idx in range(len(classList)) :
imgfile=GetFileList('train/'+ classList[idx])#將數據集放在與.py文件相同目錄下
for img in imgfile:
srcImg = cv2.imread( img);
resizedImg = cv2.resize(srcImg , (28,28))
cv2.imwrite( img ,resizedImg)
strTemp=img+' '+classList[idx]+'\n' #用空格代替轉義字符 \t
train_txt.writelines(strTemp)
train_txt.close()
test_txt = open('val.txt' , 'w') #制作標簽數據
for idx in range(len(classList)) :
imgfile=GetFileList('val/'+ classList[idx])
for img in imgfile:
srcImg = cv2.imread( img);
resizedImg = cv2.resize(srcImg , (28,28))
cv2.imwrite( img ,resizedImg)
strTemp=img+' '+classList[idx]+'\n' #用空格代替轉義字符 \t
test_txt.writelines(strTemp)
test_txt.close()
print("成功生成文件列表")

運行該py文件，可將所有圖片縮放至28*28大小，並且在rain和val文件夾所在路徑下生成訓練和測試圖像數據的標簽txt文件，文件內容為：

(3)生成lmdb格式的數據集

首先於caffe路徑下新建一文件夾My_File，並在My_File下新建兩個文件夾Build_lmdb和Data_label，將(2)中生成文本文件train.txt和val.txt搬至Data_label下

將caffe路徑下 examples/imagenet/create_imagenet.sh 復制一份到Build_lmdb文件夾下

打開create_imagenet.sh ，修改內容如下：

[python] view plain copy

#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
set -e
EXAMPLE=My_File/Build_lmdb #生成的lmdb格式數據保存地址
DATA=My_File/Data_label #兩個txt標簽文件所在路徑
TOOLS=build/tools #caffe自帶工具，不用管
TRAIN_DATA_ROOT=/home/zjy/char_images/ #預先准備的訓練圖片路徑，該路徑和train.txt上寫的路徑合起來是圖片完整路徑
VAL_DATA_ROOT=/home/zjy/char_images/ #預先准備的測試圖片路徑，...
# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=false
if $RESIZE; then
RESIZE_HEIGHT=28
RESIZE_WIDTH=28
else
RESIZE_HEIGHT=0
RESIZE_WIDTH=0
fi
if [ ! -d "$TRAIN_DATA_ROOT" ]; then
echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
"where the ImageNet training data is stored."
exit 1
fi
if [ ! -d "$VAL_DATA_ROOT" ]; then
echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
"where the ImageNet validation data is stored."
exit 1
fi
echo "Creating train lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
--gray \ #灰度圖像加上這個
$TRAIN_DATA_ROOT \
$DATA/train.txt \
$EXAMPLE/train_lmdb #生成的lmdb格式訓練數據集所在的文件夾
echo "Creating val lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
--gray \ #灰度圖像加上這個
$VAL_DATA_ROOT \
$DATA/val.txt \
$EXAMPLE/val_lmdb #生成的lmdb格式訓練數據集所在的文件夾
echo "Done."

以上只是為了說明修改的地方才添加漢字注釋，實際時sh文件不要出現漢字，運行該sh文件，可在Build_lmdb文件夾內生成2個文件夾train_lmdb和val_lmdb，里面各有2個lmdb格式的文件

(4)更改lenet_solver.prototxt和lenet_train_test.prototxt
將caffe/examples/mnist下的 train_lenet.sh 、lenet_solver.prototxt 、lenet_train_test.prototxt 這三個文件復制至 My_File，首先修改train_lenet.sh 如下，只改了solver.prototxt的路徑

[python] view plain copy

#!/usr/bin/env sh
set -e
./build/tools/caffe train --solver=My_File/lenet_solver.prototxt $@ #改路徑

然后再更改lenet_solver.prototxt，如下：

[python] view plain copy

# The train/test net protocol buffer definition
net: "My_File/lenet_train_test.prototxt" #改這里
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "My_File/" #改這里
# solver mode: CPU or GPU
solver_mode: GPU

最后修改lenet_train_test.prototxt ,如下：

[python] view plain copy

name: "LeNet"
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "My_File/Build_lmdb/train_lmdb" #改成自己的
batch_size: 64
backend: LMDB
}
}
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
}
data_param {
source: "My_File/Build_lmdb/val_lmdb" #改成自己的
batch_size: 100
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}

運行 My_File/train_lenet.sh ，得到最后的訓練結果，在My_File下生成訓練的caffemodel和solverstate。

(5)生成均值文件
均值文件主要用於圖像預測的時候，由caffe/build/tools/compute_image_mean生成，在My_File文件夾下新建一文件夾Mean，用於存放均值文件，在caffe/下執行：
build/tools/compute_image_mean My_File/Build_lmdb/train_lmdb My_File/Mean/mean.binaryproto
可在My_File/Mean/下生成均值文件mean.binaryproto
(6)生成deploy.prototxt
deploy.prototxt是在lenet_train_test.prototxt的基礎上刪除了開頭的Train和Test部分以及結尾的Accuracy、SoftmaxWithLoss層，並在開始時增加了一個data層描述，結尾增加softmax層，可以參照博文http://blog.csdn.net/lanxuecc/article/details/52474476 使用python生成，也可以直接由train_val.prototxt上做修改，在My_File文件夾下新建一文件夾Deploy，將 lenet_train_test.prototxt復制至文件夾Deploy下，並重命名為deploy.prototxt ，修改里面的內容如下：

[python] view plain copy

name: "LeNet"
layer { #刪去原來的Train和Test部分,增加一個data層
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 1 dim: 1 dim: 28 dim: 28 } }
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer { #增加softmax層
name: "prob"
type: "Softmax"
bottom: "ip2"
top: "prob"
}

(7)預測圖片
在My_File文件夾下創建一文件夾Pic，用於存放測試的圖片；在My_File文件夾下創建另一文件夾Synset，在其中新建synset_words.txt文件，之后在里面輸入：
0
1
2
3
4
5
6
7
8
9

看看My_File文件夾都有啥了

最后使用caffe/build/examples/cpp_classification/classification.bin對圖片作預測，在終端輸入：

三、結束語

真是篇又臭又長的博文，高手自行忽略，剛剛入門的可以看看！

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 使用LeNet算法實現手寫數字識別（代碼部分解釋）手寫數字識別 ----在已經訓練好的數據上根據28*28的圖片獲取識別概率（基於Tensorflow,Python）漢字手寫訓練和識別手寫數字圖片識別實戰 Pytorch1.0入門實戰一：LeNet神經網絡實現 MNIST手寫數字識別手寫數字識別——基於LeNet-5卷積網絡模型卷積神經網絡入門：LeNet5（手寫體數字識別）詳解將MNIST手寫數據集轉換成圖片保存到本地手寫vue3.0圖片懶加載手寫promise