Caffe學習使用__運行caffe自帶的兩個簡單例子


為了程序的簡潔,在caffe中是不帶練習數據的,因此需要自己去下載。但在caffe根目錄下的data文件夾里,作者已經為我們編寫好了下載數據的腳本文件,我們只需要聯網,運行這些腳本文件就行了。

注意:在caffe中運行所有程序,都必須在根目錄下進行。

1、mnist實例

mnist是一個手寫數字庫。mnist最初用於支票上的手寫數字識別, 現在成了DL的入門練習庫。征對mnist識別的專門模型是Lenet,算是最早的cnn模型了。

mnist數據訓練樣本為60000張,測試樣本為10000張,每個樣本為28*28大小的黑白圖片,手寫數字為0-9,因此分為10類。

首先下載mnist數據

(caffe_src) root@ranxf-TEST:/workdisk/caffe# sh data/mnist/get_mnist.sh
Downloading...
--2019-09-12 13:09:21--  http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
正在解析主機 yann.lecun.com (yann.lecun.com)... 216.165.22.6
正在連接 yann.lecun.com (yann.lecun.com)|216.165.22.6|:80... 已連接。
已發出 HTTP 請求,正在等待回應... 200 OK
長度: 9912422 (9.5M) [application/x-gzip]
正在保存至: “train-images-idx3-ubyte.gz”

train-images-idx3-ubyte. 100%[===============================>]   9.45M  23.5KB/s    in 14m 22s 

2019-09-12 13:23:44 (11.2 KB/s) - 已保存 “train-images-idx3-ubyte.gz” [9912422/9912422])

--2019-09-12 13:23:44--  http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
正在解析主機 yann.lecun.com (yann.lecun.com)... 216.165.22.6
正在連接 yann.lecun.com (yann.lecun.com)|216.165.22.6|:80... 已連接。
已發出 HTTP 請求,正在等待回應... 200 OK
長度: 28881 (28K) [application/x-gzip]
正在保存至: “train-labels-idx1-ubyte.gz”

train-labels-idx1-ubyte. 100%[===============================>]  28.20K  54.8KB/s    in 0.5s    

2019-09-12 13:23:46 (54.8 KB/s) - 已保存 “train-labels-idx1-ubyte.gz” [28881/28881])

--2019-09-12 13:23:46--  http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
正在解析主機 yann.lecun.com (yann.lecun.com)... 216.165.22.6
正在連接 yann.lecun.com (yann.lecun.com)|216.165.22.6|:80... 已連接。
已發出 HTTP 請求,正在等待回應... 200 OK
長度: 1648877 (1.6M) [application/x-gzip]
正在保存至: “t10k-images-idx3-ubyte.gz”

t10k-images-idx3-ubyte.g 100%[===============================>]   1.57M  32.0KB/s    in 84s     

2019-09-12 13:25:10 (19.3 KB/s) - 已保存 “t10k-images-idx3-ubyte.gz” [1648877/1648877])

--2019-09-12 13:25:10--  http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
正在解析主機 yann.lecun.com (yann.lecun.com)... 216.165.22.6
正在連接 yann.lecun.com (yann.lecun.com)|216.165.22.6|:80... 已連接。
已發出 HTTP 請求,正在等待回應... 200 OK
長度: 4542 (4.4K) [application/x-gzip]
正在保存至: “t10k-labels-idx1-ubyte.gz”

t10k-labels-idx1-ubyte.g 100%[===============================>]   4.44K  --.-KB/s    in 0s      

2019-09-12 13:25:11 (121 MB/s) - 已保存 “t10k-labels-idx1-ubyte.gz” [4542/4542])

運行成功后,在 data/mnist/目錄下有四個文件:

(caffe_src) root@ranxf-TEST:/workdisk/caffe/data/mnist# lst10k-images-idx3-ubyte  訓練集樣本
t10k-labels-idx1-ubyte 訓練集對應標注
train-images-idx3-ubyte 測試集圖片
train-labels-idx1-ubyte 測試集對應標注

這些數據不能在caffe中直接使用,需要轉換成LMDB數據

(caffe_src) root@ranxf-TEST:/workdisk/caffe# sh examples/mnist/create_mnist.sh
Creating lmdb...
I0912 13:36:06.644217  3799 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_train_lmdb
I0912 13:36:06.644412  3799 convert_mnist_data.cpp:88] A total of 60000 items.
I0912 13:36:06.644423  3799 convert_mnist_data.cpp:89] Rows: 28 Cols: 28
I0912 13:36:11.209887  3799 convert_mnist_data.cpp:108] Processed 60000 files.
I0912 13:36:11.485198  3801 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_test_lmdb
I0912 13:36:11.485344  3801 convert_mnist_data.cpp:88] A total of 10000 items.
I0912 13:36:11.485355  3801 convert_mnist_data.cpp:89] Rows: 28 Cols: 28
I0912 13:36:12.264843  3801 convert_mnist_data.cpp:108] Processed 10000 files.
Done.

如果想運行leveldb數據,請運行 examples/siamese/ 文件夾下面的程序。而examples/mnist/ 文件夾是運行lmdb數據

轉換成功后,會在 examples/mnist/目錄下,生成兩個文件夾,分別是mnist_train_lmdb和mnist_test_lmdb,里面存放的data.mdb和lock.mdb,就是我們需要的運行數據。

(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/mnist# cd mnist_test_lmdb/
(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/mnist/mnist_test_lmdb# ls
data.mdb  lock.mdb
(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/mnist/mnist_test_lmdb# cd ../mnist_train_lmdb/
(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/mnist/mnist_train_lmdb# ls
data.mdb  lock.mdb
(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/mnist/mnist_train_lmdb# 

 接下來是修改配置文件,如果你有GPU且已經完全安裝好,這一步可以省略,如果沒有,則需要修改solver配置文件。

需要的配置文件有兩個,一個是lenet_solver.prototxt,另一個是train_lenet.prototxt.

首先打開lenet_solver.prototxt

(caffe_src) root@ranxf-TEST:/workdisk/caffe# vim examples//mnist/lenet_solver.prototxt

根據需要,在max_iter處設置最大迭代次數,以及決定最后一行solver_mode,是否要改成CPU。(我目前還沒有GPU,只能改為CPU)

保存退出后,就可以運行這個例子了

(caffe_src) root@ranxf-TEST:/workdisk/caffe# time sh examples/mnist/train_lenet.sh
I0912 13:53:03.622133  4864 caffe.cpp:197] Use CPU.
I0912 13:53:03.622301  4864 solver.cpp:45] Initializing solver from parameters: 
test_iter: 100
test_interval: 500
base_lr: 0.01
display: 100
max_iter: 10000
………………

………………
I0912 14:05:13.225632  4867 data_layer.cpp:73] Restarting data prefetching from start.
I0912 14:05:13.387485  4864 solver.cpp:414]     Test net output #0: accuracy = 0.9913
I0912 14:05:13.387523  4864 solver.cpp:414]     Test net output #1: loss = 0.0285459 (* 1 = 0.0285459 loss)
I0912 14:05:13.387529  4864 solver.cpp:332] Optimization Done.
I0912 14:05:13.387535  4864 caffe.cpp:250] Optimization Done.

real    12m9.863s
user    12m12.844s
sys    0m0.236s

CPU運行時候大約13分鍾,精度為99%左右。

(caffe) root@test:/opt/caffe# time sh examples/mnist/train_lenet.sh
I0924 10:57:55.730465 10305 caffe.cpp:204] Using GPUs 0
I0924 10:57:55.754664 10305 caffe.cpp:209] GPU 0: GeForce GTX TITAN X
I0924 10:57:56.068701 10305 solver.cpp:45] Initializing solver from parameters: 
test_iter: 100
test_interval: 500
base_lr: 0.01
display: 100
max_iter: 10000
lr_policy: "inv"
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
solver_mode: GPU
device_id: 0
net: "examples/mnist/lenet_train_test.prototxt"
train_state {
  level: 0
  stage: ""
}

…………
…………

I0924 10:58:42.541656 10305 sgd_solver.cpp:284] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_10000.solverstate
I0924 10:58:42.546165 10305 solver.cpp:327] Iteration 10000, loss = 0.00301418
I0924 10:58:42.546185 10305 solver.cpp:347] Iteration 10000, Testing net (#0)
I0924 10:58:42.783694 10311 data_layer.cpp:73] Restarting data prefetching from start.
I0924 10:58:42.792412 10305 solver.cpp:414]     Test net output #0: accuracy = 0.9917
I0924 10:58:42.792433 10305 solver.cpp:414]     Test net output #1: loss = 0.0296816 (* 1 = 0.0296816 loss)
I0924 10:58:42.792439 10305 solver.cpp:332] Optimization Done.
I0924 10:58:42.792444 10305 caffe.cpp:250] Optimization Done.

real    0m47.216s
user    0m48.420s
sys    0m10.966s

GPU運行時候大約48秒,精度為99%左右。

 2、cifar10實例

cifar10數據訓練樣本50000張,測試樣本10000張,每張為32*32的彩色三通道圖片,共分為10類。

下載數據:

(caffe_src) root@ranxf-TEST:/workdisk/caffe# sh data/cifar10/get_cifar10.sh
(caffe_src) root@ranxf-TEST:/workdisk/caffe# sh data/cifar10/get_cifar10.sh 
Downloading...
--2019-09-12 14:08:51--  http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz
正在解析主機 www.cs.toronto.edu (www.cs.toronto.edu)... 128.100.3.30
正在連接 www.cs.toronto.edu (www.cs.toronto.edu)|128.100.3.30|:80... 已連接。
已發出 HTTP 請求,正在等待回應... 200 OK
長度: 170052171 (162M) [application/x-gzip]
正在保存至: “cifar-10-binary.tar.gz”

cifar-10-binary.tar.gz                 100%[==========================================================================>] 162.17M  39.8KB/s    in 1h 43m  

2019-09-12 15:52:50 (26.6 KB/s) - 已保存 “cifar-10-binary.tar.gz” [170052171/170052171])

Unzipping...
Done.

運行成功后,會在 data/cifar10/文件夾下生成一堆bin文件

(caffe_src) root@ranxf-TEST:/workdisk/caffe/data/cifar10# ll
總用量 180092
drwxr-xr-x 2 root root     4096 9月  12 15:52 ./
drwxr-xr-x 6 root root     4096 9月  10 15:30 ../
-rw-r--r-- 1 2156 1103       61 6月   5  2009 batches.meta.txt
-rw-r--r-- 1 2156 1103 30730000 6月   5  2009 data_batch_1.bin
-rw-r--r-- 1 2156 1103 30730000 6月   5  2009 data_batch_2.bin
-rw-r--r-- 1 2156 1103 30730000 6月   5  2009 data_batch_3.bin
-rw-r--r-- 1 2156 1103 30730000 6月   5  2009 data_batch_4.bin
-rw-r--r-- 1 2156 1103 30730000 6月   5  2009 data_batch_5.bin
-rwxr-xr-x 1 root root      506 9月  10 10:26 get_cifar10.sh*
-rw-r--r-- 1 2156 1103       88 6月   5  2009 readme.html
-rw-r--r-- 1 2156 1103 30730000 6月   5  2009 test_batch.bin
(caffe_src) root@ranxf-TEST:/workdisk/caffe/data/cifar10#

轉換數據格式為lmdb:

(caffe_src) root@ranxf-TEST:/workdisk/caffe# sh examples/cifar10/create_cifar10.sh
Creating lmdb...

轉換成功后,會在 examples/cifar10/文件夾下生成兩個文件夾,cifar10_train_lmdb和cifar10_test_lmdb, 里面的文件就是我們需要的文件。

 

(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/cifar10# cd cifar10_train_lmdb/
(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/cifar10/cifar10_train_lmdb# ls
data.mdb  lock.mdb
(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/cifar10/cifar10_train_lmdb# cd ../cifar10_test_lmdb/
(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/cifar10/cifar10_test_lmdb# ls
data.mdb  lock.mdb
(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/cifar10/cifar10_test_lmdb# 

為了節省時間,我們進行快速訓練(train_quick),訓練分為兩個階段,第一個階段(迭代4000次)調用配置文件cifar10_quick_solver.prototxt, 學習率(base_lr)為0.001

第二階段(迭代5000次)調用配置文件cifar10_quick_solver_lr1.prototxt, 學習率(base_lr)為0.0001

前后兩個配置文件就是學習率(base_lr)和最大迭代次數(max_iter)不一樣,其它都是一樣。

(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/cifar10# vim cifar10_quick_solver.prototxt 
(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/cifar10# 
(caffe_src) root@ranxf-TEST:/workdisk/caffe/examples/cifar10# vim cifar10_quick_solver_lr1.prototxt 

如果你對配置文件比較熟悉以后,實際上是可以將兩個配置文件合二為一的,設置lr_policy為multistep就可以了。

base_lr: 0.001
momentum: 0.9
weight_decay: 0.004
# The learning rate policy
# lr_policy: "fixed"
lr_policy: "multistep"
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 4000
# snapshot intermediate results
snapshot: 4000
snapshot_prefix: "examples/cifar10/cifar10_quick"
# solver mode: CPU or GPU
solver_mode: CPU

運行例子:

(caffe_src) root@ranxf-TEST:/workdisk/caffe# time sh examples/cifar10/train_quick.sh
I0912 16:23:04.298250  9363 caffe.cpp:197] Use CPU.
I0912 16:23:04.298424  9363 solver.cpp:45] Initializing solver from parameters: 
test_iter: 100
test_interval: 500
base_lr: 0.001
display: 100
max_iter: 4000
lr_policy: "fixed"
momentum: 0.9
weight_decay: 0.004
snapshot: 4000
snapshot_prefix: "examples/cifar10/cifar10_quick"
solver_mode: CPU
………………

I0912 17:10:29.430344 10100 solver.cpp:474] Snapshotting to HDF5 file examples/cifar10/cifar10_quick_iter_5000.caffemodel.h5
I0912 17:10:29.526800 10100 sgd_solver.cpp:296] Snapshotting solver state to HDF5 file examples/cifar10/cifar10_quick_iter_5000.solverstate.h5
I0912 17:10:29.745208 10100 solver.cpp:327] Iteration 5000, loss = 0.480207
I0912 17:10:29.745240 10100 solver.cpp:347] Iteration 5000, Testing net (#0)
I0912 17:10:49.806242 10103 data_layer.cpp:73] Restarting data prefetching from start.
I0912 17:10:50.642014 10100 solver.cpp:414]     Test net output #0: accuracy = 0.7558
I0912 17:10:50.642050 10100 solver.cpp:414]     Test net output #1: loss = 0.739888 (* 1 = 0.739888 loss)
I0912 17:10:50.642055 10100 solver.cpp:332] Optimization Done.
I0912 17:10:50.642061 10100 caffe.cpp:250] Optimization Done.

real    47m46.393s
user    47m50.689s
sys    0m0.312

CPU大約48分鍾左右,精度75%左右。

以下是GPU運行情況

real    2m6.112s
user    1m23.442s
sys    0m21.771s

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM