caffe2 教程入門（python版）

本文轉載自查看原文 2017-10-17 15:35 6563 ai/ caffe2

學習思路

1、先看官方文檔，學習如何使用python調用caffe2包，包括

Basics of Caffe2 - Workspaces, Operators, and Nets
Toy Regression
Image Pre-Processing
Loading Pre-Trained Models
MNIST - Create a CNN from Scratch

caffe2官方教程以python語言為主，指導如何使用python調用caffe2，文檔依次從最基本caffe中的幾個重要的類的概念、如何使用基礎類搭建一個小網絡、如何數據預處理、如何使用預訓練的模型、如何構造復雜的網絡來講述caffe2的使用。初學者可以先行學習官方文檔caffe2-tutorials，理解caffe2 中的網絡構建、網絡訓練的理念與思路，體會caffe2與caffe在整體構造上的不同。

2、結合着caffe2源碼看python實際調用的c++類

在python中，caffe2這個包中類與函數大部分是封裝了源碼文件夾caffe2/caffe2/core下的c++源文件，如基礎數據類Tensor，操作類Operator等，通過使用python中類的使用，找到對應c++源碼中類和函數的構造和實現，可以為使用c++直接構建和訓練網絡打下准備。

以下總結基於官方文檔和部分網絡資料。

基礎知識

首先從我們自己的角度出發來思考，假設我們自己需要寫一個簡單的多層神經網絡並訓練，一般邏輯上我們需要考慮數據的定義、數據的流動、數據的更新。

數據如何定義：訓練數據和網絡參數以什么形式存儲
數據如何流動：訓練數據經過哪些運算得到輸出，其實就是網絡的定義
數據如何更新：使用什么樣的梯度更新方法與參數，其實就是如何訓練

在caffe中，數據儲存在Blob類的實例當中，在這里，我們可以理解blob就像是numpy中數組，起的作用就是存儲數據。輸入的blobs經過不同層的往前傳遞，得到輸出的blobs，caffe中，我們可以認為對數據最基本的運算單位是layer。每一層的layer定義了不同的計算方式，數據經過不同的層，都做了相應的運算，由這些layers組合到一起網絡即構成了net，net本質上是一個計算網絡。當數據流動的方式構建好了，反向傳遞的梯度計算的方式也確定，在這個基礎之上，caffe中使用solver類來給定梯度更新的規則，網絡在solver的控制下，不斷讓數據前傳，再反傳求梯度，再使用梯度更新權值，循環往復。

所以對應着caffe中，基礎組成有四類：

blob：存儲數據和權值
layer：輸入數據blob 形式，輸出數據blob形式，層定義了計算
net：由多個layers組成，構成整體的網絡
solver：定義了訓練規則

再看caffe2中：

在caffe2中，operator是caffe2中的特色，取代了caffe中layer作為net的基本構造單位。如下圖所示，我們可以使用一個InnerProduct操作運輸符號來完成InnerProductLayer的功能。operator的接口定義在caffe2/proto/caffe2.proto，一般來說，operator接受一串輸入，產生一串輸出。

operator

由於operator定義很基礎，很抽象，因此caffe2中的權值初始化、前傳、反傳、梯度更新都可以用operator實現，所以solver、layer類在caffe2中都不是必要的。在caffe2中，對應的基礎組成有

blob：存儲數據
operator：輸入blob，輸出blob，定義了計算規則
網絡：net，由多個operator組合實現
workspace：caffe中沒有，可以理解成變量的空間，便於管理網絡和變量

具體使用和理解如下,先用python：

在使用之前，我們先導入caffe2.core和workspace，基礎的類和函數都在其中。同時我們需要導入caffe2.proto來對protobuf文件進行必要操作。

# We'll also import a few standard python libraries from matplotlib import pyplot import numpy as np import time # These are the droids you are looking for. from caffe2.python import core, workspace from caffe2.proto import caffe2_pb2 # Let's show all plots inline. %matplotlib inline

1、workspace

我們可以把workspace理解成matlab中變量存儲區，我們可以把定義好的數據blob或net放到都在一個workspace中，也可以用不用的workspace來區分。

下面我們打印一下當前workspace中blob情況。Blobs()取出blob，HasBlobs(name)判斷是否有此名字的blob。

print("Current blobs in the workspace: {}".format(workspace.Blobs()))
print("Workspace has blob 'X'? {}".format(workspace.HasBlob("X")))

一開始，當然結果是啥也沒有。

我們使用FeedBlob來給當前workspace添加blob，再打印出來：

X = np.random.randn(2, 3).astype(np.float32) print("Generated X from numpy:\n{}".format(X)) workspace.FeedBlob("X", X)

Generated X from numpy: [[-0.56927377 -1.28052795 -0.95808828] [-0.44225693 -0.0620895 -0.50509363]]

print("Current blobs in the workspace: {}".format(workspace.Blobs())) print("Workspace has blob 'X'? {}".format(workspace.HasBlob("X"))) print("Fetched X:\n{}".format(workspace.FetchBlob("X")))

Current blobs in the workspace: [u'X']
Workspace has blob 'X'? True
Fetched X:
[[-0.56927377 -1.28052795 -0.95808828]
 [-0.44225693 -0.0620895  -0.50509363]]

當然，我們也用多個名字定義多個workspace，並且可以切換工作空間。我們可以使用currentworkspace()在訪問當前工作空間，使用switchworkspace(name)來切換工作空間。

print("Current workspace: {}".format(workspace.CurrentWorkspace()))
print("Current blobs in the workspace: {}".format(workspace.Blobs()))

# Switch the workspace. The second argument "True" means creating
# the workspace if it is missing.
workspace.SwitchWorkspace("gutentag", True)

# Let's print the current workspace. Note that there is nothing in the
# workspace yet.
print("Current workspace: {}".format(workspace.CurrentWorkspace()))
print("Current blobs in the workspace: {}".format(workspace.Blobs()))

Current workspace: default
Current blobs in the workspace: ['X']
Current workspace: gutentag
Current blobs in the workspace: []

總結一下，在這里workspace功能類似於matlab中的工作區，變量存儲在其中，我們可以通過工作區去訪問在工作區中net和blob。

2、Operators

通常我們在python中，可以使用core.CreateOperator來直接創造，也可以使用core.Net來訪問創建operator,還可以使用modelHelper來訪問創建operators。在這里我們使用core.CreateOperator來簡單理解operator，在實際情況下，我們創建網絡的時候，不會直接創建每個operator，這樣太麻煩，一般使用modelhelper來幫忙我們創建網絡。

# Create an operator. op = core.CreateOperator( "Relu", # The type of operator that we want to run ["X"], # A list of input blobs by their names ["Y"], # A list of output blobs by their names ) # and we are done!

上面的代碼創建了一個Relu運算符，在這里需要知道，在python中創建一個operator，只是定義了一個operator，其實並沒有運行這個operator。在上面代碼中創建的op，實際上是一個protobuf對象。

print("Type of the created op is: {}".format(type(op))) print("Content:\n") print(str(op))

Type of the created op is: <class 'caffe2.proto.caffe2_pb2.OperatorDef'> Content: input: "X" output: "Y" name: "" type: "Relu"

在創造op之后，我們在當前的工作區中添加輸入X，然后使用RunOperatorOnce運行這個operator。運行之后，我們對比下得到的結果。

workspace.FeedBlob("X", np.random.randn(2, 3).astype(np.float32)) workspace.RunOperatorOnce(op)

print("Current blobs in the workspace: {}\n".format(workspace.Blobs())) print("X:\n{}\n".format(workspace.FetchBlob("X"))) print("Y:\n{}\n".format(workspace.FetchBlob("Y"))) print("Expected:\n{}\n".format(np.maximum(workspace.FetchBlob("X"), 0)))

Current blobs in the workspace: ['X', 'Y'] X: [[ 1.03125858 1.0038228 0.0066975 ] [ 1.33142471 1.80271244 -0.54222912]] Y: [[ 1.03125858 1.0038228 0.0066975 ] [ 1.33142471 1.80271244 0. ]] Expected: [[ 1.03125858 1.0038228 0.0066975 ] [ 1.33142471 1.80271244 0. ]]

此外，operator相對於layer更為抽象。operator不僅僅可以替代layer類，還可以接受無參數的輸入來輸出數據，從而用來生成數據，常用來初始化權值。下面這一段就可以用來初始化權值。

op = core.CreateOperator(
    "GaussianFill", [], # GaussianFill does not need any parameters. ["W"], shape=[100, 100], # shape argument as a list of ints. mean=1.0, # mean as a single float std=1.0, # std as a single float ) print("Content of op:\n") print(str(op))

Content of op:

output: "W" name: "" type: "GaussianFill" arg { name: "std" f: 1.0 } arg { name: "shape" ints: 100 ints: 100 } arg { name: "mean" f: 1.0 }

workspace.RunOperatorOnce(op)
temp = workspace.FetchBlob("Z")
pyplot.hist(temp.flatten(), bins=50)
pyplot.title("Distribution of Z")

3、Nets

Nets是一系列operator的集合，從本質上，是由operator構成的計算圖。Caffe2中core.net 封裝了源碼中 NetDef 類。我們舉個栗子，創建網絡來實現以下的公式。

X = np.random.randn(2, 3)
W = np.random.randn(5, 3)
b = np.ones(5)
Y = X * W^T + b

首先創建網絡：

net = core.Net("my_first_net") print("Current network proto:\n\n{}".format(net.Proto()))

Current network proto:

name: "my_first_net"

首先使用生成權值和輸入，在這里，使用core.net來訪問創建：

X = net.GaussianFill([], ["X"], mean=0.0, std=1.0, shape=[2, 3], run_once=0) print("New network proto:\n\n{}".format(net.Proto())) W = net.GaussianFill([], ["W"], mean=0.0, std=1.0, shape=[5, 3], run_once=0) b = net.ConstantFill([], ["b"], shape=[5,], value=1.0, run_once=0)

生成輸出：

Y = net.FC([X, W, b], ["Y"])

我們打印下當前的網絡：

print("Current network proto:\n\n{}".format(net.Proto()))

Current network proto:

name: "my_first_net" op { output: "X" name: "" type: "GaussianFill" arg { name: "std" f: 1.0 } arg { name: "run_once" i: 0 } arg { name: "shape" ints: 2 ints: 3 } arg { name: "mean" f: 0.0 } } op { output: "W" name: "" type: "GaussianFill" arg { name: "std" f: 1.0 } arg { name: "run_once" i: 0 } arg { name: "shape" ints: 5 ints: 3 } arg { name: "mean" f: 0.0 } } op { output: "b" name: "" type: "ConstantFill" arg { name: "run_once" i: 0 } arg { name: "shape" ints: 5 } arg { name: "value" f: 1.0 } } op { input: "X" input: "W" input: "b" output: "Y" name: "" type: "FC" }

在這里，我們可以畫出來定義的網絡：

from caffe2.python import net_drawer from IPython import display graph = net_drawer.GetPydotGraph(net, rankdir="LR") display.Image(graph.create_png(), width=800)

和operator類似，在這里我們只定義了一個net，但是並沒有運行net的計算。當我們在python運行網絡時，實際上在c++層面做了兩件事情：

由protobuf定義初始化c++ 的net對象
調用初始化了的net的run函數

在python中有兩種方法來運行一個net：

方法1:使用workspace.RunNetOnce,初始化網絡，運行網絡，然后銷毀網絡。
方法2:先使用workspace.CreateNet初始化網絡，然后使用workspace.RunNet來運行網絡

方法一：

workspace.ResetWorkspace()
print("Current blobs in the workspace: {}".format(workspace.Blobs())) workspace.RunNetOnce(net) print("Blobs in the workspace after execution: {}".format(workspace.Blobs())) # Let's dump the contents of the blobs for name in workspace.Blobs(): print("{}:\n{}".format(name, workspace.FetchBlob(name)))

Current blobs in the workspace: [] Blobs in the workspace after execution: ['W', 'X', 'Y', 'b'] W: [[-0.29295802 0.02897477 -1.25667715] [-1.82299471 0.92877913 0.33613944] [-0.64382178 -0.68545657 -0.44015241] [ 1.10232282 1.38060772 -2.29121733] [-0.55766547 1.97437167 0.39324901]] X: [[-0.47522315 -0.40166432 0.7179445 ] [-0.8363331 -0.82451206 1.54286408]] Y: [[ 0.22535783 1.73460138 1.2652775 -1.72335696 0.7543118 ] [-0.71776152 2.27745867 1.42452145 -4.59527397 0.4452306 ]] b: [ 1. 1. 1. 1. 1.]

方法二：

workspace.ResetWorkspace()
print("Current blobs in the workspace: {}".format(workspace.Blobs())) workspace.CreateNet(net) workspace.RunNet(net.Proto().name) print("Blobs in the workspace after execution: {}".format(workspace.Blobs())) for name in workspace.Blobs(): print("{}:\n{}".format(name, workspace.FetchBlob(name)))

Current blobs in the workspace: [] Blobs in the workspace after execution: ['W', 'X', 'Y', 'b'] W: [[-0.29295802 0.02897477 -1.25667715] [-1.82299471 0.92877913 0.33613944] [-0.64382178 -0.68545657 -0.44015241] [ 1.10232282 1.38060772 -2.29121733] [-0.55766547 1.97437167 0.39324901]] X: [[-0.47522315 -0.40166432 0.7179445 ] [-0.8363331 -0.82451206 1.54286408]] Y: [[ 0.22535783 1.73460138 1.2652775 -1.72335696 0.7543118 ] [-0.71776152 2.27745867 1.42452145 -4.59527397 0.4452306 ]] b: [ 1. 1. 1. 1. 1.]

在這里，大家可能比較疑惑為什么會有兩種運行網絡的方式，在之后的實際應用中，大家就會慢慢理解，在這里，暫時記住有這樣兩種運行網絡的方式即可。

總結一下，在caffe2中

workspace是工作空間，在worspace中，可以存儲網絡結構類Net和數據存儲類Blob.
輸入數據、權值、輸出數據都存儲在Blob中
Operator類用來定義來數據如何計算，由多個operators構成Net，operator的作用強大
Net類是由operator構成的整體。

應用舉例

在基礎知識中，我們理解了workspace，operator，net等基本的概念，在這里我們結合caffe2的官方文檔簡單舉出幾個例子。

栗子1-回歸的小栗子

第一個栗子幫助大家理解caffe2框架網絡構建、參數初始化、訓練、圖等的一些關於整體框架的理念。

假設我們要做訓練一個簡單的網絡，擬合下面這樣的一個回歸函數：

y = wx + b
其中：w=[2.0, 1.5]  b=0.5

一般訓練數據是從外部讀進來，在這里訓練數據我們直接用caffe2中的operator生成，我們在后面的栗子中有會舉例說明如何從外部讀入數據。

首先導入必要的包：

from caffe2.python import core, cnn, net_drawer, workspace, visualize import numpy as np from IPython import display from matplotlib import pyplot

在這里，首先我們需要建立兩個網絡圖：

一個用來生成訓練數據、初始化權值的網絡圖
一個用來用來訓練，更新剃度的網絡圖

這里caffe2的思路和caffe不太一樣，在caffe中，我們在訓練網絡中定義好了參數的初始化方式，網絡加載時，程序會根據網絡定義，自動初始化權值，我們只需要對這個網絡，使用solver不斷的前傳和反傳，更新參數即可。在caffe2中，我們要把所有網絡的搭建、初始化、梯度生成、梯度更新都使用operator這樣一個方式來實現，所有的數據的生成、流動都要在圖中反映出來。這樣，那么初始化這一部分我就需要一些operators來實現，這些operators組成的net，我們把它單獨拿出來，稱它為用於初始化的網絡。我們可以結合着代碼來理解。

首先，我們創建一個生成訓練數據和初始化權值的網絡。

init_net = core.Net("init") # The ground truth parameters. W_gt = init_net.GivenTensorFill( [], "W_gt", shape=[1, 2], values=[2.0, 1.5]) B_gt = init_net.GivenTensorFill([], "B_gt", shape=[1], values=[0.5]) # Constant value ONE is used in weighted sum when updating parameters. ONE = init_net.ConstantFill([], "ONE", shape=[1], value=1.) # ITER is the iterator count. ITER = init_net.ConstantFill([], "ITER", shape=[1], value=0, dtype=core.DataType.INT32) # For the parameters to be learned: we randomly initialize weight # from [-1, 1] and init bias with 0.0. W = init_net.UniformFill([], "W", shape=[1, 2], min=-1., max=1.) B = init_net.ConstantFill([], "B", shape=[1], value=0.0) print('Created init net.')

接下來，我們定義一個用來訓練的網絡。

train_net = core.Net("train")
# First, we generate random samples of X and create the ground truth.
X = train_net.GaussianFill([], "X", shape=[64, 2], mean=0.0, std=1.0, run_once=0)
Y_gt = X.FC([W_gt, B_gt], "Y_gt")
# We add Gaussian noise to the ground truth
noise = train_net.GaussianFill([], "noise", shape=[64, 1], mean=0.0, std=1.0, run_once=0)
Y_noise = Y_gt.Add(noise, "Y_noise")
# Note that we do not need to propagate the gradients back through Y_noise,
# so we mark StopGradient to notify the auto differentiating algorithm
# to ignore this path.
Y_noise = Y_noise.StopGradient([], "Y_noise")

# Now, for the normal linear regression prediction, this is all we need.
Y_pred = X.FC([W, B], "Y_pred")

# The loss function is computed by a squared L2 distance, and then averaged
# over all items in the minibatch.
dist = train_net.SquaredL2Distance([Y_noise, Y_pred], "dist")
loss = dist.AveragedLoss([], ["loss"])

我們來畫出我們定義的訓練網絡的圖：

graph = net_drawer.GetPydotGraph(train_net.Proto().op, "train", rankdir="LR")
display.Image(graph.create_png(), width=800)

在這里，通過上面的圖，我們可以看到init_net部分生成了訓練數據、初始化的權值W，以及用來生成計算過程中需要的常數矩陣，而train_net構建了前向計算過程。

但是我們還沒有定義如何反向傳導，和很多其他的深度學習框架類似，caffe2支持自動梯度推導，自動生成產生梯度的operator。

接下來，我們給train_net加上梯度運算：

# Get gradients for all the computations above. gradient_map = train_net.AddGradientOperators([loss]) graph = net_drawer.GetPydotGraph(train_net.Proto().op, "train", rankdir="LR") display.Image(graph.create_png(), width=800)

可以看到，網絡后半部分進行了求梯度運算，輸出了各學習參數的梯度值，當我們得到這些梯度值后，我們再獲得當前訓練的學習率，我們就可以使用梯度下降方法更新參數。

接下來，我們在train_net加上SGD更新的部分：

# Increment the iteration by one. train_net.Iter(ITER, ITER) # Compute the learning rate that corresponds to the iteration. LR = train_net.LearningRate(ITER, "LR", base_lr=-0.1, policy="step", stepsize=20, gamma=0.9) # Weighted sum train_net.WeightedSum([W, ONE, gradient_map[W], LR], W) train_net.WeightedSum([B, ONE, gradient_map[B], LR], B) # Let's show the graph again. graph = net_drawer.GetPydotGraph(train_net.Proto().op, "train", rankdir="LR") display.Image(graph.create_png(), width=800)

到這里，整個模型的參數初始化、前傳、反傳、梯度更新全都使用operator定義好了。這個就是caffe2中使用operator的威力，它使得caffe2較caffe具有不可比擬的靈活性。在這里注意，我們只是定義了網絡，還沒有運行網絡，下面讓我們來運行它們：

workspace.RunNetOnce(init_net)
workspace.CreateNet(train_net)
print("Before training, W is: {}".format(workspace.FetchBlob("W"))) print("Before training, B is: {}".format(workspace.FetchBlob("B")))

True Before training, W is: [[-0.77634162 -0.88467366]] Before training, B is: [ 0.]

#run the train net 100 times for i in range(100): workspace.RunNet(train_net.Proto().name) print("After training, W is: {}".format(workspace.FetchBlob("W"))) print("After training, B is: {}".format(workspace.FetchBlob("B"))) print("Ground truth W is: {}".format(workspace.FetchBlob("W_gt"))) print("Ground truth B is: {}".format(workspace.FetchBlob("B_gt")))

在這里，我們需要注意一點，我們使用了RunNetOnce和RunNet兩種不同的方式來運行網絡，還記得兩種運行網絡的方式么？

方法1:使用workspace.RunNetOnce,這個函數會初始化網絡，運行網絡，然后銷毀網絡。
方法2:先使用workspace.CreateNet初始化網絡，然后使用workspace.RunNet來運行網絡

一開始我也不明白為什么要有兩種方式運行網絡，現在結合init_net和train_net來看，就非常明白了。RunNetOnce用來運行生成權值和數據的網絡，常用於初始化，這樣的網絡一次生成完，權值輸出或數據就存在當前的workspace中，網絡本身就沒有存在的必要了，就直接銷毀，而RunNet可以用來重復訓練網絡，一開始使用CreateNet，不斷迭代調用RunNet就可以不斷運行網絡更新參數了。

以下是訓練結果：

After training, W is: [[ 1.95769441 1.47348857]] After training, B is: [ 0.45236012] Ground truth W is: [[ 2. 1.5]] Ground truth B is: [ 0.5]

，總結一下：

caffe2中使用operator完成初始化參數、前傳、反傳、梯度更新
caffe2中一個模型通常包含一個初始化網絡，一個訓練網絡

最后，還要說明一點，這個例子中，我們直接使用operator來構建網絡。對於常見的深度網絡，直接用operator構建會步驟會非常繁瑣，所以caffe2中為了簡化網絡的搭建，又封裝了model_helper類來幫助我們方便地搭建網絡，譬如對於卷積神經網絡中的常見的層，我們就可以直接使用model_helper來構建。在之后的栗子中也有說明。

栗子二-圖像預處理

眾所周知，網絡中訓練需要做一系列的數據預處理，在這里，caffe和caffe2中處理的方式一樣。都需要經過XXX等步。因為沒有什么區別，在這里就不舉了，直接參考官方教程Image Pre-Processing，解釋非常清楚。給個贊。

栗子三-加載預訓練模型

首先，我們使用一個caffe2中定義的下載模塊去下載一個預訓練好的模型，命令行中輸入如下的命令會下載squeezenet這個預訓練模型：

python -m caffe2.python.models.download -i squeezenet

當下載完成時，在caffe2/python/model底下有一個squeezenet文件，文件夾底下有兩個文件init_net.pb,predict_net.pb分別保存了權值和網絡定義。

在python中我們使用caffe2的workspace來存放這個模型的網絡定義和權重，並且把它們加載到blob、init_net和predict_net。我們需要使用一個workspace.Predictor來接收兩個protobuf，然后剩下的就可以交給caffe2了。

所以一般加載預測模型只需要幾步：

1、讀入protobuf文件

 with open("init_net.pb") as f: init_net = f.read() with open("predict_net.pb") as f: predict_net = f.read()

2、使用workspace中的Predictor來加載從protobuf中取到的blobs：

 p = workspace.Predictor(init_net, predict_net)

3、運行網絡，得到結果：

 results = p.run([img])

需要注意的這里的img是預處理過的圖像。

以下是官方文檔下的一個完整的栗子：

首先配置一下問文件路徑等，導入常用包：

# where you installed caffe2. Probably '~/caffe2' or '~/src/caffe2'. CAFFE2_ROOT = "~/caffe2" # assumes being a subdirectory of caffe2 CAFFE_MODELS = "~/caffe2/caffe2/python/models" # if you have a mean file, place it in the same dir as the model %matplotlib inline from caffe2.proto import caffe2_pb2 import numpy as np import skimage.io import skimage.transform from matplotlib import pyplot import os from caffe2.python import core, workspace import urllib2 print("Required modules imported.")

IMAGE_LOCATION =  "https://cdn.pixabay.com/photo/2015/02/10/21/28/flower-631765_1280.jpg" # What model are we using? You should have already converted or downloaded one. # format below is the model's: # folder, INIT_NET, predict_net, mean, input image size # you can switch the comments on MODEL to try out different model conversions MODEL = 'squeezenet', 'init_net.pb', 'predict_net.pb', 'ilsvrc_2012_mean.npy', 227 # codes - these help decypher the output and source from a list from AlexNet's object codes to provide an result like "tabby cat" or "lemon" depending on what's in the picture you submit to the neural network. # The list of output codes for the AlexNet models (also squeezenet) codes = "https://gist.githubusercontent.com/aaronmarkham/cd3a6b6ac071eca6f7b4a6e40e6038aa/raw/9edb4038a37da6b5a44c3b5bc52e448ff09bfe5b/alexnet_codes" print "Config set!"

定義數據預處理的函數：

def crop_center(img,cropx,cropy): y,x,c = img.shape startx = x//2-(cropx//2) starty = y//2-(cropy//2) return img[starty:starty+cropy,startx:startx+cropx] def rescale(img, input_height, input_width): print("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!") print("Model's input shape is %dx%d") % (input_height, input_width) aspect = img.shape[1]/float(img.shape[0]) print("Orginal aspect ratio: " + str(aspect)) if(aspect>1): # landscape orientation - wide image res = int(aspect * input_height) imgScaled = skimage.transform.resize(img, (input_width, res)) if(aspect<1): # portrait orientation - tall image res = int(input_width/aspect) imgScaled = skimage.transform.resize(img, (res, input_height)) if(aspect == 1): imgScaled = skimage.transform.resize(img, (input_width, input_height)) pyplot.figure() pyplot.imshow(imgScaled) pyplot.axis('on') pyplot.title('Rescaled image') print("New image shape:" + str(imgScaled.shape) + " in HWC") return imgScaled print "Functions set." # set paths and variables from model choice and prep image CAFFE2_ROOT = os.path.expanduser(CAFFE2_ROOT) CAFFE_MODELS = os.path.expanduser(CAFFE_MODELS) # mean can be 128 or custom based on the model # gives better results to remove the colors found in all of the training images MEAN_FILE = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[3]) if not os.path.exists(MEAN_FILE): mean = 128 else: mean = np.load(MEAN_FILE).mean(1).mean(1) mean = mean[:, np.newaxis, np.newaxis] print "mean was set to: ", mean # some models were trained with different image sizes, this helps you calibrate your image INPUT_IMAGE_SIZE = MODEL[4] # make sure all of the files are around... if not os.path.exists(CAFFE2_ROOT): print("Houston, you may have a problem.") INIT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[1]) print 'INIT_NET = ', INIT_NET PREDICT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[2]) print 'PREDICT_NET = ', PREDICT_NET if not os.path.exists(INIT_NET): print(INIT_NET + " not found!") else: print "Found ", INIT_NET, "...Now looking for", PREDICT_NET if not os.path.exists(PREDICT_NET): print "Caffe model file, " + PREDICT_NET + " was not found!" else: print "All needed files found! Loading the model in the next block." # load and transform image img = skimage.img_as_float(skimage.io.imread(IMAGE_LOCATION)).astype(np.float32) img = rescale(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE) img = crop_center(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE) print "After crop: " , img.shape pyplot.figure() pyplot.imshow(img) pyplot.axis('on') pyplot.title('Cropped') # switch to CHW img = img.swapaxes(1, 2).swapaxes(0, 1) pyplot.figure() for i in range(3): # For some reason, pyplot subplot follows Matlab's indexing # convention (starting with 1). Well, we'll just follow it... pyplot.subplot(1, 3, i+1) pyplot.imshow(img[i]) pyplot.axis('off') pyplot.title('RGB channel %d' % (i+1)) # switch to BGR img = img[(2, 1, 0), :, :] # remove mean for better results img = img * 255 - mean # add batch size img = img[np.newaxis, :, :, :].astype(np.float32) print "NCHW: ", img.shape

運行一下，輸出結果：

Functions set.
mean was set to:  128 INIT_NET = /home/aaron/models/squeezenet/init_net.pb PREDICT_NET = /home/aaron/models/squeezenet/predict_net.pb Found /home/aaron/models/squeezenet/init_net.pb ...Now looking for /home/aaron/models/squeezenet/predict_net.pb All needed files found! Loading the model in the next block. Original image shape:(751, 1280, 3) and remember it should be in H, W, C! Model's input shape is 227x227 Orginal aspect ratio: 1.70439414115 New image shape:(227, 386, 3) in HWC After crop: (227, 227, 3) NCHW: (1, 3, 227, 227)

image output

當圖像經過處理之后，就可以按照前面的安排加載和運行網絡。

# initialize the neural net with open(INIT_NET) as f: init_net = f.read() with open(PREDICT_NET) as f: predict_net = f.read() p = workspace.Predictor(init_net, predict_net) # run the net and return prediction results = p.run([img]) # turn it into something we can play with and examine which is in a multi-dimensional array results = np.asarray(results) print "results shape: ", results.shape

results shape:  (1, 1, 1000, 1, 1)

這里輸出來了1000個值，表示這張圖片分別對應1000類的概率。我們可以取出來其中概率最高的值，來找到它對應的標簽：

# the rest of this is digging through the results results = np.delete(results, 1) index = 0 highest = 0 arr = np.empty((0,2), dtype=object) arr[:,0] = int(10) arr[:,1:] = float(10) for i, r in enumerate(results): # imagenet index begins with 1! i=i+1 arr = np.append(arr, np.array([[i,r]]), axis=0) if (r > highest): highest = r index = i print index, " :: ", highest # lookup the code and return the result # top 3 results # sorted(arr, key=lambda x: x[1], reverse=True)[:3] # now we can grab the code list response = urllib2.urlopen(codes) # and lookup our result from the list for line in response: code, result = line.partition(":")[::2] if (code.strip() == str(index)): print result.strip()[1:-2]

985 :: 0.979059 daisy

栗子四-創建一個CNN模型

1、模型、幫助函數、brew

在前面我們已經基本介紹了在python中關於caffe2中基本的操作。

這個例子中，我們來簡單搭建一個CNN模型。在這個需要說明一點:

在caffe中，我們通常說一個模型，其實就是一個網絡，一個Net
而在caffe2中，我們通常使用modelHelper來代表一個model，而這個model包含多個Net，就像我們前面看到的，我們會使用一個初始化網絡init_net，還有有一個訓練網絡net，這兩個網絡圖都是model的一部分。

這一點需要大家區分開，不然容易疑惑。舉例，如果我們要構造一個模型，只有一個FC層，在這里使用modelHelper來表示一個model，使用operators來構造網絡，一般model有一個param_init_net和一個net。分別用於模型初始化和訓練：

model = model_helper.ModelHelper(name="train") # initialize your weight weight = model.param_init_net.XavierFill( [], blob_out + '_w', shape=[dim_out, dim_in], **kwargs, # maybe indicating weight should be on GPU here ) # initialize your bias bias = model.param_init_net.ConstantFill( [], blob_out + '_b', shape=[dim_out, ], **kwargs, ) # finally building FC model.net.FC([blob_in, weights, bias], blob_out, **kwargs)

前面，我們說過在日常搭建網絡的時候呢，我們通常不是完全使用operator搭建網絡，因為使用這種方式，每個參數都需要我們手動初始化，以及每個operator都需要構造，太過於繁瑣。我們想着，對於常用層，能不能把構造它的operators都封裝起來，封裝成一個函數，我們構造時只需給這個函數要提供必要的參數，函數中的代碼就能幫助我們完成層初始化和operator的構建。

在caffe2中，為了便於開發者搭建網絡，caffe2在python/helpers中提供了許多help函數，像上面例子中的FC層，使用python/helpers/fc.py來構造，非常簡單就一行代碼：

fcLayer = fc(model, blob_in, blob_out, **kwargs) # returns a blob reference

這里面help函數能夠幫助我們將權值初始化和計算網絡自動分開到兩個網絡，這樣一來就簡單多了。caffe2為了更方便調用和管理，把這些幫助函數集合到一起，放在brew這個包里面。可以通過導入brew這個包來調用這些幫助函數。像上面的fc層的實現就可以使用：

from caffe2.python import brew brew.fc(model, blob_in, blob_out, ...)

我們使用brew構造網絡就十分簡單，下面的代碼就構造了一個LeNet模型：

from caffe2.python import brew def AddLeNetModel(model, data): conv1 = brew.conv(model, data, 'conv1', 1, 20, 5) pool1 = brew.max_pool(model, conv1, 'pool1', kernel=2, stride=2) conv2 = brew.conv(model, pool1, 'conv2', 20, 50, 5) pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2, stride=2) fc3 = brew.fc(model, pool2, 'fc3', 50 * 4 * 4, 500) fc3 = brew.relu(model, fc3, fc3) pred = brew.fc(model, fc3, 'pred', 500, 10) softmax = brew.softmax(model, pred, 'softmax')

caffe2 使用brew提供很多構造網絡的幫助函數，大大簡化了我們構建網絡的過程。但實際上，這些只是封裝的結果，網絡構造的原理和之前說的使用operators構建的原理是一樣的。

2、創建一個CNN模型用於MNIST手寫體數據集

首先，導入必要的包：

%matplotlib inline
from matplotlib import pyplot import numpy as np import os import shutil from caffe2.python import core, model_helper, net_drawer, workspace, visualize, brew # If you would like to see some really detailed initializations, # you can change --caffe2_log_level=0 to --caffe2_log_level=-1 core.GlobalInit(['caffe2', '--caffe2_log_level=0']) print("Necessities imported!")

下載MNIST dataset,並且把數據集轉成leveldb：

./make_mnist_db --channel_first --db leveldb --image_file ~/Downloads/train-images-idx3-ubyte --label_file ~/Downloads/train-labels-idx1-ubyte --output_file ~/caffe2_notebooks/tutorial_data/mnist/mnist-train-nchw-leveldb

./make_mnist_db --channel_first --db leveldb --image_file ~/Downloads/t10k-images-idx3-ubyte --label_file ~/Downloads/t10k-labels-idx1-ubyte --output_file ~/caffe2_notebooks/tutorial_data/mnist/mnist-test-nchw-leveldb

# This section preps your image and test set in a leveldb current_folder = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks') data_folder = os.path.join(current_folder, 'tutorial_data', 'mnist') root_folder = os.path.join(current_folder, 'tutorial_files', 'tutorial_mnist') image_file_train = os.path.join(data_folder, "train-images-idx3-ubyte") label_file_train = os.path.join(data_folder, "train-labels-idx1-ubyte") image_file_test = os.path.join(data_folder, "t10k-images-idx3-ubyte") label_file_test = os.path.join(data_folder, "t10k-labels-idx1-ubyte") # Get the dataset if it is missing def DownloadDataset(url, path): import requests, zipfile, StringIO print "Downloading... ", url, " to ", path r = requests.get(url, stream=True) z = zipfile.ZipFile(StringIO.StringIO(r.content)) z.extractall(path) def GenerateDB(image, label, name): name = os.path.join(data_folder, name) print 'DB: ', name if not os.path.exists(name): syscall = "/usr/local/bin/make_mnist_db --channel_first --db leveldb --image_file " + image + " --label_file " + label + " --output_file " + name # print "Creating database with: ", syscall os.system(syscall) else: print "Database exists already. Delete the folder if you have issues/corrupted DB, then rerun this." if os.path.exists(os.path.join(name, "LOCK")): # print "Deleting the pre-existing lock file" os.remove(os.path.join(name, "LOCK")) if not os.path.exists(data_folder): os.makedirs(data_folder) if not os.path.exists(label_file_train): DownloadDataset("https://download.caffe2.ai/datasets/mnist/mnist.zip", data_folder) if os.path.exists(root_folder): print("Looks like you ran this before, so we need to cleanup those old files...") shutil.rmtree(root_folder) os.makedirs(root_folder) workspace.ResetWorkspace(root_folder) # (Re)generate the leveldb database (known to get corrupted...) GenerateDB(image_file_train, label_file_train, "mnist-train-nchw-leveldb") GenerateDB(image_file_test, label_file_test, "mnist-test-nchw-leveldb") print("training data folder:" + data_folder) print("workspace root folder:" + root_folder)

在這里，我們使用modelHelper來代表我們的模型，使用brew和operators來搭建模型，modelHelper包含了兩個net，包括param_init_net和net，分別代表初始化網絡和主訓練網絡。

我們來一步一步分塊構造模型：

(1)輸入部分（AddInput function)
(2)網絡計算部分（AddLeNetModel function)
(3)網絡訓練部分,添加梯度運算，更新等（AddTrainingOperators function)
(4)記錄統計部分，打印一些統計數據來觀察（AddBookkeepingOperators function）

（1）輸入部分（AddInput function)

AddInput會從DB加載data，AddInput加載完成之后，和得到data 和label：

- data with shape `(batch_size, num_channels, width, height)` - in this case `[batch_size, 1, 28, 28]` of data type *uint8* - label with shape `[batch_size]` of data type *int*

def AddInput(model, batch_size, db, db_type): # load the data data_uint8, label = model.TensorProtosDBInput( [], ["data_uint8", "label"], batch_size=batch_size, db=db, db_type=db_type) # cast the data to float data = model.Cast(data_uint8, "data", to=core.DataType.FLOAT) # scale data from [0,255] down to [0,1] data = model.Scale(data, data, scale=float(1./256)) # don't need the gradient for the backward pass data = model.StopGradient(data, data) return data, label

在這里簡單解釋一下AddInput中的一些操作，首先將data轉換成float類型，這樣做是因為我們主要做浮點運算。為了保證計算穩定，我們將圖像從[0,255]縮放到[0,1]，並且這里做的事占位運算，不需要保存未縮放之前的值。當計算反向過程中，這一部分不需要計算梯度，我們使用StopGradient來禁止梯度反傳，這樣自動生成梯度時，這個operator和它之前的operator就不會變了。

def AddInput(model, batch_size, db, db_type): # load the data data_uint8, label = model.TensorProtosDBInput( [], ["data_uint8", "label"], batch_size=batch_size, db=db, db_type=db_type) # cast the data to float data = model.Cast(data_uint8, "data", to=core.DataType.FLOAT) # scale data from [0,255] down to [0,1] data = model.Scale(data, data, scale=float(1./256)) # don't need the gradient for the backward pass data = model.StopGradient(data, data) return data, label

在這個基礎上，就是加入網絡AddLenetModel，同時加入一個AddAccuracy來追蹤模型的准確率：

def AddLeNetModel(model, data): # Image size: 28 x 28 -> 24 x 24 conv1 = brew.conv(model, data, 'conv1', dim_in=1, dim_out=20, kernel=5) # Image size: 24 x 24 -> 12 x 12 pool1 = brew.max_pool(model, conv1, 'pool1', kernel=2, stride=2) # Image size: 12 x 12 -> 8 x 8 conv2 = brew.conv(model, pool1, 'conv2', dim_in=20, dim_out=50, kernel=5) # Image size: 8 x 8 -> 4 x 4 pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2, stride=2) # 50 * 4 * 4 stands for dim_out from previous layer multiplied by the image size fc3 = brew.fc(model, pool2, 'fc3', dim_in=50 * 4 * 4, dim_out=500) fc3 = brew.relu(model, fc3, fc3) pred = brew.fc(model, fc3, 'pred', 500, 10) softmax = brew.softmax(model, pred, 'softmax') return softmax def AddAccuracy(model, softmax, label): accuracy = model.Accuracy([softmax, label], "accuracy") return accuracy

接下來，我們將加入梯度生成和更新，這部分由AddTrainingOperators實現，梯度生成和更新和之前例子中的原理一樣。

def AddTrainingOperators(model, softmax, label):

# something very important happens here xent = model.LabelCrossEntropy([softmax, label], 'xent') # compute the expected loss loss = model.AveragedLoss(xent, "loss") # track the accuracy of the model AddAccuracy(model, softmax, label) # use the average loss we just computed to add gradient operators to the model model.AddGradientOperators([loss]) # do a simple stochastic gradient descent ITER = model.Iter("iter") # set the learning rate schedule LR = model.LearningRate( ITER, "LR", base_lr=-0.1, policy="step", stepsize=1, gamma=0.999 ) # ONE is a constant value that is used in the gradient update. We only need # to create it once, so it is explicitly placed in param_init_net. ONE = model.param_init_net.ConstantFill([], "ONE", shape=[1], value=1.0) # Now, for each parameter, we do the gradient updates. for param in model.params: # Note how we get the gradient of each parameter - ModelHelper keeps # track of that. param_grad = model.param_to_grad[param] # The update is a simple weighted sum: param = param + param_grad * LR model.WeightedSum([param, ONE, param_grad, LR], param) # let's checkpoint every 20 iterations, which should probably be fine. # you may need to delete tutorial_files/tutorial-mnist to re-run the tutorial model.Checkpoint([ITER] + model.params, [], db="mnist_lenet_checkpoint_%05d.leveldb", db_type="leveldb", every=20)

接下來，我們使用AddBookkeepingOperations來打印一些統計數據供我們之后觀察，這一部分不影響訓練部分，只是統計，打印日志。

def AddBookkeepingOperators(model): # Print basically prints out the content of the blob. to_file=1 routes the # printed output to a file. The file is going to be stored under # root_folder/[blob name] model.Print('accuracy', [], to_file=1) model.Print('loss', [], to_file=1) # Summarizes the parameters. Different from Print, Summarize gives some # statistics of the parameter, such as mean, std, min and max. for param in model.params: model.Summarize(param, [], to_file=1) model.Summarize(model.param_to_grad[param], [], to_file=1) # Now, if we really want to be verbose, we can summarize EVERY blob # that the model produces; it is probably not a good idea, because that # is going to take time - summarization do not come for free. For this # demo, we will only show how to summarize the parameters and their # gradients. print("Bookkeeping function created")

在這里，我們一共做了四件事：

(1)輸入部分（AddInput function)
(2)網絡計算部分（AddLeNetModel function)
(3)網絡訓練部分,添加梯度運算，更新等（AddTrainingOperators function)
(4)記錄統計部分，打印一些統計數據來觀察（AddBookkeepingOperators function）

基本的操作我們都定義好了，接下來調用定義模型，在這里，它定義了一個訓練模型，用於訓練，一個部署模型，用於部署：

arg_scope = {"order": "NCHW"} train_model = model_helper.ModelHelper(name="mnist_train", arg_scope=arg_scope) data, label = AddInput( train_model, batch_size=64, db=os.path.join(data_folder, 'mnist-train-nchw-leveldb'), db_type='leveldb') softmax = AddLeNetModel(train_model, data) AddTrainingOperators(train_model, softmax, label) AddBookkeepingOperators(train_model) # Testing model. We will set the batch size to 100, so that the testing # pass is 100 iterations (10,000 images in total). # For the testing model, we need the data input part, the main LeNetModel # part, and an accuracy part. Note that init_params is set False because # we will be using the parameters obtained from the train model. test_model = model_helper.ModelHelper( name="mnist_test", arg_scope=arg_scope, init_params=False) data, label = AddInput( test_model, batch_size=100, db=os.path.join(data_folder, 'mnist-test-nchw-leveldb'), db_type='leveldb') softmax = AddLeNetModel(test_model, data) AddAccuracy(test_model, softmax, label) # Deployment model. We simply need the main LeNetModel part. deploy_model = model_helper.ModelHelper( name="mnist_deploy", arg_scope=arg_scope, init_params=False) AddLeNetModel(deploy_model, "data") # You may wonder what happens with the param_init_net part of the deploy_model. # No, we will not use them, since during deployment time we will not randomly # initialize the parameters, but load the parameters from the db.

運行網絡，打印loss曲線：

# The parameter initialization network only needs to be run once. workspace.RunNetOnce(train_model.param_init_net) # creating the network workspace.CreateNet(train_model.net) # set the number of iterations and track the accuracy & loss total_iters = 200 accuracy = np.zeros(total_iters) loss = np.zeros(total_iters) # Now, we will manually run the network for 200 iterations. for i in range(total_iters): workspace.RunNet(train_model.net.Proto().name) accuracy[i] = workspace.FetchBlob('accuracy') loss[i] = workspace.FetchBlob('loss') # After the execution is done, let's plot the values. pyplot.plot(loss, 'b') pyplot.plot(accuracy, 'r') pyplot.legend(('Loss', 'Accuracy'), loc='upper right')

我們也可以輸出來預測：

# Let's look at some of the data. pyplot.figure() data = workspace.FetchBlob('data') _ = visualize.NCHW.ShowMultiple(data) pyplot.figure() softmax = workspace.FetchBlob('softmax') _ = pyplot.plot(softmax[0], 'ro') pyplot.title('Prediction for the first image')

記得我們也定義了一個test_model，我們可以運行它得到測試集准確率，雖然test_model的權值由train_model來加載，但是測試數據輸入還需要運行param_init_net。

# run a test pass on the test net workspace.RunNetOnce(test_model.param_init_net) workspace.CreateNet(test_model.net) test_accuracy = np.zeros(100) for i in range(100): workspace.RunNet(test_model.net.Proto().name) test_accuracy[i] = workspace.FetchBlob('accuracy') # After the execution is done, let's plot the values. pyplot.plot(test_accuracy, 'r') pyplot.title('Acuracy over test batches.') print('test_accuracy: %f' % test_accuracy.mean())

test_accuracy: 0.946700

這樣，我們就簡單的完成了模型的搭建、訓練、部署。

這個教程是caffe2的python接口教程。教程例子基本都是官方提供的，只是加了些自己的理解思路，也簡單對比了caffe，可能有疏忽和理解錯的地方，敬請指正。

2017.07.07 cskenken

作者：陸姚知馬力
鏈接：http://www.jianshu.com/p/5c0fd1c9fef9
來源：簡書
著作權歸作者所有。商業轉載請聯系作者獲得授權，非商業轉載請注明出處。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 【轉】安裝caffe2的參考和我一起熟悉caffe2 從TensorFlow 到 Caffe2：盤點深度學習框架 Caffe2——C++ 預測(predict)Demo Win10上使用VS2015編譯Caffe2 Python入門基礎教程(兒童版) [分享一本入門級教程] caffe 教程編譯分布式並行版caffe（Open MPI）教程 python版opencv入門（1）簡單實現Ubuntu16.04 + caffe2 + CUDA9.0 + cuDNN8.0