Caffe的整體流程圖:

程序入口:main()
1 int main(int argc, char** argv) { 2 ..... 3 return GetBrewFunction(caffe::string(argv[1]))(); 4 .... 5 }
g_brew_map實現過程,首先通過 typedef定義函數指針 typedef int (*BrewFunction)(); 這個是用typedef定義函數指針方法。這個程序定義一個BrewFunction函數指針類型,在caffe.cpp 中 BrewFunction 作為GetBrewFunction()函數的返回類型,可以是 train(),test(),device_query(),time() 這四個函數指針的其中一個。在train(),test(),中可以調用solver類的函數,從而進入到net,進入到每一層,運行整個caffe程序。然后對每個函數注冊。
1 RegisterBrewFunction(train) 2 RegisterBrewFunction(test) 3 RegisterBrewFunction(device_query) 4 RegisterBrewFunction(time)
- train: 訓練或者調整一個模型
- test : 在測試集上測試一個模型
- device_query : 打印GPU的調試信息
- time: 壓測一個模型的執行時間
如果需要,可以增加其他的方式,然后通過RegisterBrewFunction()函數注冊一下即可。
接着調用train()函數,train函數中主要有三個方法ReadSolverParamsFromTextFileOrDie、CreateSolver、Solve。
1 // Train / Finetune a model. 2 int train() { 3 ...... 4 caffe::SolverParameter solver_param; 5 caffe::ReadSolverParamsFromTextFileOrDie(FLAGS_solver, &solver_param);//從-solver參數讀取solver_param 6 ...... 7 shared_ptr<caffe::Solver<float> > 8 solver(caffe::SolverRegistry<float>::CreateSolver(solver_param));//從參數創建solver,同樣采用string到函數指針的映射實現,用到了工廠模式 9 10 if (FLAGS_snapshot.size()) {//迭代snapshot次后保存模型一次 11 LOG(INFO) << "Resuming from " << FLAGS_snapshot; 12 solver->Restore(FLAGS_snapshot.c_str()); 13 } else if (FLAGS_weights.size()) {//若采用finetuning,則拷貝weight到指定模型 14 CopyLayers(solver.get(), FLAGS_weights); 15 } 16 17 if (gpus.size() > 1) { 18 caffe::P2PSync<float> sync(solver, NULL, solver->param()); 19 sync.Run(gpus); 20 } else { 21 LOG(INFO) << "Starting Optimization"; 22 solver->Solve();//開始訓練網絡 23 } 24 LOG(INFO) << "Optimization Done."; 25 return 0; 26 }
ReadSolverParamsFromTextFileOrDie
caffe::ReadSolverParamsFromTextFileOrDie(FLAGS_solver, &solver_param)解析-solver指定的solver.prototxt的文件內容到solver_param中
CreateSolver
CreateSolver函數構建solver和net,該函數是初始化的入口,會通過執行Solver的構造函數,調用 void Solver<Dtype>::Init(const SolverParameter& param),該函數內有InitTrainNet()、InitTestNets()。對於InitTrainNet函數:
...... net_.reset(new Net<Dtype>(net_param));
調用Net類的構造函數,然后執行Init()操作,該函數具體的內容如下圖和源碼所示:
1 template <typename Dtype> 2 void Net<Dtype>::Init(const NetParameter& in_param) { 3 ........//過濾校驗參數FilterNet 4 FilterNet(in_param, &filtered_param); 5 .........//插入Splits層 6 InsertSplits(filtered_param, ¶m); 7 .......// 構建網絡中輸入輸出存儲結構 8 bottom_vecs_.resize(param.layer_size()); 9 top_vecs_.resize(param.layer_size()); 10 bottom_id_vecs_.resize(param.layer_size()); 11 param_id_vecs_.resize(param.layer_size()); 12 top_id_vecs_.resize(param.layer_size()); 13 bottom_need_backward_.resize(param.layer_size()); 14 15 for (int layer_id = 0; layer_id < param.layer_size(); ++layer_id) { 16 ...//創建層 17 layers_.push_back(LayerRegistry<Dtype>::CreateLayer(layer_param)); 18 layer_names_.push_back(layer_param.name()); 19 LOG_IF(INFO, Caffe::root_solver()) 20 << "Creating Layer " << layer_param.name(); 21 bool need_backward = false; 22 23 // Figure out this layer's input and output 24 for (int bottom_id = 0; bottom_id < layer_param.bottom_size(); 25 ++bottom_id) { 26 const int blob_id = AppendBottom(param, layer_id, bottom_id, 27 &available_blobs, &blob_name_to_idx); 28 29 30 ........//創建相關blob 31 // If the layer specifies that AutoTopBlobs() -> true and the LayerParameter 32 // specified fewer than the required number (as specified by 33 // ExactNumTopBlobs() or MinTopBlobs()), allocate them here. 34 Layer<Dtype>* layer = layers_[layer_id].get(); 35 if (layer->AutoTopBlobs()) { 36 const int needed_num_top = 37 std::max(layer->MinTopBlobs(), layer->ExactNumTopBlobs()); 38 for (; num_top < needed_num_top; ++num_top) { 39 // Add "anonymous" top blobs -- do not modify available_blobs or 40 // blob_name_to_idx as we don't want these blobs to be usable as input 41 // to other layers. 42 AppendTop(param, layer_id, num_top, NULL, NULL); 43 } 44 } 45 46 47 .....//執行SetUp() 48 // After this layer is connected, set it up. 49 layers_[layer_id]->SetUp(bottom_vecs_[layer_id], top_vecs_[layer_id]); 50 LOG_IF(INFO, Caffe::root_solver()) 51 << "Setting up " << layer_names_[layer_id]; 52 for (int top_id = 0; top_id < top_vecs_[layer_id].size(); ++top_id) { 53 if (blob_loss_weights_.size() <= top_id_vecs_[layer_id][top_id]) { 54 blob_loss_weights_.resize(top_id_vecs_[layer_id][top_id] + 1, Dtype(0)); 55 } 56 blob_loss_weights_[top_id_vecs_[layer_id][top_id]] = layer->loss(top_id); 57 LOG_IF(INFO, Caffe::root_solver()) 58 << "Top shape: " << top_vecs_[layer_id][top_id]->shape_string(); 59 if (layer->loss(top_id)) { 60 LOG_IF(INFO, Caffe::root_solver()) 61 << " with loss weight " << layer->loss(top_id); 62 } 63 memory_used_ += top_vecs_[layer_id][top_id]->count(); 64 } 65 LOG_IF(INFO, Caffe::root_solver()) 66 << "Memory required for data: " << memory_used_ * sizeof(Dtype); 67 const int param_size = layer_param.param_size(); 68 const int num_param_blobs = layers_[layer_id]->blobs().size(); 69 CHECK_LE(param_size, num_param_blobs) 70 << "Too many params specified for layer " <<
SetUp是怎么構建的呢?
1 virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom, 2 const vector<Blob<Dtype>*>& top) {} 3 4 void SetUp(const vector<Blob<Dtype>*>& bottom, 5 const vector<Blob<Dtype>*>& top) { 6 InitMutex(); 7 CheckBlobCounts(bottom, top); 8 LayerSetUp(bottom, top); 9 Reshape(bottom, top); 10 SetLossWeights(top); 11 }

初始化的總體流程大概就是新建一個Solver對象,然后調用Solver類的構造函數,然后在Solver的構造函數中又會新建Net類實例,在Net類的構造函數中又會新建各個layer的實例,一直具體到設置每個Blob,大概就完成了網絡初始化的工作了。
Solve
train函數中CreateSolver()執行完成后,接下來是具體訓練過程,執行Solve()函數---->Step()--->結束
Solve的具體內容和代碼:
1 template <typename Dtype> 2 void Solver<Dtype>::Solve(const char* resume_file) { 3 CHECK(Caffe::root_solver()); 4 LOG(INFO) << "Solving " << net_->name(); 5 LOG(INFO) << "Learning Rate Policy: " << param_.lr_policy(); 6 7 // For a network that is trained by the solver, no bottom or top vecs 8 // should be given, and we will just provide dummy vecs. 9 int start_iter = iter_; 10 Step(param_.max_iter() - iter_); 11 12 // overridden by setting snapshot_after_train := false 13 if (param_.snapshot_after_train() 14 && (!param_.snapshot() || iter_ % param_.snapshot() != 0)) { 15 Snapshot(); 16 } 17 18 // display loss 19 if (param_.display() && iter_ % param_.display() == 0) { 20 int average_loss = this->param_.average_loss(); 21 Dtype loss; 22 net_->Forward(&loss); 23 24 UpdateSmoothedLoss(loss, start_iter, average_loss); 25 26 27 if (param_.test_interval() && iter_ % param_.test_interval() == 0) { 28 TestAll(); 29 } 30 }

然后開始執行Step函數,具體內容和代碼:
1 template <typename Dtype> 2 void Solver<Dtype>::Step(int iters) 3 { 4 // 起始迭代步數 5 const int start_iter = iter_; 6 // 終止迭代步數 7 const int stop_iter = iter_ + iters; 8 9 // 判斷是否已經完成設定步數 10 while (iter_ < stop_iter) 11 { 12 // 將net_中的Bolb梯度參數置為零 13 net_->ClearParamDiffs(); 14 15 ... 16 17 // accumulate the loss and gradient 18 Dtype loss = 0; 19 for (int i = 0; i < param_.iter_size(); ++i) 20 { 21 // 正向傳導和反向傳導,並計算loss 22 loss += net_->ForwardBackward(); 23 } 24 loss /= param_.iter_size(); 25 26 // 為了輸出結果平滑,將臨近的average_loss個loss數值進行平均,存儲在成員變量smoothed_loss_中 27 UpdateSmoothedLoss(loss, start_iter, average_loss); 28 29 // BP算法更新權重 30 ApplyUpdate(); 31 32 // Increment the internal iter_ counter -- its value should always indicate 33 // the number of times the weights have been updated. 34 ++iter_; 35 } 36 }

while循環中先調用了網絡類Net::ForwardBackward()成員函數進行正向傳導和反向傳導,並計算loss
1 Dtype ForwardBackward() { 2 Dtype loss; 3 //正向傳導 4 Forward(&loss); 5 //反向傳導 6 Backward(); 7 return loss; 8 }
而Fordward函數中調用了ForwardFromTo,而FordwardFromTo又調用了每個layer的Fordward。反向傳導函數Backward()調用了BackwardFromTo(int start, int end)函數。正向傳導和反向傳導結束后,再調用SGDSolver::ApplyUpdate()成員函數進行權重更新。
- ForwardBackward:按順序調用了Forward和Backward。
- ForwardFromTo(int start, int end):執行從start層到end層的前向傳遞,采用簡單的for循環調用。,forward只要計算損失loss
- BackwardFromTo(int start, int end):和前面的ForwardFromTo函數類似,調用從start層到end層的反向傳遞。backward主要根據loss來計算梯度,caffe通過自動求導並反向組合每一層的梯度來計算整個網絡的梯度。
- ToProto函數完成網絡的序列化到文件,循環調用了每個層的ToProto函數
1 template <typename Dtype> 2 void SGDSolver<Dtype>::ApplyUpdate() 3 { 4 // 獲取當前學習速率 5 Dtype rate = GetLearningRate(); 6 if (this->param_.display() && this->iter_ % this->param_.display() == 0) 7 { 8 LOG(INFO) << "Iteration " << this->iter_ << ", lr = " << rate; 9 } 10 11 // 在計算當前梯度的時候,如果該值超過了閾值clip_gradients,則將梯度直接設置為該閾值 12 // 此處閾值設為-1,即不起作用 13 ClipGradients(); 14 15 // 逐層更新網絡中的可學習層 16 for (int param_id = 0; param_id < this->net_->learnable_params().size(); 17 ++param_id) 18 { 19 // 歸一化 20 Normalize(param_id); 21 // L2范數正則化添加衰減權重 22 Regularize(param_id); 23 // 隨機梯度下降法計算更新值 24 ComputeUpdateValue(param_id, rate); 25 } 26 // 更新權重 27 this->net_->Update(); 28 }
最后將迭代次數++iter_,繼續while循環,直到迭代次數完成。 這就是整個網絡的訓練過程。
