像大多數優化軟件包一樣,Ceres求解器依賴其能夠在任意參數值下評估目標函數中每一項的值和導數。 正確而高效地做到這一點是取得好結果的關鍵。Ceres提供了一系列解決方案,其中一個就是在Hello World中用到的Automatic Differentiation (自動微分算法)。我們將探討另外兩種可能性,即解析法(Analytic)和數值法(Numeric )求導。
1.Numeric Derivatives(數值法求導)
在某些情況下,只定義一個仿函數是不行的,例如,當殘差的求值涉及調用您無法控制的庫函數時。在這種情況下,可以使用數值微分法。
用戶定義一個仿函數(functor)來計算殘差值,並且構建一個數值微分代價函數(NumericDiffCostFunction)。比如對於 f(x)=10−x 對應函數體如下:
struct NumericDiffCostFunctor {
bool operator()(const double* const x, double* residual) const {
residual[0] = 10.0 - x[0];
return true;
}
};
/*
對比Hello world中的Functor的定義
可以發現,這次沒用模板類。
*/
然后繼續添加Problem
CostFunction* cost_function =
new NumericDiffCostFunction<NumericDiffCostFunctor, ceres::CENTRAL, 1, 1>(
new NumericDiffCostFunctor);
problem.AddResidualBlock(cost_function, nullptr, &x);
對比在Hello World中使用automatic算法
CostFunction* cost_function =
new AutoDiffCostFunction<CostFunctor, 1, 1>(new CostFunctor);
problem.AddResidualBlock(cost_function, nullptr, &x);
可以發現兩種算法在構建Problem時候基本差不多。但是在用Nummeric算法時需要額外給定一個參數ceres::CENTRAL 。這個參數告訴計算機如何計算導數。
更多具體介紹可以參看NumericDiffCostFunction的文檔(http://ceres-solver.org/numerical_derivatives.html?highlight=numericdiffcostfunction)
一般來說,Ceres官方更加推薦自動微分算法,因為C++模板類使自動算法有更高的效率。數值微分算法通常來說計算更復雜,收斂更緩慢。
代碼在ceres-solver-1.14.0/examples/helloworld_numeric_diff.cc
#include "ceres/ceres.h"
#include "glog/logging.h"
using ceres::NumericDiffCostFunction;
using ceres::CENTRAL;
using ceres::CostFunction;
using ceres::Problem;
using ceres::Solver;
using ceres::Solve;
// A cost functor that implements the residual r = 10 - x.
struct CostFunctor {
bool operator()(const double* const x, double* residual) const {
residual[0] = 10.0 - x[0];
return true;
}
};
int main(int argc, char** argv) {
google::InitGoogleLogging(argv[0]);
// The variable to solve for with its initial value. It will be
// mutated in place by the solver.
double x = 0.5;
const double initial_x = x;
// Build the problem.
Problem problem;
// Set up the only cost function (also known as residual). This uses
// numeric differentiation to obtain the derivative (jacobian).
CostFunction* cost_function =
new NumericDiffCostFunction<CostFunctor, CENTRAL, 1, 1> (new CostFunctor);
problem.AddResidualBlock(cost_function, NULL, &x);
// Run the solver!
Solver::Options options;
options.minimizer_progress_to_stdout = true;
options.linear_solver_type = ceres::DENSE_QR;
Solver::Summary summary;
Solve(options, &problem, &summary);
std::cout << summary.BriefReport() << "\n";
std::cout << "initial x : " << initial_x << ", "
<< "result x : " << x << "\n";
return 0;
}
結果
iter cost cost_change |gradient| |step| tr_ratio tr_radius ls_iter iter_time total_time
0 4.512500e+01 0.00e+00 9.50e+00 0.00e+00 0.00e+00 1.00e+04 0 1.44e-04 3.64e-04
1 4.511690e-07 4.51e+01 9.50e-04 9.50e+00 1.00e+00 3.00e+04 1 9.17e-05 5.92e-04
2 5.012669e-16 4.51e-07 3.17e-08 9.50e-04 1.00e+00 9.00e+04 1 8.76e-06 6.08e-04
trust_region_minimizer.cc:687 Terminating: Parameter tolerance reached. Relative step_norm: 3.166246e-09 <= 1.000000e-08.
Ceres Solver Report: Iterations: 3, Initial cost: 4.512500e+01, Final cost: 5.012669e-16, Termination: CONVERGENCE
initial x : 0.5, result x : 10
2.Analytic Derivatives(解析法求導)
有些時候,應用自動求解算法時不可能的。比如在某些情況下,計算導數的時候,使用閉合解(closed form,也被稱為解析解)會比使用自動微分算法中的鏈式法則(chain rule)更有效率。
解析解,數值解參考: https://blog.csdn.net/qianmianyuan/article/details/8998281
例如: 求解3÷7。解析解為3/7,數值解為0.429;
鏈式法則: 微積分中求導法則,用於求復合函數的導數
參考: https://blog.csdn.net/weixin_44010756/article/details/109662171
在這種情況下,你就可以自己編寫殘差和雅可比行列式的計算代碼了。為此,定義一個CostFunction的子類,或者在編譯時知道了參數的大小和殘差,則可以定義SizedCostFunction的子類。
例如,為了實現f(x) = 10 - x,有一個簡單的例子
class QuadraticCostFunction : public ceres::SizedCostFunction<1, 1> {
public:
virtual ~QuadraticCostFunction() {}
virtual bool Evaluate(double const* const* parameters,
double* residuals,
double** jacobians) const {
const double x = parameters[0][0];
residuals[0] = 10 - x;
// Compute the Jacobian if asked for.
if (jacobians != nullptr && jacobians[0] != nullptr) {
jacobians[0][0] = -1;
}
return true;
}
};
parameters:輸入數組
residuals/jacobians:輸出數組
Evaluate函數用於檢查jacobians是否為非空值。如果不為空,那么就把殘差方程的導數值賦值給它。在這種情況下,由於殘差函數是線性的,雅可比矩陣是常數。
從上面的代碼片段可以看出,實現CostFunction對象有點繁瑣。我們建議,除非有特殊的原因自己管理雅可比矩陣計算,否則您可以使用AutoDiffCostFunction或NumericDiffCostFunction來構造殘差塊。
代碼在ceres-solver-1.14.0/examples/helloworld_analytic_diff.cc
#include <vector>
#include "ceres/ceres.h"
#include "glog/logging.h"
using ceres::CostFunction;
using ceres::SizedCostFunction;
using ceres::Problem;
using ceres::Solver;
using ceres::Solve;
// A CostFunction implementing analytically derivatives for the
// function f(x) = 10 - x.
class QuadraticCostFunction
: public SizedCostFunction<1 /* number of residuals */,
1 /* size of first parameter */> {
public:
virtual ~QuadraticCostFunction() {}
virtual bool Evaluate(double const* const* parameters,
double* residuals,
double** jacobians) const {
double x = parameters[0][0];
// f(x) = 10 - x.
residuals[0] = 10 - x;
// f'(x) = -1. Since there's only 1 parameter and that parameter
// has 1 dimension, there is only 1 element to fill in the
// jacobians.
//
// Since the Evaluate function can be called with the jacobians
// pointer equal to NULL, the Evaluate function must check to see
// if jacobians need to be computed.
//
// For this simple problem it is overkill to check if jacobians[0]
// is NULL, but in general when writing more complex
// CostFunctions, it is possible that Ceres may only demand the
// derivatives w.r.t. a subset of the parameter blocks.
if (jacobians != NULL && jacobians[0] != NULL) {
jacobians[0][0] = -1;
}
return true;
}
};
int main(int argc, char** argv) {
google::InitGoogleLogging(argv[0]);
// The variable to solve for with its initial value. It will be
// mutated in place by the solver.
double x = 0.5;
const double initial_x = x;
// Build the problem.
Problem problem;
// Set up the only cost function (also known as residual).
CostFunction* cost_function = new QuadraticCostFunction;
problem.AddResidualBlock(cost_function, NULL, &x);
// Run the solver!
Solver::Options options;
options.minimizer_progress_to_stdout = true;
options.linear_solver_type = ceres::DENSE_QR;
Solver::Summary summary;
Solve(options, &problem, &summary);
std::cout << summary.BriefReport() << "\n";
std::cout << "initial x : " << initial_x << ", "
<< "result x : " << x << "\n";
return 0;
}
結果
iter cost cost_change |gradient| |step| tr_ratio tr_radius ls_iter iter_time total_time
0 4.512500e+01 0.00e+00 9.50e+00 0.00e+00 0.00e+00 1.00e+04 0 2.16e-05 7.73e-05
1 4.511598e-07 4.51e+01 9.50e-04 9.50e+00 1.00e+00 3.00e+04 1 3.16e-05 1.39e-04
2 5.012552e-16 4.51e-07 3.17e-08 9.50e-04 1.00e+00 9.00e+04 1 2.13e-05 1.88e-04
trust_region_minimizer.cc:687 Terminating: Parameter tolerance reached. Relative step_norm: 3.166209e-09 <= 1.000000e-08.
Ceres Solver Report: Iterations: 3, Initial cost: 4.512500e+01, Final cost: 5.012552e-16, Termination: CONVERGENCE
initial x : 0.5, result x : 10
3.More About Derivatives(其他導數計算方法)
到目前為止,計算導數其實是Ceres最復雜的部分了。根據需要,用戶有時候還需要更復雜的導數計算算法。這一節僅僅是大體介紹了如何使用Ceres進行導數計算最淺顯的部分。對NumericDiffCostFunction和AutoDiffCostFunction方法都很熟悉之后,還可以深入研究一下DynamicAutoDiffCostFunction, CostFunctionToFunctor, NumericDiffFunctor和ConditionedCostFunction,從而實現更高級的代價函數的計算方法。