File levenberg_marquardt_solver.h¶
-
namespace sym
-
template<typename ScalarType, typename _LinearSolverType = sym::SparseCholeskySolver<Eigen::SparseMatrix<ScalarType>>>
class LevenbergMarquardtSolver - #include <levenberg_marquardt_solver.h>
Fast Levenberg-Marquardt solver for nonlinear least squares problems specified by a linearization function. Supports on-manifold optimization and sparse solving, and attempts to minimize allocations after the first iteration.
This assumes the problem structure is the same for the lifetime of the object - if the problem structure changes, create a new LevenbergMarquardtSolver.
TODO(aaron): Analyze what repeated allocations we do have, and get rid of them if possible
Not thread safe! Create one per thread.
Example usage:
constexpr const int M = 9; constexpr const int N = 5; // Create a function that computes the residual (a linear residual for this example) const auto J_MN = sym::RandomNormalMatrix<double, M, N>(gen); const auto linearize_func = [&J_MN](const sym::Valuesd& values, sym::SparseLinearizationd* const linearization) { const auto state_vec = values.At<sym::Vector5d>('v'); linearization->residual = J_MN * state_vec; linearization->hessian_lower = (J_MN.transpose() * J_MN).sparseView(); linearization->jacobian = J_MN.sparseView(); linearization->rhs = J_MN.transpose() * linearization->residual; }; // Create a Values sym::Valuesd values_init{}; values_init.Set('v', (StateVector::Ones() * 100).eval()); // Create a Solver sym::LevenbergMarquardtSolverd solver(params, "", kEpsilon); solver.SetIndex(values_init.CreateIndex({'v'})); solver.Reset(values_init); // Iterate to convergence sym::optimization_stats_t stats; bool should_early_exit = false; while (!should_early_exit) { should_early_exit = solver.Iterate(linearize_func, &stats); } // Get the best values sym::Valuesd values_final = solver.GetBestValues();
The theory:
We start with a nonlinear vector-valued error function that defines an error residual for which we want to minimize the squared norm. The residual is dimension M, the state is N.
residual = f(x)
Define a least squares cost function as the squared norm of the residual:
e(x) = 0.5 * ||f(x)||**2 = 0.5 * f(x).T * f(x)
Take the first order taylor expansion for x around the linearization point x0:
f(x) = f(x0) + f'(x0) * (x - x0) + ...
Plug in to the cost function to get a quadratic:
e(x) ~= 0.5 * (x - x0).T * f'(x0).T * f'(x0) * (x - x0) + f(x0).T * f'(x0) * (x - x0) + 0.5 * f(x0).T * f(x0)
Take derivative with respect to x:
e'(x) = f'(x0).T * f'(x0) * (x - x0) + f'(x0).T * f(x0)
Set to zero to find the minimum value of the quadratic (paraboloid):
0 = f'(x0).T * f'(x0) * (x - x0) + f'(x0).T * f(x0) (x - x0) = - inv(f'(x0).T * f'(x0)) * f'(x0).T * f(x0) x = x0 - inv(f'(x0).T * f'(x0)) * f'(x0).T * f(x0)
Another way to write this is to create some helpful shorthand:
f'(x0) --> jacobian or J (shape = MxN) f (x0) --> bias or b (shape = Mx1) x - x0 --> dx (shape = Nx1)
Rederiving the Gauss-Newton solution:
e(x) ~= 0.5 * dx.T * J.T * J * dx + b.T * J * dx + 0.5 * b.T * b e'(x) = J.T * J * dx + J.T * b x = x0 - inv(J.T * J) * J.T * b
A couple more common names:
f'(x0).T * f'(x0) = J.T * J --> hessian approximation or H (shape = NxN) f'(x0).T * f (x0) = J.T * b --> right hand side or rhs (shape = Nx1)
So the iteration loop for optimization is:
J, b = linearize(f, x0) dx = -inv(J.T * J) * J.T * b x_new = x0 + dx
Technically what we’ve just described is the Gauss-Newton algorithm; the Levenberg-Marquardt algorithm is based on Gauss-Newton, but adds a term to J.T * J before inverting to make sure the matrix is invertible and make the optimization more stable. This additional term typically takes the form lambda * I, or lambda * diag(J.T * J), where lambda is another parameter updated by the solver at each iteration. Configuration of how this term is computed can be found in the optimizer params.
Public Types
-
using Scalar = ScalarType
-
using LinearSolverType = _LinearSolverType
-
using MatrixType = typename LinearSolverType::MatrixType
-
using StateType = internal::LevenbergMarquardtState<MatrixType>
-
using LinearizationType = Linearization<MatrixType>
-
using FailureReason = levenberg_marquardt_solver_failure_reason_t
-
using LinearizeFunc = std::function<void(const Values<Scalar>&, LinearizationType&)>
Public Functions
-
inline LevenbergMarquardtSolver(const optimizer_params_t &p, const std::string &id, const Scalar epsilon)
-
inline LevenbergMarquardtSolver(const optimizer_params_t &p, const std::string &id, const Scalar epsilon, const LinearSolverType &linear_solver)
-
inline void SetIndex(const index_t &index)
-
inline const optimizer_params_t &Params() const
-
void UpdateParams(const optimizer_params_t &p)
-
inline const LinearSolverType &LinearSolver() const
-
inline LinearSolverType &LinearSolver()
-
optional<std::pair<optimization_status_t, FailureReason>> Iterate(const LinearizeFunc &func, OptimizationStats<MatrixType> &stats)
-
inline const LinearizationType &GetBestLinearization() const
-
void ComputeCovariance(const MatrixType &hessian_lower, MatrixX<Scalar> &covariance)
Private Functions
-
void DampHessian(MatrixType &hessian_lower, bool &have_max_diagonal, VectorX<Scalar> &max_diagonal, Scalar lambda, VectorX<Scalar> &damping_vector, VectorX<Scalar> &undamped_diagonal) const¶
-
void CheckHessianDiagonal(const MatrixType &hessian_lower_damped, Scalar lambda)¶
-
using Scalar = ScalarType
-
template<typename ScalarType, typename _LinearSolverType = sym::SparseCholeskySolver<Eigen::SparseMatrix<ScalarType>>>