Class sym::LevenbergMarquardtSolver¶

template<typename ScalarType, typename _LinearSolverType = sym::SparseCholeskySolver<Eigen::SparseMatrix<ScalarType>>, typename _StateType = internal::LevenbergMarquardtState<typename _LinearSolverType::MatrixType>> class LevenbergMarquardtSolver¶

Fast Levenberg-Marquardt solver for nonlinear least squares problems specified by a linearization function. Supports on-manifold optimization and sparse solving, and attempts to minimize allocations after the first iteration.

This assumes the problem structure is the same for the lifetime of the object - if the problem structure changes, create a new LevenbergMarquardtSolver.

TODO(aaron): Analyze what repeated allocations we do have, and get rid of them if possible

Not thread safe! Create one per thread.

Example usage:

constexpr const int M = 9;
constexpr const int N = 5;

// Create a function that computes the residual (a linear residual for this example)
const auto J_MN = sym::RandomNormalMatrix<double, M, N>(gen);
const auto linearize_func = [&J_MN](const sym::Valuesd& values,
                                    sym::SparseLinearizationd* const linearization) {
  const auto state_vec = values.At<sym::Vector5d>('v');
  linearization->residual = J_MN * state_vec;
  linearization->hessian_lower = (J_MN.transpose() * J_MN).sparseView();
  linearization->jacobian = J_MN.sparseView();
  linearization->rhs = J_MN.transpose() * linearization->residual;
};

// Create a Values
sym::Valuesd values_init{};
values_init.Set('v', (StateVector::Ones() * 100).eval());

// Create a Solver
sym::LevenbergMarquardtSolverd solver(params, "", kEpsilon);
solver.SetIndex(values_init.CreateIndex({'v'}));
solver.Reset(values_init);

// Iterate to convergence
sym::optimization_stats_t stats;
bool should_early_exit = false;
while (!should_early_exit) {
  should_early_exit = solver.Iterate(linearize_func, &stats);
}

// Get the best values
sym::Valuesd values_final = solver.GetBestValues();

The theory:

We start with a nonlinear vector-valued error function that defines an error residual for which we want to minimize the squared norm. The residual is dimension M, the state is N.

residual = f(x)

Define a least squares cost function as the squared norm of the residual:

e(x) = 0.5 * ||f(x)||**2 = 0.5 * f(x).T * f(x)

Take the first order taylor expansion for x around the linearization point x0:

f(x) = f(x0) + f'(x0) * (x - x0) + ...

Plug in to the cost function to get a quadratic:

e(x) ~= 0.5 * (x - x0).T * f'(x0).T * f'(x0) * (x - x0) + f(x0).T * f'(x0) * (x - x0)
        + 0.5 * f(x0).T * f(x0)

Take derivative with respect to x:

e'(x) = f'(x0).T * f'(x0) * (x - x0) + f'(x0).T * f(x0)

Set to zero to find the minimum value of the quadratic (paraboloid):

0 = f'(x0).T * f'(x0) * (x - x0) + f'(x0).T * f(x0)
(x - x0) = - inv(f'(x0).T * f'(x0)) * f'(x0).T * f(x0)
x = x0 - inv(f'(x0).T * f'(x0)) * f'(x0).T * f(x0)

Another way to write this is to create some helpful shorthand:

f'(x0) --> jacobian or J (shape = MxN)
f (x0) --> bias or b     (shape = Mx1)
x - x0 --> dx            (shape = Nx1)

Rederiving the Gauss-Newton solution:

e(x) ~= 0.5 * dx.T * J.T * J * dx + b.T * J * dx + 0.5 * b.T * b
e'(x) = J.T * J * dx + J.T * b
x = x0 - inv(J.T * J) * J.T * b

A couple more common names:

f'(x0).T * f'(x0) = J.T * J --> hessian approximation or H (shape = NxN)
f'(x0).T * f (x0) = J.T * b --> right hand side or rhs     (shape = Nx1)

So the iteration loop for optimization is:

J, b = linearize(f, x0)
dx = -inv(J.T * J) * J.T * b
x_new = x0 + dx

Technically what we’ve just described is the Gauss-Newton algorithm; the Levenberg-Marquardt algorithm is based on Gauss-Newton, but adds a term to J.T * J before inverting to make sure the matrix is invertible and make the optimization more stable. This additional term typically takes the form lambda * I, or lambda * diag(J.T * J), where lambda is another parameter updated by the solver at each iteration. Configuration of how this term is computed can be found in the optimizer params.

Public Types

using Scalar = ScalarType ¶

using LinearSolverType = _LinearSolverType ¶

using MatrixType = typename LinearSolverType::MatrixType¶

using StateType = _StateType ¶

using LinearizationType = Linearization<MatrixType>¶

using ValuesType = typename StateType::ValuesType¶

using FailureReason = levenberg_marquardt_solver_failure_reason_t¶

using LinearizeFunc = std::function<void(const ValuesType&, LinearizationType&)>¶

Public Functions

inline LevenbergMarquardtSolver(const optimizer_params_t &p, const std::string &id, const Scalar epsilon)¶

inline LevenbergMarquardtSolver(const optimizer_params_t &p, const std::string &id, const Scalar epsilon, const LinearSolverType &linear_solver)¶

inline void SetIndex(const index_t &index)¶

inline void Reset(const ValuesType &values)¶

inline void RelaxDampingToInitial()¶

inline void ResetState(const ValuesType &values)¶

inline const optimizer_params_t &Params() const¶

void UpdateParams(const optimizer_params_t &p)¶

inline const LinearSolverType &LinearSolver() const¶

inline LinearSolverType &LinearSolver()¶

std::optional<std::pair<optimization_status_t, FailureReason>> Iterate(const LinearizeFunc &func, OptimizationStats<MatrixType> &stats)¶

inline const ValuesType &GetBestValues() const¶

inline const LinearizationType &GetBestLinearization() const¶

void ComputeCovariance(const MatrixType &hessian_lower, MatrixX<Scalar> &covariance)¶