blob: 285df3af65da6260c9bbe60da0339e8732bde866 [file] [log] [blame]
Austin Schuh70cc9552019-01-21 19:46:48 -08001
2.. default-domain:: cpp
3
4.. cpp:namespace:: ceres
5
6.. _chapter-nnls_solving:
7
8================================
9Solving Non-linear Least Squares
10================================
11
12Introduction
13============
14
15Effective use of Ceres requires some familiarity with the basic
16components of a non-linear least squares solver, so before we describe
17how to configure and use the solver, we will take a brief look at how
18some of the core optimization algorithms in Ceres work.
19
20Let :math:`x \in \mathbb{R}^n` be an :math:`n`-dimensional vector of
21variables, and
22:math:`F(x) = \left[f_1(x), ... , f_{m}(x) \right]^{\top}` be a
23:math:`m`-dimensional function of :math:`x`. We are interested in
24solving the optimization problem [#f1]_
25
26.. math:: \arg \min_x \frac{1}{2}\|F(x)\|^2\ . \\
27 L \le x \le U
28 :label: nonlinsq
29
30Where, :math:`L` and :math:`U` are lower and upper bounds on the
31parameter vector :math:`x`.
32
33Since the efficient global minimization of :eq:`nonlinsq` for
34general :math:`F(x)` is an intractable problem, we will have to settle
35for finding a local minimum.
36
37In the following, the Jacobian :math:`J(x)` of :math:`F(x)` is an
38:math:`m\times n` matrix, where :math:`J_{ij}(x) = \partial_j f_i(x)`
39and the gradient vector is :math:`g(x) = \nabla \frac{1}{2}\|F(x)\|^2
40= J(x)^\top F(x)`.
41
42The general strategy when solving non-linear optimization problems is
43to solve a sequence of approximations to the original problem
44[NocedalWright]_. At each iteration, the approximation is solved to
45determine a correction :math:`\Delta x` to the vector :math:`x`. For
46non-linear least squares, an approximation can be constructed by using
47the linearization :math:`F(x+\Delta x) \approx F(x) + J(x)\Delta x`,
48which leads to the following linear least squares problem:
49
50.. math:: \min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2
51 :label: linearapprox
52
53Unfortunately, naively solving a sequence of these problems and
54updating :math:`x \leftarrow x+ \Delta x` leads to an algorithm that
55may not converge. To get a convergent algorithm, we need to control
56the size of the step :math:`\Delta x`. Depending on how the size of
57the step :math:`\Delta x` is controlled, non-linear optimization
58algorithms can be divided into two major categories [NocedalWright]_.
59
601. **Trust Region** The trust region approach approximates the
Austin Schuh1d1e6ea2020-12-23 21:56:30 -080061 objective function using a model function (often a quadratic) over
62 a subset of the search space known as the trust region. If the
Austin Schuh70cc9552019-01-21 19:46:48 -080063 model function succeeds in minimizing the true objective function
64 the trust region is expanded; conversely, otherwise it is
65 contracted and the model optimization problem is solved again.
66
672. **Line Search** The line search approach first finds a descent
68 direction along which the objective function will be reduced and
69 then computes a step size that decides how far should move along
70 that direction. The descent direction can be computed by various
71 methods, such as gradient descent, Newton's method and Quasi-Newton
72 method. The step size can be determined either exactly or
73 inexactly.
74
75Trust region methods are in some sense dual to line search methods:
76trust region methods first choose a step size (the size of the trust
77region) and then a step direction while line search methods first
78choose a step direction and then a step size. Ceres implements
79multiple algorithms in both categories.
80
81.. _section-trust-region-methods:
82
83Trust Region Methods
84====================
85
86The basic trust region algorithm looks something like this.
87
88 1. Given an initial point :math:`x` and a trust region radius :math:`\mu`.
89 2. Solve
90
91 .. math::
92 \arg \min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 \\
93 \text{such that} &\|D(x)\Delta x\|^2 \le \mu\\
94 &L \le x + \Delta x \le U.
95
96 3. :math:`\rho = \frac{\displaystyle \|F(x + \Delta x)\|^2 -
97 \|F(x)\|^2}{\displaystyle \|J(x)\Delta x + F(x)\|^2 -
98 \|F(x)\|^2}`
99 4. if :math:`\rho > \epsilon` then :math:`x = x + \Delta x`.
100 5. if :math:`\rho > \eta_1` then :math:`\mu = 2 \mu`
101 6. else if :math:`\rho < \eta_2` then :math:`\mu = 0.5 * \mu`
102 7. Go to 2.
103
104Here, :math:`\mu` is the trust region radius, :math:`D(x)` is some
105matrix used to define a metric on the domain of :math:`F(x)` and
106:math:`\rho` measures the quality of the step :math:`\Delta x`, i.e.,
107how well did the linear model predict the decrease in the value of the
108non-linear objective. The idea is to increase or decrease the radius
109of the trust region depending on how well the linearization predicts
110the behavior of the non-linear objective, which in turn is reflected
111in the value of :math:`\rho`.
112
113The key computational step in a trust-region algorithm is the solution
114of the constrained optimization problem
115
116.. math::
117 \arg \min_{\Delta x}&\quad \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 \\
118 \text{such that} &\quad \|D(x)\Delta x\|^2 \le \mu\\
119 &\quad L \le x + \Delta x \le U.
120 :label: trp
121
122There are a number of different ways of solving this problem, each
123giving rise to a different concrete trust-region algorithm. Currently,
124Ceres implements two trust-region algorithms - Levenberg-Marquardt
125and Dogleg, each of which is augmented with a line search if bounds
126constraints are present [Kanzow]_. The user can choose between them by
127setting :member:`Solver::Options::trust_region_strategy_type`.
128
129.. rubric:: Footnotes
130
131.. [#f1] At the level of the non-linear solver, the block structure is
132 not relevant, therefore our discussion here is in terms of an
133 optimization problem defined over a state vector of size
134 :math:`n`. Similarly the presence of loss functions is also
135 ignored as the problem is internally converted into a pure
136 non-linear least squares problem.
137
138
139.. _section-levenberg-marquardt:
140
141Levenberg-Marquardt
142-------------------
143
144The Levenberg-Marquardt algorithm [Levenberg]_ [Marquardt]_ is the
145most popular algorithm for solving non-linear least squares problems.
146It was also the first trust region algorithm to be developed
147[Levenberg]_ [Marquardt]_. Ceres implements an exact step [Madsen]_
148and an inexact step variant of the Levenberg-Marquardt algorithm
149[WrightHolt]_ [NashSofer]_.
150
151It can be shown, that the solution to :eq:`trp` can be obtained by
152solving an unconstrained optimization of the form
153
154.. math:: \arg\min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 +\lambda \|D(x)\Delta x\|^2
155
156Where, :math:`\lambda` is a Lagrange multiplier that is inverse
157related to :math:`\mu`. In Ceres, we solve for
158
159.. math:: \arg\min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 + \frac{1}{\mu} \|D(x)\Delta x\|^2
160 :label: lsqr
161
162The matrix :math:`D(x)` is a non-negative diagonal matrix, typically
163the square root of the diagonal of the matrix :math:`J(x)^\top J(x)`.
164
165Before going further, let us make some notational simplifications. We
166will assume that the matrix :math:`\frac{1}{\sqrt{\mu}} D` has been concatenated
167at the bottom of the matrix :math:`J` and similarly a vector of zeros
168has been added to the bottom of the vector :math:`f` and the rest of
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800169our discussion will be in terms of :math:`J` and :math:`F`, i.e, the
Austin Schuh70cc9552019-01-21 19:46:48 -0800170linear least squares problem.
171
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800172.. math:: \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + F(x)\|^2 .
Austin Schuh70cc9552019-01-21 19:46:48 -0800173 :label: simple
174
175For all but the smallest problems the solution of :eq:`simple` in
176each iteration of the Levenberg-Marquardt algorithm is the dominant
177computational cost in Ceres. Ceres provides a number of different
178options for solving :eq:`simple`. There are two major classes of
179methods - factorization and iterative.
180
181The factorization methods are based on computing an exact solution of
182:eq:`lsqr` using a Cholesky or a QR factorization and lead to an exact
183step Levenberg-Marquardt algorithm. But it is not clear if an exact
184solution of :eq:`lsqr` is necessary at each step of the LM algorithm
185to solve :eq:`nonlinsq`. In fact, we have already seen evidence
186that this may not be the case, as :eq:`lsqr` is itself a regularized
187version of :eq:`linearapprox`. Indeed, it is possible to
188construct non-linear optimization algorithms in which the linearized
189problem is solved approximately. These algorithms are known as inexact
190Newton or truncated Newton methods [NocedalWright]_.
191
192An inexact Newton method requires two ingredients. First, a cheap
193method for approximately solving systems of linear
194equations. Typically an iterative linear solver like the Conjugate
195Gradients method is used for this
196purpose [NocedalWright]_. Second, a termination rule for
197the iterative solver. A typical termination rule is of the form
198
199.. math:: \|H(x) \Delta x + g(x)\| \leq \eta_k \|g(x)\|.
200 :label: inexact
201
202Here, :math:`k` indicates the Levenberg-Marquardt iteration number and
203:math:`0 < \eta_k <1` is known as the forcing sequence. [WrightHolt]_
204prove that a truncated Levenberg-Marquardt algorithm that uses an
205inexact Newton step based on :eq:`inexact` converges for any
206sequence :math:`\eta_k \leq \eta_0 < 1` and the rate of convergence
207depends on the choice of the forcing sequence :math:`\eta_k`.
208
209Ceres supports both exact and inexact step solution strategies. When
210the user chooses a factorization based linear solver, the exact step
211Levenberg-Marquardt algorithm is used. When the user chooses an
212iterative linear solver, the inexact step Levenberg-Marquardt
213algorithm is used.
214
215.. _section-dogleg:
216
217Dogleg
218------
219
220Another strategy for solving the trust region problem :eq:`trp` was
221introduced by M. J. D. Powell. The key idea there is to compute two
222vectors
223
224.. math::
225
226 \Delta x^{\text{Gauss-Newton}} &= \arg \min_{\Delta x}\frac{1}{2} \|J(x)\Delta x + f(x)\|^2.\\
227 \Delta x^{\text{Cauchy}} &= -\frac{\|g(x)\|^2}{\|J(x)g(x)\|^2}g(x).
228
229Note that the vector :math:`\Delta x^{\text{Gauss-Newton}}` is the
230solution to :eq:`linearapprox` and :math:`\Delta
231x^{\text{Cauchy}}` is the vector that minimizes the linear
232approximation if we restrict ourselves to moving along the direction
233of the gradient. Dogleg methods finds a vector :math:`\Delta x`
234defined by :math:`\Delta x^{\text{Gauss-Newton}}` and :math:`\Delta
235x^{\text{Cauchy}}` that solves the trust region problem. Ceres
236supports two variants that can be chose by setting
237:member:`Solver::Options::dogleg_type`.
238
239``TRADITIONAL_DOGLEG`` as described by Powell, constructs two line
240segments using the Gauss-Newton and Cauchy vectors and finds the point
241farthest along this line shaped like a dogleg (hence the name) that is
242contained in the trust-region. For more details on the exact reasoning
243and computations, please see Madsen et al [Madsen]_.
244
245``SUBSPACE_DOGLEG`` is a more sophisticated method that considers the
246entire two dimensional subspace spanned by these two vectors and finds
247the point that minimizes the trust region problem in this subspace
248[ByrdSchnabel]_.
249
250The key advantage of the Dogleg over Levenberg-Marquardt is that if
251the step computation for a particular choice of :math:`\mu` does not
252result in sufficient decrease in the value of the objective function,
253Levenberg-Marquardt solves the linear approximation from scratch with
254a smaller value of :math:`\mu`. Dogleg on the other hand, only needs
255to compute the interpolation between the Gauss-Newton and the Cauchy
256vectors, as neither of them depend on the value of :math:`\mu`.
257
258The Dogleg method can only be used with the exact factorization based
259linear solvers.
260
261.. _section-inner-iterations:
262
263Inner Iterations
264----------------
265
266Some non-linear least squares problems have additional structure in
267the way the parameter blocks interact that it is beneficial to modify
268the way the trust region step is computed. For example, consider the
269following regression problem
270
271.. math:: y = a_1 e^{b_1 x} + a_2 e^{b_3 x^2 + c_1}
272
273
274Given a set of pairs :math:`\{(x_i, y_i)\}`, the user wishes to estimate
275:math:`a_1, a_2, b_1, b_2`, and :math:`c_1`.
276
277Notice that the expression on the left is linear in :math:`a_1` and
278:math:`a_2`, and given any value for :math:`b_1, b_2` and :math:`c_1`,
279it is possible to use linear regression to estimate the optimal values
280of :math:`a_1` and :math:`a_2`. It's possible to analytically
281eliminate the variables :math:`a_1` and :math:`a_2` from the problem
282entirely. Problems like these are known as separable least squares
283problem and the most famous algorithm for solving them is the Variable
284Projection algorithm invented by Golub & Pereyra [GolubPereyra]_.
285
286Similar structure can be found in the matrix factorization with
287missing data problem. There the corresponding algorithm is known as
288Wiberg's algorithm [Wiberg]_.
289
290Ruhe & Wedin present an analysis of various algorithms for solving
291separable non-linear least squares problems and refer to *Variable
292Projection* as Algorithm I in their paper [RuheWedin]_.
293
294Implementing Variable Projection is tedious and expensive. Ruhe &
295Wedin present a simpler algorithm with comparable convergence
296properties, which they call Algorithm II. Algorithm II performs an
297additional optimization step to estimate :math:`a_1` and :math:`a_2`
298exactly after computing a successful Newton step.
299
300
301This idea can be generalized to cases where the residual is not
302linear in :math:`a_1` and :math:`a_2`, i.e.,
303
304.. math:: y = f_1(a_1, e^{b_1 x}) + f_2(a_2, e^{b_3 x^2 + c_1})
305
306In this case, we solve for the trust region step for the full problem,
307and then use it as the starting point to further optimize just `a_1`
308and `a_2`. For the linear case, this amounts to doing a single linear
309least squares solve. For non-linear problems, any method for solving
310the :math:`a_1` and :math:`a_2` optimization problems will do. The
311only constraint on :math:`a_1` and :math:`a_2` (if they are two
312different parameter block) is that they do not co-occur in a residual
313block.
314
315This idea can be further generalized, by not just optimizing
316:math:`(a_1, a_2)`, but decomposing the graph corresponding to the
317Hessian matrix's sparsity structure into a collection of
318non-overlapping independent sets and optimizing each of them.
319
320Setting :member:`Solver::Options::use_inner_iterations` to ``true``
321enables the use of this non-linear generalization of Ruhe & Wedin's
322Algorithm II. This version of Ceres has a higher iteration
323complexity, but also displays better convergence behavior per
324iteration.
325
326Setting :member:`Solver::Options::num_threads` to the maximum number
327possible is highly recommended.
328
329.. _section-non-monotonic-steps:
330
331Non-monotonic Steps
332-------------------
333
334Note that the basic trust-region algorithm described in
335:ref:`section-trust-region-methods` is a descent algorithm in that it
336only accepts a point if it strictly reduces the value of the objective
337function.
338
339Relaxing this requirement allows the algorithm to be more efficient in
340the long term at the cost of some local increase in the value of the
341objective function.
342
343This is because allowing for non-decreasing objective function values
344in a principled manner allows the algorithm to *jump over boulders* as
345the method is not restricted to move into narrow valleys while
346preserving its convergence properties.
347
348Setting :member:`Solver::Options::use_nonmonotonic_steps` to ``true``
349enables the non-monotonic trust region algorithm as described by Conn,
350Gould & Toint in [Conn]_.
351
352Even though the value of the objective function may be larger
353than the minimum value encountered over the course of the
354optimization, the final parameters returned to the user are the
355ones corresponding to the minimum cost over all iterations.
356
357The option to take non-monotonic steps is available for all trust
358region strategies.
359
360
361.. _section-line-search-methods:
362
363Line Search Methods
364===================
365
366The line search method in Ceres Solver cannot handle bounds
367constraints right now, so it can only be used for solving
368unconstrained problems.
369
370Line search algorithms
371
372 1. Given an initial point :math:`x`
373 2. :math:`\Delta x = -H^{-1}(x) g(x)`
374 3. :math:`\arg \min_\mu \frac{1}{2} \| F(x + \mu \Delta x) \|^2`
375 4. :math:`x = x + \mu \Delta x`
376 5. Goto 2.
377
378Here :math:`H(x)` is some approximation to the Hessian of the
379objective function, and :math:`g(x)` is the gradient at
380:math:`x`. Depending on the choice of :math:`H(x)` we get a variety of
381different search directions :math:`\Delta x`.
382
383Step 4, which is a one dimensional optimization or `Line Search` along
384:math:`\Delta x` is what gives this class of methods its name.
385
386Different line search algorithms differ in their choice of the search
387direction :math:`\Delta x` and the method used for one dimensional
388optimization along :math:`\Delta x`. The choice of :math:`H(x)` is the
389primary source of computational complexity in these
390methods. Currently, Ceres Solver supports three choices of search
391directions, all aimed at large scale problems.
392
3931. ``STEEPEST_DESCENT`` This corresponds to choosing :math:`H(x)` to
394 be the identity matrix. This is not a good search direction for
395 anything but the simplest of the problems. It is only included here
396 for completeness.
397
3982. ``NONLINEAR_CONJUGATE_GRADIENT`` A generalization of the Conjugate
399 Gradient method to non-linear functions. The generalization can be
400 performed in a number of different ways, resulting in a variety of
401 search directions. Ceres Solver currently supports
402 ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and ``HESTENES_STIEFEL``
403 directions.
404
4053. ``BFGS`` A generalization of the Secant method to multiple
406 dimensions in which a full, dense approximation to the inverse
407 Hessian is maintained and used to compute a quasi-Newton step
408 [NocedalWright]_. BFGS is currently the best known general
409 quasi-Newton algorithm.
410
4114. ``LBFGS`` A limited memory approximation to the full ``BFGS``
412 method in which the last `M` iterations are used to approximate the
413 inverse Hessian used to compute a quasi-Newton step [Nocedal]_,
414 [ByrdNocedal]_.
415
416Currently Ceres Solver supports both a backtracking and interpolation
417based Armijo line search algorithm, and a sectioning / zoom
418interpolation (strong) Wolfe condition line search algorithm.
419However, note that in order for the assumptions underlying the
420``BFGS`` and ``LBFGS`` methods to be guaranteed to be satisfied the
421Wolfe line search algorithm should be used.
422
423.. _section-linear-solver:
424
425LinearSolver
426============
427
428Recall that in both of the trust-region methods described above, the
429key computational cost is the solution of a linear least squares
430problem of the form
431
432.. math:: \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + f(x)\|^2 .
433 :label: simple2
434
435Let :math:`H(x)= J(x)^\top J(x)` and :math:`g(x) = -J(x)^\top
436f(x)`. For notational convenience let us also drop the dependence on
437:math:`x`. Then it is easy to see that solving :eq:`simple2` is
438equivalent to solving the *normal equations*.
439
440.. math:: H \Delta x = g
441 :label: normal
442
443Ceres provides a number of different options for solving :eq:`normal`.
444
445.. _section-qr:
446
447``DENSE_QR``
448------------
449
450For small problems (a couple of hundred parameters and a few thousand
451residuals) with relatively dense Jacobians, ``DENSE_QR`` is the method
452of choice [Bjorck]_. Let :math:`J = QR` be the QR-decomposition of
453:math:`J`, where :math:`Q` is an orthonormal matrix and :math:`R` is
454an upper triangular matrix [TrefethenBau]_. Then it can be shown that
455the solution to :eq:`normal` is given by
456
457.. math:: \Delta x^* = -R^{-1}Q^\top f
458
459
460Ceres uses ``Eigen`` 's dense QR factorization routines.
461
462.. _section-cholesky:
463
464``DENSE_NORMAL_CHOLESKY`` & ``SPARSE_NORMAL_CHOLESKY``
465------------------------------------------------------
466
467Large non-linear least square problems are usually sparse. In such
468cases, using a dense QR factorization is inefficient. Let :math:`H =
469R^\top R` be the Cholesky factorization of the normal equations, where
470:math:`R` is an upper triangular matrix, then the solution to
471:eq:`normal` is given by
472
473.. math::
474
475 \Delta x^* = R^{-1} R^{-\top} g.
476
477
478The observant reader will note that the :math:`R` in the Cholesky
479factorization of :math:`H` is the same upper triangular matrix
480:math:`R` in the QR factorization of :math:`J`. Since :math:`Q` is an
481orthonormal matrix, :math:`J=QR` implies that :math:`J^\top J = R^\top
482Q^\top Q R = R^\top R`. There are two variants of Cholesky
483factorization -- sparse and dense.
484
485``DENSE_NORMAL_CHOLESKY`` as the name implies performs a dense
486Cholesky factorization of the normal equations. Ceres uses
487``Eigen`` 's dense LDLT factorization routines.
488
489``SPARSE_NORMAL_CHOLESKY``, as the name implies performs a sparse
490Cholesky factorization of the normal equations. This leads to
491substantial savings in time and memory for large sparse
492problems. Ceres uses the sparse Cholesky factorization routines in
493Professor Tim Davis' ``SuiteSparse`` or ``CXSparse`` packages [Chen]_
494or the sparse Cholesky factorization algorithm in ``Eigen`` (which
495incidently is a port of the algorithm implemented inside ``CXSparse``)
496
497.. _section-cgnr:
498
499``CGNR``
500--------
501
502For general sparse problems, if the problem is too large for
503``CHOLMOD`` or a sparse linear algebra library is not linked into
504Ceres, another option is the ``CGNR`` solver. This solver uses the
505Conjugate Gradients solver on the *normal equations*, but without
506forming the normal equations explicitly. It exploits the relation
507
508.. math::
509 H x = J^\top J x = J^\top(J x)
510
511The convergence of Conjugate Gradients depends on the conditioner
512number :math:`\kappa(H)`. Usually :math:`H` is poorly conditioned and
513a :ref:`section-preconditioner` must be used to get reasonable
514performance. Currently only the ``JACOBI`` preconditioner is available
515for use with ``CGNR``. It uses the block diagonal of :math:`H` to
516precondition the normal equations.
517
518When the user chooses ``CGNR`` as the linear solver, Ceres
519automatically switches from the exact step algorithm to an inexact
520step algorithm.
521
522.. _section-schur:
523
524``DENSE_SCHUR`` & ``SPARSE_SCHUR``
525----------------------------------
526
527While it is possible to use ``SPARSE_NORMAL_CHOLESKY`` to solve bundle
528adjustment problems, bundle adjustment problem have a special
529structure, and a more efficient scheme for solving :eq:`normal`
530can be constructed.
531
532Suppose that the SfM problem consists of :math:`p` cameras and
533:math:`q` points and the variable vector :math:`x` has the block
534structure :math:`x = [y_{1}, ... ,y_{p},z_{1}, ... ,z_{q}]`. Where,
535:math:`y` and :math:`z` correspond to camera and point parameters,
536respectively. Further, let the camera blocks be of size :math:`c` and
537the point blocks be of size :math:`s` (for most problems :math:`c` =
538:math:`6`--`9` and :math:`s = 3`). Ceres does not impose any constancy
539requirement on these block sizes, but choosing them to be constant
540simplifies the exposition.
541
542A key characteristic of the bundle adjustment problem is that there is
543no term :math:`f_{i}` that includes two or more point blocks. This in
544turn implies that the matrix :math:`H` is of the form
545
546.. math:: H = \left[ \begin{matrix} B & E\\ E^\top & C \end{matrix} \right]\ ,
547 :label: hblock
548
549where :math:`B \in \mathbb{R}^{pc\times pc}` is a block sparse matrix
550with :math:`p` blocks of size :math:`c\times c` and :math:`C \in
551\mathbb{R}^{qs\times qs}` is a block diagonal matrix with :math:`q` blocks
552of size :math:`s\times s`. :math:`E \in \mathbb{R}^{pc\times qs}` is a
553general block sparse matrix, with a block of size :math:`c\times s`
554for each observation. Let us now block partition :math:`\Delta x =
555[\Delta y,\Delta z]` and :math:`g=[v,w]` to restate :eq:`normal`
556as the block structured linear system
557
558.. math:: \left[ \begin{matrix} B & E\\ E^\top & C \end{matrix}
559 \right]\left[ \begin{matrix} \Delta y \\ \Delta z
560 \end{matrix} \right] = \left[ \begin{matrix} v\\ w
561 \end{matrix} \right]\ ,
562 :label: linear2
563
564and apply Gaussian elimination to it. As we noted above, :math:`C` is
565a block diagonal matrix, with small diagonal blocks of size
566:math:`s\times s`. Thus, calculating the inverse of :math:`C` by
567inverting each of these blocks is cheap. This allows us to eliminate
568:math:`\Delta z` by observing that :math:`\Delta z = C^{-1}(w - E^\top
569\Delta y)`, giving us
570
571.. math:: \left[B - EC^{-1}E^\top\right] \Delta y = v - EC^{-1}w\ .
572 :label: schur
573
574The matrix
575
576.. math:: S = B - EC^{-1}E^\top
577
578is the Schur complement of :math:`C` in :math:`H`. It is also known as
579the *reduced camera matrix*, because the only variables
580participating in :eq:`schur` are the ones corresponding to the
581cameras. :math:`S \in \mathbb{R}^{pc\times pc}` is a block structured
582symmetric positive definite matrix, with blocks of size :math:`c\times
583c`. The block :math:`S_{ij}` corresponding to the pair of images
584:math:`i` and :math:`j` is non-zero if and only if the two images
585observe at least one common point.
586
587
588Now, :eq:`linear2` can be solved by first forming :math:`S`, solving for
589:math:`\Delta y`, and then back-substituting :math:`\Delta y` to
590obtain the value of :math:`\Delta z`. Thus, the solution of what was
591an :math:`n\times n`, :math:`n=pc+qs` linear system is reduced to the
592inversion of the block diagonal matrix :math:`C`, a few matrix-matrix
593and matrix-vector multiplies, and the solution of block sparse
594:math:`pc\times pc` linear system :eq:`schur`. For almost all
595problems, the number of cameras is much smaller than the number of
596points, :math:`p \ll q`, thus solving :eq:`schur` is
597significantly cheaper than solving :eq:`linear2`. This is the
598*Schur complement trick* [Brown]_.
599
600This still leaves open the question of solving :eq:`schur`. The
601method of choice for solving symmetric positive definite systems
602exactly is via the Cholesky factorization [TrefethenBau]_ and
603depending upon the structure of the matrix, there are, in general, two
604options. The first is direct factorization, where we store and factor
605:math:`S` as a dense matrix [TrefethenBau]_. This method has
606:math:`O(p^2)` space complexity and :math:`O(p^3)` time complexity and
607is only practical for problems with up to a few hundred cameras. Ceres
608implements this strategy as the ``DENSE_SCHUR`` solver.
609
610
611But, :math:`S` is typically a fairly sparse matrix, as most images
612only see a small fraction of the scene. This leads us to the second
613option: Sparse Direct Methods. These methods store :math:`S` as a
614sparse matrix, use row and column re-ordering algorithms to maximize
615the sparsity of the Cholesky decomposition, and focus their compute
616effort on the non-zero part of the factorization [Chen]_. Sparse
617direct methods, depending on the exact sparsity structure of the Schur
618complement, allow bundle adjustment algorithms to significantly scale
619up over those based on dense factorization. Ceres implements this
620strategy as the ``SPARSE_SCHUR`` solver.
621
622.. _section-iterative_schur:
623
624``ITERATIVE_SCHUR``
625-------------------
626
627Another option for bundle adjustment problems is to apply
628Preconditioned Conjugate Gradients to the reduced camera matrix
629:math:`S` instead of :math:`H`. One reason to do this is that
630:math:`S` is a much smaller matrix than :math:`H`, but more
631importantly, it can be shown that :math:`\kappa(S)\leq \kappa(H)`.
632Ceres implements Conjugate Gradients on :math:`S` as the
633``ITERATIVE_SCHUR`` solver. When the user chooses ``ITERATIVE_SCHUR``
634as the linear solver, Ceres automatically switches from the exact step
635algorithm to an inexact step algorithm.
636
637The key computational operation when using Conjuagate Gradients is the
638evaluation of the matrix vector product :math:`Sx` for an arbitrary
639vector :math:`x`. There are two ways in which this product can be
640evaluated, and this can be controlled using
641``Solver::Options::use_explicit_schur_complement``. Depending on the
642problem at hand, the performance difference between these two methods
643can be quite substantial.
644
645 1. **Implicit** This is default. Implicit evaluation is suitable for
646 large problems where the cost of computing and storing the Schur
647 Complement :math:`S` is prohibitive. Because PCG only needs
648 access to :math:`S` via its product with a vector, one way to
649 evaluate :math:`Sx` is to observe that
650
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800651 .. math:: x_1 &= E^\top x\\
652 x_2 &= C^{-1} x_1\\
653 x_3 &= Ex_2\\
654 x_4 &= Bx\\
655 Sx &= x_4 - x_3
Austin Schuh70cc9552019-01-21 19:46:48 -0800656 :label: schurtrick1
657
658 Thus, we can run PCG on :math:`S` with the same computational
659 effort per iteration as PCG on :math:`H`, while reaping the
660 benefits of a more powerful preconditioner. In fact, we do not
661 even need to compute :math:`H`, :eq:`schurtrick1` can be
662 implemented using just the columns of :math:`J`.
663
664 Equation :eq:`schurtrick1` is closely related to *Domain
665 Decomposition methods* for solving large linear systems that
666 arise in structural engineering and partial differential
667 equations. In the language of Domain Decomposition, each point in
668 a bundle adjustment problem is a domain, and the cameras form the
669 interface between these domains. The iterative solution of the
670 Schur complement then falls within the sub-category of techniques
671 known as Iterative Sub-structuring [Saad]_ [Mathew]_.
672
673 2. **Explicit** The complexity of implicit matrix-vector product
674 evaluation scales with the number of non-zeros in the
675 Jacobian. For small to medium sized problems, the cost of
676 constructing the Schur Complement is small enough that it is
677 better to construct it explicitly in memory and use it to
678 evaluate the product :math:`Sx`.
679
680When the user chooses ``ITERATIVE_SCHUR`` as the linear solver, Ceres
681automatically switches from the exact step algorithm to an inexact
682step algorithm.
683
684 .. NOTE::
685
686 In exact arithmetic, the choice of implicit versus explicit Schur
687 complement would have no impact on solution quality. However, in
688 practice if the Jacobian is poorly conditioned, one may observe
689 (usually small) differences in solution quality. This is a
690 natural consequence of performing computations in finite arithmetic.
691
692
693.. _section-preconditioner:
694
695Preconditioner
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800696==============
Austin Schuh70cc9552019-01-21 19:46:48 -0800697
698The convergence rate of Conjugate Gradients for
699solving :eq:`normal` depends on the distribution of eigenvalues
700of :math:`H` [Saad]_. A useful upper bound is
701:math:`\sqrt{\kappa(H)}`, where, :math:`\kappa(H)` is the condition
702number of the matrix :math:`H`. For most bundle adjustment problems,
703:math:`\kappa(H)` is high and a direct application of Conjugate
704Gradients to :eq:`normal` results in extremely poor performance.
705
706The solution to this problem is to replace :eq:`normal` with a
707*preconditioned* system. Given a linear system, :math:`Ax =b` and a
708preconditioner :math:`M` the preconditioned system is given by
709:math:`M^{-1}Ax = M^{-1}b`. The resulting algorithm is known as
710Preconditioned Conjugate Gradients algorithm (PCG) and its worst case
711complexity now depends on the condition number of the *preconditioned*
712matrix :math:`\kappa(M^{-1}A)`.
713
714The computational cost of using a preconditioner :math:`M` is the cost
715of computing :math:`M` and evaluating the product :math:`M^{-1}y` for
716arbitrary vectors :math:`y`. Thus, there are two competing factors to
717consider: How much of :math:`H`'s structure is captured by :math:`M`
718so that the condition number :math:`\kappa(HM^{-1})` is low, and the
719computational cost of constructing and using :math:`M`. The ideal
720preconditioner would be one for which :math:`\kappa(M^{-1}A)
721=1`. :math:`M=A` achieves this, but it is not a practical choice, as
722applying this preconditioner would require solving a linear system
723equivalent to the unpreconditioned problem. It is usually the case
724that the more information :math:`M` has about :math:`H`, the more
725expensive it is use. For example, Incomplete Cholesky factorization
726based preconditioners have much better convergence behavior than the
727Jacobi preconditioner, but are also much more expensive.
728
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800729For a survey of the state of the art in preconditioning linear least
730squares problems with general sparsity structure see [GouldScott]_.
731
732Ceres Solver comes with an number of preconditioners suited for
733problems with general sparsity as well as the special sparsity
734structure encountered in bundle adjustment problems.
735
736``JACOBI``
737----------
738
Austin Schuh70cc9552019-01-21 19:46:48 -0800739The simplest of all preconditioners is the diagonal or Jacobi
740preconditioner, i.e., :math:`M=\operatorname{diag}(A)`, which for
741block structured matrices like :math:`H` can be generalized to the
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800742block Jacobi preconditioner. The ``JACOBI`` preconditioner in Ceres
743when used with :ref:`section-cgnr` refers to the block diagonal of
744:math:`H` and when used with :ref:`section-iterative_schur` refers to
745the block diagonal of :math:`B` [Mandel]_. For detailed performance
746data about the performance of ``JACOBI`` on bundle adjustment problems
747see [Agarwal]_.
748
749
750``SCHUR_JACOBI``
751----------------
Austin Schuh70cc9552019-01-21 19:46:48 -0800752
753Another obvious choice for :ref:`section-iterative_schur` is the block
754diagonal of the Schur complement matrix :math:`S`, i.e, the block
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800755Jacobi preconditioner for :math:`S`. In Ceres we refer to it as the
756``SCHUR_JACOBI`` preconditioner. For detailed performance data about
757the performance of ``SCHUR_JACOBI`` on bundle adjustment problems see
758[Agarwal]_.
759
760
761``CLUSTER_JACOBI`` and ``CLUSTER_TRIDIAGONAL``
762----------------------------------------------
Austin Schuh70cc9552019-01-21 19:46:48 -0800763
764For bundle adjustment problems arising in reconstruction from
765community photo collections, more effective preconditioners can be
766constructed by analyzing and exploiting the camera-point visibility
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800767structure of the scene.
768
769The key idea is to cluster the cameras based on the visibility
770structure of the scene. The similarity between a pair of cameras
771:math:`i` and :math:`j` is given by:
772
773 .. math:: S_{ij} = \frac{|V_i \cap V_j|}{|V_i| |V_j|}
774
775Here :math:`V_i` is the set of scene points visible in camera
776:math:`i`. This idea was first exploited by [KushalAgarwal]_ to create
777the ``CLUSTER_JACOBI`` and the ``CLUSTER_TRIDIAGONAL`` preconditioners
778which Ceres implements.
779
780The performance of these two preconditioners depends on the speed and
781clustering quality of the clustering algorithm used when building the
782preconditioner. In the original paper, [KushalAgarwal]_ used the
783Canonical Views algorithm [Simon]_, which while producing high quality
784clusterings can be quite expensive for large graphs. So, Ceres
785supports two visibility clustering algorithms - ``CANONICAL_VIEWS``
786and ``SINGLE_LINKAGE``. The former is as the name implies Canonical
787Views algorithm of [Simon]_. The latter is the the classic `Single
788Linkage Clustering
789<https://en.wikipedia.org/wiki/Single-linkage_clustering>`_
790algorithm. The choice of clustering algorithm is controlled by
791:member:`Solver::Options::visibility_clustering_type`.
792
793``SUBSET``
794----------
795
796This is a preconditioner for problems with general sparsity. Given a
797subset of residual blocks of a problem, it uses the corresponding
798subset of the rows of the Jacobian to construct a preconditioner
799[Dellaert]_.
800
801Suppose the Jacobian :math:`J` has been horizontally partitioned as
802
803 .. math:: J = \begin{bmatrix} P \\ Q \end{bmatrix}
804
805Where, :math:`Q` is the set of rows corresponding to the residual
806blocks in
807:member:`Solver::Options::residual_blocks_for_subset_preconditioner`. The
808preconditioner is the matrix :math:`(Q^\top Q)^{-1}`.
809
810The efficacy of the preconditioner depends on how well the matrix
811:math:`Q` approximates :math:`J^\top J`, or how well the chosen
812residual blocks approximate the full problem.
813
Austin Schuh70cc9552019-01-21 19:46:48 -0800814
815.. _section-ordering:
816
817Ordering
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800818========
Austin Schuh70cc9552019-01-21 19:46:48 -0800819
820The order in which variables are eliminated in a linear solver can
821have a significant of impact on the efficiency and accuracy of the
822method. For example when doing sparse Cholesky factorization, there
823are matrices for which a good ordering will give a Cholesky factor
824with :math:`O(n)` storage, where as a bad ordering will result in an
825completely dense factor.
826
827Ceres allows the user to provide varying amounts of hints to the
828solver about the variable elimination ordering to use. This can range
829from no hints, where the solver is free to decide the best ordering
830based on the user's choices like the linear solver being used, to an
831exact order in which the variables should be eliminated, and a variety
832of possibilities in between.
833
834Instances of the :class:`ParameterBlockOrdering` class are used to
835communicate this information to Ceres.
836
837Formally an ordering is an ordered partitioning of the parameter
838blocks. Each parameter block belongs to exactly one group, and each
839group has a unique integer associated with it, that determines its
840order in the set of groups. We call these groups *Elimination Groups*
841
842Given such an ordering, Ceres ensures that the parameter blocks in the
843lowest numbered elimination group are eliminated first, and then the
844parameter blocks in the next lowest numbered elimination group and so
845on. Within each elimination group, Ceres is free to order the
846parameter blocks as it chooses. For example, consider the linear system
847
848.. math::
849 x + y &= 3\\
850 2x + 3y &= 7
851
852There are two ways in which it can be solved. First eliminating
853:math:`x` from the two equations, solving for :math:`y` and then back
854substituting for :math:`x`, or first eliminating :math:`y`, solving
855for :math:`x` and back substituting for :math:`y`. The user can
856construct three orderings here.
857
8581. :math:`\{0: x\}, \{1: y\}` : Eliminate :math:`x` first.
8592. :math:`\{0: y\}, \{1: x\}` : Eliminate :math:`y` first.
8603. :math:`\{0: x, y\}` : Solver gets to decide the elimination order.
861
862Thus, to have Ceres determine the ordering automatically using
863heuristics, put all the variables in the same elimination group. The
864identity of the group does not matter. This is the same as not
865specifying an ordering at all. To control the ordering for every
866variable, create an elimination group per variable, ordering them in
867the desired order.
868
869If the user is using one of the Schur solvers (``DENSE_SCHUR``,
870``SPARSE_SCHUR``, ``ITERATIVE_SCHUR``) and chooses to specify an
871ordering, it must have one important property. The lowest numbered
872elimination group must form an independent set in the graph
873corresponding to the Hessian, or in other words, no two parameter
874blocks in in the first elimination group should co-occur in the same
875residual block. For the best performance, this elimination group
876should be as large as possible. For standard bundle adjustment
877problems, this corresponds to the first elimination group containing
878all the 3d points, and the second containing the all the cameras
879parameter blocks.
880
881If the user leaves the choice to Ceres, then the solver uses an
882approximate maximum independent set algorithm to identify the first
883elimination group [LiSaad]_.
884
885.. _section-solver-options:
886
887:class:`Solver::Options`
888========================
889
890.. class:: Solver::Options
891
892 :class:`Solver::Options` controls the overall behavior of the
893 solver. We list the various settings and their default values below.
894
895.. function:: bool Solver::Options::IsValid(string* error) const
896
897 Validate the values in the options struct and returns true on
898 success. If there is a problem, the method returns false with
899 ``error`` containing a textual description of the cause.
900
901.. member:: MinimizerType Solver::Options::minimizer_type
902
903 Default: ``TRUST_REGION``
904
905 Choose between ``LINE_SEARCH`` and ``TRUST_REGION`` algorithms. See
906 :ref:`section-trust-region-methods` and
907 :ref:`section-line-search-methods` for more details.
908
909.. member:: LineSearchDirectionType Solver::Options::line_search_direction_type
910
911 Default: ``LBFGS``
912
913 Choices are ``STEEPEST_DESCENT``, ``NONLINEAR_CONJUGATE_GRADIENT``,
914 ``BFGS`` and ``LBFGS``.
915
916.. member:: LineSearchType Solver::Options::line_search_type
917
918 Default: ``WOLFE``
919
920 Choices are ``ARMIJO`` and ``WOLFE`` (strong Wolfe conditions).
921 Note that in order for the assumptions underlying the ``BFGS`` and
922 ``LBFGS`` line search direction algorithms to be guaranteed to be
923 satisifed, the ``WOLFE`` line search should be used.
924
925.. member:: NonlinearConjugateGradientType Solver::Options::nonlinear_conjugate_gradient_type
926
927 Default: ``FLETCHER_REEVES``
928
929 Choices are ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and
930 ``HESTENES_STIEFEL``.
931
932.. member:: int Solver::Options::max_lbfgs_rank
933
934 Default: 20
935
936 The L-BFGS hessian approximation is a low rank approximation to the
937 inverse of the Hessian matrix. The rank of the approximation
938 determines (linearly) the space and time complexity of using the
939 approximation. Higher the rank, the better is the quality of the
940 approximation. The increase in quality is however is bounded for a
941 number of reasons.
942
943 1. The method only uses secant information and not actual
944 derivatives.
945
946 2. The Hessian approximation is constrained to be positive
947 definite.
948
949 So increasing this rank to a large number will cost time and space
950 complexity without the corresponding increase in solution
951 quality. There are no hard and fast rules for choosing the maximum
952 rank. The best choice usually requires some problem specific
953 experimentation.
954
955.. member:: bool Solver::Options::use_approximate_eigenvalue_bfgs_scaling
956
957 Default: ``false``
958
959 As part of the ``BFGS`` update step / ``LBFGS`` right-multiply
960 step, the initial inverse Hessian approximation is taken to be the
961 Identity. However, [Oren]_ showed that using instead :math:`I *
962 \gamma`, where :math:`\gamma` is a scalar chosen to approximate an
963 eigenvalue of the true inverse Hessian can result in improved
964 convergence in a wide variety of cases. Setting
965 ``use_approximate_eigenvalue_bfgs_scaling`` to true enables this
966 scaling in ``BFGS`` (before first iteration) and ``LBFGS`` (at each
967 iteration).
968
969 Precisely, approximate eigenvalue scaling equates to
970
971 .. math:: \gamma = \frac{y_k' s_k}{y_k' y_k}
972
973 With:
974
975 .. math:: y_k = \nabla f_{k+1} - \nabla f_k
976 .. math:: s_k = x_{k+1} - x_k
977
978 Where :math:`f()` is the line search objective and :math:`x` the
979 vector of parameter values [NocedalWright]_.
980
981 It is important to note that approximate eigenvalue scaling does
982 **not** *always* improve convergence, and that it can in fact
983 *significantly* degrade performance for certain classes of problem,
984 which is why it is disabled by default. In particular it can
985 degrade performance when the sensitivity of the problem to different
986 parameters varies significantly, as in this case a single scalar
987 factor fails to capture this variation and detrimentally downscales
988 parts of the Jacobian approximation which correspond to
989 low-sensitivity parameters. It can also reduce the robustness of the
990 solution to errors in the Jacobians.
991
992.. member:: LineSearchIterpolationType Solver::Options::line_search_interpolation_type
993
994 Default: ``CUBIC``
995
996 Degree of the polynomial used to approximate the objective
997 function. Valid values are ``BISECTION``, ``QUADRATIC`` and
998 ``CUBIC``.
999
1000.. member:: double Solver::Options::min_line_search_step_size
1001
1002 The line search terminates if:
1003
1004 .. math:: \|\Delta x_k\|_\infty < \text{min_line_search_step_size}
1005
1006 where :math:`\|\cdot\|_\infty` refers to the max norm, and
1007 :math:`\Delta x_k` is the step change in the parameter values at
1008 the :math:`k`-th iteration.
1009
1010.. member:: double Solver::Options::line_search_sufficient_function_decrease
1011
1012 Default: ``1e-4``
1013
1014 Solving the line search problem exactly is computationally
1015 prohibitive. Fortunately, line search based optimization algorithms
1016 can still guarantee convergence if instead of an exact solution,
1017 the line search algorithm returns a solution which decreases the
1018 value of the objective function sufficiently. More precisely, we
1019 are looking for a step size s.t.
1020
1021 .. math:: f(\text{step_size}) \le f(0) + \text{sufficient_decrease} * [f'(0) * \text{step_size}]
1022
1023 This condition is known as the Armijo condition.
1024
1025.. member:: double Solver::Options::max_line_search_step_contraction
1026
1027 Default: ``1e-3``
1028
1029 In each iteration of the line search,
1030
1031 .. math:: \text{new_step_size} >= \text{max_line_search_step_contraction} * \text{step_size}
1032
1033 Note that by definition, for contraction:
1034
1035 .. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1
1036
1037.. member:: double Solver::Options::min_line_search_step_contraction
1038
1039 Default: ``0.6``
1040
1041 In each iteration of the line search,
1042
1043 .. math:: \text{new_step_size} <= \text{min_line_search_step_contraction} * \text{step_size}
1044
1045 Note that by definition, for contraction:
1046
1047 .. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1
1048
1049.. member:: int Solver::Options::max_num_line_search_step_size_iterations
1050
1051 Default: ``20``
1052
1053 Maximum number of trial step size iterations during each line
1054 search, if a step size satisfying the search conditions cannot be
1055 found within this number of trials, the line search will stop.
1056
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001057 The minimum allowed value is 0 for trust region minimizer and 1
1058 otherwise. If 0 is specified for the trust region minimizer, then
1059 line search will not be used when solving constrained optimization
1060 problems.
1061
Austin Schuh70cc9552019-01-21 19:46:48 -08001062 As this is an 'artificial' constraint (one imposed by the user, not
1063 the underlying math), if ``WOLFE`` line search is being used, *and*
1064 points satisfying the Armijo sufficient (function) decrease
1065 condition have been found during the current search (in :math:`<=`
1066 ``max_num_line_search_step_size_iterations``). Then, the step size
1067 with the lowest function value which satisfies the Armijo condition
1068 will be returned as the new valid step, even though it does *not*
1069 satisfy the strong Wolfe conditions. This behaviour protects
1070 against early termination of the optimizer at a sub-optimal point.
1071
1072.. member:: int Solver::Options::max_num_line_search_direction_restarts
1073
1074 Default: ``5``
1075
1076 Maximum number of restarts of the line search direction algorithm
1077 before terminating the optimization. Restarts of the line search
1078 direction algorithm occur when the current algorithm fails to
1079 produce a new descent direction. This typically indicates a
1080 numerical failure, or a breakdown in the validity of the
1081 approximations used.
1082
1083.. member:: double Solver::Options::line_search_sufficient_curvature_decrease
1084
1085 Default: ``0.9``
1086
1087 The strong Wolfe conditions consist of the Armijo sufficient
1088 decrease condition, and an additional requirement that the
1089 step size be chosen s.t. the *magnitude* ('strong' Wolfe
1090 conditions) of the gradient along the search direction
1091 decreases sufficiently. Precisely, this second condition
1092 is that we seek a step size s.t.
1093
1094 .. math:: \|f'(\text{step_size})\| <= \text{sufficient_curvature_decrease} * \|f'(0)\|
1095
1096 Where :math:`f()` is the line search objective and :math:`f'()` is the derivative
1097 of :math:`f` with respect to the step size: :math:`\frac{d f}{d~\text{step size}}`.
1098
1099.. member:: double Solver::Options::max_line_search_step_expansion
1100
1101 Default: ``10.0``
1102
1103 During the bracketing phase of a Wolfe line search, the step size
1104 is increased until either a point satisfying the Wolfe conditions
1105 is found, or an upper bound for a bracket containing a point
1106 satisfying the conditions is found. Precisely, at each iteration
1107 of the expansion:
1108
1109 .. math:: \text{new_step_size} <= \text{max_step_expansion} * \text{step_size}
1110
1111 By definition for expansion
1112
1113 .. math:: \text{max_step_expansion} > 1.0
1114
1115.. member:: TrustRegionStrategyType Solver::Options::trust_region_strategy_type
1116
1117 Default: ``LEVENBERG_MARQUARDT``
1118
1119 The trust region step computation algorithm used by
1120 Ceres. Currently ``LEVENBERG_MARQUARDT`` and ``DOGLEG`` are the two
1121 valid choices. See :ref:`section-levenberg-marquardt` and
1122 :ref:`section-dogleg` for more details.
1123
1124.. member:: DoglegType Solver::Options::dogleg_type
1125
1126 Default: ``TRADITIONAL_DOGLEG``
1127
1128 Ceres supports two different dogleg strategies.
1129 ``TRADITIONAL_DOGLEG`` method by Powell and the ``SUBSPACE_DOGLEG``
1130 method described by [ByrdSchnabel]_ . See :ref:`section-dogleg`
1131 for more details.
1132
1133.. member:: bool Solver::Options::use_nonmonotonic_steps
1134
1135 Default: ``false``
1136
1137 Relax the requirement that the trust-region algorithm take strictly
1138 decreasing steps. See :ref:`section-non-monotonic-steps` for more
1139 details.
1140
1141.. member:: int Solver::Options::max_consecutive_nonmonotonic_steps
1142
1143 Default: ``5``
1144
1145 The window size used by the step selection algorithm to accept
1146 non-monotonic steps.
1147
1148.. member:: int Solver::Options::max_num_iterations
1149
1150 Default: ``50``
1151
1152 Maximum number of iterations for which the solver should run.
1153
1154.. member:: double Solver::Options::max_solver_time_in_seconds
1155
1156 Default: ``1e6``
1157 Maximum amount of time for which the solver should run.
1158
1159.. member:: int Solver::Options::num_threads
1160
1161 Default: ``1``
1162
1163 Number of threads used by Ceres to evaluate the Jacobian.
1164
1165.. member:: double Solver::Options::initial_trust_region_radius
1166
1167 Default: ``1e4``
1168
1169 The size of the initial trust region. When the
1170 ``LEVENBERG_MARQUARDT`` strategy is used, the reciprocal of this
1171 number is the initial regularization parameter.
1172
1173.. member:: double Solver::Options::max_trust_region_radius
1174
1175 Default: ``1e16``
1176
1177 The trust region radius is not allowed to grow beyond this value.
1178
1179.. member:: double Solver::Options::min_trust_region_radius
1180
1181 Default: ``1e-32``
1182
1183 The solver terminates, when the trust region becomes smaller than
1184 this value.
1185
1186.. member:: double Solver::Options::min_relative_decrease
1187
1188 Default: ``1e-3``
1189
1190 Lower threshold for relative decrease before a trust-region step is
1191 accepted.
1192
1193.. member:: double Solver::Options::min_lm_diagonal
1194
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001195 Default: ``1e-6``
Austin Schuh70cc9552019-01-21 19:46:48 -08001196
1197 The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to
1198 regularize the trust region step. This is the lower bound on
1199 the values of this diagonal matrix.
1200
1201.. member:: double Solver::Options::max_lm_diagonal
1202
1203 Default: ``1e32``
1204
1205 The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to
1206 regularize the trust region step. This is the upper bound on
1207 the values of this diagonal matrix.
1208
1209.. member:: int Solver::Options::max_num_consecutive_invalid_steps
1210
1211 Default: ``5``
1212
1213 The step returned by a trust region strategy can sometimes be
1214 numerically invalid, usually because of conditioning
1215 issues. Instead of crashing or stopping the optimization, the
1216 optimizer can go ahead and try solving with a smaller trust
1217 region/better conditioned problem. This parameter sets the number
1218 of consecutive retries before the minimizer gives up.
1219
1220.. member:: double Solver::Options::function_tolerance
1221
1222 Default: ``1e-6``
1223
1224 Solver terminates if
1225
1226 .. math:: \frac{|\Delta \text{cost}|}{\text{cost}} <= \text{function_tolerance}
1227
1228 where, :math:`\Delta \text{cost}` is the change in objective
1229 function value (up or down) in the current iteration of
1230 Levenberg-Marquardt.
1231
1232.. member:: double Solver::Options::gradient_tolerance
1233
1234 Default: ``1e-10``
1235
1236 Solver terminates if
1237
1238 .. math:: \|x - \Pi \boxplus(x, -g(x))\|_\infty <= \text{gradient_tolerance}
1239
1240 where :math:`\|\cdot\|_\infty` refers to the max norm, :math:`\Pi`
1241 is projection onto the bounds constraints and :math:`\boxplus` is
1242 Plus operation for the overall local parameterization associated
1243 with the parameter vector.
1244
1245.. member:: double Solver::Options::parameter_tolerance
1246
1247 Default: ``1e-8``
1248
1249 Solver terminates if
1250
1251 .. math:: \|\Delta x\| <= (\|x\| + \text{parameter_tolerance}) * \text{parameter_tolerance}
1252
1253 where :math:`\Delta x` is the step computed by the linear solver in
1254 the current iteration.
1255
1256.. member:: LinearSolverType Solver::Options::linear_solver_type
1257
1258 Default: ``SPARSE_NORMAL_CHOLESKY`` / ``DENSE_QR``
1259
1260 Type of linear solver used to compute the solution to the linear
1261 least squares problem in each iteration of the Levenberg-Marquardt
1262 algorithm. If Ceres is built with support for ``SuiteSparse`` or
1263 ``CXSparse`` or ``Eigen``'s sparse Cholesky factorization, the
1264 default is ``SPARSE_NORMAL_CHOLESKY``, it is ``DENSE_QR``
1265 otherwise.
1266
1267.. member:: PreconditionerType Solver::Options::preconditioner_type
1268
1269 Default: ``JACOBI``
1270
1271 The preconditioner used by the iterative linear solver. The default
1272 is the block Jacobi preconditioner. Valid values are (in increasing
1273 order of complexity) ``IDENTITY``, ``JACOBI``, ``SCHUR_JACOBI``,
1274 ``CLUSTER_JACOBI`` and ``CLUSTER_TRIDIAGONAL``. See
1275 :ref:`section-preconditioner` for more details.
1276
1277.. member:: VisibilityClusteringType Solver::Options::visibility_clustering_type
1278
1279 Default: ``CANONICAL_VIEWS``
1280
1281 Type of clustering algorithm to use when constructing a visibility
1282 based preconditioner. The original visibility based preconditioning
1283 paper and implementation only used the canonical views algorithm.
1284
1285 This algorithm gives high quality results but for large dense
1286 graphs can be particularly expensive. As its worst case complexity
1287 is cubic in size of the graph.
1288
1289 Another option is to use ``SINGLE_LINKAGE`` which is a simple
1290 thresholded single linkage clustering algorithm that only pays
1291 attention to tightly coupled blocks in the Schur complement. This
1292 is a fast algorithm that works well.
1293
1294 The optimal choice of the clustering algorithm depends on the
1295 sparsity structure of the problem, but generally speaking we
1296 recommend that you try ``CANONICAL_VIEWS`` first and if it is too
1297 expensive try ``SINGLE_LINKAGE``.
1298
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001299.. member:: std::unordered_set<ResidualBlockId> residual_blocks_for_subset_preconditioner
1300
1301 ``SUBSET`` preconditioner is a preconditioner for problems with
1302 general sparsity. Given a subset of residual blocks of a problem,
1303 it uses the corresponding subset of the rows of the Jacobian to
1304 construct a preconditioner.
1305
1306 Suppose the Jacobian :math:`J` has been horizontally partitioned as
1307
1308 .. math:: J = \begin{bmatrix} P \\ Q \end{bmatrix}
1309
1310 Where, :math:`Q` is the set of rows corresponding to the residual
1311 blocks in
1312 :member:`Solver::Options::residual_blocks_for_subset_preconditioner`. The
1313 preconditioner is the matrix :math:`(Q^\top Q)^{-1}`.
1314
1315 The efficacy of the preconditioner depends on how well the matrix
1316 :math:`Q` approximates :math:`J^\top J`, or how well the chosen
1317 residual blocks approximate the full problem.
1318
1319 If ``Solver::Options::preconditioner_type == SUBSET``, then
1320 ``residual_blocks_for_subset_preconditioner`` must be non-empty.
1321
Austin Schuh70cc9552019-01-21 19:46:48 -08001322.. member:: DenseLinearAlgebraLibrary Solver::Options::dense_linear_algebra_library_type
1323
1324 Default:``EIGEN``
1325
1326 Ceres supports using multiple dense linear algebra libraries for
1327 dense matrix factorizations. Currently ``EIGEN`` and ``LAPACK`` are
1328 the valid choices. ``EIGEN`` is always available, ``LAPACK`` refers
1329 to the system ``BLAS + LAPACK`` library which may or may not be
1330 available.
1331
1332 This setting affects the ``DENSE_QR``, ``DENSE_NORMAL_CHOLESKY``
1333 and ``DENSE_SCHUR`` solvers. For small to moderate sized probem
1334 ``EIGEN`` is a fine choice but for large problems, an optimized
1335 ``LAPACK + BLAS`` implementation can make a substantial difference
1336 in performance.
1337
1338.. member:: SparseLinearAlgebraLibrary Solver::Options::sparse_linear_algebra_library_type
1339
1340 Default: The highest available according to: ``SUITE_SPARSE`` >
1341 ``CX_SPARSE`` > ``EIGEN_SPARSE`` > ``NO_SPARSE``
1342
1343 Ceres supports the use of three sparse linear algebra libraries,
1344 ``SuiteSparse``, which is enabled by setting this parameter to
1345 ``SUITE_SPARSE``, ``CXSparse``, which can be selected by setting
1346 this parameter to ``CX_SPARSE`` and ``Eigen`` which is enabled by
1347 setting this parameter to ``EIGEN_SPARSE``. Lastly, ``NO_SPARSE``
1348 means that no sparse linear solver should be used; note that this is
1349 irrespective of whether Ceres was compiled with support for one.
1350
1351 ``SuiteSparse`` is a sophisticated and complex sparse linear
1352 algebra library and should be used in general.
1353
1354 If your needs/platforms prevent you from using ``SuiteSparse``,
1355 consider using ``CXSparse``, which is a much smaller, easier to
1356 build library. As can be expected, its performance on large
1357 problems is not comparable to that of ``SuiteSparse``.
1358
1359 Last but not the least you can use the sparse linear algebra
1360 routines in ``Eigen``. Currently the performance of this library is
1361 the poorest of the three. But this should change in the near
1362 future.
1363
1364 Another thing to consider here is that the sparse Cholesky
1365 factorization libraries in Eigen are licensed under ``LGPL`` and
1366 building Ceres with support for ``EIGEN_SPARSE`` will result in an
1367 LGPL licensed library (since the corresponding code from Eigen is
1368 compiled into the library).
1369
1370 The upside is that you do not need to build and link to an external
1371 library to use ``EIGEN_SPARSE``.
1372
1373
1374.. member:: shared_ptr<ParameterBlockOrdering> Solver::Options::linear_solver_ordering
1375
1376 Default: ``NULL``
1377
1378 An instance of the ordering object informs the solver about the
1379 desired order in which parameter blocks should be eliminated by the
1380 linear solvers. See section~\ref{sec:ordering`` for more details.
1381
1382 If ``NULL``, the solver is free to choose an ordering that it
1383 thinks is best.
1384
1385 See :ref:`section-ordering` for more details.
1386
1387.. member:: bool Solver::Options::use_explicit_schur_complement
1388
1389 Default: ``false``
1390
1391 Use an explicitly computed Schur complement matrix with
1392 ``ITERATIVE_SCHUR``.
1393
1394 By default this option is disabled and ``ITERATIVE_SCHUR``
1395 evaluates evaluates matrix-vector products between the Schur
1396 complement and a vector implicitly by exploiting the algebraic
1397 expression for the Schur complement.
1398
1399 The cost of this evaluation scales with the number of non-zeros in
1400 the Jacobian.
1401
1402 For small to medium sized problems there is a sweet spot where
1403 computing the Schur complement is cheap enough that it is much more
1404 efficient to explicitly compute it and use it for evaluating the
1405 matrix-vector products.
1406
1407 Enabling this option tells ``ITERATIVE_SCHUR`` to use an explicitly
1408 computed Schur complement. This can improve the performance of the
1409 ``ITERATIVE_SCHUR`` solver significantly.
1410
1411 .. NOTE:
1412
1413 This option can only be used with the ``SCHUR_JACOBI``
1414 preconditioner.
1415
1416.. member:: bool Solver::Options::use_post_ordering
1417
1418 Default: ``false``
1419
1420 Sparse Cholesky factorization algorithms use a fill-reducing
1421 ordering to permute the columns of the Jacobian matrix. There are
1422 two ways of doing this.
1423
1424 1. Compute the Jacobian matrix in some order and then have the
1425 factorization algorithm permute the columns of the Jacobian.
1426
1427 2. Compute the Jacobian with its columns already permuted.
1428
1429 The first option incurs a significant memory penalty. The
1430 factorization algorithm has to make a copy of the permuted Jacobian
1431 matrix, thus Ceres pre-permutes the columns of the Jacobian matrix
1432 and generally speaking, there is no performance penalty for doing
1433 so.
1434
1435 In some rare cases, it is worth using a more complicated reordering
1436 algorithm which has slightly better runtime performance at the
1437 expense of an extra copy of the Jacobian matrix. Setting
1438 ``use_postordering`` to ``true`` enables this tradeoff.
1439
1440.. member:: bool Solver::Options::dynamic_sparsity
1441
1442 Some non-linear least squares problems are symbolically dense but
1443 numerically sparse. i.e. at any given state only a small number of
1444 Jacobian entries are non-zero, but the position and number of
1445 non-zeros is different depending on the state. For these problems
1446 it can be useful to factorize the sparse jacobian at each solver
1447 iteration instead of including all of the zero entries in a single
1448 general factorization.
1449
1450 If your problem does not have this property (or you do not know),
1451 then it is probably best to keep this false, otherwise it will
1452 likely lead to worse performance.
1453
1454 This setting only affects the `SPARSE_NORMAL_CHOLESKY` solver.
1455
1456.. member:: int Solver::Options::min_linear_solver_iterations
1457
1458 Default: ``0``
1459
1460 Minimum number of iterations used by the linear solver. This only
1461 makes sense when the linear solver is an iterative solver, e.g.,
1462 ``ITERATIVE_SCHUR`` or ``CGNR``.
1463
1464.. member:: int Solver::Options::max_linear_solver_iterations
1465
1466 Default: ``500``
1467
1468 Minimum number of iterations used by the linear solver. This only
1469 makes sense when the linear solver is an iterative solver, e.g.,
1470 ``ITERATIVE_SCHUR`` or ``CGNR``.
1471
1472.. member:: double Solver::Options::eta
1473
1474 Default: ``1e-1``
1475
1476 Forcing sequence parameter. The truncated Newton solver uses this
1477 number to control the relative accuracy with which the Newton step
1478 is computed. This constant is passed to
1479 ``ConjugateGradientsSolver`` which uses it to terminate the
1480 iterations when
1481
1482 .. math:: \frac{Q_i - Q_{i-1}}{Q_i} < \frac{\eta}{i}
1483
1484.. member:: bool Solver::Options::jacobi_scaling
1485
1486 Default: ``true``
1487
1488 ``true`` means that the Jacobian is scaled by the norm of its
1489 columns before being passed to the linear solver. This improves the
1490 numerical conditioning of the normal equations.
1491
1492.. member:: bool Solver::Options::use_inner_iterations
1493
1494 Default: ``false``
1495
1496 Use a non-linear version of a simplified variable projection
1497 algorithm. Essentially this amounts to doing a further optimization
1498 on each Newton/Trust region step using a coordinate descent
1499 algorithm. For more details, see :ref:`section-inner-iterations`.
1500
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001501 **Note** Inner iterations cannot be used with :class:`Problem`
1502 objects that have an :class:`EvaluationCallback` associated with
1503 them.
1504
Austin Schuh70cc9552019-01-21 19:46:48 -08001505.. member:: double Solver::Options::inner_iteration_tolerance
1506
1507 Default: ``1e-3``
1508
1509 Generally speaking, inner iterations make significant progress in
1510 the early stages of the solve and then their contribution drops
1511 down sharply, at which point the time spent doing inner iterations
1512 is not worth it.
1513
1514 Once the relative decrease in the objective function due to inner
1515 iterations drops below ``inner_iteration_tolerance``, the use of
1516 inner iterations in subsequent trust region minimizer iterations is
1517 disabled.
1518
1519.. member:: shared_ptr<ParameterBlockOrdering> Solver::Options::inner_iteration_ordering
1520
1521 Default: ``NULL``
1522
1523 If :member:`Solver::Options::use_inner_iterations` true, then the
1524 user has two choices.
1525
1526 1. Let the solver heuristically decide which parameter blocks to
1527 optimize in each inner iteration. To do this, set
1528 :member:`Solver::Options::inner_iteration_ordering` to ``NULL``.
1529
1530 2. Specify a collection of of ordered independent sets. The lower
1531 numbered groups are optimized before the higher number groups
1532 during the inner optimization phase. Each group must be an
1533 independent set. Not all parameter blocks need to be included in
1534 the ordering.
1535
1536 See :ref:`section-ordering` for more details.
1537
1538.. member:: LoggingType Solver::Options::logging_type
1539
1540 Default: ``PER_MINIMIZER_ITERATION``
1541
1542.. member:: bool Solver::Options::minimizer_progress_to_stdout
1543
1544 Default: ``false``
1545
1546 By default the :class:`Minimizer` progress is logged to ``STDERR``
1547 depending on the ``vlog`` level. If this flag is set to true, and
1548 :member:`Solver::Options::logging_type` is not ``SILENT``, the logging
1549 output is sent to ``STDOUT``.
1550
1551 For ``TRUST_REGION_MINIMIZER`` the progress display looks like
1552
1553 .. code-block:: bash
1554
1555 iter cost cost_change |gradient| |step| tr_ratio tr_radius ls_iter iter_time total_time
1556 0 4.185660e+06 0.00e+00 1.09e+08 0.00e+00 0.00e+00 1.00e+04 0 7.59e-02 3.37e-01
1557 1 1.062590e+05 4.08e+06 8.99e+06 5.36e+02 9.82e-01 3.00e+04 1 1.65e-01 5.03e-01
1558 2 4.992817e+04 5.63e+04 8.32e+06 3.19e+02 6.52e-01 3.09e+04 1 1.45e-01 6.48e-01
1559
1560 Here
1561
1562 #. ``cost`` is the value of the objective function.
1563 #. ``cost_change`` is the change in the value of the objective
1564 function if the step computed in this iteration is accepted.
1565 #. ``|gradient|`` is the max norm of the gradient.
1566 #. ``|step|`` is the change in the parameter vector.
1567 #. ``tr_ratio`` is the ratio of the actual change in the objective
1568 function value to the change in the value of the trust
1569 region model.
1570 #. ``tr_radius`` is the size of the trust region radius.
1571 #. ``ls_iter`` is the number of linear solver iterations used to
1572 compute the trust region step. For direct/factorization based
1573 solvers it is always 1, for iterative solvers like
1574 ``ITERATIVE_SCHUR`` it is the number of iterations of the
1575 Conjugate Gradients algorithm.
1576 #. ``iter_time`` is the time take by the current iteration.
1577 #. ``total_time`` is the total time taken by the minimizer.
1578
1579 For ``LINE_SEARCH_MINIMIZER`` the progress display looks like
1580
1581 .. code-block:: bash
1582
1583 0: f: 2.317806e+05 d: 0.00e+00 g: 3.19e-01 h: 0.00e+00 s: 0.00e+00 e: 0 it: 2.98e-02 tt: 8.50e-02
1584 1: f: 2.312019e+05 d: 5.79e+02 g: 3.18e-01 h: 2.41e+01 s: 1.00e+00 e: 1 it: 4.54e-02 tt: 1.31e-01
1585 2: f: 2.300462e+05 d: 1.16e+03 g: 3.17e-01 h: 4.90e+01 s: 2.54e-03 e: 1 it: 4.96e-02 tt: 1.81e-01
1586
1587 Here
1588
1589 #. ``f`` is the value of the objective function.
1590 #. ``d`` is the change in the value of the objective function if
1591 the step computed in this iteration is accepted.
1592 #. ``g`` is the max norm of the gradient.
1593 #. ``h`` is the change in the parameter vector.
1594 #. ``s`` is the optimal step length computed by the line search.
1595 #. ``it`` is the time take by the current iteration.
1596 #. ``tt`` is the total time taken by the minimizer.
1597
1598.. member:: vector<int> Solver::Options::trust_region_minimizer_iterations_to_dump
1599
1600 Default: ``empty``
1601
1602 List of iterations at which the trust region minimizer should dump
1603 the trust region problem. Useful for testing and benchmarking. If
1604 ``empty``, no problems are dumped.
1605
1606.. member:: string Solver::Options::trust_region_problem_dump_directory
1607
1608 Default: ``/tmp``
1609
1610 Directory to which the problems should be written to. Should be
1611 non-empty if
1612 :member:`Solver::Options::trust_region_minimizer_iterations_to_dump` is
1613 non-empty and
1614 :member:`Solver::Options::trust_region_problem_dump_format_type` is not
1615 ``CONSOLE``.
1616
1617.. member:: DumpFormatType Solver::Options::trust_region_problem_dump_format
1618
1619 Default: ``TEXTFILE``
1620
1621 The format in which trust region problems should be logged when
1622 :member:`Solver::Options::trust_region_minimizer_iterations_to_dump`
1623 is non-empty. There are three options:
1624
1625 * ``CONSOLE`` prints the linear least squares problem in a human
1626 readable format to ``stderr``. The Jacobian is printed as a
1627 dense matrix. The vectors :math:`D`, :math:`x` and :math:`f` are
1628 printed as dense vectors. This should only be used for small
1629 problems.
1630
1631 * ``TEXTFILE`` Write out the linear least squares problem to the
1632 directory pointed to by
1633 :member:`Solver::Options::trust_region_problem_dump_directory` as
1634 text files which can be read into ``MATLAB/Octave``. The Jacobian
1635 is dumped as a text file containing :math:`(i,j,s)` triplets, the
1636 vectors :math:`D`, `x` and `f` are dumped as text files
1637 containing a list of their values.
1638
1639 A ``MATLAB/Octave`` script called
1640 ``ceres_solver_iteration_???.m`` is also output, which can be
1641 used to parse and load the problem into memory.
1642
1643.. member:: bool Solver::Options::check_gradients
1644
1645 Default: ``false``
1646
1647 Check all Jacobians computed by each residual block with finite
1648 differences. This is expensive since it involves computing the
1649 derivative by normal means (e.g. user specified, autodiff, etc),
1650 then also computing it using finite differences. The results are
1651 compared, and if they differ substantially, the optimization fails
1652 and the details are stored in the solver summary.
1653
1654.. member:: double Solver::Options::gradient_check_relative_precision
1655
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001656 Default: ``1e-8``
Austin Schuh70cc9552019-01-21 19:46:48 -08001657
1658 Precision to check for in the gradient checker. If the relative
1659 difference between an element in a Jacobian exceeds this number,
1660 then the Jacobian for that cost term is dumped.
1661
1662.. member:: double Solver::Options::gradient_check_numeric_derivative_relative_step_size
1663
1664 Default: ``1e-6``
1665
1666 .. NOTE::
1667
1668 This option only applies to the numeric differentiation used for
1669 checking the user provided derivatives when when
1670 `Solver::Options::check_gradients` is true. If you are using
1671 :class:`NumericDiffCostFunction` and are interested in changing
1672 the step size for numeric differentiation in your cost function,
1673 please have a look at :class:`NumericDiffOptions`.
1674
1675 Relative shift used for taking numeric derivatives when
1676 `Solver::Options::check_gradients` is `true`.
1677
1678 For finite differencing, each dimension is evaluated at slightly
1679 shifted values, e.g., for forward differences, the numerical
1680 derivative is
1681
1682 .. math::
1683
1684 \delta &= gradient\_check\_numeric\_derivative\_relative\_step\_size\\
1685 \Delta f &= \frac{f((1 + \delta) x) - f(x)}{\delta x}
1686
1687 The finite differencing is done along each dimension. The reason to
1688 use a relative (rather than absolute) step size is that this way,
1689 numeric differentiation works for functions where the arguments are
1690 typically large (e.g. :math:`10^9`) and when the values are small
1691 (e.g. :math:`10^{-5}`). It is possible to construct *torture cases*
1692 which break this finite difference heuristic, but they do not come
1693 up often in practice.
1694
Austin Schuh70cc9552019-01-21 19:46:48 -08001695.. member:: bool Solver::Options::update_state_every_iteration
1696
1697 Default: ``false``
1698
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001699 If ``update_state_every_iteration`` is ``true``, then Ceres Solver
1700 will guarantee that at the end of every iteration and before any
1701 user :class:`IterationCallback` is called, the parameter blocks are
1702 updated to the current best solution found by the solver. Thus the
1703 IterationCallback can inspect the values of the parameter blocks
1704 for purposes of computation, visualization or termination.
Austin Schuh70cc9552019-01-21 19:46:48 -08001705
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001706 If ``update_state_every_iteration`` is ``false`` then there is no
1707 such guarantee, and user provided :class:`IterationCallback` s
1708 should not expect to look at the parameter blocks and interpret
1709 their values.
Austin Schuh70cc9552019-01-21 19:46:48 -08001710
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001711.. member:: vector<IterationCallback> Solver::Options::callbacks
Austin Schuh70cc9552019-01-21 19:46:48 -08001712
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001713 Callbacks that are executed at the end of each iteration of the
1714 :class:`Minimizer`. They are executed in the order that they are
1715 specified in this vector.
Austin Schuh70cc9552019-01-21 19:46:48 -08001716
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001717 By default, parameter blocks are updated only at the end of the
1718 optimization, i.e., when the :class:`Minimizer` terminates. This
1719 means that by default, if an :class:`IterationCallback` inspects
1720 the parameter blocks, they will not see them changing in the course
1721 of the optimization.
Austin Schuh70cc9552019-01-21 19:46:48 -08001722
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001723 To tell Ceres to update the parameter blocks at the end of each
1724 iteration and before calling the user's callback, set
1725 :member:`Solver::Options::update_state_every_iteration` to
1726 ``true``.
Austin Schuh70cc9552019-01-21 19:46:48 -08001727
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001728 The solver does NOT take ownership of these pointers.
Austin Schuh70cc9552019-01-21 19:46:48 -08001729
1730:class:`ParameterBlockOrdering`
1731===============================
1732
1733.. class:: ParameterBlockOrdering
1734
1735 ``ParameterBlockOrdering`` is a class for storing and manipulating
1736 an ordered collection of groups/sets with the following semantics:
1737
1738 Group IDs are non-negative integer values. Elements are any type
1739 that can serve as a key in a map or an element of a set.
1740
1741 An element can only belong to one group at a time. A group may
1742 contain an arbitrary number of elements.
1743
1744 Groups are ordered by their group id.
1745
1746.. function:: bool ParameterBlockOrdering::AddElementToGroup(const double* element, const int group)
1747
1748 Add an element to a group. If a group with this id does not exist,
1749 one is created. This method can be called any number of times for
1750 the same element. Group ids should be non-negative numbers. Return
1751 value indicates if adding the element was a success.
1752
1753.. function:: void ParameterBlockOrdering::Clear()
1754
1755 Clear the ordering.
1756
1757.. function:: bool ParameterBlockOrdering::Remove(const double* element)
1758
1759 Remove the element, no matter what group it is in. If the element
1760 is not a member of any group, calling this method will result in a
1761 crash. Return value indicates if the element was actually removed.
1762
1763.. function:: void ParameterBlockOrdering::Reverse()
1764
1765 Reverse the order of the groups in place.
1766
1767.. function:: int ParameterBlockOrdering::GroupId(const double* element) const
1768
1769 Return the group id for the element. If the element is not a member
1770 of any group, return -1.
1771
1772.. function:: bool ParameterBlockOrdering::IsMember(const double* element) const
1773
1774 True if there is a group containing the parameter block.
1775
1776.. function:: int ParameterBlockOrdering::GroupSize(const int group) const
1777
1778 This function always succeeds, i.e., implicitly there exists a
1779 group for every integer.
1780
1781.. function:: int ParameterBlockOrdering::NumElements() const
1782
1783 Number of elements in the ordering.
1784
1785.. function:: int ParameterBlockOrdering::NumGroups() const
1786
1787 Number of groups with one or more elements.
1788
Austin Schuh70cc9552019-01-21 19:46:48 -08001789:class:`IterationCallback`
1790==========================
1791
1792.. class:: IterationSummary
1793
1794 :class:`IterationSummary` describes the state of the minimizer at
1795 the end of each iteration.
1796
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08001797.. member:: int IterationSummary::iteration
Austin Schuh70cc9552019-01-21 19:46:48 -08001798
1799 Current iteration number.
1800
1801.. member:: bool IterationSummary::step_is_valid
1802
1803 Step was numerically valid, i.e., all values are finite and the
1804 step reduces the value of the linearized model.
1805
1806 **Note**: :member:`IterationSummary::step_is_valid` is `false`
1807 when :member:`IterationSummary::iteration` = 0.
1808
1809.. member:: bool IterationSummary::step_is_nonmonotonic
1810
1811 Step did not reduce the value of the objective function
1812 sufficiently, but it was accepted because of the relaxed
1813 acceptance criterion used by the non-monotonic trust region
1814 algorithm.
1815
1816 **Note**: :member:`IterationSummary::step_is_nonmonotonic` is
1817 `false` when when :member:`IterationSummary::iteration` = 0.
1818
1819.. member:: bool IterationSummary::step_is_successful
1820
1821 Whether or not the minimizer accepted this step or not.
1822
1823 If the ordinary trust region algorithm is used, this means that the
1824 relative reduction in the objective function value was greater than
1825 :member:`Solver::Options::min_relative_decrease`. However, if the
1826 non-monotonic trust region algorithm is used
1827 (:member:`Solver::Options::use_nonmonotonic_steps` = `true`), then
1828 even if the relative decrease is not sufficient, the algorithm may
1829 accept the step and the step is declared successful.
1830
1831 **Note**: :member:`IterationSummary::step_is_successful` is `false`
1832 when when :member:`IterationSummary::iteration` = 0.
1833
1834.. member:: double IterationSummary::cost
1835
1836 Value of the objective function.
1837
1838.. member:: double IterationSummary::cost_change
1839
1840 Change in the value of the objective function in this
1841 iteration. This can be positive or negative.
1842
1843.. member:: double IterationSummary::gradient_max_norm
1844
1845 Infinity norm of the gradient vector.
1846
1847.. member:: double IterationSummary::gradient_norm
1848
1849 2-norm of the gradient vector.
1850
1851.. member:: double IterationSummary::step_norm
1852
1853 2-norm of the size of the step computed in this iteration.
1854
1855.. member:: double IterationSummary::relative_decrease
1856
1857 For trust region algorithms, the ratio of the actual change in cost
1858 and the change in the cost of the linearized approximation.
1859
1860 This field is not used when a linear search minimizer is used.
1861
1862.. member:: double IterationSummary::trust_region_radius
1863
1864 Size of the trust region at the end of the current iteration. For
1865 the Levenberg-Marquardt algorithm, the regularization parameter is
1866 1.0 / member::`IterationSummary::trust_region_radius`.
1867
1868.. member:: double IterationSummary::eta
1869
1870 For the inexact step Levenberg-Marquardt algorithm, this is the
1871 relative accuracy with which the step is solved. This number is
1872 only applicable to the iterative solvers capable of solving linear
1873 systems inexactly. Factorization-based exact solvers always have an
1874 eta of 0.0.
1875
1876.. member:: double IterationSummary::step_size
1877
1878 Step sized computed by the line search algorithm.
1879
1880 This field is not used when a trust region minimizer is used.
1881
1882.. member:: int IterationSummary::line_search_function_evaluations
1883
1884 Number of function evaluations used by the line search algorithm.
1885
1886 This field is not used when a trust region minimizer is used.
1887
1888.. member:: int IterationSummary::linear_solver_iterations
1889
1890 Number of iterations taken by the linear solver to solve for the
1891 trust region step.
1892
1893 Currently this field is not used when a line search minimizer is
1894 used.
1895
1896.. member:: double IterationSummary::iteration_time_in_seconds
1897
1898 Time (in seconds) spent inside the minimizer loop in the current
1899 iteration.
1900
1901.. member:: double IterationSummary::step_solver_time_in_seconds
1902
1903 Time (in seconds) spent inside the trust region step solver.
1904
1905.. member:: double IterationSummary::cumulative_time_in_seconds
1906
1907 Time (in seconds) since the user called Solve().
1908
1909
1910.. class:: IterationCallback
1911
1912 Interface for specifying callbacks that are executed at the end of
1913 each iteration of the minimizer.
1914
1915 .. code-block:: c++
1916
1917 class IterationCallback {
1918 public:
1919 virtual ~IterationCallback() {}
1920 virtual CallbackReturnType operator()(const IterationSummary& summary) = 0;
1921 };
1922
1923
1924 The solver uses the return value of ``operator()`` to decide whether
1925 to continue solving or to terminate. The user can return three
1926 values.
1927
1928 #. ``SOLVER_ABORT`` indicates that the callback detected an abnormal
1929 situation. The solver returns without updating the parameter
1930 blocks (unless ``Solver::Options::update_state_every_iteration`` is
1931 set true). Solver returns with ``Solver::Summary::termination_type``
1932 set to ``USER_FAILURE``.
1933
1934 #. ``SOLVER_TERMINATE_SUCCESSFULLY`` indicates that there is no need
1935 to optimize anymore (some user specified termination criterion
1936 has been met). Solver returns with
1937 ``Solver::Summary::termination_type``` set to ``USER_SUCCESS``.
1938
1939 #. ``SOLVER_CONTINUE`` indicates that the solver should continue
1940 optimizing.
1941
1942 For example, the following :class:`IterationCallback` is used
1943 internally by Ceres to log the progress of the optimization.
1944
1945 .. code-block:: c++
1946
1947 class LoggingCallback : public IterationCallback {
1948 public:
1949 explicit LoggingCallback(bool log_to_stdout)
1950 : log_to_stdout_(log_to_stdout) {}
1951
1952 ~LoggingCallback() {}
1953
1954 CallbackReturnType operator()(const IterationSummary& summary) {
1955 const char* kReportRowFormat =
1956 "% 4d: f:% 8e d:% 3.2e g:% 3.2e h:% 3.2e "
1957 "rho:% 3.2e mu:% 3.2e eta:% 3.2e li:% 3d";
1958 string output = StringPrintf(kReportRowFormat,
1959 summary.iteration,
1960 summary.cost,
1961 summary.cost_change,
1962 summary.gradient_max_norm,
1963 summary.step_norm,
1964 summary.relative_decrease,
1965 summary.trust_region_radius,
1966 summary.eta,
1967 summary.linear_solver_iterations);
1968 if (log_to_stdout_) {
1969 cout << output << endl;
1970 } else {
1971 VLOG(1) << output;
1972 }
1973 return SOLVER_CONTINUE;
1974 }
1975
1976 private:
1977 const bool log_to_stdout_;
1978 };
1979
1980
1981
1982:class:`CRSMatrix`
1983==================
1984
1985.. class:: CRSMatrix
1986
1987 A compressed row sparse matrix used primarily for communicating the
1988 Jacobian matrix to the user.
1989
1990.. member:: int CRSMatrix::num_rows
1991
1992 Number of rows.
1993
1994.. member:: int CRSMatrix::num_cols
1995
1996 Number of columns.
1997
1998.. member:: vector<int> CRSMatrix::rows
1999
2000 :member:`CRSMatrix::rows` is a :member:`CRSMatrix::num_rows` + 1
2001 sized array that points into the :member:`CRSMatrix::cols` and
2002 :member:`CRSMatrix::values` array.
2003
2004.. member:: vector<int> CRSMatrix::cols
2005
2006 :member:`CRSMatrix::cols` contain as many entries as there are
2007 non-zeros in the matrix.
2008
2009 For each row ``i``, ``cols[rows[i]]`` ... ``cols[rows[i + 1] - 1]``
2010 are the indices of the non-zero columns of row ``i``.
2011
2012.. member:: vector<int> CRSMatrix::values
2013
2014 :member:`CRSMatrix::values` contain as many entries as there are
2015 non-zeros in the matrix.
2016
2017 For each row ``i``,
2018 ``values[rows[i]]`` ... ``values[rows[i + 1] - 1]`` are the values
2019 of the non-zero columns of row ``i``.
2020
2021e.g., consider the 3x4 sparse matrix
2022
2023.. code-block:: c++
2024
2025 0 10 0 4
2026 0 2 -3 2
2027 1 2 0 0
2028
2029The three arrays will be:
2030
2031.. code-block:: c++
2032
2033 -row0- ---row1--- -row2-
2034 rows = [ 0, 2, 5, 7]
2035 cols = [ 1, 3, 1, 2, 3, 0, 1]
2036 values = [10, 4, 2, -3, 2, 1, 2]
2037
2038
2039:class:`Solver::Summary`
2040========================
2041
2042.. class:: Solver::Summary
2043
2044 Summary of the various stages of the solver after termination.
2045
2046.. function:: string Solver::Summary::BriefReport() const
2047
2048 A brief one line description of the state of the solver after
2049 termination.
2050
2051.. function:: string Solver::Summary::FullReport() const
2052
2053 A full multiline description of the state of the solver after
2054 termination.
2055
2056.. function:: bool Solver::Summary::IsSolutionUsable() const
2057
2058 Whether the solution returned by the optimization algorithm can be
2059 relied on to be numerically sane. This will be the case if
2060 `Solver::Summary:termination_type` is set to `CONVERGENCE`,
2061 `USER_SUCCESS` or `NO_CONVERGENCE`, i.e., either the solver
2062 converged by meeting one of the convergence tolerances or because
2063 the user indicated that it had converged or it ran to the maximum
2064 number of iterations or time.
2065
2066.. member:: MinimizerType Solver::Summary::minimizer_type
2067
2068 Type of minimization algorithm used.
2069
2070.. member:: TerminationType Solver::Summary::termination_type
2071
2072 The cause of the minimizer terminating.
2073
2074.. member:: string Solver::Summary::message
2075
2076 Reason why the solver terminated.
2077
2078.. member:: double Solver::Summary::initial_cost
2079
2080 Cost of the problem (value of the objective function) before the
2081 optimization.
2082
2083.. member:: double Solver::Summary::final_cost
2084
2085 Cost of the problem (value of the objective function) after the
2086 optimization.
2087
2088.. member:: double Solver::Summary::fixed_cost
2089
2090 The part of the total cost that comes from residual blocks that
2091 were held fixed by the preprocessor because all the parameter
2092 blocks that they depend on were fixed.
2093
2094.. member:: vector<IterationSummary> Solver::Summary::iterations
2095
2096 :class:`IterationSummary` for each minimizer iteration in order.
2097
2098.. member:: int Solver::Summary::num_successful_steps
2099
2100 Number of minimizer iterations in which the step was
2101 accepted. Unless :member:`Solver::Options::use_non_monotonic_steps`
2102 is `true` this is also the number of steps in which the objective
2103 function value/cost went down.
2104
2105.. member:: int Solver::Summary::num_unsuccessful_steps
2106
2107 Number of minimizer iterations in which the step was rejected
2108 either because it did not reduce the cost enough or the step was
2109 not numerically valid.
2110
2111.. member:: int Solver::Summary::num_inner_iteration_steps
2112
2113 Number of times inner iterations were performed.
2114
2115 .. member:: int Solver::Summary::num_line_search_steps
2116
2117 Total number of iterations inside the line search algorithm across
2118 all invocations. We call these iterations "steps" to distinguish
2119 them from the outer iterations of the line search and trust region
2120 minimizer algorithms which call the line search algorithm as a
2121 subroutine.
2122
2123.. member:: double Solver::Summary::preprocessor_time_in_seconds
2124
2125 Time (in seconds) spent in the preprocessor.
2126
2127.. member:: double Solver::Summary::minimizer_time_in_seconds
2128
2129 Time (in seconds) spent in the Minimizer.
2130
2131.. member:: double Solver::Summary::postprocessor_time_in_seconds
2132
2133 Time (in seconds) spent in the post processor.
2134
2135.. member:: double Solver::Summary::total_time_in_seconds
2136
2137 Time (in seconds) spent in the solver.
2138
2139.. member:: double Solver::Summary::linear_solver_time_in_seconds
2140
2141 Time (in seconds) spent in the linear solver computing the trust
2142 region step.
2143
2144.. member:: int Solver::Summary::num_linear_solves
2145
2146 Number of times the Newton step was computed by solving a linear
2147 system. This does not include linear solves used by inner
2148 iterations.
2149
2150.. member:: double Solver::Summary::residual_evaluation_time_in_seconds
2151
2152 Time (in seconds) spent evaluating the residual vector.
2153
2154.. member:: int Solver::Summary::num_residual_evaluations
2155
2156 Number of times only the residuals were evaluated.
2157
2158.. member:: double Solver::Summary::jacobian_evaluation_time_in_seconds
2159
2160 Time (in seconds) spent evaluating the Jacobian matrix.
2161
2162.. member:: int Solver::Summary::num_jacobian_evaluations
2163
2164 Number of times only the Jacobian and the residuals were evaluated.
2165
2166.. member:: double Solver::Summary::inner_iteration_time_in_seconds
2167
2168 Time (in seconds) spent doing inner iterations.
2169
2170.. member:: int Solver::Summary::num_parameter_blocks
2171
2172 Number of parameter blocks in the problem.
2173
2174.. member:: int Solver::Summary::num_parameters
2175
2176 Number of parameters in the problem.
2177
2178.. member:: int Solver::Summary::num_effective_parameters
2179
2180 Dimension of the tangent space of the problem (or the number of
2181 columns in the Jacobian for the problem). This is different from
2182 :member:`Solver::Summary::num_parameters` if a parameter block is
2183 associated with a :class:`LocalParameterization`.
2184
2185.. member:: int Solver::Summary::num_residual_blocks
2186
2187 Number of residual blocks in the problem.
2188
2189.. member:: int Solver::Summary::num_residuals
2190
2191 Number of residuals in the problem.
2192
2193.. member:: int Solver::Summary::num_parameter_blocks_reduced
2194
2195 Number of parameter blocks in the problem after the inactive and
2196 constant parameter blocks have been removed. A parameter block is
2197 inactive if no residual block refers to it.
2198
2199.. member:: int Solver::Summary::num_parameters_reduced
2200
2201 Number of parameters in the reduced problem.
2202
2203.. member:: int Solver::Summary::num_effective_parameters_reduced
2204
2205 Dimension of the tangent space of the reduced problem (or the
2206 number of columns in the Jacobian for the reduced problem). This is
2207 different from :member:`Solver::Summary::num_parameters_reduced` if
2208 a parameter block in the reduced problem is associated with a
2209 :class:`LocalParameterization`.
2210
2211.. member:: int Solver::Summary::num_residual_blocks_reduced
2212
2213 Number of residual blocks in the reduced problem.
2214
2215.. member:: int Solver::Summary::num_residuals_reduced
2216
2217 Number of residuals in the reduced problem.
2218
2219.. member:: int Solver::Summary::num_threads_given
2220
2221 Number of threads specified by the user for Jacobian and residual
2222 evaluation.
2223
2224.. member:: int Solver::Summary::num_threads_used
2225
2226 Number of threads actually used by the solver for Jacobian and
2227 residual evaluation. This number is not equal to
2228 :member:`Solver::Summary::num_threads_given` if none of `OpenMP`
Austin Schuh1d1e6ea2020-12-23 21:56:30 -08002229 or `CXX_THREADS` is available.
Austin Schuh70cc9552019-01-21 19:46:48 -08002230
2231.. member:: LinearSolverType Solver::Summary::linear_solver_type_given
2232
2233 Type of the linear solver requested by the user.
2234
2235.. member:: LinearSolverType Solver::Summary::linear_solver_type_used
2236
2237 Type of the linear solver actually used. This may be different from
2238 :member:`Solver::Summary::linear_solver_type_given` if Ceres
2239 determines that the problem structure is not compatible with the
2240 linear solver requested or if the linear solver requested by the
2241 user is not available, e.g. The user requested
2242 `SPARSE_NORMAL_CHOLESKY` but no sparse linear algebra library was
2243 available.
2244
2245.. member:: vector<int> Solver::Summary::linear_solver_ordering_given
2246
2247 Size of the elimination groups given by the user as hints to the
2248 linear solver.
2249
2250.. member:: vector<int> Solver::Summary::linear_solver_ordering_used
2251
2252 Size of the parameter groups used by the solver when ordering the
2253 columns of the Jacobian. This maybe different from
2254 :member:`Solver::Summary::linear_solver_ordering_given` if the user
2255 left :member:`Solver::Summary::linear_solver_ordering_given` blank
2256 and asked for an automatic ordering, or if the problem contains
2257 some constant or inactive parameter blocks.
2258
2259.. member:: std::string Solver::Summary::schur_structure_given
2260
2261 For Schur type linear solvers, this string describes the template
2262 specialization which was detected in the problem and should be
2263 used.
2264
2265.. member:: std::string Solver::Summary::schur_structure_used
2266
2267 For Schur type linear solvers, this string describes the template
2268 specialization that was actually instantiated and used. The reason
2269 this will be different from
2270 :member:`Solver::Summary::schur_structure_given` is because the
2271 corresponding template specialization does not exist.
2272
2273 Template specializations can be added to ceres by editing
2274 ``internal/ceres/generate_template_specializations.py``
2275
2276.. member:: bool Solver::Summary::inner_iterations_given
2277
2278 `True` if the user asked for inner iterations to be used as part of
2279 the optimization.
2280
2281.. member:: bool Solver::Summary::inner_iterations_used
2282
2283 `True` if the user asked for inner iterations to be used as part of
2284 the optimization and the problem structure was such that they were
2285 actually performed. For example, in a problem with just one parameter
2286 block, inner iterations are not performed.
2287
2288.. member:: vector<int> inner_iteration_ordering_given
2289
2290 Size of the parameter groups given by the user for performing inner
2291 iterations.
2292
2293.. member:: vector<int> inner_iteration_ordering_used
2294
2295 Size of the parameter groups given used by the solver for
2296 performing inner iterations. This maybe different from
2297 :member:`Solver::Summary::inner_iteration_ordering_given` if the
2298 user left :member:`Solver::Summary::inner_iteration_ordering_given`
2299 blank and asked for an automatic ordering, or if the problem
2300 contains some constant or inactive parameter blocks.
2301
2302.. member:: PreconditionerType Solver::Summary::preconditioner_type_given
2303
2304 Type of the preconditioner requested by the user.
2305
2306.. member:: PreconditionerType Solver::Summary::preconditioner_type_used
2307
2308 Type of the preconditioner actually used. This may be different
2309 from :member:`Solver::Summary::linear_solver_type_given` if Ceres
2310 determines that the problem structure is not compatible with the
2311 linear solver requested or if the linear solver requested by the
2312 user is not available.
2313
2314.. member:: VisibilityClusteringType Solver::Summary::visibility_clustering_type
2315
2316 Type of clustering algorithm used for visibility based
2317 preconditioning. Only meaningful when the
2318 :member:`Solver::Summary::preconditioner_type` is
2319 ``CLUSTER_JACOBI`` or ``CLUSTER_TRIDIAGONAL``.
2320
2321.. member:: TrustRegionStrategyType Solver::Summary::trust_region_strategy_type
2322
2323 Type of trust region strategy.
2324
2325.. member:: DoglegType Solver::Summary::dogleg_type
2326
2327 Type of dogleg strategy used for solving the trust region problem.
2328
2329.. member:: DenseLinearAlgebraLibraryType Solver::Summary::dense_linear_algebra_library_type
2330
2331 Type of the dense linear algebra library used.
2332
2333.. member:: SparseLinearAlgebraLibraryType Solver::Summary::sparse_linear_algebra_library_type
2334
2335 Type of the sparse linear algebra library used.
2336
2337.. member:: LineSearchDirectionType Solver::Summary::line_search_direction_type
2338
2339 Type of line search direction used.
2340
2341.. member:: LineSearchType Solver::Summary::line_search_type
2342
2343 Type of the line search algorithm used.
2344
2345.. member:: LineSearchInterpolationType Solver::Summary::line_search_interpolation_type
2346
2347 When performing line search, the degree of the polynomial used to
2348 approximate the objective function.
2349
2350.. member:: NonlinearConjugateGradientType Solver::Summary::nonlinear_conjugate_gradient_type
2351
2352 If the line search direction is `NONLINEAR_CONJUGATE_GRADIENT`,
2353 then this indicates the particular variant of non-linear conjugate
2354 gradient used.
2355
2356.. member:: int Solver::Summary::max_lbfgs_rank
2357
2358 If the type of the line search direction is `LBFGS`, then this
2359 indicates the rank of the Hessian approximation.