blob: 4e3fc7171e295d52d9191cc0ab544c75501bae6d [file] [log] [blame]
Austin Schuh70cc9552019-01-21 19:46:48 -08001.. highlight:: c++
2
3.. default-domain:: cpp
4
Austin Schuh3de38b02024-06-25 18:25:10 -07005.. cpp:namespace:: ceres
6
Austin Schuh70cc9552019-01-21 19:46:48 -08007.. _chapter-gradient_problem_solver:
8
9==================================
10General Unconstrained Minimization
11==================================
12
13Modeling
14========
15
16:class:`FirstOrderFunction`
17---------------------------
18
19.. class:: FirstOrderFunction
20
21 Instances of :class:`FirstOrderFunction` implement the evaluation of
22 a function and its gradient.
23
24 .. code-block:: c++
25
26 class FirstOrderFunction {
27 public:
28 virtual ~FirstOrderFunction() {}
29 virtual bool Evaluate(const double* const parameters,
30 double* cost,
31 double* gradient) const = 0;
32 virtual int NumParameters() const = 0;
33 };
34
35.. function:: bool FirstOrderFunction::Evaluate(const double* const parameters, double* cost, double* gradient) const
36
37 Evaluate the cost/value of the function. If ``gradient`` is not
Austin Schuh1d1e6ea2020-12-23 21:56:30 -080038 ``nullptr`` then evaluate the gradient too. If evaluation is
Austin Schuh70cc9552019-01-21 19:46:48 -080039 successful return, ``true`` else return ``false``.
40
Austin Schuh1d1e6ea2020-12-23 21:56:30 -080041 ``cost`` guaranteed to be never ``nullptr``, ``gradient`` can be ``nullptr``.
Austin Schuh70cc9552019-01-21 19:46:48 -080042
43.. function:: int FirstOrderFunction::NumParameters() const
44
45 Number of parameters in the domain of the function.
46
47
48:class:`GradientProblem`
49------------------------
50
51.. class:: GradientProblem
52
53.. code-block:: c++
54
55 class GradientProblem {
56 public:
57 explicit GradientProblem(FirstOrderFunction* function);
58 GradientProblem(FirstOrderFunction* function,
Austin Schuh3de38b02024-06-25 18:25:10 -070059 Manifold* manifold);
Austin Schuh70cc9552019-01-21 19:46:48 -080060 int NumParameters() const;
Austin Schuh3de38b02024-06-25 18:25:10 -070061 int NumTangentParameters() const;
Austin Schuh70cc9552019-01-21 19:46:48 -080062 bool Evaluate(const double* parameters, double* cost, double* gradient) const;
63 bool Plus(const double* x, const double* delta, double* x_plus_delta) const;
64 };
65
66Instances of :class:`GradientProblem` represent general non-linear
67optimization problems that must be solved using just the value of the
68objective function and its gradient. Unlike the :class:`Problem`
69class, which can only be used to model non-linear least squares
70problems, instances of :class:`GradientProblem` not restricted in the
71form of the objective function.
72
73Structurally :class:`GradientProblem` is a composition of a
Austin Schuh3de38b02024-06-25 18:25:10 -070074:class:`FirstOrderFunction` and optionally a :class:`Manifold`.
Austin Schuh70cc9552019-01-21 19:46:48 -080075
76The :class:`FirstOrderFunction` is responsible for evaluating the cost
77and gradient of the objective function.
78
Austin Schuh3de38b02024-06-25 18:25:10 -070079The :class:`Manifold` is responsible for going back and forth between the
80ambient space and the local tangent space. When a :class:`Manifold` is not
81provided, then the tangent space is assumed to coincide with the ambient
82Euclidean space that the gradient vector lives in.
Austin Schuh70cc9552019-01-21 19:46:48 -080083
84The constructor takes ownership of the :class:`FirstOrderFunction` and
Austin Schuh3de38b02024-06-25 18:25:10 -070085:class:`Manifold` objects passed to it.
Austin Schuh70cc9552019-01-21 19:46:48 -080086
87
88.. function:: void Solve(const GradientProblemSolver::Options& options, const GradientProblem& problem, double* parameters, GradientProblemSolver::Summary* summary)
89
90 Solve the given :class:`GradientProblem` using the values in
91 ``parameters`` as the initial guess of the solution.
92
93
94Solving
95=======
96
97:class:`GradientProblemSolver::Options`
98---------------------------------------
99
100.. class:: GradientProblemSolver::Options
101
102 :class:`GradientProblemSolver::Options` controls the overall
103 behavior of the solver. We list the various settings and their
104 default values below.
105
Austin Schuh3de38b02024-06-25 18:25:10 -0700106.. function:: bool GradientProblemSolver::Options::IsValid(std::string* error) const
Austin Schuh70cc9552019-01-21 19:46:48 -0800107
108 Validate the values in the options struct and returns true on
109 success. If there is a problem, the method returns false with
110 ``error`` containing a textual description of the cause.
111
112.. member:: LineSearchDirectionType GradientProblemSolver::Options::line_search_direction_type
113
114 Default: ``LBFGS``
115
116 Choices are ``STEEPEST_DESCENT``, ``NONLINEAR_CONJUGATE_GRADIENT``,
117 ``BFGS`` and ``LBFGS``.
118
119.. member:: LineSearchType GradientProblemSolver::Options::line_search_type
120
121 Default: ``WOLFE``
122
123 Choices are ``ARMIJO`` and ``WOLFE`` (strong Wolfe conditions).
124 Note that in order for the assumptions underlying the ``BFGS`` and
125 ``LBFGS`` line search direction algorithms to be guaranteed to be
Austin Schuh3de38b02024-06-25 18:25:10 -0700126 satisfied, the ``WOLFE`` line search should be used.
Austin Schuh70cc9552019-01-21 19:46:48 -0800127
128.. member:: NonlinearConjugateGradientType GradientProblemSolver::Options::nonlinear_conjugate_gradient_type
129
130 Default: ``FLETCHER_REEVES``
131
132 Choices are ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and
133 ``HESTENES_STIEFEL``.
134
135.. member:: int GradientProblemSolver::Options::max_lbfs_rank
136
137 Default: 20
138
139 The L-BFGS hessian approximation is a low rank approximation to the
140 inverse of the Hessian matrix. The rank of the approximation
141 determines (linearly) the space and time complexity of using the
142 approximation. Higher the rank, the better is the quality of the
143 approximation. The increase in quality is however is bounded for a
144 number of reasons.
145
146 1. The method only uses secant information and not actual
147 derivatives.
148
149 2. The Hessian approximation is constrained to be positive
150 definite.
151
152 So increasing this rank to a large number will cost time and space
153 complexity without the corresponding increase in solution
154 quality. There are no hard and fast rules for choosing the maximum
155 rank. The best choice usually requires some problem specific
156 experimentation.
157
158.. member:: bool GradientProblemSolver::Options::use_approximate_eigenvalue_bfgs_scaling
159
160 Default: ``false``
161
162 As part of the ``BFGS`` update step / ``LBFGS`` right-multiply
163 step, the initial inverse Hessian approximation is taken to be the
164 Identity. However, [Oren]_ showed that using instead :math:`I *
165 \gamma`, where :math:`\gamma` is a scalar chosen to approximate an
166 eigenvalue of the true inverse Hessian can result in improved
167 convergence in a wide variety of cases. Setting
168 ``use_approximate_eigenvalue_bfgs_scaling`` to true enables this
169 scaling in ``BFGS`` (before first iteration) and ``LBFGS`` (at each
170 iteration).
171
172 Precisely, approximate eigenvalue scaling equates to
173
174 .. math:: \gamma = \frac{y_k' s_k}{y_k' y_k}
175
176 With:
177
178 .. math:: y_k = \nabla f_{k+1} - \nabla f_k
179 .. math:: s_k = x_{k+1} - x_k
180
181 Where :math:`f()` is the line search objective and :math:`x` the
182 vector of parameter values [NocedalWright]_.
183
184 It is important to note that approximate eigenvalue scaling does
185 **not** *always* improve convergence, and that it can in fact
186 *significantly* degrade performance for certain classes of problem,
187 which is why it is disabled by default. In particular it can
188 degrade performance when the sensitivity of the problem to different
189 parameters varies significantly, as in this case a single scalar
190 factor fails to capture this variation and detrimentally downscales
191 parts of the Jacobian approximation which correspond to
192 low-sensitivity parameters. It can also reduce the robustness of the
193 solution to errors in the Jacobians.
194
Austin Schuh3de38b02024-06-25 18:25:10 -0700195.. member:: LineSearchInterpolationType GradientProblemSolver::Options::line_search_interpolation_type
Austin Schuh70cc9552019-01-21 19:46:48 -0800196
197 Default: ``CUBIC``
198
199 Degree of the polynomial used to approximate the objective
200 function. Valid values are ``BISECTION``, ``QUADRATIC`` and
201 ``CUBIC``.
202
203.. member:: double GradientProblemSolver::Options::min_line_search_step_size
204
205 The line search terminates if:
206
207 .. math:: \|\Delta x_k\|_\infty < \text{min_line_search_step_size}
208
209 where :math:`\|\cdot\|_\infty` refers to the max norm, and
210 :math:`\Delta x_k` is the step change in the parameter values at
211 the :math:`k`-th iteration.
212
213.. member:: double GradientProblemSolver::Options::line_search_sufficient_function_decrease
214
215 Default: ``1e-4``
216
217 Solving the line search problem exactly is computationally
218 prohibitive. Fortunately, line search based optimization algorithms
219 can still guarantee convergence if instead of an exact solution,
220 the line search algorithm returns a solution which decreases the
221 value of the objective function sufficiently. More precisely, we
222 are looking for a step size s.t.
223
224 .. math:: f(\text{step_size}) \le f(0) + \text{sufficient_decrease} * [f'(0) * \text{step_size}]
225
226 This condition is known as the Armijo condition.
227
228.. member:: double GradientProblemSolver::Options::max_line_search_step_contraction
229
230 Default: ``1e-3``
231
232 In each iteration of the line search,
233
234 .. math:: \text{new_step_size} \geq \text{max_line_search_step_contraction} * \text{step_size}
235
236 Note that by definition, for contraction:
237
238 .. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1
239
240.. member:: double GradientProblemSolver::Options::min_line_search_step_contraction
241
242 Default: ``0.6``
243
244 In each iteration of the line search,
245
246 .. math:: \text{new_step_size} \leq \text{min_line_search_step_contraction} * \text{step_size}
247
248 Note that by definition, for contraction:
249
250 .. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1
251
252.. member:: int GradientProblemSolver::Options::max_num_line_search_step_size_iterations
253
254 Default: ``20``
255
256 Maximum number of trial step size iterations during each line
257 search, if a step size satisfying the search conditions cannot be
258 found within this number of trials, the line search will stop.
259
260 As this is an 'artificial' constraint (one imposed by the user, not
261 the underlying math), if ``WOLFE`` line search is being used, *and*
262 points satisfying the Armijo sufficient (function) decrease
263 condition have been found during the current search (in :math:`\leq`
264 ``max_num_line_search_step_size_iterations``). Then, the step size
265 with the lowest function value which satisfies the Armijo condition
266 will be returned as the new valid step, even though it does *not*
267 satisfy the strong Wolfe conditions. This behaviour protects
268 against early termination of the optimizer at a sub-optimal point.
269
270.. member:: int GradientProblemSolver::Options::max_num_line_search_direction_restarts
271
272 Default: ``5``
273
274 Maximum number of restarts of the line search direction algorithm
275 before terminating the optimization. Restarts of the line search
276 direction algorithm occur when the current algorithm fails to
277 produce a new descent direction. This typically indicates a
278 numerical failure, or a breakdown in the validity of the
279 approximations used.
280
281.. member:: double GradientProblemSolver::Options::line_search_sufficient_curvature_decrease
282
283 Default: ``0.9``
284
285 The strong Wolfe conditions consist of the Armijo sufficient
286 decrease condition, and an additional requirement that the
287 step size be chosen s.t. the *magnitude* ('strong' Wolfe
288 conditions) of the gradient along the search direction
289 decreases sufficiently. Precisely, this second condition
290 is that we seek a step size s.t.
291
292 .. math:: \|f'(\text{step_size})\| \leq \text{sufficient_curvature_decrease} * \|f'(0)\|
293
294 Where :math:`f()` is the line search objective and :math:`f'()` is the derivative
295 of :math:`f` with respect to the step size: :math:`\frac{d f}{d~\text{step size}}`.
296
297.. member:: double GradientProblemSolver::Options::max_line_search_step_expansion
298
299 Default: ``10.0``
300
301 During the bracketing phase of a Wolfe line search, the step size
302 is increased until either a point satisfying the Wolfe conditions
303 is found, or an upper bound for a bracket containing a point
304 satisfying the conditions is found. Precisely, at each iteration
305 of the expansion:
306
307 .. math:: \text{new_step_size} \leq \text{max_step_expansion} * \text{step_size}
308
309 By definition for expansion
310
311 .. math:: \text{max_step_expansion} > 1.0
312
313.. member:: int GradientProblemSolver::Options::max_num_iterations
314
315 Default: ``50``
316
317 Maximum number of iterations for which the solver should run.
318
319.. member:: double GradientProblemSolver::Options::max_solver_time_in_seconds
320
321 Default: ``1e6``
322 Maximum amount of time for which the solver should run.
323
324.. member:: double GradientProblemSolver::Options::function_tolerance
325
326 Default: ``1e-6``
327
328 Solver terminates if
329
330 .. math:: \frac{|\Delta \text{cost}|}{\text{cost}} \leq \text{function_tolerance}
331
332 where, :math:`\Delta \text{cost}` is the change in objective
333 function value (up or down) in the current iteration of the line search.
334
335.. member:: double GradientProblemSolver::Options::gradient_tolerance
336
337 Default: ``1e-10``
338
339 Solver terminates if
340
341 .. math:: \|x - \Pi \boxplus(x, -g(x))\|_\infty \leq \text{gradient_tolerance}
342
343 where :math:`\|\cdot\|_\infty` refers to the max norm, :math:`\Pi`
344 is projection onto the bounds constraints and :math:`\boxplus` is
Austin Schuh3de38b02024-06-25 18:25:10 -0700345 Plus operation for the manifold associated with the parameter
346 vector.
Austin Schuh70cc9552019-01-21 19:46:48 -0800347
348.. member:: double GradientProblemSolver::Options::parameter_tolerance
349
350 Default: ``1e-8``
351
352 Solver terminates if
353
354 .. math:: \|\Delta x\| \leq (\|x\| + \text{parameter_tolerance}) * \text{parameter_tolerance}
355
356 where :math:`\Delta x` is the step computed by the linear solver in
357 the current iteration of the line search.
358
359.. member:: LoggingType GradientProblemSolver::Options::logging_type
360
361 Default: ``PER_MINIMIZER_ITERATION``
362
363.. member:: bool GradientProblemSolver::Options::minimizer_progress_to_stdout
364
365 Default: ``false``
366
367 By default the :class:`Minimizer` progress is logged to ``STDERR``
368 depending on the ``vlog`` level. If this flag is set to true, and
369 :member:`GradientProblemSolver::Options::logging_type` is not
370 ``SILENT``, the logging output is sent to ``STDOUT``.
371
372 The progress display looks like
373
374 .. code-block:: bash
375
376 0: f: 2.317806e+05 d: 0.00e+00 g: 3.19e-01 h: 0.00e+00 s: 0.00e+00 e: 0 it: 2.98e-02 tt: 8.50e-02
377 1: f: 2.312019e+05 d: 5.79e+02 g: 3.18e-01 h: 2.41e+01 s: 1.00e+00 e: 1 it: 4.54e-02 tt: 1.31e-01
378 2: f: 2.300462e+05 d: 1.16e+03 g: 3.17e-01 h: 4.90e+01 s: 2.54e-03 e: 1 it: 4.96e-02 tt: 1.81e-01
379
380 Here
381
382 #. ``f`` is the value of the objective function.
383 #. ``d`` is the change in the value of the objective function if
384 the step computed in this iteration is accepted.
385 #. ``g`` is the max norm of the gradient.
386 #. ``h`` is the change in the parameter vector.
387 #. ``s`` is the optimal step length computed by the line search.
388 #. ``it`` is the time take by the current iteration.
389 #. ``tt`` is the total time taken by the minimizer.
390
Austin Schuh3de38b02024-06-25 18:25:10 -0700391.. member:: std::vector<IterationCallback> GradientProblemSolver::Options::callbacks
Austin Schuh70cc9552019-01-21 19:46:48 -0800392
393 Callbacks that are executed at the end of each iteration of the
394 :class:`Minimizer`. They are executed in the order that they are
395 specified in this vector. By default, parameter blocks are updated
396 only at the end of the optimization, i.e., when the
397 :class:`Minimizer` terminates. This behavior is controlled by
Austin Schuh3de38b02024-06-25 18:25:10 -0700398 :member:`GradientProblemSolver::Options::update_state_every_iteration`. If
Austin Schuh70cc9552019-01-21 19:46:48 -0800399 the user wishes to have access to the update parameter blocks when
400 his/her callbacks are executed, then set
401 :member:`GradientProblemSolver::Options::update_state_every_iteration`
402 to true.
403
404 The solver does NOT take ownership of these pointers.
405
406
Austin Schuh3de38b02024-06-25 18:25:10 -0700407.. member:: bool GradientProblemSolver::Options::update_state_every_iteration
Austin Schuh70cc9552019-01-21 19:46:48 -0800408
409 Default: ``false``
410
411 Normally the parameter vector is only updated when the solver
412 terminates. Setting this to true updates it every iteration. This
413 setting is useful when building an interactive application using
414 Ceres and using an :class:`IterationCallback`.
415
416:class:`GradientProblemSolver::Summary`
417---------------------------------------
418
419.. class:: GradientProblemSolver::Summary
420
421 Summary of the various stages of the solver after termination.
422
Austin Schuh3de38b02024-06-25 18:25:10 -0700423.. function:: std::string GradientProblemSolver::Summary::BriefReport() const
Austin Schuh70cc9552019-01-21 19:46:48 -0800424
425 A brief one line description of the state of the solver after
426 termination.
427
Austin Schuh3de38b02024-06-25 18:25:10 -0700428.. function:: std::string GradientProblemSolver::Summary::FullReport() const
Austin Schuh70cc9552019-01-21 19:46:48 -0800429
430 A full multiline description of the state of the solver after
431 termination.
432
433.. function:: bool GradientProblemSolver::Summary::IsSolutionUsable() const
434
435 Whether the solution returned by the optimization algorithm can be
436 relied on to be numerically sane. This will be the case if
437 `GradientProblemSolver::Summary:termination_type` is set to `CONVERGENCE`,
438 `USER_SUCCESS` or `NO_CONVERGENCE`, i.e., either the solver
439 converged by meeting one of the convergence tolerances or because
440 the user indicated that it had converged or it ran to the maximum
441 number of iterations or time.
442
443.. member:: TerminationType GradientProblemSolver::Summary::termination_type
444
445 The cause of the minimizer terminating.
446
Austin Schuh3de38b02024-06-25 18:25:10 -0700447.. member:: std::string GradientProblemSolver::Summary::message
Austin Schuh70cc9552019-01-21 19:46:48 -0800448
449 Reason why the solver terminated.
450
451.. member:: double GradientProblemSolver::Summary::initial_cost
452
453 Cost of the problem (value of the objective function) before the
454 optimization.
455
456.. member:: double GradientProblemSolver::Summary::final_cost
457
458 Cost of the problem (value of the objective function) after the
459 optimization.
460
Austin Schuh3de38b02024-06-25 18:25:10 -0700461.. member:: std::vector<IterationSummary> GradientProblemSolver::Summary::iterations
Austin Schuh70cc9552019-01-21 19:46:48 -0800462
463 :class:`IterationSummary` for each minimizer iteration in order.
464
465.. member:: int num_cost_evaluations
466
467 Number of times the cost (and not the gradient) was evaluated.
468
469.. member:: int num_gradient_evaluations
470
471 Number of times the gradient (and the cost) were evaluated.
472
473.. member:: double GradientProblemSolver::Summary::total_time_in_seconds
474
475 Time (in seconds) spent in the solver.
476
477.. member:: double GradientProblemSolver::Summary::cost_evaluation_time_in_seconds
478
479 Time (in seconds) spent evaluating the cost vector.
480
481.. member:: double GradientProblemSolver::Summary::gradient_evaluation_time_in_seconds
482
483 Time (in seconds) spent evaluating the gradient vector.
484
485.. member:: int GradientProblemSolver::Summary::num_parameters
486
487 Number of parameters in the problem.
488
Austin Schuh3de38b02024-06-25 18:25:10 -0700489.. member:: int GradientProblemSolver::Summary::num_tangent_parameters
Austin Schuh70cc9552019-01-21 19:46:48 -0800490
491 Dimension of the tangent space of the problem. This is different
492 from :member:`GradientProblemSolver::Summary::num_parameters` if a
Austin Schuh3de38b02024-06-25 18:25:10 -0700493 :class:`Manifold` object is used.
Austin Schuh70cc9552019-01-21 19:46:48 -0800494
495.. member:: LineSearchDirectionType GradientProblemSolver::Summary::line_search_direction_type
496
497 Type of line search direction used.
498
499.. member:: LineSearchType GradientProblemSolver::Summary::line_search_type
500
501 Type of the line search algorithm used.
502
503.. member:: LineSearchInterpolationType GradientProblemSolver::Summary::line_search_interpolation_type
504
505 When performing line search, the degree of the polynomial used to
506 approximate the objective function.
507
508.. member:: NonlinearConjugateGradientType GradientProblemSolver::Summary::nonlinear_conjugate_gradient_type
509
510 If the line search direction is `NONLINEAR_CONJUGATE_GRADIENT`,
511 then this indicates the particular variant of non-linear conjugate
512 gradient used.
513
514.. member:: int GradientProblemSolver::Summary::max_lbfgs_rank
515
516 If the type of the line search direction is `LBFGS`, then this
517 indicates the rank of the Hessian approximation.