Blame - docs/source/nnls_solving.rst - RealtimeRoboticsGroup/test

blob: 184b94b72837e60a8b2852f7ffaad3b204adfe5c [file] [log] [blame]

Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1	.. highlight:: c++
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2
				3	.. default-domain:: cpp
				4
				5	.. cpp:namespace:: ceres
				6
				7	.. _chapter-nnls_solving:
				8
				9	================================
				10	Solving Non-linear Least Squares
				11	================================
				12
				13	Introduction
				14	============
				15
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	16	Effective use of Ceres Solver requires some familiarity with the basic
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	17	components of a non-linear least squares solver, so before we describe
				18	how to configure and use the solver, we will take a brief look at how
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	19	some of the core optimization algorithms in Ceres Solver work.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	20
				21	Let :math:`x \in \mathbb{R}^n` be an :math:`n`-dimensional vector of
				22	variables, and
				23	:math:`F(x) = \left[f_1(x), ... , f_{m}(x) \right]^{\top}` be a
				24	:math:`m`-dimensional function of :math:`x`. We are interested in
				25	solving the optimization problem [#f1]_
				26
				27	.. math:: \arg \min_x \frac{1}{2}\\|F(x)\\|^2\ . \\
				28	L \le x \le U
				29	:label: nonlinsq
				30
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	31	Where, :math:`L` and :math:`U` are vector lower and upper bounds on
				32	the parameter vector :math:`x`. The inequality holds component-wise.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	33
				34	Since the efficient global minimization of :eq:`nonlinsq` for
				35	general :math:`F(x)` is an intractable problem, we will have to settle
				36	for finding a local minimum.
				37
				38	In the following, the Jacobian :math:`J(x)` of :math:`F(x)` is an
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	39	:math:`m\times n` matrix, where :math:`J_{ij}(x) = D_j f_i(x)`
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	40	and the gradient vector is :math:`g(x) = \nabla \frac{1}{2}\\|F(x)\\|^2
				41	= J(x)^\top F(x)`.
				42
				43	The general strategy when solving non-linear optimization problems is
				44	to solve a sequence of approximations to the original problem
				45	[NocedalWright]_. At each iteration, the approximation is solved to
				46	determine a correction :math:`\Delta x` to the vector :math:`x`. For
				47	non-linear least squares, an approximation can be constructed by using
				48	the linearization :math:`F(x+\Delta x) \approx F(x) + J(x)\Delta x`,
				49	which leads to the following linear least squares problem:
				50
				51	.. math:: \min_{\Delta x} \frac{1}{2}\\|J(x)\Delta x + F(x)\\|^2
				52	:label: linearapprox
				53
				54	Unfortunately, naively solving a sequence of these problems and
				55	updating :math:`x \leftarrow x+ \Delta x` leads to an algorithm that
				56	may not converge. To get a convergent algorithm, we need to control
				57	the size of the step :math:`\Delta x`. Depending on how the size of
				58	the step :math:`\Delta x` is controlled, non-linear optimization
				59	algorithms can be divided into two major categories [NocedalWright]_.
				60
				61	1. Trust Region The trust region approach approximates the
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	62	objective function using a model function (often a quadratic) over
				63	a subset of the search space known as the trust region. If the
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	64	model function succeeds in minimizing the true objective function
				65	the trust region is expanded; conversely, otherwise it is
				66	contracted and the model optimization problem is solved again.
				67
				68	2. Line Search The line search approach first finds a descent
				69	direction along which the objective function will be reduced and
				70	then computes a step size that decides how far should move along
				71	that direction. The descent direction can be computed by various
				72	methods, such as gradient descent, Newton's method and Quasi-Newton
				73	method. The step size can be determined either exactly or
				74	inexactly.
				75
				76	Trust region methods are in some sense dual to line search methods:
				77	trust region methods first choose a step size (the size of the trust
				78	region) and then a step direction while line search methods first
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	79	choose a step direction and then a step size. Ceres Solver implements
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	80	multiple algorithms in both categories.
				81
				82	.. _section-trust-region-methods:
				83
				84	Trust Region Methods
				85	====================
				86
				87	The basic trust region algorithm looks something like this.
				88
				89	1. Given an initial point :math:`x` and a trust region radius :math:`\mu`.
				90	2. Solve
				91
				92	.. math::
				93	\arg \min_{\Delta x}& \frac{1}{2}\\|J(x)\Delta x + F(x)\\|^2 \\
				94	\text{such that} &\\|D(x)\Delta x\\|^2 \le \mu\\
				95	&L \le x + \Delta x \le U.
				96
				97	3. :math:`\rho = \frac{\displaystyle \\|F(x + \Delta x)\\|^2 -
				98	\\|F(x)\\|^2}{\displaystyle \\|J(x)\Delta x + F(x)\\|^2 -
				99	\\|F(x)\\|^2}`
				100	4. if :math:`\rho > \epsilon` then :math:`x = x + \Delta x`.
				101	5. if :math:`\rho > \eta_1` then :math:`\mu = 2 \mu`
				102	6. else if :math:`\rho < \eta_2` then :math:`\mu = 0.5 * \mu`
				103	7. Go to 2.
				104
				105	Here, :math:`\mu` is the trust region radius, :math:`D(x)` is some
				106	matrix used to define a metric on the domain of :math:`F(x)` and
				107	:math:`\rho` measures the quality of the step :math:`\Delta x`, i.e.,
				108	how well did the linear model predict the decrease in the value of the
				109	non-linear objective. The idea is to increase or decrease the radius
				110	of the trust region depending on how well the linearization predicts
				111	the behavior of the non-linear objective, which in turn is reflected
				112	in the value of :math:`\rho`.
				113
				114	The key computational step in a trust-region algorithm is the solution
				115	of the constrained optimization problem
				116
				117	.. math::
				118	\arg \min_{\Delta x}&\quad \frac{1}{2}\\|J(x)\Delta x + F(x)\\|^2 \\
				119	\text{such that} &\quad \\|D(x)\Delta x\\|^2 \le \mu\\
				120	&\quad L \le x + \Delta x \le U.
				121	:label: trp
				122
				123	There are a number of different ways of solving this problem, each
				124	giving rise to a different concrete trust-region algorithm. Currently,
				125	Ceres implements two trust-region algorithms - Levenberg-Marquardt
				126	and Dogleg, each of which is augmented with a line search if bounds
				127	constraints are present [Kanzow]_. The user can choose between them by
				128	setting :member:`Solver::Options::trust_region_strategy_type`.
				129
				130	.. rubric:: Footnotes
				131
				132	.. [#f1] At the level of the non-linear solver, the block structure is
				133	not relevant, therefore our discussion here is in terms of an
				134	optimization problem defined over a state vector of size
				135	:math:`n`. Similarly the presence of loss functions is also
				136	ignored as the problem is internally converted into a pure
				137	non-linear least squares problem.
				138
				139
				140	.. _section-levenberg-marquardt:
				141
				142	Levenberg-Marquardt
				143	-------------------
				144
				145	The Levenberg-Marquardt algorithm [Levenberg]_ [Marquardt]_ is the
				146	most popular algorithm for solving non-linear least squares problems.
				147	It was also the first trust region algorithm to be developed
				148	[Levenberg]_ [Marquardt]_. Ceres implements an exact step [Madsen]_
				149	and an inexact step variant of the Levenberg-Marquardt algorithm
				150	[WrightHolt]_ [NashSofer]_.
				151
				152	It can be shown, that the solution to :eq:`trp` can be obtained by
				153	solving an unconstrained optimization of the form
				154
				155	.. math:: \arg\min_{\Delta x} \frac{1}{2}\\|J(x)\Delta x + F(x)\\|^2 +\lambda \\|D(x)\Delta x\\|^2
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	156	:label: lsqr-naive
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	157
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	158	Where, :math:`\lambda` is a Lagrange multiplier that is inversely
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	159	related to :math:`\mu`. In Ceres, we solve for
				160
				161	.. math:: \arg\min_{\Delta x} \frac{1}{2}\\|J(x)\Delta x + F(x)\\|^2 + \frac{1}{\mu} \\|D(x)\Delta x\\|^2
				162	:label: lsqr
				163
				164	The matrix :math:`D(x)` is a non-negative diagonal matrix, typically
				165	the square root of the diagonal of the matrix :math:`J(x)^\top J(x)`.
				166
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	167	Before going further, let us make some notational simplifications.
				168
				169	We will assume that the matrix :math:`\frac{1}{\sqrt{\mu}} D` has been
				170	concatenated at the bottom of the matrix :math:`J(x)` and a
				171	corresponding vector of zeroes has been added to the bottom of
				172	:math:`F(x)`, i.e.:
				173
				174	.. math:: J(x) = \begin{bmatrix} J(x) \\ \frac{1}{\sqrt{\mu}} D
				175	\end{bmatrix},\quad F(x) = \begin{bmatrix} F(x) \\ 0
				176	\end{bmatrix}.
				177
				178	This allows us to re-write :eq:`lsqr` as
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	179
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	180	.. math:: \min_{\Delta x} \frac{1}{2} \\|J(x)\Delta x + F(x)\\|^2 .
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	181	:label: simple
				182
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	183	and only talk about :math:`J(x)` and :math:`F(x)` going forward.
				184
				185	For all but the smallest problems the solution of :eq:`simple` in each
				186	iteration of the Levenberg-Marquardt algorithm is the dominant
				187	computational cost. Ceres provides a number of different options for
				188	solving :eq:`simple`. There are two major classes of methods -
				189	factorization and iterative.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	190
				191	The factorization methods are based on computing an exact solution of
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	192	:eq:`lsqr` using a Cholesky or a QR factorization and lead to the so
				193	called exact step Levenberg-Marquardt algorithm. But it is not clear
				194	if an exact solution of :eq:`lsqr` is necessary at each step of the
				195	Levenberg-Mardquardt algorithm. We have already seen evidence that
				196	this may not be the case, as :eq:`lsqr` is itself a regularized
				197	version of :eq:`linearapprox`. Indeed, it is possible to construct
				198	non-linear optimization algorithms in which the linearized problem is
				199	solved approximately. These algorithms are known as inexact Newton or
				200	truncated Newton methods [NocedalWright]_.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	201
				202	An inexact Newton method requires two ingredients. First, a cheap
				203	method for approximately solving systems of linear
				204	equations. Typically an iterative linear solver like the Conjugate
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	205	Gradients method is used for this purpose [NocedalWright]_. Second, a
				206	termination rule for the iterative solver. A typical termination rule
				207	is of the form
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	208
				209	.. math:: \\|H(x) \Delta x + g(x)\\| \leq \eta_k \\|g(x)\\|.
				210	:label: inexact
				211
				212	Here, :math:`k` indicates the Levenberg-Marquardt iteration number and
				213	:math:`0 < \eta_k <1` is known as the forcing sequence. [WrightHolt]_
				214	prove that a truncated Levenberg-Marquardt algorithm that uses an
				215	inexact Newton step based on :eq:`inexact` converges for any
				216	sequence :math:`\eta_k \leq \eta_0 < 1` and the rate of convergence
				217	depends on the choice of the forcing sequence :math:`\eta_k`.
				218
				219	Ceres supports both exact and inexact step solution strategies. When
				220	the user chooses a factorization based linear solver, the exact step
				221	Levenberg-Marquardt algorithm is used. When the user chooses an
				222	iterative linear solver, the inexact step Levenberg-Marquardt
				223	algorithm is used.
				224
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	225	We will talk more about the various linear solvers that you can use in
				226	:ref:`section-linear-solver`.
				227
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	228	.. _section-dogleg:
				229
				230	Dogleg
				231	------
				232
				233	Another strategy for solving the trust region problem :eq:`trp` was
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	234	introduced by
				235	`M. J. D. Powell <https://en.wikipedia.org/wiki/Michael_J._D._Powell>`_. The
				236	key idea there is to compute two vectors
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	237
				238	.. math::
				239
				240	\Delta x^{\text{Gauss-Newton}} &= \arg \min_{\Delta x}\frac{1}{2} \\|J(x)\Delta x + f(x)\\|^2.\\
				241	\Delta x^{\text{Cauchy}} &= -\frac{\\|g(x)\\|^2}{\\|J(x)g(x)\\|^2}g(x).
				242
				243	Note that the vector :math:`\Delta x^{\text{Gauss-Newton}}` is the
				244	solution to :eq:`linearapprox` and :math:`\Delta
				245	x^{\text{Cauchy}}` is the vector that minimizes the linear
				246	approximation if we restrict ourselves to moving along the direction
				247	of the gradient. Dogleg methods finds a vector :math:`\Delta x`
				248	defined by :math:`\Delta x^{\text{Gauss-Newton}}` and :math:`\Delta
				249	x^{\text{Cauchy}}` that solves the trust region problem. Ceres
				250	supports two variants that can be chose by setting
				251	:member:`Solver::Options::dogleg_type`.
				252
				253	``TRADITIONAL_DOGLEG`` as described by Powell, constructs two line
				254	segments using the Gauss-Newton and Cauchy vectors and finds the point
				255	farthest along this line shaped like a dogleg (hence the name) that is
				256	contained in the trust-region. For more details on the exact reasoning
				257	and computations, please see Madsen et al [Madsen]_.
				258
				259	``SUBSPACE_DOGLEG`` is a more sophisticated method that considers the
				260	entire two dimensional subspace spanned by these two vectors and finds
				261	the point that minimizes the trust region problem in this subspace
				262	[ByrdSchnabel]_.
				263
				264	The key advantage of the Dogleg over Levenberg-Marquardt is that if
				265	the step computation for a particular choice of :math:`\mu` does not
				266	result in sufficient decrease in the value of the objective function,
				267	Levenberg-Marquardt solves the linear approximation from scratch with
				268	a smaller value of :math:`\mu`. Dogleg on the other hand, only needs
				269	to compute the interpolation between the Gauss-Newton and the Cauchy
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	270	vectors, as neither of them depend on the value of :math:`\mu`. As a
				271	result the Dogleg method only solves one linear system per successful
				272	step, while Levenberg-Marquardt may need to solve an arbitrary number
				273	of linear systems before it can make progress [LourakisArgyros]_.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	274
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	275	A disadvantage of the Dogleg implementation in Ceres Solver is that is
				276	can only be used with method can only be used with exact factorization
				277	based linear solvers.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	278
				279	.. _section-inner-iterations:
				280
				281	Inner Iterations
				282	----------------
				283
				284	Some non-linear least squares problems have additional structure in
				285	the way the parameter blocks interact that it is beneficial to modify
				286	the way the trust region step is computed. For example, consider the
				287	following regression problem
				288
				289	.. math:: y = a_1 e^{b_1 x} + a_2 e^{b_3 x^2 + c_1}
				290
				291
				292	Given a set of pairs :math:`\{(x_i, y_i)\}`, the user wishes to estimate
				293	:math:`a_1, a_2, b_1, b_2`, and :math:`c_1`.
				294
				295	Notice that the expression on the left is linear in :math:`a_1` and
				296	:math:`a_2`, and given any value for :math:`b_1, b_2` and :math:`c_1`,
				297	it is possible to use linear regression to estimate the optimal values
				298	of :math:`a_1` and :math:`a_2`. It's possible to analytically
				299	eliminate the variables :math:`a_1` and :math:`a_2` from the problem
				300	entirely. Problems like these are known as separable least squares
				301	problem and the most famous algorithm for solving them is the Variable
				302	Projection algorithm invented by Golub & Pereyra [GolubPereyra]_.
				303
				304	Similar structure can be found in the matrix factorization with
				305	missing data problem. There the corresponding algorithm is known as
				306	Wiberg's algorithm [Wiberg]_.
				307
				308	Ruhe & Wedin present an analysis of various algorithms for solving
				309	separable non-linear least squares problems and refer to *Variable
				310	Projection* as Algorithm I in their paper [RuheWedin]_.
				311
				312	Implementing Variable Projection is tedious and expensive. Ruhe &
				313	Wedin present a simpler algorithm with comparable convergence
				314	properties, which they call Algorithm II. Algorithm II performs an
				315	additional optimization step to estimate :math:`a_1` and :math:`a_2`
				316	exactly after computing a successful Newton step.
				317
				318
				319	This idea can be generalized to cases where the residual is not
				320	linear in :math:`a_1` and :math:`a_2`, i.e.,
				321
				322	.. math:: y = f_1(a_1, e^{b_1 x}) + f_2(a_2, e^{b_3 x^2 + c_1})
				323
				324	In this case, we solve for the trust region step for the full problem,
				325	and then use it as the starting point to further optimize just `a_1`
				326	and `a_2`. For the linear case, this amounts to doing a single linear
				327	least squares solve. For non-linear problems, any method for solving
				328	the :math:`a_1` and :math:`a_2` optimization problems will do. The
				329	only constraint on :math:`a_1` and :math:`a_2` (if they are two
				330	different parameter block) is that they do not co-occur in a residual
				331	block.
				332
				333	This idea can be further generalized, by not just optimizing
				334	:math:`(a_1, a_2)`, but decomposing the graph corresponding to the
				335	Hessian matrix's sparsity structure into a collection of
				336	non-overlapping independent sets and optimizing each of them.
				337
				338	Setting :member:`Solver::Options::use_inner_iterations` to ``true``
				339	enables the use of this non-linear generalization of Ruhe & Wedin's
				340	Algorithm II. This version of Ceres has a higher iteration
				341	complexity, but also displays better convergence behavior per
				342	iteration.
				343
				344	Setting :member:`Solver::Options::num_threads` to the maximum number
				345	possible is highly recommended.
				346
				347	.. _section-non-monotonic-steps:
				348
				349	Non-monotonic Steps
				350	-------------------
				351
				352	Note that the basic trust-region algorithm described in
				353	:ref:`section-trust-region-methods` is a descent algorithm in that it
				354	only accepts a point if it strictly reduces the value of the objective
				355	function.
				356
				357	Relaxing this requirement allows the algorithm to be more efficient in
				358	the long term at the cost of some local increase in the value of the
				359	objective function.
				360
				361	This is because allowing for non-decreasing objective function values
				362	in a principled manner allows the algorithm to jump over boulders as
				363	the method is not restricted to move into narrow valleys while
				364	preserving its convergence properties.
				365
				366	Setting :member:`Solver::Options::use_nonmonotonic_steps` to ``true``
				367	enables the non-monotonic trust region algorithm as described by Conn,
				368	Gould & Toint in [Conn]_.
				369
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	370	Even though the value of the objective function may be larger than the
				371	minimum value encountered over the course of the optimization, the
				372	final parameters returned to the user are the ones corresponding to
				373	the minimum cost over all iterations.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	374
				375	The option to take non-monotonic steps is available for all trust
				376	region strategies.
				377
				378
				379	.. _section-line-search-methods:
				380
				381	Line Search Methods
				382	===================
				383
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	384	.. NOTE::
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	385
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	386	The line search method in Ceres Solver cannot handle bounds
				387	constraints right now, so it can only be used for solving
				388	unconstrained problems.
				389
				390	The basic line search algorithm looks something like this:
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	391
				392	1. Given an initial point :math:`x`
				393	2. :math:`\Delta x = -H^{-1}(x) g(x)`
				394	3. :math:`\arg \min_\mu \frac{1}{2} \\| F(x + \mu \Delta x) \\|^2`
				395	4. :math:`x = x + \mu \Delta x`
				396	5. Goto 2.
				397
				398	Here :math:`H(x)` is some approximation to the Hessian of the
				399	objective function, and :math:`g(x)` is the gradient at
				400	:math:`x`. Depending on the choice of :math:`H(x)` we get a variety of
				401	different search directions :math:`\Delta x`.
				402
				403	Step 4, which is a one dimensional optimization or `Line Search` along
				404	:math:`\Delta x` is what gives this class of methods its name.
				405
				406	Different line search algorithms differ in their choice of the search
				407	direction :math:`\Delta x` and the method used for one dimensional
				408	optimization along :math:`\Delta x`. The choice of :math:`H(x)` is the
				409	primary source of computational complexity in these
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	410	methods. Currently, Ceres Solver supports four choices of search
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	411	directions, all aimed at large scale problems.
				412
				413	1. ``STEEPEST_DESCENT`` This corresponds to choosing :math:`H(x)` to
				414	be the identity matrix. This is not a good search direction for
				415	anything but the simplest of the problems. It is only included here
				416	for completeness.
				417
				418	2. ``NONLINEAR_CONJUGATE_GRADIENT`` A generalization of the Conjugate
				419	Gradient method to non-linear functions. The generalization can be
				420	performed in a number of different ways, resulting in a variety of
				421	search directions. Ceres Solver currently supports
				422	``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and ``HESTENES_STIEFEL``
				423	directions.
				424
				425	3. ``BFGS`` A generalization of the Secant method to multiple
				426	dimensions in which a full, dense approximation to the inverse
				427	Hessian is maintained and used to compute a quasi-Newton step
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	428	[NocedalWright]_. ``BFGS`` and its limited memory variant ``LBFGS``
				429	are currently the best known general quasi-Newton algorithm.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	430
				431	4. ``LBFGS`` A limited memory approximation to the full ``BFGS``
				432	method in which the last `M` iterations are used to approximate the
				433	inverse Hessian used to compute a quasi-Newton step [Nocedal]_,
				434	[ByrdNocedal]_.
				435
				436	Currently Ceres Solver supports both a backtracking and interpolation
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	437	based `Armijo line search algorithm
				438	<https://en.wikipedia.org/wiki/Backtracking_line_search>`_ (``ARMIJO``)
				439	, and a sectioning / zoom interpolation (strong) `Wolfe condition line
				440	search algorithm <https://en.wikipedia.org/wiki/Wolfe_conditions>`_
				441	(``WOLFE``).
				442
				443	.. NOTE::
				444
				445	In order for the assumptions underlying the ``BFGS`` and ``LBFGS``
				446	methods to be satisfied the ``WOLFE`` algorithm must be used.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	447
				448	.. _section-linear-solver:
				449
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	450	Linear Solvers
				451	==============
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	452
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	453	Observe that for both of the trust-region methods described above, the
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	454	key computational cost is the solution of a linear least squares
				455	problem of the form
				456
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	457	.. math:: \min_{\Delta x} \frac{1}{2} \\|J(x)\Delta x + F(x)\\|^2 .
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	458	:label: simple2
				459
				460	Let :math:`H(x)= J(x)^\top J(x)` and :math:`g(x) = -J(x)^\top
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	461	F(x)`. For notational convenience let us also drop the dependence on
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	462	:math:`x`. Then it is easy to see that solving :eq:`simple2` is
				463	equivalent to solving the normal equations.
				464
				465	.. math:: H \Delta x = g
				466	:label: normal
				467
				468	Ceres provides a number of different options for solving :eq:`normal`.
				469
				470	.. _section-qr:
				471
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	472	DENSE_QR
				473	--------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	474
				475	For small problems (a couple of hundred parameters and a few thousand
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	476	residuals) with relatively dense Jacobians, QR-decomposition is the
				477	method of choice [Bjorck]_. Let :math:`J = QR` be the QR-decomposition
				478	of :math:`J`, where :math:`Q` is an orthonormal matrix and :math:`R`
				479	is an upper triangular matrix [TrefethenBau]_. Then it can be shown
				480	that the solution to :eq:`normal` is given by
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	481
				482	.. math:: \Delta x^* = -R^{-1}Q^\top f
				483
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	484	You can use QR-decomposition by setting
				485	:member:`Solver::Options::linear_solver_type` to ``DENSE_QR``.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	486
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	487	By default (``Solver::Options::dense_linear_algebra_library_type =
				488	EIGEN``) Ceres Solver will use `Eigen Householder QR factorization
				489	<https://eigen.tuxfamily.org/dox-devel/classEigen_1_1HouseholderQR.html>`_
				490	.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	491
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	492	If Ceres Solver has been built with an optimized LAPACK
				493	implementation, then the user can also choose to use LAPACK's
				494	`DGEQRF`_ routine by setting
				495	:member:`Solver::Options::dense_linear_algebra_library_type` to
				496	``LAPACK``. Depending on the `LAPACK` and the underlying `BLAS`
				497	implementation this may perform better than using Eigen's Householder
				498	QR factorization.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	499
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	500	.. _DGEQRF: https://netlib.org/lapack/explore-html/df/dc5/group__variants_g_ecomputational_ga3766ea903391b5cf9008132f7440ec7b.html
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	501
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	502
				503	If an NVIDIA GPU is available and Ceres Solver has been built with
				504	CUDA support enabled, then the user can also choose to perform the
				505	QR-decomposition on the GPU by setting
				506	:member:`Solver::Options::dense_linear_algebra_library_type` to
				507	``CUDA``. Depending on the GPU this can lead to a substantial
				508	speedup. Using CUDA only makes sense for moderate to large sized
				509	problems. This is because to perform the decomposition on the GPU the
				510	matrix :math:`J` needs to be transferred from the CPU to the GPU and
				511	this incurs a cost. So unless the speedup from doing the decomposition
				512	on the GPU is large enough to also account for the time taken to
				513	transfer the Jacobian to the GPU, using CUDA will not be better than
				514	just doing the decomposition on the CPU.
				515
				516	.. _section-dense-normal-cholesky:
				517
				518	DENSE_NORMAL_CHOLESKY
				519	---------------------
				520
				521	It is often the case that the number of rows in the Jacobian :math:`J`
				522	are much larger than the the number of columns. The complexity of QR
				523	factorization scales linearly with the number of rows, so beyond a
				524	certain size it is more efficient to solve :eq:`normal` using a dense
				525	`Cholesky factorization
				526	<https://en.wikipedia.org/wiki/Cholesky_decomposition>`_.
				527
				528	Let :math:`H = R^\top R` be the Cholesky factorization of the normal
				529	equations, where :math:`R` is an upper triangular matrix, then the
				530	solution to :eq:`normal` is given by
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	531
				532	.. math::
				533
				534	\Delta x^* = R^{-1} R^{-\top} g.
				535
				536
				537	The observant reader will note that the :math:`R` in the Cholesky
				538	factorization of :math:`H` is the same upper triangular matrix
				539	:math:`R` in the QR factorization of :math:`J`. Since :math:`Q` is an
				540	orthonormal matrix, :math:`J=QR` implies that :math:`J^\top J = R^\top
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	541	Q^\top Q R = R^\top R`.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	542
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	543	Unfortunately, forming the matrix :math:`H = J'J` squares the
				544	condition number. As a result while the cost of forming :math:`H` and
				545	computing its Cholesky factorization is lower than computing the
				546	QR-factorization of :math:`J`, we pay the price in terms of increased
				547	numerical instability and potential failure of the Cholesky
				548	factorization for ill-conditioned Jacobians.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	549
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	550	You can use dense Cholesky factorization by setting
				551	:member:`Solver::Options::linear_solver_type` to
				552	``DENSE_NORMAL_CHOLESKY``.
				553
				554	By default (``Solver::Options::dense_linear_algebra_library_type =
				555	EIGEN``) Ceres Solver will use `Eigen's LLT factorization`_ routine.
				556
				557	.. _Eigen's LLT Factorization: https://eigen.tuxfamily.org/dox/classEigen_1_1LLT.html
				558
				559	If Ceres Solver has been built with an optimized LAPACK
				560	implementation, then the user can also choose to use LAPACK's
				561	`DPOTRF`_ routine by setting
				562	:member:`Solver::Options::dense_linear_algebra_library_type` to
				563	``LAPACK``. Depending on the `LAPACK` and the underlying `BLAS`
				564	implementation this may perform better than using Eigen's Cholesky
				565	factorization.
				566
				567	.. _DPOTRF: https://www.netlib.org/lapack/explore-html/d1/d7a/group__double_p_ocomputational_ga2f55f604a6003d03b5cd4a0adcfb74d6.html
				568
				569	If an NVIDIA GPU is available and Ceres Solver has been built with
				570	CUDA support enabled, then the user can also choose to perform the
				571	Cholesky factorization on the GPU by setting
				572	:member:`Solver::Options::dense_linear_algebra_library_type` to
				573	``CUDA``. Depending on the GPU this can lead to a substantial speedup.
				574	Using CUDA only makes sense for moderate to large sized problems. This
				575	is because to perform the decomposition on the GPU the matrix
				576	:math:`H` needs to be transferred from the CPU to the GPU and this
				577	incurs a cost. So unless the speedup from doing the decomposition on
				578	the GPU is large enough to also account for the time taken to transfer
				579	the Jacobian to the GPU, using CUDA will not be better than just doing
				580	the decomposition on the CPU.
				581
				582
				583	.. _section-sparse-normal-cholesky:
				584
				585	SPARSE_NORMAL_CHOLESKY
				586	----------------------
				587
				588	Large non-linear least square problems are usually sparse. In such
				589	cases, using a dense QR or Cholesky factorization is inefficient. For
				590	such problems, Cholesky factorization routines which treat :math:`H`
				591	as a sparse matrix and computes a sparse factor :math:`R` are better
				592	suited [Davis]_. This can lead to substantial savings in memory and
				593	CPU time for large sparse problems.
				594
				595	You can use dense Cholesky factorization by setting
				596	:member:`Solver::Options::linear_solver_type` to
				597	``SPARSE_NORMAL_CHOLESKY``.
				598
				599	The use of this linear solver requires that Ceres is compiled with
				600	support for at least one of:
				601
				602	1. `SuiteSparse <https://people.engr.tamu.edu/davis/suitesparse.html>`_ (``SUITE_SPARSE``).
				603	2. `Apple's Accelerate framework
				604	<https://developer.apple.com/documentation/accelerate/sparse_solvers?language=objc>`_
				605	(``ACCELERATE_SPARSE``).
				606	3. `Eigen's sparse linear solvers
				607	<https://eigen.tuxfamily.org/dox/group__SparseCholesky__Module.html>`_
				608	(``EIGEN_SPARSE``).
				609
				610	SuiteSparse and Accelerate offer high performance sparse Cholesky
				611	factorization routines as they level-3 BLAS routines
				612	internally. Eigen's sparse Cholesky routines are simplicial and do
				613	not use dense linear algebra routines and as a result cannot compete
				614	with SuiteSparse and Accelerate, especially on large problems. As a
				615	result to get the best performance out of SuiteSparse it should be
				616	linked to high quality BLAS and LAPACK implementations e.g. `ATLAS
				617	<https://math-atlas.sourceforge.net/>`_, `OpenBLAS
				618	<https://www.openblas.net/>`_ or `Intel MKL
				619	<https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html>`_.
				620
				621	A critical part of a sparse Cholesky factorization routine is the use
				622	a fill-reducing ordering. By default Ceres Solver uses the Approximate
				623	Minimum Degree (``AMD``) ordering, which usually performs well, but
				624	there are other options that may perform better depending on the
				625	actual sparsity structure of the Jacobian. See :ref:`section-ordering`
				626	for more details.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	627
				628	.. _section-cgnr:
				629
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	630	CGNR
				631	----
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	632
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	633	For general sparse problems, if the problem is too large for sparse
				634	Cholesky factorization or a sparse linear algebra library is not
				635	linked into Ceres, another option is the ``CGNR`` solver. This solver
				636	uses the `Conjugate Gradients
				637	<https://en.wikipedia.org/wiki/Conjugate_gradient_method>_` method on
				638	the normal equations, but without forming the normal equations
				639	explicitly. It exploits the relation
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	640
				641	.. math::
				642	H x = J^\top J x = J^\top(J x)
				643
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	644	Because ``CGNR`` never solves the linear system exactly, when the user
				645	chooses ``CGNR`` as the linear solver, Ceres automatically switches
				646	from the exact step algorithm to an inexact step algorithm. This also
				647	means that ``CGNR`` can only be used with ``LEVENBERG_MARQUARDT`` and
				648	not with ``DOGLEG`` trust region strategy.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	649
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	650	``CGNR`` by default runs on the CPU. However, if an NVIDIA GPU is
				651	available and Ceres Solver has been built with CUDA support enabled,
				652	then the user can also choose to run ``CGNR`` on the GPU by setting
				653	:member:`Solver::Options::sparse_linear_algebra_library_type` to
				654	``CUDA_SPARSE``. The key complexity of ``CGNR`` comes from evaluating
				655	the two sparse-matrix vector products (SpMV) :math:`Jx` and
				656	:math:`J'y`. GPUs are particularly well suited for doing sparse
				657	matrix-vector products. As a result, for large problems using a GPU
				658	can lead to a substantial speedup.
				659
				660	The convergence of Conjugate Gradients depends on the conditioner
				661	number :math:`\kappa(H)`. Usually :math:`H` is quite poorly
				662	conditioned and a `Preconditioner
				663	<https://en.wikipedia.org/wiki/Preconditioner>`_ must be used to get
				664	reasonable performance. See section on :ref:`section-preconditioner`
				665	for more details.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	666
				667	.. _section-schur:
				668
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	669	DENSE_SCHUR & SPARSE_SCHUR
				670	--------------------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	671
				672	While it is possible to use ``SPARSE_NORMAL_CHOLESKY`` to solve bundle
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	673	adjustment problems, they have a special sparsity structure that can
				674	be exploited to solve the normal equations more efficiently.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	675
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	676	Suppose that the bundle adjustment problem consists of :math:`p`
				677	cameras and :math:`q` points and the variable vector :math:`x` has the
				678	block structure :math:`x = [y_{1}, ... ,y_{p},z_{1},
				679	... ,z_{q}]`. Where, :math:`y` and :math:`z` correspond to camera and
				680	point parameters respectively. Further, let the camera blocks be of
				681	size :math:`c` and the point blocks be of size :math:`s` (for most
				682	problems :math:`c` = :math:`6`--`9` and :math:`s = 3`). Ceres does not
				683	impose any constancy requirement on these block sizes, but choosing
				684	them to be constant simplifies the exposition.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	685
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	686	The key property of bundle adjustment problems which we will exploit
				687	is the fact that no term :math:`f_{i}` in :eq:`nonlinsq` includes two
				688	or more point blocks at the same time. This in turn implies that the
				689	matrix :math:`H` is of the form
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	690
				691	.. math:: H = \left[ \begin{matrix} B & E\\ E^\top & C \end{matrix} \right]\ ,
				692	:label: hblock
				693
				694	where :math:`B \in \mathbb{R}^{pc\times pc}` is a block sparse matrix
				695	with :math:`p` blocks of size :math:`c\times c` and :math:`C \in
				696	\mathbb{R}^{qs\times qs}` is a block diagonal matrix with :math:`q` blocks
				697	of size :math:`s\times s`. :math:`E \in \mathbb{R}^{pc\times qs}` is a
				698	general block sparse matrix, with a block of size :math:`c\times s`
				699	for each observation. Let us now block partition :math:`\Delta x =
				700	[\Delta y,\Delta z]` and :math:`g=[v,w]` to restate :eq:`normal`
				701	as the block structured linear system
				702
				703	.. math:: \left[ \begin{matrix} B & E\\ E^\top & C \end{matrix}
				704	\right]\left[ \begin{matrix} \Delta y \\ \Delta z
				705	\end{matrix} \right] = \left[ \begin{matrix} v\\ w
				706	\end{matrix} \right]\ ,
				707	:label: linear2
				708
				709	and apply Gaussian elimination to it. As we noted above, :math:`C` is
				710	a block diagonal matrix, with small diagonal blocks of size
				711	:math:`s\times s`. Thus, calculating the inverse of :math:`C` by
				712	inverting each of these blocks is cheap. This allows us to eliminate
				713	:math:`\Delta z` by observing that :math:`\Delta z = C^{-1}(w - E^\top
				714	\Delta y)`, giving us
				715
				716	.. math:: \left[B - EC^{-1}E^\top\right] \Delta y = v - EC^{-1}w\ .
				717	:label: schur
				718
				719	The matrix
				720
				721	.. math:: S = B - EC^{-1}E^\top
				722
				723	is the Schur complement of :math:`C` in :math:`H`. It is also known as
				724	the reduced camera matrix, because the only variables
				725	participating in :eq:`schur` are the ones corresponding to the
				726	cameras. :math:`S \in \mathbb{R}^{pc\times pc}` is a block structured
				727	symmetric positive definite matrix, with blocks of size :math:`c\times
				728	c`. The block :math:`S_{ij}` corresponding to the pair of images
				729	:math:`i` and :math:`j` is non-zero if and only if the two images
				730	observe at least one common point.
				731
				732
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	733	Now :eq:`linear2` can be solved by first forming :math:`S`, solving
				734	for :math:`\Delta y`, and then back-substituting :math:`\Delta y` to
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	735	obtain the value of :math:`\Delta z`. Thus, the solution of what was
				736	an :math:`n\times n`, :math:`n=pc+qs` linear system is reduced to the
				737	inversion of the block diagonal matrix :math:`C`, a few matrix-matrix
				738	and matrix-vector multiplies, and the solution of block sparse
				739	:math:`pc\times pc` linear system :eq:`schur`. For almost all
				740	problems, the number of cameras is much smaller than the number of
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	741	points, :math:`p \ll q`, thus solving :eq:`schur` is significantly
				742	cheaper than solving :eq:`linear2`. This is the *Schur complement
				743	trick* [Brown]_.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	744
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	745	This still leaves open the question of solving :eq:`schur`. As we
				746	discussed when considering the exact solution of the normal equations
				747	using Cholesky factorization, we have two options.
				748
				749	1. ``DENSE_SCHUR`` - The first is dense Cholesky factorization,
				750	where we store and factor :math:`S` as a dense matrix. This method has
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	751	:math:`O(p^2)` space complexity and :math:`O(p^3)` time complexity and
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	752	is only practical for problems with up to a few hundred cameras.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	753
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	754	2. ``SPARSE_SCHUR`` - For large bundle adjustment problems :math:`S`
				755	is typically a fairly sparse matrix, as most images only see a small
				756	fraction of the scene. This leads us to the second option: **sparse
				757	Cholesky factorization** [Davis]_. Here we store :math:`S` as a
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	758	sparse matrix, use row and column re-ordering algorithms to maximize
				759	the sparsity of the Cholesky decomposition, and focus their compute
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	760	effort on the non-zero part of the factorization [Davis]_ [Chen]_
				761	. Sparse direct methods, depending on the exact sparsity structure of
				762	the Schur complement, allow bundle adjustment algorithms to scenes
				763	with thousands of cameras.
				764
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	765
				766	.. _section-iterative_schur:
				767
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	768	ITERATIVE_SCHUR
				769	---------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	770
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	771	Another option for bundle adjustment problems is to apply Conjugate
				772	Gradients to the reduced camera matrix :math:`S` instead of
				773	:math:`H`. One reason to do this is that :math:`S` is a much smaller
				774	matrix than :math:`H`, but more importantly, it can be shown that
				775	:math:`\kappa(S)\leq \kappa(H)` [Agarwal]_.
				776
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	777	Ceres implements Conjugate Gradients on :math:`S` as the
				778	``ITERATIVE_SCHUR`` solver. When the user chooses ``ITERATIVE_SCHUR``
				779	as the linear solver, Ceres automatically switches from the exact step
				780	algorithm to an inexact step algorithm.
				781
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	782
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	783	The key computational operation when using Conjuagate Gradients is the
				784	evaluation of the matrix vector product :math:`Sx` for an arbitrary
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	785	vector :math:`x`. Because PCG only needs access to :math:`S` via its
				786	product with a vector, one way to evaluate :math:`Sx` is to observe
				787	that
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	788
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	789	.. math:: x_1 &= E^\top x\\
				790	x_2 &= C^{-1} x_1\\
				791	x_3 &= Ex_2\\
				792	x_4 &= Bx\\
				793	Sx &= x_4 - x_3
				794	:label: schurtrick1
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	795
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	796	Thus, we can run Conjugate Gradients on :math:`S` with the same
				797	computational effort per iteration as Conjugate Gradients on
				798	:math:`H`, while reaping the benefits of a more powerful
				799	preconditioner. In fact, we do not even need to compute :math:`H`,
				800	:eq:`schurtrick1` can be implemented using just the columns of
				801	:math:`J`.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	802
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	803	Equation :eq:`schurtrick1` is closely related to *Domain Decomposition
				804	methods* for solving large linear systems that arise in structural
				805	engineering and partial differential equations. In the language of
				806	Domain Decomposition, each point in a bundle adjustment problem is a
				807	domain, and the cameras form the interface between these domains. The
				808	iterative solution of the Schur complement then falls within the
				809	sub-category of techniques known as Iterative Sub-structuring [Saad]_
				810	[Mathew]_.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	811
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	812	While in most cases the above method for evaluating :math:`Sx` is the
				813	way to go, for some problems it is better to compute the Schur
				814	complemenent :math:`S` explicitly and then run Conjugate Gradients on
				815	it. This can be done by settin
				816	``Solver::Options::use_explicit_schur_complement`` to ``true``. This
				817	option can only be used with the ``SCHUR_JACOBI`` preconditioner.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	818
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	819
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	820	.. _section-schur_power_series_expansion:
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	821
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	822	SCHUR_POWER_SERIES_EXPANSION
				823	----------------------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	824
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	825	It can be shown that the inverse of the Schur complement can be
				826	written as an infinite power-series [Weber]_ [Zheng]_:
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	827
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	828	.. math:: S &= B - EC^{-1}E^\top\\
				829	&= B(I - B^{-1}EC^{-1}E^\top)\\
				830	S^{-1} &= (I - B^{-1}EC^{-1}E^\top)^{-1} B^{-1}\\
				831	& = \sum_{i=0}^\infty \left(B^{-1}EC^{-1}E^\top\right)^{i} B^{-1}
				832
				833	As a result a truncated version of this power series expansion can be
				834	used to approximate the inverse and therefore the solution to
				835	:eq:`schur`. Ceres allows the user to use Schur power series expansion
				836	in three ways.
				837
				838	1. As a linear solver. This is what [Weber]_ calls **Power Bundle
				839	Adjustment** and corresponds to using the truncated power series to
				840	approximate the inverse of the Schur complement. This is done by
				841	setting the following options.
				842
				843	.. code-block:: c++
				844
				845	Solver::Options::linear_solver_type = ITERATIVE_SCHUR
				846	Solver::Options::preconditioner_type = IDENTITY
				847	Solver::Options::use_spse_initialization = true
				848	Solver::Options::max_linear_solver_iterations = 0;
				849
				850	// The following two settings are worth tuning for your application.
				851	Solver::Options::max_num_spse_iterations = 5;
				852	Solver::Options::spse_tolerance = 0.1;
				853
				854
				855	2. As a preconditioner for ``ITERATIVE_SCHUR``. Any method for
				856	approximating the inverse of a matrix can also be used as a
				857	preconditioner. This is enabled by setting the following options.
				858
				859	.. code-block:: c++
				860
				861	Solver::Options::linear_solver_type = ITERATIVE_SCHUR
				862	Solver::Options::preconditioner_type = SCHUR_POWER_SERIES_EXPANSION;
				863	Solver::Options::use_spse_initialization = false;
				864
				865	// This is worth tuning for your application.
				866	Solver::Options::max_num_spse_iterations = 5;
				867
				868
				869	3. As initialization for ``ITERATIIVE_SCHUR`` with any
				870	preconditioner. This is a combination of the above two, where the
				871	Schur Power Series Expansion
				872
				873	.. code-block:: c++
				874
				875	Solver::Options::linear_solver_type = ITERATIVE_SCHUR
				876	Solver::Options::preconditioner_type = ... // Preconditioner of your choice.
				877	Solver::Options::use_spse_initialization = true
				878	Solver::Options::max_linear_solver_iterations = 0;
				879
				880	// The following two settings are worth tuning for your application.
				881	Solver::Options::max_num_spse_iterations = 5;
				882	// This only affects the initialization but not the preconditioner.
				883	Solver::Options::spse_tolerance = 0.1;
				884
				885
				886	.. _section-mixed-precision:
				887
				888	Mixed Precision Solves
				889	======================
				890
				891	Generally speaking Ceres Solver does all its arithmetic in double
				892	precision. Sometimes though, one can use single precision arithmetic
				893	to get substantial speedups. Currently, for linear solvers that
				894	perform Cholesky factorization (sparse or dense) the user has the
				895	option cast the linear system to single precision and then use
				896	single precision Cholesky factorization routines to solve the
				897	resulting linear system. This can be enabled by setting
				898	:member:`Solver::Options::use_mixed_precision_solves` to ``true``.
				899
				900	Depending on the conditioning of the problem, the use of single
				901	precision factorization may lead to some loss of accuracy. Some of
				902	this accuracy can be recovered by performing `Iterative Refinement
				903	<https://en.wikipedia.org/wiki/Iterative_refinement>`_. The number of
				904	iterations of iterative refinement are controlled by
				905	:member:`Solver::Options::max_num_refinement_iterations`. The default
				906	value of this parameter is zero, which means if
				907	:member:`Solver::Options::use_mixed_precision_solves` is ``true``,
				908	then no iterative refinement is performed. Usually 2-3 refinement
				909	iterations are enough.
				910
				911	Mixed precision solves are available in the following linear solver
				912	configurations:
				913
				914	1. ``DENSE_NORMAL_CHOLESKY`` + ``EIGEN``/ ``LAPACK`` / ``CUDA``.
				915	2. ``DENSE_SCHUR`` + ``EIGEN``/ ``LAPACK`` / ``CUDA``.
				916	3. ``SPARSE_NORMAL_CHOLESKY`` + ``EIGEN_SPARSE`` / ``ACCELERATE_SPARSE``
				917	4. ``SPARSE_SCHUR`` + ``EIGEN_SPARSE`` / ``ACCELERATE_SPARSE``
				918
				919	Mixed precision solves area not available when using ``SUITE_SPARSE``
				920	as the sparse linear algebra backend because SuiteSparse/CHOLMOD does
				921	not support single precision solves.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	922
				923	.. _section-preconditioner:
				924
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	925	Preconditioners
				926	===============
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	927
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	928	The convergence rate of Conjugate Gradients for solving :eq:`normal`
				929	depends on the distribution of eigenvalues of :math:`H` [Saad]_. A
				930	useful upper bound is :math:`\sqrt{\kappa(H)}`, where,
				931	:math:`\kappa(H)` is the condition number of the matrix :math:`H`. For
				932	most non-linear least squares problems, :math:`\kappa(H)` is high and
				933	a direct application of Conjugate Gradients to :eq:`normal` results in
				934	extremely poor performance.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	935
				936	The solution to this problem is to replace :eq:`normal` with a
				937	preconditioned system. Given a linear system, :math:`Ax =b` and a
				938	preconditioner :math:`M` the preconditioned system is given by
				939	:math:`M^{-1}Ax = M^{-1}b`. The resulting algorithm is known as
				940	Preconditioned Conjugate Gradients algorithm (PCG) and its worst case
				941	complexity now depends on the condition number of the preconditioned
				942	matrix :math:`\kappa(M^{-1}A)`.
				943
				944	The computational cost of using a preconditioner :math:`M` is the cost
				945	of computing :math:`M` and evaluating the product :math:`M^{-1}y` for
				946	arbitrary vectors :math:`y`. Thus, there are two competing factors to
				947	consider: How much of :math:`H`'s structure is captured by :math:`M`
				948	so that the condition number :math:`\kappa(HM^{-1})` is low, and the
				949	computational cost of constructing and using :math:`M`. The ideal
				950	preconditioner would be one for which :math:`\kappa(M^{-1}A)
				951	=1`. :math:`M=A` achieves this, but it is not a practical choice, as
				952	applying this preconditioner would require solving a linear system
				953	equivalent to the unpreconditioned problem. It is usually the case
				954	that the more information :math:`M` has about :math:`H`, the more
				955	expensive it is use. For example, Incomplete Cholesky factorization
				956	based preconditioners have much better convergence behavior than the
				957	Jacobi preconditioner, but are also much more expensive.
				958
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	959	For a survey of the state of the art in preconditioning linear least
				960	squares problems with general sparsity structure see [GouldScott]_.
				961
				962	Ceres Solver comes with an number of preconditioners suited for
				963	problems with general sparsity as well as the special sparsity
				964	structure encountered in bundle adjustment problems.
				965
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	966	IDENTITY
				967	--------
				968
				969	This is equivalent to using an identity matrix as a preconditioner,
				970	i.e. no preconditioner at all.
				971
				972
				973	JACOBI
				974	------
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	975
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	976	The simplest of all preconditioners is the diagonal or Jacobi
				977	preconditioner, i.e., :math:`M=\operatorname{diag}(A)`, which for
				978	block structured matrices like :math:`H` can be generalized to the
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	979	block Jacobi preconditioner. The ``JACOBI`` preconditioner in Ceres
				980	when used with :ref:`section-cgnr` refers to the block diagonal of
				981	:math:`H` and when used with :ref:`section-iterative_schur` refers to
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	982	the block diagonal of :math:`B` [Mandel]_.
				983
				984	For detailed performance data about the performance of ``JACOBI`` on
				985	bundle adjustment problems see [Agarwal]_.
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	986
				987
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	988	SCHUR_JACOBI
				989	------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	990
				991	Another obvious choice for :ref:`section-iterative_schur` is the block
				992	diagonal of the Schur complement matrix :math:`S`, i.e, the block
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	993	Jacobi preconditioner for :math:`S`. In Ceres we refer to it as the
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	994	``SCHUR_JACOBI`` preconditioner.
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	995
				996
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	997	For detailed performance data about the performance of
				998	``SCHUR_JACOBI`` on bundle adjustment problems see [Agarwal]_.
				999
				1000
				1001	CLUSTER_JACOBI and CLUSTER_TRIDIAGONAL
				1002	--------------------------------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1003
				1004	For bundle adjustment problems arising in reconstruction from
				1005	community photo collections, more effective preconditioners can be
				1006	constructed by analyzing and exploiting the camera-point visibility
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	1007	structure of the scene.
				1008
				1009	The key idea is to cluster the cameras based on the visibility
				1010	structure of the scene. The similarity between a pair of cameras
				1011	:math:`i` and :math:`j` is given by:
				1012
				1013	.. math:: S_{ij} = \frac{\|V_i \cap V_j\|}{\|V_i\| \|V_j\|}
				1014
				1015	Here :math:`V_i` is the set of scene points visible in camera
				1016	:math:`i`. This idea was first exploited by [KushalAgarwal]_ to create
				1017	the ``CLUSTER_JACOBI`` and the ``CLUSTER_TRIDIAGONAL`` preconditioners
				1018	which Ceres implements.
				1019
				1020	The performance of these two preconditioners depends on the speed and
				1021	clustering quality of the clustering algorithm used when building the
				1022	preconditioner. In the original paper, [KushalAgarwal]_ used the
				1023	Canonical Views algorithm [Simon]_, which while producing high quality
				1024	clusterings can be quite expensive for large graphs. So, Ceres
				1025	supports two visibility clustering algorithms - ``CANONICAL_VIEWS``
				1026	and ``SINGLE_LINKAGE``. The former is as the name implies Canonical
				1027	Views algorithm of [Simon]_. The latter is the the classic `Single
				1028	Linkage Clustering
				1029	<https://en.wikipedia.org/wiki/Single-linkage_clustering>`_
				1030	algorithm. The choice of clustering algorithm is controlled by
				1031	:member:`Solver::Options::visibility_clustering_type`.
				1032
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1033	SCHUR_POWER_SERIES_EXPANSION
				1034	----------------------------
				1035
				1036	As explained in :ref:`section-schur_power_series_expansion`, the Schur
				1037	complement matrix admits a power series expansion and a truncated
				1038	version of this power series can be used as a preconditioner for
				1039	``ITERATIVE_SCHUR``. When used as a preconditioner
				1040	:member:`Solver::Options::max_num_spse_iterations` controls the number
				1041	of terms in the power series that are used.
				1042
				1043
				1044	SUBSET
				1045	------
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	1046
				1047	This is a preconditioner for problems with general sparsity. Given a
				1048	subset of residual blocks of a problem, it uses the corresponding
				1049	subset of the rows of the Jacobian to construct a preconditioner
				1050	[Dellaert]_.
				1051
				1052	Suppose the Jacobian :math:`J` has been horizontally partitioned as
				1053
				1054	.. math:: J = \begin{bmatrix} P \\ Q \end{bmatrix}
				1055
				1056	Where, :math:`Q` is the set of rows corresponding to the residual
				1057	blocks in
				1058	:member:`Solver::Options::residual_blocks_for_subset_preconditioner`. The
				1059	preconditioner is the matrix :math:`(Q^\top Q)^{-1}`.
				1060
				1061	The efficacy of the preconditioner depends on how well the matrix
				1062	:math:`Q` approximates :math:`J^\top J`, or how well the chosen
				1063	residual blocks approximate the full problem.
				1064
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1065	This preconditioner is NOT available when running ``CGNR`` using
				1066	``CUDA``.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1067
				1068	.. _section-ordering:
				1069
				1070	Ordering
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	1071	========
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1072
				1073	The order in which variables are eliminated in a linear solver can
				1074	have a significant of impact on the efficiency and accuracy of the
				1075	method. For example when doing sparse Cholesky factorization, there
				1076	are matrices for which a good ordering will give a Cholesky factor
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1077	with :math:`O(n)` storage, whereas a bad ordering will result in an
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1078	completely dense factor.
				1079
				1080	Ceres allows the user to provide varying amounts of hints to the
				1081	solver about the variable elimination ordering to use. This can range
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1082	from no hints, where the solver is free to decide the best possible
				1083	ordering based on the user's choices like the linear solver being
				1084	used, to an exact order in which the variables should be eliminated,
				1085	and a variety of possibilities in between.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1086
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1087	The simplest thing to do is to just set
				1088	:member:`Solver::Options::linear_solver_ordering_type` to ``AMD``
				1089	(default) or ``NESDIS`` based on your understanding of the problem or
				1090	empirical testing.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1091
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1092
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1093	More information can be commmuniucated by using an instance
				1094	:class:`ParameterBlockOrdering` class.
				1095
				1096	Formally an ordering is an ordered partitioning of the
				1097	parameter blocks, i.e, each parameter block belongs to exactly
				1098	one group, and each group has a unique non-negative integer
				1099	associated with it, that determines its order in the set of
				1100	groups.
				1101
				1102	e.g. Consider the linear system
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1103
				1104	.. math::
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1105	x + y &= 3 \\
				1106	2x + 3y &= 7
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1107
				1108	There are two ways in which it can be solved. First eliminating
				1109	:math:`x` from the two equations, solving for :math:`y` and then back
				1110	substituting for :math:`x`, or first eliminating :math:`y`, solving
				1111	for :math:`x` and back substituting for :math:`y`. The user can
				1112	construct three orderings here.
				1113
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1114	1. :math:`\{0: x\}, \{1: y\}` - eliminate :math:`x` first.
				1115	2. :math:`\{0: y\}, \{1: x\}` - eliminate :math:`y` first.
				1116	3. :math:`\{0: x, y\}` - Solver gets to decide the elimination order.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1117
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1118	Thus, to have Ceres determine the ordering automatically, put all the
				1119	variables in group 0 and to control the ordering for every variable,
				1120	create groups :math:`0 \dots N-1`, one per variable, in the desired
				1121	order.
				1122
				1123	``linear_solver_ordering == nullptr`` and an ordering where all the
				1124	parameter blocks are in one elimination group mean the same thing -
				1125	the solver is free to choose what it thinks is the best elimination
				1126	ordering using the ordering algorithm (specified using
				1127	:member:`Solver::Options::linear_solver_ordering_type`). Therefore in
				1128	the following we will only consider the case where
				1129	``linear_solver_ordering != nullptr``.
				1130
				1131	The exact interpretation of the ``linear_solver_ordering`` depends on
				1132	the values of :member:`Solver::Options::linear_solver_ordering_type`,
				1133	:member:`Solver::Options::linear_solver_type`,
				1134	:member:`Solver::Options::preconditioner_type` and
				1135	:member:`Solver::Options::sparse_linear_algebra_library_type` as we will
				1136	explain below.
				1137
				1138	Bundle Adjustment
				1139	-----------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1140
				1141	If the user is using one of the Schur solvers (``DENSE_SCHUR``,
				1142	``SPARSE_SCHUR``, ``ITERATIVE_SCHUR``) and chooses to specify an
				1143	ordering, it must have one important property. The lowest numbered
				1144	elimination group must form an independent set in the graph
				1145	corresponding to the Hessian, or in other words, no two parameter
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1146	blocks in the first elimination group should co-occur in the same
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1147	residual block. For the best performance, this elimination group
				1148	should be as large as possible. For standard bundle adjustment
				1149	problems, this corresponds to the first elimination group containing
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1150	all the 3d points, and the second containing the parameter blocks for
				1151	all the cameras.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1152
				1153	If the user leaves the choice to Ceres, then the solver uses an
				1154	approximate maximum independent set algorithm to identify the first
				1155	elimination group [LiSaad]_.
				1156
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1157	``sparse_linear_algebra_library_type = SUITE_SPARSE``
				1158	-----------------------------------------------------
				1159
				1160	linear_solver_ordering_type = AMD
				1161
				1162	A constrained Approximate Minimum Degree (CAMD) ordering is used where
				1163	the parameter blocks in the lowest numbered group are eliminated
				1164	first, and then the parameter blocks in the next lowest numbered group
				1165	and so on. Within each group, CAMD is free to order the parameter blocks
				1166	as it chooses.
				1167
				1168	linear_solver_ordering_type = NESDIS
				1169
				1170	a. ``linear_solver_type = SPARSE_NORMAL_CHOLESKY`` or
				1171	``linear_solver_type = CGNR`` and ``preconditioner_type = SUBSET``
				1172
				1173	The value of ``linear_solver_ordering`` is ignored and a Nested
				1174	Dissection algorithm is used to compute a fill reducing ordering.
				1175
				1176	b. ``linear_solver_type = SPARSE_SCHUR/DENSE_SCHUR/ITERATIVE_SCHUR``
				1177
				1178	ONLY the lowest group are used to compute the Schur complement, and
				1179	Nested Dissection is used to compute a fill reducing ordering for
				1180	the Schur Complement (or its preconditioner).
				1181
				1182	``sparse_linear_algebra_library_type = EIGEN_SPARSE/ACCELERATE_SPARSE``
				1183	-----------------------------------------------------------------------
				1184
				1185	a. ``linear_solver_type = SPARSE_NORMAL_CHOLESKY`` or
				1186	``linear_solver_type = CGNR`` and ``preconditioner_type = SUBSET``
				1187
				1188	The value of ``linear_solver_ordering`` is ignored and ``AMD`` or
				1189	``NESDIS`` is used to compute a fill reducing ordering as requested
				1190	by the user.
				1191
				1192	b. ``linear_solver_type = SPARSE_SCHUR/DENSE_SCHUR/ITERATIVE_SCHUR``
				1193
				1194	ONLY the lowest group are used to compute the Schur complement, and
				1195	``AMD`` or ``NESID`` is used to compute a fill reducing ordering
				1196	for the Schur Complement (or its preconditioner) as requested by
				1197	the user.
				1198
				1199
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1200	.. _section-solver-options:
				1201
				1202	:class:`Solver::Options`
				1203	========================
				1204
				1205	.. class:: Solver::Options
				1206
				1207	:class:`Solver::Options` controls the overall behavior of the
				1208	solver. We list the various settings and their default values below.
				1209
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1210	.. function:: bool Solver::Options::IsValid(std::string* error) const
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1211
				1212	Validate the values in the options struct and returns true on
				1213	success. If there is a problem, the method returns false with
				1214	``error`` containing a textual description of the cause.
				1215
				1216	.. member:: MinimizerType Solver::Options::minimizer_type
				1217
				1218	Default: ``TRUST_REGION``
				1219
				1220	Choose between ``LINE_SEARCH`` and ``TRUST_REGION`` algorithms. See
				1221	:ref:`section-trust-region-methods` and
				1222	:ref:`section-line-search-methods` for more details.
				1223
				1224	.. member:: LineSearchDirectionType Solver::Options::line_search_direction_type
				1225
				1226	Default: ``LBFGS``
				1227
				1228	Choices are ``STEEPEST_DESCENT``, ``NONLINEAR_CONJUGATE_GRADIENT``,
				1229	``BFGS`` and ``LBFGS``.
				1230
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1231	See :ref:`section-line-search-methods` for more details.
				1232
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1233	.. member:: LineSearchType Solver::Options::line_search_type
				1234
				1235	Default: ``WOLFE``
				1236
				1237	Choices are ``ARMIJO`` and ``WOLFE`` (strong Wolfe conditions).
				1238	Note that in order for the assumptions underlying the ``BFGS`` and
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1239	``LBFGS`` line search direction algorithms to be satisfied, the
				1240	``WOLFE`` line search must be used.
				1241
				1242	See :ref:`section-line-search-methods` for more details.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1243
				1244	.. member:: NonlinearConjugateGradientType Solver::Options::nonlinear_conjugate_gradient_type
				1245
				1246	Default: ``FLETCHER_REEVES``
				1247
				1248	Choices are ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and
				1249	``HESTENES_STIEFEL``.
				1250
				1251	.. member:: int Solver::Options::max_lbfgs_rank
				1252
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1253	Default: ``20``
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1254
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1255	The LBFGS hessian approximation is a low rank approximation to
				1256	the inverse of the Hessian matrix. The rank of the
				1257	approximation determines (linearly) the space and time
				1258	complexity of using the approximation. Higher the rank, the
				1259	better is the quality of the approximation. The increase in
				1260	quality is however is bounded for a number of reasons.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1261
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1262	1. The method only uses secant information and not actual
				1263	derivatives.
				1264	2. The Hessian approximation is constrained to be positive
				1265	definite.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1266
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1267	So increasing this rank to a large number will cost time and
				1268	space complexity without the corresponding increase in solution
				1269	quality. There are no hard and fast rules for choosing the
				1270	maximum rank. The best choice usually requires some problem
				1271	specific experimentation.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1272
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1273	For more theoretical and implementation details of the LBFGS
				1274	method, please see [Nocedal]_.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1275
				1276	.. member:: bool Solver::Options::use_approximate_eigenvalue_bfgs_scaling
				1277
				1278	Default: ``false``
				1279
				1280	As part of the ``BFGS`` update step / ``LBFGS`` right-multiply
				1281	step, the initial inverse Hessian approximation is taken to be the
				1282	Identity. However, [Oren]_ showed that using instead :math:`I *
				1283	\gamma`, where :math:`\gamma` is a scalar chosen to approximate an
				1284	eigenvalue of the true inverse Hessian can result in improved
				1285	convergence in a wide variety of cases. Setting
				1286	``use_approximate_eigenvalue_bfgs_scaling`` to true enables this
				1287	scaling in ``BFGS`` (before first iteration) and ``LBFGS`` (at each
				1288	iteration).
				1289
				1290	Precisely, approximate eigenvalue scaling equates to
				1291
				1292	.. math:: \gamma = \frac{y_k' s_k}{y_k' y_k}
				1293
				1294	With:
				1295
				1296	.. math:: y_k = \nabla f_{k+1} - \nabla f_k
				1297	.. math:: s_k = x_{k+1} - x_k
				1298
				1299	Where :math:`f()` is the line search objective and :math:`x` the
				1300	vector of parameter values [NocedalWright]_.
				1301
				1302	It is important to note that approximate eigenvalue scaling does
				1303	not always improve convergence, and that it can in fact
				1304	significantly degrade performance for certain classes of problem,
				1305	which is why it is disabled by default. In particular it can
				1306	degrade performance when the sensitivity of the problem to different
				1307	parameters varies significantly, as in this case a single scalar
				1308	factor fails to capture this variation and detrimentally downscales
				1309	parts of the Jacobian approximation which correspond to
				1310	low-sensitivity parameters. It can also reduce the robustness of the
				1311	solution to errors in the Jacobians.
				1312
				1313	.. member:: LineSearchIterpolationType Solver::Options::line_search_interpolation_type
				1314
				1315	Default: ``CUBIC``
				1316
				1317	Degree of the polynomial used to approximate the objective
				1318	function. Valid values are ``BISECTION``, ``QUADRATIC`` and
				1319	``CUBIC``.
				1320
				1321	.. member:: double Solver::Options::min_line_search_step_size
				1322
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1323	Default: ``1e-9``
				1324
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1325	The line search terminates if:
				1326
				1327	.. math:: \\|\Delta x_k\\|_\infty < \text{min_line_search_step_size}
				1328
				1329	where :math:`\\|\cdot\\|_\infty` refers to the max norm, and
				1330	:math:`\Delta x_k` is the step change in the parameter values at
				1331	the :math:`k`-th iteration.
				1332
				1333	.. member:: double Solver::Options::line_search_sufficient_function_decrease
				1334
				1335	Default: ``1e-4``
				1336
				1337	Solving the line search problem exactly is computationally
				1338	prohibitive. Fortunately, line search based optimization algorithms
				1339	can still guarantee convergence if instead of an exact solution,
				1340	the line search algorithm returns a solution which decreases the
				1341	value of the objective function sufficiently. More precisely, we
				1342	are looking for a step size s.t.
				1343
				1344	.. math:: f(\text{step_size}) \le f(0) + \text{sufficient_decrease} * [f'(0) * \text{step_size}]
				1345
				1346	This condition is known as the Armijo condition.
				1347
				1348	.. member:: double Solver::Options::max_line_search_step_contraction
				1349
				1350	Default: ``1e-3``
				1351
				1352	In each iteration of the line search,
				1353
				1354	.. math:: \text{new_step_size} >= \text{max_line_search_step_contraction} * \text{step_size}
				1355
				1356	Note that by definition, for contraction:
				1357
				1358	.. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1
				1359
				1360	.. member:: double Solver::Options::min_line_search_step_contraction
				1361
				1362	Default: ``0.6``
				1363
				1364	In each iteration of the line search,
				1365
				1366	.. math:: \text{new_step_size} <= \text{min_line_search_step_contraction} * \text{step_size}
				1367
				1368	Note that by definition, for contraction:
				1369
				1370	.. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1
				1371
				1372	.. member:: int Solver::Options::max_num_line_search_step_size_iterations
				1373
				1374	Default: ``20``
				1375
				1376	Maximum number of trial step size iterations during each line
				1377	search, if a step size satisfying the search conditions cannot be
				1378	found within this number of trials, the line search will stop.
				1379
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	1380	The minimum allowed value is 0 for trust region minimizer and 1
				1381	otherwise. If 0 is specified for the trust region minimizer, then
				1382	line search will not be used when solving constrained optimization
				1383	problems.
				1384
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1385	As this is an 'artificial' constraint (one imposed by the user, not
				1386	the underlying math), if ``WOLFE`` line search is being used, and
				1387	points satisfying the Armijo sufficient (function) decrease
				1388	condition have been found during the current search (in :math:`<=`
				1389	``max_num_line_search_step_size_iterations``). Then, the step size
				1390	with the lowest function value which satisfies the Armijo condition
				1391	will be returned as the new valid step, even though it does not
				1392	satisfy the strong Wolfe conditions. This behaviour protects
				1393	against early termination of the optimizer at a sub-optimal point.
				1394
				1395	.. member:: int Solver::Options::max_num_line_search_direction_restarts
				1396
				1397	Default: ``5``
				1398
				1399	Maximum number of restarts of the line search direction algorithm
				1400	before terminating the optimization. Restarts of the line search
				1401	direction algorithm occur when the current algorithm fails to
				1402	produce a new descent direction. This typically indicates a
				1403	numerical failure, or a breakdown in the validity of the
				1404	approximations used.
				1405
				1406	.. member:: double Solver::Options::line_search_sufficient_curvature_decrease
				1407
				1408	Default: ``0.9``
				1409
				1410	The strong Wolfe conditions consist of the Armijo sufficient
				1411	decrease condition, and an additional requirement that the
				1412	step size be chosen s.t. the magnitude ('strong' Wolfe
				1413	conditions) of the gradient along the search direction
				1414	decreases sufficiently. Precisely, this second condition
				1415	is that we seek a step size s.t.
				1416
				1417	.. math:: \\|f'(\text{step_size})\\| <= \text{sufficient_curvature_decrease} * \\|f'(0)\\|
				1418
				1419	Where :math:`f()` is the line search objective and :math:`f'()` is the derivative
				1420	of :math:`f` with respect to the step size: :math:`\frac{d f}{d~\text{step size}}`.
				1421
				1422	.. member:: double Solver::Options::max_line_search_step_expansion
				1423
				1424	Default: ``10.0``
				1425
				1426	During the bracketing phase of a Wolfe line search, the step size
				1427	is increased until either a point satisfying the Wolfe conditions
				1428	is found, or an upper bound for a bracket containing a point
				1429	satisfying the conditions is found. Precisely, at each iteration
				1430	of the expansion:
				1431
				1432	.. math:: \text{new_step_size} <= \text{max_step_expansion} * \text{step_size}
				1433
				1434	By definition for expansion
				1435
				1436	.. math:: \text{max_step_expansion} > 1.0
				1437
				1438	.. member:: TrustRegionStrategyType Solver::Options::trust_region_strategy_type
				1439
				1440	Default: ``LEVENBERG_MARQUARDT``
				1441
				1442	The trust region step computation algorithm used by
				1443	Ceres. Currently ``LEVENBERG_MARQUARDT`` and ``DOGLEG`` are the two
				1444	valid choices. See :ref:`section-levenberg-marquardt` and
				1445	:ref:`section-dogleg` for more details.
				1446
				1447	.. member:: DoglegType Solver::Options::dogleg_type
				1448
				1449	Default: ``TRADITIONAL_DOGLEG``
				1450
				1451	Ceres supports two different dogleg strategies.
				1452	``TRADITIONAL_DOGLEG`` method by Powell and the ``SUBSPACE_DOGLEG``
				1453	method described by [ByrdSchnabel]_ . See :ref:`section-dogleg`
				1454	for more details.
				1455
				1456	.. member:: bool Solver::Options::use_nonmonotonic_steps
				1457
				1458	Default: ``false``
				1459
				1460	Relax the requirement that the trust-region algorithm take strictly
				1461	decreasing steps. See :ref:`section-non-monotonic-steps` for more
				1462	details.
				1463
				1464	.. member:: int Solver::Options::max_consecutive_nonmonotonic_steps
				1465
				1466	Default: ``5``
				1467
				1468	The window size used by the step selection algorithm to accept
				1469	non-monotonic steps.
				1470
				1471	.. member:: int Solver::Options::max_num_iterations
				1472
				1473	Default: ``50``
				1474
				1475	Maximum number of iterations for which the solver should run.
				1476
				1477	.. member:: double Solver::Options::max_solver_time_in_seconds
				1478
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1479	Default: ``1e9``
				1480
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1481	Maximum amount of time for which the solver should run.
				1482
				1483	.. member:: int Solver::Options::num_threads
				1484
				1485	Default: ``1``
				1486
				1487	Number of threads used by Ceres to evaluate the Jacobian.
				1488
				1489	.. member:: double Solver::Options::initial_trust_region_radius
				1490
				1491	Default: ``1e4``
				1492
				1493	The size of the initial trust region. When the
				1494	``LEVENBERG_MARQUARDT`` strategy is used, the reciprocal of this
				1495	number is the initial regularization parameter.
				1496
				1497	.. member:: double Solver::Options::max_trust_region_radius
				1498
				1499	Default: ``1e16``
				1500
				1501	The trust region radius is not allowed to grow beyond this value.
				1502
				1503	.. member:: double Solver::Options::min_trust_region_radius
				1504
				1505	Default: ``1e-32``
				1506
				1507	The solver terminates, when the trust region becomes smaller than
				1508	this value.
				1509
				1510	.. member:: double Solver::Options::min_relative_decrease
				1511
				1512	Default: ``1e-3``
				1513
				1514	Lower threshold for relative decrease before a trust-region step is
				1515	accepted.
				1516
				1517	.. member:: double Solver::Options::min_lm_diagonal
				1518
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	1519	Default: ``1e-6``
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1520
				1521	The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to
				1522	regularize the trust region step. This is the lower bound on
				1523	the values of this diagonal matrix.
				1524
				1525	.. member:: double Solver::Options::max_lm_diagonal
				1526
				1527	Default: ``1e32``
				1528
				1529	The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to
				1530	regularize the trust region step. This is the upper bound on
				1531	the values of this diagonal matrix.
				1532
				1533	.. member:: int Solver::Options::max_num_consecutive_invalid_steps
				1534
				1535	Default: ``5``
				1536
				1537	The step returned by a trust region strategy can sometimes be
				1538	numerically invalid, usually because of conditioning
				1539	issues. Instead of crashing or stopping the optimization, the
				1540	optimizer can go ahead and try solving with a smaller trust
				1541	region/better conditioned problem. This parameter sets the number
				1542	of consecutive retries before the minimizer gives up.
				1543
				1544	.. member:: double Solver::Options::function_tolerance
				1545
				1546	Default: ``1e-6``
				1547
				1548	Solver terminates if
				1549
				1550	.. math:: \frac{\|\Delta \text{cost}\|}{\text{cost}} <= \text{function_tolerance}
				1551
				1552	where, :math:`\Delta \text{cost}` is the change in objective
				1553	function value (up or down) in the current iteration of
				1554	Levenberg-Marquardt.
				1555
				1556	.. member:: double Solver::Options::gradient_tolerance
				1557
				1558	Default: ``1e-10``
				1559
				1560	Solver terminates if
				1561
				1562	.. math:: \\|x - \Pi \boxplus(x, -g(x))\\|_\infty <= \text{gradient_tolerance}
				1563
				1564	where :math:`\\|\cdot\\|_\infty` refers to the max norm, :math:`\Pi`
				1565	is projection onto the bounds constraints and :math:`\boxplus` is
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1566	Plus operation for the overall manifold associated with the
				1567	parameter vector.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1568
				1569	.. member:: double Solver::Options::parameter_tolerance
				1570
				1571	Default: ``1e-8``
				1572
				1573	Solver terminates if
				1574
				1575	.. math:: \\|\Delta x\\| <= (\\|x\\| + \text{parameter_tolerance}) * \text{parameter_tolerance}
				1576
				1577	where :math:`\Delta x` is the step computed by the linear solver in
				1578	the current iteration.
				1579
				1580	.. member:: LinearSolverType Solver::Options::linear_solver_type
				1581
				1582	Default: ``SPARSE_NORMAL_CHOLESKY`` / ``DENSE_QR``
				1583
				1584	Type of linear solver used to compute the solution to the linear
				1585	least squares problem in each iteration of the Levenberg-Marquardt
				1586	algorithm. If Ceres is built with support for ``SuiteSparse`` or
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1587	``Accelerate`` or ``Eigen``'s sparse Cholesky factorization, the
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1588	default is ``SPARSE_NORMAL_CHOLESKY``, it is ``DENSE_QR``
				1589	otherwise.
				1590
				1591	.. member:: PreconditionerType Solver::Options::preconditioner_type
				1592
				1593	Default: ``JACOBI``
				1594
				1595	The preconditioner used by the iterative linear solver. The default
				1596	is the block Jacobi preconditioner. Valid values are (in increasing
				1597	order of complexity) ``IDENTITY``, ``JACOBI``, ``SCHUR_JACOBI``,
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1598	``CLUSTER_JACOBI``, ``CLUSTER_TRIDIAGONAL``, ``SUBSET`` and
				1599	``SCHUR_POWER_SERIES_EXPANSION``. See :ref:`section-preconditioner`
				1600	for more details.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1601
				1602	.. member:: VisibilityClusteringType Solver::Options::visibility_clustering_type
				1603
				1604	Default: ``CANONICAL_VIEWS``
				1605
				1606	Type of clustering algorithm to use when constructing a visibility
				1607	based preconditioner. The original visibility based preconditioning
				1608	paper and implementation only used the canonical views algorithm.
				1609
				1610	This algorithm gives high quality results but for large dense
				1611	graphs can be particularly expensive. As its worst case complexity
				1612	is cubic in size of the graph.
				1613
				1614	Another option is to use ``SINGLE_LINKAGE`` which is a simple
				1615	thresholded single linkage clustering algorithm that only pays
				1616	attention to tightly coupled blocks in the Schur complement. This
				1617	is a fast algorithm that works well.
				1618
				1619	The optimal choice of the clustering algorithm depends on the
				1620	sparsity structure of the problem, but generally speaking we
				1621	recommend that you try ``CANONICAL_VIEWS`` first and if it is too
				1622	expensive try ``SINGLE_LINKAGE``.
				1623
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1624	.. member:: std::unordered_set<ResidualBlockId> Solver::Options::residual_blocks_for_subset_preconditioner
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	1625
				1626	``SUBSET`` preconditioner is a preconditioner for problems with
				1627	general sparsity. Given a subset of residual blocks of a problem,
				1628	it uses the corresponding subset of the rows of the Jacobian to
				1629	construct a preconditioner.
				1630
				1631	Suppose the Jacobian :math:`J` has been horizontally partitioned as
				1632
				1633	.. math:: J = \begin{bmatrix} P \\ Q \end{bmatrix}
				1634
				1635	Where, :math:`Q` is the set of rows corresponding to the residual
				1636	blocks in
				1637	:member:`Solver::Options::residual_blocks_for_subset_preconditioner`. The
				1638	preconditioner is the matrix :math:`(Q^\top Q)^{-1}`.
				1639
				1640	The efficacy of the preconditioner depends on how well the matrix
				1641	:math:`Q` approximates :math:`J^\top J`, or how well the chosen
				1642	residual blocks approximate the full problem.
				1643
				1644	If ``Solver::Options::preconditioner_type == SUBSET``, then
				1645	``residual_blocks_for_subset_preconditioner`` must be non-empty.
				1646
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1647	.. member:: DenseLinearAlgebraLibrary Solver::Options::dense_linear_algebra_library_type
				1648
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1649	Default: ``EIGEN``
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1650
				1651	Ceres supports using multiple dense linear algebra libraries for
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1652	dense matrix factorizations. Currently ``EIGEN``, ``LAPACK`` and
				1653	``CUDA`` are the valid choices. ``EIGEN`` is always available,
				1654	``LAPACK`` refers to the system ``BLAS + LAPACK`` library which may
				1655	or may not be available. ``CUDA`` refers to Nvidia's GPU based
				1656	dense linear algebra library which may or may not be available.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1657
				1658	This setting affects the ``DENSE_QR``, ``DENSE_NORMAL_CHOLESKY``
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1659	and ``DENSE_SCHUR`` solvers. For small to moderate sized problem
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1660	``EIGEN`` is a fine choice but for large problems, an optimized
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1661	``LAPACK + BLAS`` or ``CUDA`` implementation can make a substantial
				1662	difference in performance.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1663
				1664	.. member:: SparseLinearAlgebraLibrary Solver::Options::sparse_linear_algebra_library_type
				1665
				1666	Default: The highest available according to: ``SUITE_SPARSE`` >
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1667	``ACCELERATE_SPARSE`` > ``EIGEN_SPARSE`` > ``NO_SPARSE``
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1668
				1669	Ceres supports the use of three sparse linear algebra libraries,
				1670	``SuiteSparse``, which is enabled by setting this parameter to
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1671	``SUITE_SPARSE``, ``Acclerate``, which can be selected by setting
				1672	this parameter to ``ACCELERATE_SPARSE`` and ``Eigen`` which is
				1673	enabled by setting this parameter to ``EIGEN_SPARSE``. Lastly,
				1674	``NO_SPARSE`` means that no sparse linear solver should be used;
				1675	note that this is irrespective of whether Ceres was compiled with
				1676	support for one.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1677
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1678	``SuiteSparse`` is a sophisticated sparse linear algebra library
				1679	and should be used in general. On MacOS you may want to use the
				1680	``Accelerate`` framework.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1681
				1682	If your needs/platforms prevent you from using ``SuiteSparse``,
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1683	consider using the sparse linear algebra routines in ``Eigen``. The
				1684	sparse Cholesky algorithms currently included with ``Eigen`` are
				1685	not as sophisticated as the ones in ``SuiteSparse`` and
				1686	``Accelerate`` and as a result its performance is considerably
				1687	worse.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1688
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1689	.. member:: LinearSolverOrderingType Solver::Options::linear_solver_ordering_type
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1690
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1691	Default: ``AMD``
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1692
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1693	The order in which variables are eliminated in a linear solver can
				1694	have a significant impact on the efficiency and accuracy of the
				1695	method. e.g., when doing sparse Cholesky factorization, there are
				1696	matrices for which a good ordering will give a Cholesky factor
				1697	with :math:`O(n)` storage, where as a bad ordering will result in
				1698	an completely dense factor.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1699
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1700	Sparse direct solvers like ``SPARSE_NORMAL_CHOLESKY`` and
				1701	``SPARSE_SCHUR`` use a fill reducing ordering of the columns and
				1702	rows of the matrix being factorized before computing the numeric
				1703	factorization.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1704
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1705	This enum controls the type of algorithm used to compute this fill
				1706	reducing ordering. There is no single algorithm that works on all
				1707	matrices, so determining which algorithm works better is a matter
				1708	of empirical experimentation.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1709
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1710	.. member:: std::shared_ptr<ParameterBlockOrdering> Solver::Options::linear_solver_ordering
				1711
				1712	Default: ``nullptr``
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1713
				1714	An instance of the ordering object informs the solver about the
				1715	desired order in which parameter blocks should be eliminated by the
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1716	linear solvers.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1717
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1718	If ``nullptr``, the solver is free to choose an ordering that it
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1719	thinks is best.
				1720
				1721	See :ref:`section-ordering` for more details.
				1722
				1723	.. member:: bool Solver::Options::use_explicit_schur_complement
				1724
				1725	Default: ``false``
				1726
				1727	Use an explicitly computed Schur complement matrix with
				1728	``ITERATIVE_SCHUR``.
				1729
				1730	By default this option is disabled and ``ITERATIVE_SCHUR``
				1731	evaluates evaluates matrix-vector products between the Schur
				1732	complement and a vector implicitly by exploiting the algebraic
				1733	expression for the Schur complement.
				1734
				1735	The cost of this evaluation scales with the number of non-zeros in
				1736	the Jacobian.
				1737
				1738	For small to medium sized problems there is a sweet spot where
				1739	computing the Schur complement is cheap enough that it is much more
				1740	efficient to explicitly compute it and use it for evaluating the
				1741	matrix-vector products.
				1742
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1743	.. NOTE::
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1744
				1745	This option can only be used with the ``SCHUR_JACOBI``
				1746	preconditioner.
				1747
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1748	.. member:: bool Solver::Options::dynamic_sparsity
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1749
				1750	Default: ``false``
				1751
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1752	Some non-linear least squares problems are symbolically dense but
				1753	numerically sparse. i.e. at any given state only a small number of
				1754	Jacobian entries are non-zero, but the position and number of
				1755	non-zeros is different depending on the state. For these problems
				1756	it can be useful to factorize the sparse jacobian at each solver
				1757	iteration instead of including all of the zero entries in a single
				1758	general factorization.
				1759
				1760	If your problem does not have this property (or you do not know),
				1761	then it is probably best to keep this false, otherwise it will
				1762	likely lead to worse performance.
				1763
				1764	This setting only affects the `SPARSE_NORMAL_CHOLESKY` solver.
				1765
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1766	.. member:: bool Solver::Options::use_mixed_precision_solves
				1767
				1768	Default: ``false``
				1769
				1770	If true, the Gauss-Newton matrix is computed in double precision, but
				1771	its factorization is computed in single precision. This can result in
				1772	significant time and memory savings at the cost of some accuracy in the
				1773	Gauss-Newton step. Iterative refinement is used to recover some
				1774	of this accuracy back.
				1775
				1776	If ``use_mixed_precision_solves`` is true, we recommend setting
				1777	``max_num_refinement_iterations`` to 2-3.
				1778
				1779	See :ref:`section-mixed-precision` for more details.
				1780
				1781	.. member:: int Solver::Options::max_num_refinement_iterations
				1782
				1783	Default: ``0``
				1784
				1785	Number steps of the iterative refinement process to run when
				1786	computing the Gauss-Newton step, see
				1787	:member:`Solver::Options::use_mixed_precision_solves`.
				1788
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1789	.. member:: int Solver::Options::min_linear_solver_iterations
				1790
				1791	Default: ``0``
				1792
				1793	Minimum number of iterations used by the linear solver. This only
				1794	makes sense when the linear solver is an iterative solver, e.g.,
				1795	``ITERATIVE_SCHUR`` or ``CGNR``.
				1796
				1797	.. member:: int Solver::Options::max_linear_solver_iterations
				1798
				1799	Default: ``500``
				1800
				1801	Minimum number of iterations used by the linear solver. This only
				1802	makes sense when the linear solver is an iterative solver, e.g.,
				1803	``ITERATIVE_SCHUR`` or ``CGNR``.
				1804
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1805	.. member:: int Solver::Options::max_num_spse_iterations
				1806
				1807	Default: `5`
				1808
				1809	Maximum number of iterations performed by
				1810	``SCHUR_POWER_SERIES_EXPANSION``. Each iteration corresponds to one
				1811	more term in the power series expansion od the inverse of the Schur
				1812	complement. This value controls the maximum number of iterations
				1813	whether it is used as a preconditioner or just to initialize the
				1814	solution for ``ITERATIVE_SCHUR``.
				1815
				1816	.. member:: bool Solver:Options::use_spse_initialization
				1817
				1818	Default: ``false``
				1819
				1820	Use Schur power series expansion to initialize the solution for
				1821	``ITERATIVE_SCHUR``. This option can be set ``true`` regardless of
				1822	what preconditioner is being used.
				1823
				1824	.. member:: double Solver::Options::spse_tolerance
				1825
				1826	Default: `0.1`
				1827
				1828	When ``use_spse_initialization`` is ``true``, this parameter along
				1829	with ``max_num_spse_iterations`` controls the number of
				1830	``SCHUR_POWER_SERIES_EXPANSION`` iterations performed for
				1831	initialization. It is not used to control the preconditioner.
				1832
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1833	.. member:: double Solver::Options::eta
				1834
				1835	Default: ``1e-1``
				1836
				1837	Forcing sequence parameter. The truncated Newton solver uses this
				1838	number to control the relative accuracy with which the Newton step
				1839	is computed. This constant is passed to
				1840	``ConjugateGradientsSolver`` which uses it to terminate the
				1841	iterations when
				1842
				1843	.. math:: \frac{Q_i - Q_{i-1}}{Q_i} < \frac{\eta}{i}
				1844
				1845	.. member:: bool Solver::Options::jacobi_scaling
				1846
				1847	Default: ``true``
				1848
				1849	``true`` means that the Jacobian is scaled by the norm of its
				1850	columns before being passed to the linear solver. This improves the
				1851	numerical conditioning of the normal equations.
				1852
				1853	.. member:: bool Solver::Options::use_inner_iterations
				1854
				1855	Default: ``false``
				1856
				1857	Use a non-linear version of a simplified variable projection
				1858	algorithm. Essentially this amounts to doing a further optimization
				1859	on each Newton/Trust region step using a coordinate descent
				1860	algorithm. For more details, see :ref:`section-inner-iterations`.
				1861
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	1862	Note Inner iterations cannot be used with :class:`Problem`
				1863	objects that have an :class:`EvaluationCallback` associated with
				1864	them.
				1865
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1866	.. member:: std::shared_ptr<ParameterBlockOrdering> Solver::Options::inner_iteration_ordering
				1867
				1868	Default: ``nullptr``
				1869
				1870	If :member:`Solver::Options::use_inner_iterations` true, then the
				1871	user has two choices.
				1872
				1873	1. Let the solver heuristically decide which parameter blocks to
				1874	optimize in each inner iteration. To do this, set
				1875	:member:`Solver::Options::inner_iteration_ordering` to ``nullptr``.
				1876
				1877	2. Specify a collection of of ordered independent sets. The lower
				1878	numbered groups are optimized before the higher number groups
				1879	during the inner optimization phase. Each group must be an
				1880	independent set. Not all parameter blocks need to be included in
				1881	the ordering.
				1882
				1883	See :ref:`section-ordering` for more details.
				1884
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1885	.. member:: double Solver::Options::inner_iteration_tolerance
				1886
				1887	Default: ``1e-3``
				1888
				1889	Generally speaking, inner iterations make significant progress in
				1890	the early stages of the solve and then their contribution drops
				1891	down sharply, at which point the time spent doing inner iterations
				1892	is not worth it.
				1893
				1894	Once the relative decrease in the objective function due to inner
				1895	iterations drops below ``inner_iteration_tolerance``, the use of
				1896	inner iterations in subsequent trust region minimizer iterations is
				1897	disabled.
				1898
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1899
				1900	.. member:: LoggingType Solver::Options::logging_type
				1901
				1902	Default: ``PER_MINIMIZER_ITERATION``
				1903
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1904	Valid values are ``SILENT`` and ``PER_MINIMIZER_ITERATION``.
				1905
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1906	.. member:: bool Solver::Options::minimizer_progress_to_stdout
				1907
				1908	Default: ``false``
				1909
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1910	By default the Minimizer's progress is logged to ``STDERR``
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1911	depending on the ``vlog`` level. If this flag is set to true, and
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1912	:member:`Solver::Options::logging_type` is not ``SILENT``, the
				1913	logging output is sent to ``STDOUT``.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1914
				1915	For ``TRUST_REGION_MINIMIZER`` the progress display looks like
				1916
				1917	.. code-block:: bash
				1918
				1919	iter cost cost_change \|gradient\| \|step\| tr_ratio tr_radius ls_iter iter_time total_time
				1920	0 4.185660e+06 0.00e+00 1.09e+08 0.00e+00 0.00e+00 1.00e+04 0 7.59e-02 3.37e-01
				1921	1 1.062590e+05 4.08e+06 8.99e+06 5.36e+02 9.82e-01 3.00e+04 1 1.65e-01 5.03e-01
				1922	2 4.992817e+04 5.63e+04 8.32e+06 3.19e+02 6.52e-01 3.09e+04 1 1.45e-01 6.48e-01
				1923
				1924	Here
				1925
				1926	#. ``cost`` is the value of the objective function.
				1927	#. ``cost_change`` is the change in the value of the objective
				1928	function if the step computed in this iteration is accepted.
				1929	#. ``\|gradient\|`` is the max norm of the gradient.
				1930	#. ``\|step\|`` is the change in the parameter vector.
				1931	#. ``tr_ratio`` is the ratio of the actual change in the objective
				1932	function value to the change in the value of the trust
				1933	region model.
				1934	#. ``tr_radius`` is the size of the trust region radius.
				1935	#. ``ls_iter`` is the number of linear solver iterations used to
				1936	compute the trust region step. For direct/factorization based
				1937	solvers it is always 1, for iterative solvers like
				1938	``ITERATIVE_SCHUR`` it is the number of iterations of the
				1939	Conjugate Gradients algorithm.
				1940	#. ``iter_time`` is the time take by the current iteration.
				1941	#. ``total_time`` is the total time taken by the minimizer.
				1942
				1943	For ``LINE_SEARCH_MINIMIZER`` the progress display looks like
				1944
				1945	.. code-block:: bash
				1946
				1947	0: f: 2.317806e+05 d: 0.00e+00 g: 3.19e-01 h: 0.00e+00 s: 0.00e+00 e: 0 it: 2.98e-02 tt: 8.50e-02
				1948	1: f: 2.312019e+05 d: 5.79e+02 g: 3.18e-01 h: 2.41e+01 s: 1.00e+00 e: 1 it: 4.54e-02 tt: 1.31e-01
				1949	2: f: 2.300462e+05 d: 1.16e+03 g: 3.17e-01 h: 4.90e+01 s: 2.54e-03 e: 1 it: 4.96e-02 tt: 1.81e-01
				1950
				1951	Here
				1952
				1953	#. ``f`` is the value of the objective function.
				1954	#. ``d`` is the change in the value of the objective function if
				1955	the step computed in this iteration is accepted.
				1956	#. ``g`` is the max norm of the gradient.
				1957	#. ``h`` is the change in the parameter vector.
				1958	#. ``s`` is the optimal step length computed by the line search.
				1959	#. ``it`` is the time take by the current iteration.
				1960	#. ``tt`` is the total time taken by the minimizer.
				1961
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1962	.. member:: std::vector<int> Solver::Options::trust_region_minimizer_iterations_to_dump
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1963
				1964	Default: ``empty``
				1965
				1966	List of iterations at which the trust region minimizer should dump
				1967	the trust region problem. Useful for testing and benchmarking. If
				1968	``empty``, no problems are dumped.
				1969
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1970	.. member:: std::string Solver::Options::trust_region_problem_dump_directory
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1971
				1972	Default: ``/tmp``
				1973
				1974	Directory to which the problems should be written to. Should be
				1975	non-empty if
				1976	:member:`Solver::Options::trust_region_minimizer_iterations_to_dump` is
				1977	non-empty and
				1978	:member:`Solver::Options::trust_region_problem_dump_format_type` is not
				1979	``CONSOLE``.
				1980
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1981	.. member:: DumpFormatType Solver::Options::trust_region_problem_dump_format_type
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1982
				1983	Default: ``TEXTFILE``
				1984
				1985	The format in which trust region problems should be logged when
				1986	:member:`Solver::Options::trust_region_minimizer_iterations_to_dump`
				1987	is non-empty. There are three options:
				1988
				1989	* ``CONSOLE`` prints the linear least squares problem in a human
				1990	readable format to ``stderr``. The Jacobian is printed as a
				1991	dense matrix. The vectors :math:`D`, :math:`x` and :math:`f` are
				1992	printed as dense vectors. This should only be used for small
				1993	problems.
				1994
				1995	* ``TEXTFILE`` Write out the linear least squares problem to the
				1996	directory pointed to by
				1997	:member:`Solver::Options::trust_region_problem_dump_directory` as
				1998	text files which can be read into ``MATLAB/Octave``. The Jacobian
				1999	is dumped as a text file containing :math:`(i,j,s)` triplets, the
				2000	vectors :math:`D`, `x` and `f` are dumped as text files
				2001	containing a list of their values.
				2002
				2003	A ``MATLAB/Octave`` script called
				2004	``ceres_solver_iteration_???.m`` is also output, which can be
				2005	used to parse and load the problem into memory.
				2006
				2007	.. member:: bool Solver::Options::check_gradients
				2008
				2009	Default: ``false``
				2010
				2011	Check all Jacobians computed by each residual block with finite
				2012	differences. This is expensive since it involves computing the
				2013	derivative by normal means (e.g. user specified, autodiff, etc),
				2014	then also computing it using finite differences. The results are
				2015	compared, and if they differ substantially, the optimization fails
				2016	and the details are stored in the solver summary.
				2017
				2018	.. member:: double Solver::Options::gradient_check_relative_precision
				2019
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	2020	Default: ``1e-8``
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2021
				2022	Precision to check for in the gradient checker. If the relative
				2023	difference between an element in a Jacobian exceeds this number,
				2024	then the Jacobian for that cost term is dumped.
				2025
				2026	.. member:: double Solver::Options::gradient_check_numeric_derivative_relative_step_size
				2027
				2028	Default: ``1e-6``
				2029
				2030	.. NOTE::
				2031
				2032	This option only applies to the numeric differentiation used for
				2033	checking the user provided derivatives when when
				2034	`Solver::Options::check_gradients` is true. If you are using
				2035	:class:`NumericDiffCostFunction` and are interested in changing
				2036	the step size for numeric differentiation in your cost function,
				2037	please have a look at :class:`NumericDiffOptions`.
				2038
				2039	Relative shift used for taking numeric derivatives when
				2040	`Solver::Options::check_gradients` is `true`.
				2041
				2042	For finite differencing, each dimension is evaluated at slightly
				2043	shifted values, e.g., for forward differences, the numerical
				2044	derivative is
				2045
				2046	.. math::
				2047
				2048	\delta &= gradient\_check\_numeric\_derivative\_relative\_step\_size\\
				2049	\Delta f &= \frac{f((1 + \delta) x) - f(x)}{\delta x}
				2050
				2051	The finite differencing is done along each dimension. The reason to
				2052	use a relative (rather than absolute) step size is that this way,
				2053	numeric differentiation works for functions where the arguments are
				2054	typically large (e.g. :math:`10^9`) and when the values are small
				2055	(e.g. :math:`10^{-5}`). It is possible to construct torture cases
				2056	which break this finite difference heuristic, but they do not come
				2057	up often in practice.
				2058
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2059	.. member:: bool Solver::Options::update_state_every_iteration
				2060
				2061	Default: ``false``
				2062
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	2063	If ``update_state_every_iteration`` is ``true``, then Ceres Solver
				2064	will guarantee that at the end of every iteration and before any
				2065	user :class:`IterationCallback` is called, the parameter blocks are
				2066	updated to the current best solution found by the solver. Thus the
				2067	IterationCallback can inspect the values of the parameter blocks
				2068	for purposes of computation, visualization or termination.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2069
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	2070	If ``update_state_every_iteration`` is ``false`` then there is no
				2071	such guarantee, and user provided :class:`IterationCallback` s
				2072	should not expect to look at the parameter blocks and interpret
				2073	their values.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2074
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2075	.. member:: std::vector<IterationCallback*> Solver::Options::callbacks
				2076
				2077	Default: ``empty``
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2078
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	2079	Callbacks that are executed at the end of each iteration of the
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2080	minimizer. They are executed in the order that they are
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	2081	specified in this vector.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2082
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	2083	By default, parameter blocks are updated only at the end of the
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2084	optimization, i.e., when the minimizer terminates. This means that
				2085	by default, if an :class:`IterationCallback` inspects the parameter
				2086	blocks, they will not see them changing in the course of the
				2087	optimization.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2088
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	2089	To tell Ceres to update the parameter blocks at the end of each
				2090	iteration and before calling the user's callback, set
				2091	:member:`Solver::Options::update_state_every_iteration` to
				2092	``true``.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2093
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2094	See `examples/iteration_callback_example.cc
				2095	<https://ceres-solver.googlesource.com/ceres-solver/+/master/examples/iteration_callback_example.cc>`_
				2096	for an example of an :class:`IterationCallback` that uses
				2097	:member:`Solver::Options::update_state_every_iteration` to log
				2098	changes to the parameter blocks over the course of the
				2099	optimization.
				2100
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	2101	The solver does NOT take ownership of these pointers.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2102
				2103	:class:`ParameterBlockOrdering`
				2104	===============================
				2105
				2106	.. class:: ParameterBlockOrdering
				2107
				2108	``ParameterBlockOrdering`` is a class for storing and manipulating
				2109	an ordered collection of groups/sets with the following semantics:
				2110
				2111	Group IDs are non-negative integer values. Elements are any type
				2112	that can serve as a key in a map or an element of a set.
				2113
				2114	An element can only belong to one group at a time. A group may
				2115	contain an arbitrary number of elements.
				2116
				2117	Groups are ordered by their group id.
				2118
				2119	.. function:: bool ParameterBlockOrdering::AddElementToGroup(const double* element, const int group)
				2120
				2121	Add an element to a group. If a group with this id does not exist,
				2122	one is created. This method can be called any number of times for
				2123	the same element. Group ids should be non-negative numbers. Return
				2124	value indicates if adding the element was a success.
				2125
				2126	.. function:: void ParameterBlockOrdering::Clear()
				2127
				2128	Clear the ordering.
				2129
				2130	.. function:: bool ParameterBlockOrdering::Remove(const double* element)
				2131
				2132	Remove the element, no matter what group it is in. If the element
				2133	is not a member of any group, calling this method will result in a
				2134	crash. Return value indicates if the element was actually removed.
				2135
				2136	.. function:: void ParameterBlockOrdering::Reverse()
				2137
				2138	Reverse the order of the groups in place.
				2139
				2140	.. function:: int ParameterBlockOrdering::GroupId(const double* element) const
				2141
				2142	Return the group id for the element. If the element is not a member
				2143	of any group, return -1.
				2144
				2145	.. function:: bool ParameterBlockOrdering::IsMember(const double* element) const
				2146
				2147	True if there is a group containing the parameter block.
				2148
				2149	.. function:: int ParameterBlockOrdering::GroupSize(const int group) const
				2150
				2151	This function always succeeds, i.e., implicitly there exists a
				2152	group for every integer.
				2153
				2154	.. function:: int ParameterBlockOrdering::NumElements() const
				2155
				2156	Number of elements in the ordering.
				2157
				2158	.. function:: int ParameterBlockOrdering::NumGroups() const
				2159
				2160	Number of groups with one or more elements.
				2161
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2162	:class:`IterationSummary`
				2163	=========================
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2164
				2165	.. class:: IterationSummary
				2166
				2167	:class:`IterationSummary` describes the state of the minimizer at
				2168	the end of each iteration.
				2169
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	2170	.. member:: int IterationSummary::iteration
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2171
				2172	Current iteration number.
				2173
				2174	.. member:: bool IterationSummary::step_is_valid
				2175
				2176	Step was numerically valid, i.e., all values are finite and the
				2177	step reduces the value of the linearized model.
				2178
				2179	Note: :member:`IterationSummary::step_is_valid` is `false`
				2180	when :member:`IterationSummary::iteration` = 0.
				2181
				2182	.. member:: bool IterationSummary::step_is_nonmonotonic
				2183
				2184	Step did not reduce the value of the objective function
				2185	sufficiently, but it was accepted because of the relaxed
				2186	acceptance criterion used by the non-monotonic trust region
				2187	algorithm.
				2188
				2189	Note: :member:`IterationSummary::step_is_nonmonotonic` is
				2190	`false` when when :member:`IterationSummary::iteration` = 0.
				2191
				2192	.. member:: bool IterationSummary::step_is_successful
				2193
				2194	Whether or not the minimizer accepted this step or not.
				2195
				2196	If the ordinary trust region algorithm is used, this means that the
				2197	relative reduction in the objective function value was greater than
				2198	:member:`Solver::Options::min_relative_decrease`. However, if the
				2199	non-monotonic trust region algorithm is used
				2200	(:member:`Solver::Options::use_nonmonotonic_steps` = `true`), then
				2201	even if the relative decrease is not sufficient, the algorithm may
				2202	accept the step and the step is declared successful.
				2203
				2204	Note: :member:`IterationSummary::step_is_successful` is `false`
				2205	when when :member:`IterationSummary::iteration` = 0.
				2206
				2207	.. member:: double IterationSummary::cost
				2208
				2209	Value of the objective function.
				2210
				2211	.. member:: double IterationSummary::cost_change
				2212
				2213	Change in the value of the objective function in this
				2214	iteration. This can be positive or negative.
				2215
				2216	.. member:: double IterationSummary::gradient_max_norm
				2217
				2218	Infinity norm of the gradient vector.
				2219
				2220	.. member:: double IterationSummary::gradient_norm
				2221
				2222	2-norm of the gradient vector.
				2223
				2224	.. member:: double IterationSummary::step_norm
				2225
				2226	2-norm of the size of the step computed in this iteration.
				2227
				2228	.. member:: double IterationSummary::relative_decrease
				2229
				2230	For trust region algorithms, the ratio of the actual change in cost
				2231	and the change in the cost of the linearized approximation.
				2232
				2233	This field is not used when a linear search minimizer is used.
				2234
				2235	.. member:: double IterationSummary::trust_region_radius
				2236
				2237	Size of the trust region at the end of the current iteration. For
				2238	the Levenberg-Marquardt algorithm, the regularization parameter is
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2239	1.0 / :member:`IterationSummary::trust_region_radius`.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2240
				2241	.. member:: double IterationSummary::eta
				2242
				2243	For the inexact step Levenberg-Marquardt algorithm, this is the
				2244	relative accuracy with which the step is solved. This number is
				2245	only applicable to the iterative solvers capable of solving linear
				2246	systems inexactly. Factorization-based exact solvers always have an
				2247	eta of 0.0.
				2248
				2249	.. member:: double IterationSummary::step_size
				2250
				2251	Step sized computed by the line search algorithm.
				2252
				2253	This field is not used when a trust region minimizer is used.
				2254
				2255	.. member:: int IterationSummary::line_search_function_evaluations
				2256
				2257	Number of function evaluations used by the line search algorithm.
				2258
				2259	This field is not used when a trust region minimizer is used.
				2260
				2261	.. member:: int IterationSummary::linear_solver_iterations
				2262
				2263	Number of iterations taken by the linear solver to solve for the
				2264	trust region step.
				2265
				2266	Currently this field is not used when a line search minimizer is
				2267	used.
				2268
				2269	.. member:: double IterationSummary::iteration_time_in_seconds
				2270
				2271	Time (in seconds) spent inside the minimizer loop in the current
				2272	iteration.
				2273
				2274	.. member:: double IterationSummary::step_solver_time_in_seconds
				2275
				2276	Time (in seconds) spent inside the trust region step solver.
				2277
				2278	.. member:: double IterationSummary::cumulative_time_in_seconds
				2279
				2280	Time (in seconds) since the user called Solve().
				2281
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2282	:class:`IterationCallback`
				2283	==========================
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2284
				2285	.. class:: IterationCallback
				2286
				2287	Interface for specifying callbacks that are executed at the end of
				2288	each iteration of the minimizer.
				2289
				2290	.. code-block:: c++
				2291
				2292	class IterationCallback {
				2293	public:
				2294	virtual ~IterationCallback() {}
				2295	virtual CallbackReturnType operator()(const IterationSummary& summary) = 0;
				2296	};
				2297
				2298
				2299	The solver uses the return value of ``operator()`` to decide whether
				2300	to continue solving or to terminate. The user can return three
				2301	values.
				2302
				2303	#. ``SOLVER_ABORT`` indicates that the callback detected an abnormal
				2304	situation. The solver returns without updating the parameter
				2305	blocks (unless ``Solver::Options::update_state_every_iteration`` is
				2306	set true). Solver returns with ``Solver::Summary::termination_type``
				2307	set to ``USER_FAILURE``.
				2308
				2309	#. ``SOLVER_TERMINATE_SUCCESSFULLY`` indicates that there is no need
				2310	to optimize anymore (some user specified termination criterion
				2311	has been met). Solver returns with
				2312	``Solver::Summary::termination_type``` set to ``USER_SUCCESS``.
				2313
				2314	#. ``SOLVER_CONTINUE`` indicates that the solver should continue
				2315	optimizing.
				2316
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2317	The return values can be used to implement custom termination
				2318	criterion that supercede the iteration/time/tolerance based
				2319	termination implemented by Ceres.
				2320
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2321	For example, the following :class:`IterationCallback` is used
				2322	internally by Ceres to log the progress of the optimization.
				2323
				2324	.. code-block:: c++
				2325
				2326	class LoggingCallback : public IterationCallback {
				2327	public:
				2328	explicit LoggingCallback(bool log_to_stdout)
				2329	: log_to_stdout_(log_to_stdout) {}
				2330
				2331	~LoggingCallback() {}
				2332
				2333	CallbackReturnType operator()(const IterationSummary& summary) {
				2334	const char* kReportRowFormat =
				2335	"% 4d: f:% 8e d:% 3.2e g:% 3.2e h:% 3.2e "
				2336	"rho:% 3.2e mu:% 3.2e eta:% 3.2e li:% 3d";
				2337	string output = StringPrintf(kReportRowFormat,
				2338	summary.iteration,
				2339	summary.cost,
				2340	summary.cost_change,
				2341	summary.gradient_max_norm,
				2342	summary.step_norm,
				2343	summary.relative_decrease,
				2344	summary.trust_region_radius,
				2345	summary.eta,
				2346	summary.linear_solver_iterations);
				2347	if (log_to_stdout_) {
				2348	cout << output << endl;
				2349	} else {
				2350	VLOG(1) << output;
				2351	}
				2352	return SOLVER_CONTINUE;
				2353	}
				2354
				2355	private:
				2356	const bool log_to_stdout_;
				2357	};
				2358
				2359
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2360	See `examples/evaluation_callback_example.cc
				2361	<https://ceres-solver.googlesource.com/ceres-solver/+/master/examples/iteration_callback_example.cc>`_
				2362	for another example that uses
				2363	:member:`Solver::Options::update_state_every_iteration` to log
				2364	changes to the parameter blocks over the course of the optimization.
				2365
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2366
				2367	:class:`CRSMatrix`
				2368	==================
				2369
				2370	.. class:: CRSMatrix
				2371
				2372	A compressed row sparse matrix used primarily for communicating the
				2373	Jacobian matrix to the user.
				2374
				2375	.. member:: int CRSMatrix::num_rows
				2376
				2377	Number of rows.
				2378
				2379	.. member:: int CRSMatrix::num_cols
				2380
				2381	Number of columns.
				2382
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2383	.. member:: std::vector<int> CRSMatrix::rows
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2384
				2385	:member:`CRSMatrix::rows` is a :member:`CRSMatrix::num_rows` + 1
				2386	sized array that points into the :member:`CRSMatrix::cols` and
				2387	:member:`CRSMatrix::values` array.
				2388
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2389	.. member:: std::vector<int> CRSMatrix::cols
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2390
				2391	:member:`CRSMatrix::cols` contain as many entries as there are
				2392	non-zeros in the matrix.
				2393
				2394	For each row ``i``, ``cols[rows[i]]`` ... ``cols[rows[i + 1] - 1]``
				2395	are the indices of the non-zero columns of row ``i``.
				2396
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2397	.. member:: std::vector<double> CRSMatrix::values
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2398
				2399	:member:`CRSMatrix::values` contain as many entries as there are
				2400	non-zeros in the matrix.
				2401
				2402	For each row ``i``,
				2403	``values[rows[i]]`` ... ``values[rows[i + 1] - 1]`` are the values
				2404	of the non-zero columns of row ``i``.
				2405
				2406	e.g., consider the 3x4 sparse matrix
				2407
				2408	.. code-block:: c++
				2409
				2410	0 10 0 4
				2411	0 2 -3 2
				2412	1 2 0 0
				2413
				2414	The three arrays will be:
				2415
				2416	.. code-block:: c++
				2417
				2418	-row0- ---row1--- -row2-
				2419	rows = [ 0, 2, 5, 7]
				2420	cols = [ 1, 3, 1, 2, 3, 0, 1]
				2421	values = [10, 4, 2, -3, 2, 1, 2]
				2422
				2423
				2424	:class:`Solver::Summary`
				2425	========================
				2426
				2427	.. class:: Solver::Summary
				2428
				2429	Summary of the various stages of the solver after termination.
				2430
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2431	.. function:: std::string Solver::Summary::BriefReport() const
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2432
				2433	A brief one line description of the state of the solver after
				2434	termination.
				2435
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2436	.. function:: std::string Solver::Summary::FullReport() const
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2437
				2438	A full multiline description of the state of the solver after
				2439	termination.
				2440
				2441	.. function:: bool Solver::Summary::IsSolutionUsable() const
				2442
				2443	Whether the solution returned by the optimization algorithm can be
				2444	relied on to be numerically sane. This will be the case if
				2445	`Solver::Summary:termination_type` is set to `CONVERGENCE`,
				2446	`USER_SUCCESS` or `NO_CONVERGENCE`, i.e., either the solver
				2447	converged by meeting one of the convergence tolerances or because
				2448	the user indicated that it had converged or it ran to the maximum
				2449	number of iterations or time.
				2450
				2451	.. member:: MinimizerType Solver::Summary::minimizer_type
				2452
				2453	Type of minimization algorithm used.
				2454
				2455	.. member:: TerminationType Solver::Summary::termination_type
				2456
				2457	The cause of the minimizer terminating.
				2458
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2459	.. member:: std::string Solver::Summary::message
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2460
				2461	Reason why the solver terminated.
				2462
				2463	.. member:: double Solver::Summary::initial_cost
				2464
				2465	Cost of the problem (value of the objective function) before the
				2466	optimization.
				2467
				2468	.. member:: double Solver::Summary::final_cost
				2469
				2470	Cost of the problem (value of the objective function) after the
				2471	optimization.
				2472
				2473	.. member:: double Solver::Summary::fixed_cost
				2474
				2475	The part of the total cost that comes from residual blocks that
				2476	were held fixed by the preprocessor because all the parameter
				2477	blocks that they depend on were fixed.
				2478
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2479	.. member:: std::vector<IterationSummary> Solver::Summary::iterations
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2480
				2481	:class:`IterationSummary` for each minimizer iteration in order.
				2482
				2483	.. member:: int Solver::Summary::num_successful_steps
				2484
				2485	Number of minimizer iterations in which the step was
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2486	accepted. Unless :member:`Solver::Options::use_nonmonotonic_steps`
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2487	is `true` this is also the number of steps in which the objective
				2488	function value/cost went down.
				2489
				2490	.. member:: int Solver::Summary::num_unsuccessful_steps
				2491
				2492	Number of minimizer iterations in which the step was rejected
				2493	either because it did not reduce the cost enough or the step was
				2494	not numerically valid.
				2495
				2496	.. member:: int Solver::Summary::num_inner_iteration_steps
				2497
				2498	Number of times inner iterations were performed.
				2499
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2500	.. member:: int Solver::Summary::num_line_search_steps
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2501
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2502	Total number of iterations inside the line search algorithm across
				2503	all invocations. We call these iterations "steps" to distinguish
				2504	them from the outer iterations of the line search and trust region
				2505	minimizer algorithms which call the line search algorithm as a
				2506	subroutine.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2507
				2508	.. member:: double Solver::Summary::preprocessor_time_in_seconds
				2509
				2510	Time (in seconds) spent in the preprocessor.
				2511
				2512	.. member:: double Solver::Summary::minimizer_time_in_seconds
				2513
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2514	Time (in seconds) spent in the minimizer.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2515
				2516	.. member:: double Solver::Summary::postprocessor_time_in_seconds
				2517
				2518	Time (in seconds) spent in the post processor.
				2519
				2520	.. member:: double Solver::Summary::total_time_in_seconds
				2521
				2522	Time (in seconds) spent in the solver.
				2523
				2524	.. member:: double Solver::Summary::linear_solver_time_in_seconds
				2525
				2526	Time (in seconds) spent in the linear solver computing the trust
				2527	region step.
				2528
				2529	.. member:: int Solver::Summary::num_linear_solves
				2530
				2531	Number of times the Newton step was computed by solving a linear
				2532	system. This does not include linear solves used by inner
				2533	iterations.
				2534
				2535	.. member:: double Solver::Summary::residual_evaluation_time_in_seconds
				2536
				2537	Time (in seconds) spent evaluating the residual vector.
				2538
				2539	.. member:: int Solver::Summary::num_residual_evaluations
				2540
				2541	Number of times only the residuals were evaluated.
				2542
				2543	.. member:: double Solver::Summary::jacobian_evaluation_time_in_seconds
				2544
				2545	Time (in seconds) spent evaluating the Jacobian matrix.
				2546
				2547	.. member:: int Solver::Summary::num_jacobian_evaluations
				2548
				2549	Number of times only the Jacobian and the residuals were evaluated.
				2550
				2551	.. member:: double Solver::Summary::inner_iteration_time_in_seconds
				2552
				2553	Time (in seconds) spent doing inner iterations.
				2554
				2555	.. member:: int Solver::Summary::num_parameter_blocks
				2556
				2557	Number of parameter blocks in the problem.
				2558
				2559	.. member:: int Solver::Summary::num_parameters
				2560
				2561	Number of parameters in the problem.
				2562
				2563	.. member:: int Solver::Summary::num_effective_parameters
				2564
				2565	Dimension of the tangent space of the problem (or the number of
				2566	columns in the Jacobian for the problem). This is different from
				2567	:member:`Solver::Summary::num_parameters` if a parameter block is
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2568	associated with a :class:`Manifold`.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2569
				2570	.. member:: int Solver::Summary::num_residual_blocks
				2571
				2572	Number of residual blocks in the problem.
				2573
				2574	.. member:: int Solver::Summary::num_residuals
				2575
				2576	Number of residuals in the problem.
				2577
				2578	.. member:: int Solver::Summary::num_parameter_blocks_reduced
				2579
				2580	Number of parameter blocks in the problem after the inactive and
				2581	constant parameter blocks have been removed. A parameter block is
				2582	inactive if no residual block refers to it.
				2583
				2584	.. member:: int Solver::Summary::num_parameters_reduced
				2585
				2586	Number of parameters in the reduced problem.
				2587
				2588	.. member:: int Solver::Summary::num_effective_parameters_reduced
				2589
				2590	Dimension of the tangent space of the reduced problem (or the
				2591	number of columns in the Jacobian for the reduced problem). This is
				2592	different from :member:`Solver::Summary::num_parameters_reduced` if
				2593	a parameter block in the reduced problem is associated with a
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2594	:class:`Manifold`.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2595
				2596	.. member:: int Solver::Summary::num_residual_blocks_reduced
				2597
				2598	Number of residual blocks in the reduced problem.
				2599
				2600	.. member:: int Solver::Summary::num_residuals_reduced
				2601
				2602	Number of residuals in the reduced problem.
				2603
				2604	.. member:: int Solver::Summary::num_threads_given
				2605
				2606	Number of threads specified by the user for Jacobian and residual
				2607	evaluation.
				2608
				2609	.. member:: int Solver::Summary::num_threads_used
				2610
				2611	Number of threads actually used by the solver for Jacobian and
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2612	residual evaluation.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2613
				2614	.. member:: LinearSolverType Solver::Summary::linear_solver_type_given
				2615
				2616	Type of the linear solver requested by the user.
				2617
				2618	.. member:: LinearSolverType Solver::Summary::linear_solver_type_used
				2619
				2620	Type of the linear solver actually used. This may be different from
				2621	:member:`Solver::Summary::linear_solver_type_given` if Ceres
				2622	determines that the problem structure is not compatible with the
				2623	linear solver requested or if the linear solver requested by the
				2624	user is not available, e.g. The user requested
				2625	`SPARSE_NORMAL_CHOLESKY` but no sparse linear algebra library was
				2626	available.
				2627
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2628	.. member:: std::vector<int> Solver::Summary::linear_solver_ordering_given
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2629
				2630	Size of the elimination groups given by the user as hints to the
				2631	linear solver.
				2632
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2633	.. member:: std::vector<int> Solver::Summary::linear_solver_ordering_used
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2634
				2635	Size of the parameter groups used by the solver when ordering the
				2636	columns of the Jacobian. This maybe different from
				2637	:member:`Solver::Summary::linear_solver_ordering_given` if the user
				2638	left :member:`Solver::Summary::linear_solver_ordering_given` blank
				2639	and asked for an automatic ordering, or if the problem contains
				2640	some constant or inactive parameter blocks.
				2641
				2642	.. member:: std::string Solver::Summary::schur_structure_given
				2643
				2644	For Schur type linear solvers, this string describes the template
				2645	specialization which was detected in the problem and should be
				2646	used.
				2647
				2648	.. member:: std::string Solver::Summary::schur_structure_used
				2649
				2650	For Schur type linear solvers, this string describes the template
				2651	specialization that was actually instantiated and used. The reason
				2652	this will be different from
				2653	:member:`Solver::Summary::schur_structure_given` is because the
				2654	corresponding template specialization does not exist.
				2655
				2656	Template specializations can be added to ceres by editing
				2657	``internal/ceres/generate_template_specializations.py``
				2658
				2659	.. member:: bool Solver::Summary::inner_iterations_given
				2660
				2661	`True` if the user asked for inner iterations to be used as part of
				2662	the optimization.
				2663
				2664	.. member:: bool Solver::Summary::inner_iterations_used
				2665
				2666	`True` if the user asked for inner iterations to be used as part of
				2667	the optimization and the problem structure was such that they were
				2668	actually performed. For example, in a problem with just one parameter
				2669	block, inner iterations are not performed.
				2670
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2671	.. member:: std::vector<int> Solver::Summary::inner_iteration_ordering_given
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2672
				2673	Size of the parameter groups given by the user for performing inner
				2674	iterations.
				2675
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2676	.. member:: std::vector<int> Solver::Summary::inner_iteration_ordering_used
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2677
				2678	Size of the parameter groups given used by the solver for
				2679	performing inner iterations. This maybe different from
				2680	:member:`Solver::Summary::inner_iteration_ordering_given` if the
				2681	user left :member:`Solver::Summary::inner_iteration_ordering_given`
				2682	blank and asked for an automatic ordering, or if the problem
				2683	contains some constant or inactive parameter blocks.
				2684
				2685	.. member:: PreconditionerType Solver::Summary::preconditioner_type_given
				2686
				2687	Type of the preconditioner requested by the user.
				2688
				2689	.. member:: PreconditionerType Solver::Summary::preconditioner_type_used
				2690
				2691	Type of the preconditioner actually used. This may be different
				2692	from :member:`Solver::Summary::linear_solver_type_given` if Ceres
				2693	determines that the problem structure is not compatible with the
				2694	linear solver requested or if the linear solver requested by the
				2695	user is not available.
				2696
				2697	.. member:: VisibilityClusteringType Solver::Summary::visibility_clustering_type
				2698
				2699	Type of clustering algorithm used for visibility based
				2700	preconditioning. Only meaningful when the
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	2701	:member:`Solver::Summary::preconditioner_type_used` is
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2702	``CLUSTER_JACOBI`` or ``CLUSTER_TRIDIAGONAL``.
				2703
				2704	.. member:: TrustRegionStrategyType Solver::Summary::trust_region_strategy_type
				2705
				2706	Type of trust region strategy.
				2707
				2708	.. member:: DoglegType Solver::Summary::dogleg_type
				2709
				2710	Type of dogleg strategy used for solving the trust region problem.
				2711
				2712	.. member:: DenseLinearAlgebraLibraryType Solver::Summary::dense_linear_algebra_library_type
				2713
				2714	Type of the dense linear algebra library used.
				2715
				2716	.. member:: SparseLinearAlgebraLibraryType Solver::Summary::sparse_linear_algebra_library_type
				2717
				2718	Type of the sparse linear algebra library used.
				2719
				2720	.. member:: LineSearchDirectionType Solver::Summary::line_search_direction_type
				2721
				2722	Type of line search direction used.
				2723
				2724	.. member:: LineSearchType Solver::Summary::line_search_type
				2725
				2726	Type of the line search algorithm used.
				2727
				2728	.. member:: LineSearchInterpolationType Solver::Summary::line_search_interpolation_type
				2729
				2730	When performing line search, the degree of the polynomial used to
				2731	approximate the objective function.
				2732
				2733	.. member:: NonlinearConjugateGradientType Solver::Summary::nonlinear_conjugate_gradient_type
				2734
				2735	If the line search direction is `NONLINEAR_CONJUGATE_GRADIENT`,
				2736	then this indicates the particular variant of non-linear conjugate
				2737	gradient used.
				2738
				2739	.. member:: int Solver::Summary::max_lbfgs_rank
				2740
				2741	If the type of the line search direction is `LBFGS`, then this
				2742	indicates the rank of the Hessian approximation.