Blame - docs/source/nnls_modeling.rst - RealtimeRoboticsGroup/test

blob: c0c3227f88c3fdca25f4f7c96f7f813c96fa4c7a [file] [log] [blame]

Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1	.. default-domain:: cpp
				2
				3	.. cpp:namespace:: ceres
				4
				5	.. _`chapter-nnls_modeling`:
				6
				7	=================================
				8	Modeling Non-linear Least Squares
				9	=================================
				10
				11	Introduction
				12	============
				13
				14	Ceres solver consists of two distinct parts. A modeling API which
				15	provides a rich set of tools to construct an optimization problem one
				16	term at a time and a solver API that controls the minimization
				17	algorithm. This chapter is devoted to the task of modeling
				18	optimization problems using Ceres. :ref:`chapter-nnls_solving` discusses
				19	the various ways in which an optimization problem can be solved using
				20	Ceres.
				21
				22	Ceres solves robustified bounds constrained non-linear least squares
				23	problems of the form:
				24
				25	.. math:: :label: ceresproblem_modeling
				26
				27	\min_{\mathbf{x}} &\quad \frac{1}{2}\sum_{i}
				28	\rho_i\left(\left\\|f_i\left(x_{i_1},
				29	... ,x_{i_k}\right)\right\\|^2\right) \\
				30	\text{s.t.} &\quad l_j \le x_j \le u_j
				31
				32	In Ceres parlance, the expression
				33	:math:`\rho_i\left(\left\\|f_i\left(x_{i_1},...,x_{i_k}\right)\right\\|^2\right)`
				34	is known as a residual block, where :math:`f_i(\cdot)` is a
				35	:class:`CostFunction` that depends on the parameter blocks
				36	:math:`\left\{x_{i_1},... , x_{i_k}\right\}`.
				37
				38	In most optimization problems small groups of scalars occur
				39	together. For example the three components of a translation vector and
				40	the four components of the quaternion that define the pose of a
				41	camera. We refer to such a group of scalars as a parameter block. Of
				42	course a parameter block can be just a single scalar too.
				43
				44	:math:`\rho_i` is a :class:`LossFunction`. A :class:`LossFunction` is
				45	a scalar valued function that is used to reduce the influence of
				46	outliers on the solution of non-linear least squares problems.
				47
				48	:math:`l_j` and :math:`u_j` are lower and upper bounds on the
				49	parameter block :math:`x_j`.
				50
				51	As a special case, when :math:`\rho_i(x) = x`, i.e., the identity
				52	function, and :math:`l_j = -\infty` and :math:`u_j = \infty` we get
				53	the more familiar unconstrained `non-linear least squares problem
				54	<http://en.wikipedia.org/wiki/Non-linear_least_squares>`_.
				55
				56	.. math:: :label: ceresproblemunconstrained
				57
				58	\frac{1}{2}\sum_{i} \left\\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\\|^2.
				59
				60	:class:`CostFunction`
				61	=====================
				62
				63	For each term in the objective function, a :class:`CostFunction` is
				64	responsible for computing a vector of residuals and Jacobian
				65	matrices. Concretely, consider a function
				66	:math:`f\left(x_{1},...,x_{k}\right)` that depends on parameter blocks
				67	:math:`\left[x_{1}, ... , x_{k}\right]`.
				68
				69	Then, given :math:`\left[x_{1}, ... , x_{k}\right]`,
				70	:class:`CostFunction` is responsible for computing the vector
				71	:math:`f\left(x_{1},...,x_{k}\right)` and the Jacobian matrices
				72
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	73	.. math:: J_i = D_i f(x_1, ..., x_k) \quad \forall i \in \{1, \ldots, k\}
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	74
				75	.. class:: CostFunction
				76
				77	.. code-block:: c++
				78
				79	class CostFunction {
				80	public:
				81	virtual bool Evaluate(double const* const* parameters,
				82	double* residuals,
				83	double** jacobians) = 0;
				84	const vector<int32>& parameter_block_sizes();
				85	int num_residuals() const;
				86
				87	protected:
				88	vector<int32>* mutable_parameter_block_sizes();
				89	void set_num_residuals(int num_residuals);
				90	};
				91
				92
				93	The signature of the :class:`CostFunction` (number and sizes of input
				94	parameter blocks and number of outputs) is stored in
				95	:member:`CostFunction::parameter_block_sizes_` and
				96	:member:`CostFunction::num_residuals_` respectively. User code
				97	inheriting from this class is expected to set these two members with
				98	the corresponding accessors. This information will be verified by the
				99	:class:`Problem` when added with :func:`Problem::AddResidualBlock`.
				100
				101	.. function:: bool CostFunction::Evaluate(double const* const* parameters, double* residuals, double** jacobians)
				102
				103	Compute the residual vector and the Jacobian matrices.
				104
				105	``parameters`` is an array of arrays of size
				106	``CostFunction::parameter_block_sizes_.size()`` and
				107	``parameters[i]`` is an array of size ``parameter_block_sizes_[i]``
				108	that contains the :math:`i^{\text{th}}` parameter block that the
				109	``CostFunction`` depends on.
				110
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	111	``parameters`` is never ``nullptr``.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	112
				113	``residuals`` is an array of size ``num_residuals_``.
				114
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	115	``residuals`` is never ``nullptr``.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	116
				117	``jacobians`` is an array of arrays of size
				118	``CostFunction::parameter_block_sizes_.size()``.
				119
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	120	If ``jacobians`` is ``nullptr``, the user is only expected to compute
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	121	the residuals.
				122
				123	``jacobians[i]`` is a row-major array of size ``num_residuals x
				124	parameter_block_sizes_[i]``.
				125
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	126	If ``jacobians[i]`` is not ``nullptr``, the user is required to
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	127	compute the Jacobian of the residual vector with respect to
				128	``parameters[i]`` and store it in this array, i.e.
				129
				130	``jacobians[i][r * parameter_block_sizes_[i] + c]`` =
				131	:math:`\frac{\displaystyle \partial \text{residual}[r]}{\displaystyle \partial \text{parameters}[i][c]}`
				132
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	133	If ``jacobians[i]`` is ``nullptr``, then this computation can be
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	134	skipped. This is the case when the corresponding parameter block is
				135	marked constant.
				136
				137	The return value indicates whether the computation of the residuals
				138	and/or jacobians was successful or not. This can be used to
				139	communicate numerical failures in Jacobian computations for
				140	instance.
				141
				142	:class:`SizedCostFunction`
				143	==========================
				144
				145	.. class:: SizedCostFunction
				146
				147	If the size of the parameter blocks and the size of the residual
				148	vector is known at compile time (this is the common case),
				149	:class:`SizeCostFunction` can be used where these values can be
				150	specified as template parameters and the user only needs to
				151	implement :func:`CostFunction::Evaluate`.
				152
				153	.. code-block:: c++
				154
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	155	template<int kNumResiduals, int... Ns>
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	156	class SizedCostFunction : public CostFunction {
				157	public:
				158	virtual bool Evaluate(double const* const* parameters,
				159	double* residuals,
				160	double** jacobians) const = 0;
				161	};
				162
				163
				164	:class:`AutoDiffCostFunction`
				165	=============================
				166
				167	.. class:: AutoDiffCostFunction
				168
				169	Defining a :class:`CostFunction` or a :class:`SizedCostFunction`
				170	can be a tedious and error prone especially when computing
				171	derivatives. To this end Ceres provides `automatic differentiation
				172	<http://en.wikipedia.org/wiki/Automatic_differentiation>`_.
				173
				174	.. code-block:: c++
				175
				176	template <typename CostFunctor,
				177	int kNumResiduals, // Number of residuals, or ceres::DYNAMIC.
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	178	int... Ns> // Size of each parameter block
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	179	class AutoDiffCostFunction : public
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	180	SizedCostFunction<kNumResiduals, Ns> {
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	181	public:
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	182	AutoDiffCostFunction(CostFunctor* functor, ownership = TAKE_OWNERSHIP);
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	183	// Ignore the template parameter kNumResiduals and use
				184	// num_residuals instead.
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	185	AutoDiffCostFunction(CostFunctor* functor,
				186	int num_residuals,
				187	ownership = TAKE_OWNERSHIP);
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	188	};
				189
				190	To get an auto differentiated cost function, you must define a
				191	class with a templated ``operator()`` (a functor) that computes the
				192	cost function in terms of the template parameter ``T``. The
				193	autodiff framework substitutes appropriate ``Jet`` objects for
				194	``T`` in order to compute the derivative when necessary, but this
				195	is hidden, and you should write the function as if ``T`` were a
				196	scalar type (e.g. a double-precision floating point number).
				197
				198	The function must write the computed value in the last argument
				199	(the only non-``const`` one) and return true to indicate success.
				200
				201	For example, consider a scalar error :math:`e = k - x^\top y`,
				202	where both :math:`x` and :math:`y` are two-dimensional vector
				203	parameters and :math:`k` is a constant. The form of this error,
				204	which is the difference between a constant and an expression, is a
				205	common pattern in least squares problems. For example, the value
				206	:math:`x^\top y` might be the model expectation for a series of
				207	measurements, where there is an instance of the cost function for
				208	each measurement :math:`k`.
				209
				210	The actual cost added to the total problem is :math:`e^2`, or
				211	:math:`(k - x^\top y)^2`; however, the squaring is implicitly done
				212	by the optimization framework.
				213
				214	To write an auto-differentiable cost function for the above model,
				215	first define the object
				216
				217	.. code-block:: c++
				218
				219	class MyScalarCostFunctor {
				220	MyScalarCostFunctor(double k): k_(k) {}
				221
				222	template <typename T>
				223	bool operator()(const T* const x , const T* const y, T* e) const {
				224	e[0] = k_ - x[0] * y[0] - x[1] * y[1];
				225	return true;
				226	}
				227
				228	private:
				229	double k_;
				230	};
				231
				232
				233	Note that in the declaration of ``operator()`` the input parameters
				234	``x`` and ``y`` come first, and are passed as const pointers to arrays
				235	of ``T``. If there were three input parameters, then the third input
				236	parameter would come after ``y``. The output is always the last
				237	parameter, and is also a pointer to an array. In the example above,
				238	``e`` is a scalar, so only ``e[0]`` is set.
				239
				240	Then given this class definition, the auto differentiated cost
				241	function for it can be constructed as follows.
				242
				243	.. code-block:: c++
				244
				245	CostFunction* cost_function
				246	= new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
				247	new MyScalarCostFunctor(1.0)); ^ ^ ^
				248	\| \| \|
				249	Dimension of residual ------+ \| \|
				250	Dimension of x ----------------+ \|
				251	Dimension of y -------------------+
				252
				253
				254	In this example, there is usually an instance for each measurement
				255	of ``k``.
				256
				257	In the instantiation above, the template parameters following
				258	``MyScalarCostFunction``, ``<1, 2, 2>`` describe the functor as
				259	computing a 1-dimensional output from two arguments, both
				260	2-dimensional.
				261
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	262	By default :class:`AutoDiffCostFunction` will take ownership of the cost
				263	functor pointer passed to it, ie. will call `delete` on the cost functor
				264	when the :class:`AutoDiffCostFunction` itself is deleted. However, this may
				265	be undesirable in certain cases, therefore it is also possible to specify
				266	:class:`DO_NOT_TAKE_OWNERSHIP` as a second argument in the constructor,
				267	while passing a pointer to a cost functor which does not need to be deleted
				268	by the AutoDiffCostFunction. For example:
				269
				270	.. code-block:: c++
				271
				272	MyScalarCostFunctor functor(1.0)
				273	CostFunction* cost_function
				274	= new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
				275	&functor, DO_NOT_TAKE_OWNERSHIP);
				276
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	277	:class:`AutoDiffCostFunction` also supports cost functions with a
				278	runtime-determined number of residuals. For example:
				279
				280	.. code-block:: c++
				281
				282	CostFunction* cost_function
				283	= new AutoDiffCostFunction<MyScalarCostFunctor, DYNAMIC, 2, 2>(
				284	new CostFunctorWithDynamicNumResiduals(1.0), ^ ^ ^
				285	runtime_number_of_residuals); <----+ \| \| \|
				286	\| \| \| \|
				287	\| \| \| \|
				288	Actual number of residuals ------+ \| \| \|
				289	Indicate dynamic number of residuals --------+ \| \|
				290	Dimension of x ------------------------------------+ \|
				291	Dimension of y ---------------------------------------+
				292
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	293	WARNING 1 A common beginner's error when first using
				294	:class:`AutoDiffCostFunction` is to get the sizing wrong. In particular,
				295	there is a tendency to set the template parameters to (dimension of
				296	residual, number of parameters) instead of passing a dimension
				297	parameter for every parameter block. In the example above, that
				298	would be ``<MyScalarCostFunction, 1, 2>``, which is missing the 2
				299	as the last template argument.
				300
				301
				302	:class:`DynamicAutoDiffCostFunction`
				303	====================================
				304
				305	.. class:: DynamicAutoDiffCostFunction
				306
				307	:class:`AutoDiffCostFunction` requires that the number of parameter
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	308	blocks and their sizes be known at compile time. In a number of
				309	applications, this is not enough e.g., Bezier curve fitting, Neural
				310	Network training etc.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	311
				312	.. code-block:: c++
				313
				314	template <typename CostFunctor, int Stride = 4>
				315	class DynamicAutoDiffCostFunction : public CostFunction {
				316	};
				317
				318	In such cases :class:`DynamicAutoDiffCostFunction` can be
				319	used. Like :class:`AutoDiffCostFunction` the user must define a
				320	templated functor, but the signature of the functor differs
				321	slightly. The expected interface for the cost functors is:
				322
				323	.. code-block:: c++
				324
				325	struct MyCostFunctor {
				326	template<typename T>
				327	bool operator()(T const* const* parameters, T* residuals) const {
				328	}
				329	}
				330
				331	Since the sizing of the parameters is done at runtime, you must
				332	also specify the sizes after creating the dynamic autodiff cost
				333	function. For example:
				334
				335	.. code-block:: c++
				336
				337	DynamicAutoDiffCostFunction<MyCostFunctor, 4>* cost_function =
				338	new DynamicAutoDiffCostFunction<MyCostFunctor, 4>(
				339	new MyCostFunctor());
				340	cost_function->AddParameterBlock(5);
				341	cost_function->AddParameterBlock(10);
				342	cost_function->SetNumResiduals(21);
				343
				344	Under the hood, the implementation evaluates the cost function
				345	multiple times, computing a small set of the derivatives (four by
				346	default, controlled by the ``Stride`` template parameter) with each
				347	pass. There is a performance tradeoff with the size of the passes;
				348	Smaller sizes are more cache efficient but result in larger number
				349	of passes, and larger stride lengths can destroy cache-locality
				350	while reducing the number of passes over the cost function. The
				351	optimal value depends on the number and sizes of the various
				352	parameter blocks.
				353
				354	As a rule of thumb, try using :class:`AutoDiffCostFunction` before
				355	you use :class:`DynamicAutoDiffCostFunction`.
				356
				357	:class:`NumericDiffCostFunction`
				358	================================
				359
				360	.. class:: NumericDiffCostFunction
				361
				362	In some cases, its not possible to define a templated cost functor,
				363	for example when the evaluation of the residual involves a call to a
				364	library function that you do not have control over. In such a
				365	situation, `numerical differentiation
				366	<http://en.wikipedia.org/wiki/Numerical_differentiation>`_ can be
				367	used.
				368
				369	.. NOTE ::
				370
				371	TODO(sameeragarwal): Add documentation for the constructor and for
				372	NumericDiffOptions. Update DynamicNumericDiffOptions in a similar
				373	manner.
				374
				375	.. code-block:: c++
				376
				377	template <typename CostFunctor,
				378	NumericDiffMethodType method = CENTRAL,
				379	int kNumResiduals, // Number of residuals, or ceres::DYNAMIC.
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	380	int... Ns> // Size of each parameter block.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	381	class NumericDiffCostFunction : public
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	382	SizedCostFunction<kNumResiduals, Ns> {
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	383	};
				384
				385	To get a numerically differentiated :class:`CostFunction`, you must
				386	define a class with a ``operator()`` (a functor) that computes the
				387	residuals. The functor must write the computed value in the last
				388	argument (the only non-``const`` one) and return ``true`` to
				389	indicate success. Please see :class:`CostFunction` for details on
				390	how the return value may be used to impose simple constraints on the
				391	parameter block. e.g., an object of the form
				392
				393	.. code-block:: c++
				394
				395	struct ScalarFunctor {
				396	public:
				397	bool operator()(const double* const x1,
				398	const double* const x2,
				399	double* residuals) const;
				400	}
				401
				402	For example, consider a scalar error :math:`e = k - x'y`, where both
				403	:math:`x` and :math:`y` are two-dimensional column vector
				404	parameters, the prime sign indicates transposition, and :math:`k` is
				405	a constant. The form of this error, which is the difference between
				406	a constant and an expression, is a common pattern in least squares
				407	problems. For example, the value :math:`x'y` might be the model
				408	expectation for a series of measurements, where there is an instance
				409	of the cost function for each measurement :math:`k`.
				410
				411	To write an numerically-differentiable class:`CostFunction` for the
				412	above model, first define the object
				413
				414	.. code-block:: c++
				415
				416	class MyScalarCostFunctor {
				417	MyScalarCostFunctor(double k): k_(k) {}
				418
				419	bool operator()(const double* const x,
				420	const double* const y,
				421	double* residuals) const {
				422	residuals[0] = k_ - x[0] * y[0] + x[1] * y[1];
				423	return true;
				424	}
				425
				426	private:
				427	double k_;
				428	};
				429
				430	Note that in the declaration of ``operator()`` the input parameters
				431	``x`` and ``y`` come first, and are passed as const pointers to
				432	arrays of ``double`` s. If there were three input parameters, then
				433	the third input parameter would come after ``y``. The output is
				434	always the last parameter, and is also a pointer to an array. In the
				435	example above, the residual is a scalar, so only ``residuals[0]`` is
				436	set.
				437
				438	Then given this class definition, the numerically differentiated
				439	:class:`CostFunction` with central differences used for computing
				440	the derivative can be constructed as follows.
				441
				442	.. code-block:: c++
				443
				444	CostFunction* cost_function
				445	= new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, 1, 2, 2>(
				446	new MyScalarCostFunctor(1.0)); ^ ^ ^ ^
				447	\| \| \| \|
				448	Finite Differencing Scheme -+ \| \| \|
				449	Dimension of residual ------------+ \| \|
				450	Dimension of x ----------------------+ \|
				451	Dimension of y -------------------------+
				452
				453	In this example, there is usually an instance for each measurement
				454	of `k`.
				455
				456	In the instantiation above, the template parameters following
				457	``MyScalarCostFunctor``, ``1, 2, 2``, describe the functor as
				458	computing a 1-dimensional output from two arguments, both
				459	2-dimensional.
				460
				461	NumericDiffCostFunction also supports cost functions with a
				462	runtime-determined number of residuals. For example:
				463
				464	.. code-block:: c++
				465
				466	CostFunction* cost_function
				467	= new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, DYNAMIC, 2, 2>(
				468	new CostFunctorWithDynamicNumResiduals(1.0), ^ ^ ^
				469	TAKE_OWNERSHIP, \| \| \|
				470	runtime_number_of_residuals); <----+ \| \| \|
				471	\| \| \| \|
				472	\| \| \| \|
				473	Actual number of residuals ------+ \| \| \|
				474	Indicate dynamic number of residuals --------------------+ \| \|
				475	Dimension of x ------------------------------------------------+ \|
				476	Dimension of y ---------------------------------------------------+
				477
				478
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	479	There are three available numeric differentiation schemes in ceres-solver:
				480
				481	The ``FORWARD`` difference method, which approximates :math:`f'(x)`
				482	by computing :math:`\frac{f(x+h)-f(x)}{h}`, computes the cost
				483	function one additional time at :math:`x+h`. It is the fastest but
				484	least accurate method.
				485
				486	The ``CENTRAL`` difference method is more accurate at the cost of
				487	twice as many function evaluations than forward difference,
				488	estimating :math:`f'(x)` by computing
				489	:math:`\frac{f(x+h)-f(x-h)}{2h}`.
				490
				491	The ``RIDDERS`` difference method[Ridders]_ is an adaptive scheme
				492	that estimates derivatives by performing multiple central
				493	differences at varying scales. Specifically, the algorithm starts at
				494	a certain :math:`h` and as the derivative is estimated, this step
				495	size decreases. To conserve function evaluations and estimate the
				496	derivative error, the method performs Richardson extrapolations
				497	between the tested step sizes. The algorithm exhibits considerably
				498	higher accuracy, but does so by additional evaluations of the cost
				499	function.
				500
				501	Consider using ``CENTRAL`` differences to begin with. Based on the
				502	results, either try forward difference to improve performance or
				503	Ridders' method to improve accuracy.
				504
				505	WARNING A common beginner's error when first using
				506	:class:`NumericDiffCostFunction` is to get the sizing wrong. In
				507	particular, there is a tendency to set the template parameters to
				508	(dimension of residual, number of parameters) instead of passing a
				509	dimension parameter for every parameter. In the example above,
				510	that would be ``<MyScalarCostFunctor, 1, 2>``, which is missing the
				511	last ``2`` argument. Please be careful when setting the size
				512	parameters.
				513
				514
				515	Numeric Differentiation & LocalParameterization
				516	-----------------------------------------------
				517
				518	If your cost function depends on a parameter block that must lie on
				519	a manifold and the functor cannot be evaluated for values of that
				520	parameter block not on the manifold then you may have problems
				521	numerically differentiating such functors.
				522
				523	This is because numeric differentiation in Ceres is performed by
				524	perturbing the individual coordinates of the parameter blocks that
				525	a cost functor depends on. In doing so, we assume that the
				526	parameter blocks live in an Euclidean space and ignore the
				527	structure of manifold that they live As a result some of the
				528	perturbations may not lie on the manifold corresponding to the
				529	parameter block.
				530
				531	For example consider a four dimensional parameter block that is
				532	interpreted as a unit Quaternion. Perturbing the coordinates of
				533	this parameter block will violate the unit norm property of the
				534	parameter block.
				535
				536	Fixing this problem requires that :class:`NumericDiffCostFunction`
				537	be aware of the :class:`LocalParameterization` associated with each
				538	parameter block and only generate perturbations in the local
				539	tangent space of each parameter block.
				540
				541	For now this is not considered to be a serious enough problem to
				542	warrant changing the :class:`NumericDiffCostFunction` API. Further,
				543	in most cases it is relatively straightforward to project a point
				544	off the manifold back onto the manifold before using it in the
				545	functor. For example in case of the Quaternion, normalizing the
				546	4-vector before using it does the trick.
				547
				548	Alternate Interface
				549
				550	For a variety of reasons, including compatibility with legacy code,
				551	:class:`NumericDiffCostFunction` can also take
				552	:class:`CostFunction` objects as input. The following describes
				553	how.
				554
				555	To get a numerically differentiated cost function, define a
				556	subclass of :class:`CostFunction` such that the
				557	:func:`CostFunction::Evaluate` function ignores the ``jacobians``
				558	parameter. The numeric differentiation wrapper will fill in the
				559	jacobian parameter if necessary by repeatedly calling the
				560	:func:`CostFunction::Evaluate` with small changes to the
				561	appropriate parameters, and computing the slope. For performance,
				562	the numeric differentiation wrapper class is templated on the
				563	concrete cost function, even though it could be implemented only in
				564	terms of the :class:`CostFunction` interface.
				565
				566	The numerically differentiated version of a cost function for a
				567	cost function can be constructed as follows:
				568
				569	.. code-block:: c++
				570
				571	CostFunction* cost_function
				572	= new NumericDiffCostFunction<MyCostFunction, CENTRAL, 1, 4, 8>(
				573	new MyCostFunction(...), TAKE_OWNERSHIP);
				574
				575	where ``MyCostFunction`` has 1 residual and 2 parameter blocks with
				576	sizes 4 and 8 respectively. Look at the tests for a more detailed
				577	example.
				578
				579	:class:`DynamicNumericDiffCostFunction`
				580	=======================================
				581
				582	.. class:: DynamicNumericDiffCostFunction
				583
				584	Like :class:`AutoDiffCostFunction` :class:`NumericDiffCostFunction`
				585	requires that the number of parameter blocks and their sizes be
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	586	known at compile time. In a number of applications, this is not enough.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	587
				588	.. code-block:: c++
				589
				590	template <typename CostFunctor, NumericDiffMethodType method = CENTRAL>
				591	class DynamicNumericDiffCostFunction : public CostFunction {
				592	};
				593
				594	In such cases when numeric differentiation is desired,
				595	:class:`DynamicNumericDiffCostFunction` can be used.
				596
				597	Like :class:`NumericDiffCostFunction` the user must define a
				598	functor, but the signature of the functor differs slightly. The
				599	expected interface for the cost functors is:
				600
				601	.. code-block:: c++
				602
				603	struct MyCostFunctor {
				604	bool operator()(double const* const* parameters, double* residuals) const {
				605	}
				606	}
				607
				608	Since the sizing of the parameters is done at runtime, you must
				609	also specify the sizes after creating the dynamic numeric diff cost
				610	function. For example:
				611
				612	.. code-block:: c++
				613
				614	DynamicNumericDiffCostFunction<MyCostFunctor>* cost_function =
				615	new DynamicNumericDiffCostFunction<MyCostFunctor>(new MyCostFunctor);
				616	cost_function->AddParameterBlock(5);
				617	cost_function->AddParameterBlock(10);
				618	cost_function->SetNumResiduals(21);
				619
				620	As a rule of thumb, try using :class:`NumericDiffCostFunction` before
				621	you use :class:`DynamicNumericDiffCostFunction`.
				622
				623	WARNING The same caution about mixing local parameterizations
				624	with numeric differentiation applies as is the case with
				625	:class:`NumericDiffCostFunction`.
				626
				627	:class:`CostFunctionToFunctor`
				628	==============================
				629
				630	.. class:: CostFunctionToFunctor
				631
				632	:class:`CostFunctionToFunctor` is an adapter class that allows
				633	users to use :class:`CostFunction` objects in templated functors
				634	which are to be used for automatic differentiation. This allows
				635	the user to seamlessly mix analytic, numeric and automatic
				636	differentiation.
				637
				638	For example, let us assume that
				639
				640	.. code-block:: c++
				641
				642	class IntrinsicProjection : public SizedCostFunction<2, 5, 3> {
				643	public:
				644	IntrinsicProjection(const double* observation);
				645	virtual bool Evaluate(double const* const* parameters,
				646	double* residuals,
				647	double** jacobians) const;
				648	};
				649
				650	is a :class:`CostFunction` that implements the projection of a
				651	point in its local coordinate system onto its image plane and
				652	subtracts it from the observed point projection. It can compute its
				653	residual and either via analytic or numerical differentiation can
				654	compute its jacobians.
				655
				656	Now we would like to compose the action of this
				657	:class:`CostFunction` with the action of camera extrinsics, i.e.,
				658	rotation and translation. Say we have a templated function
				659
				660	.. code-block:: c++
				661
				662	template<typename T>
				663	void RotateAndTranslatePoint(const T* rotation,
				664	const T* translation,
				665	const T* point,
				666	T* result);
				667
				668
				669	Then we can now do the following,
				670
				671	.. code-block:: c++
				672
				673	struct CameraProjection {
				674	CameraProjection(double* observation)
				675	: intrinsic_projection_(new IntrinsicProjection(observation)) {
				676	}
				677
				678	template <typename T>
				679	bool operator()(const T* rotation,
				680	const T* translation,
				681	const T* intrinsics,
				682	const T* point,
				683	T* residual) const {
				684	T transformed_point[3];
				685	RotateAndTranslatePoint(rotation, translation, point, transformed_point);
				686
				687	// Note that we call intrinsic_projection_, just like it was
				688	// any other templated functor.
				689	return intrinsic_projection_(intrinsics, transformed_point, residual);
				690	}
				691
				692	private:
				693	CostFunctionToFunctor<2,5,3> intrinsic_projection_;
				694	};
				695
				696	Note that :class:`CostFunctionToFunctor` takes ownership of the
				697	:class:`CostFunction` that was passed in to the constructor.
				698
				699	In the above example, we assumed that ``IntrinsicProjection`` is a
				700	``CostFunction`` capable of evaluating its value and its
				701	derivatives. Suppose, if that were not the case and
				702	``IntrinsicProjection`` was defined as follows:
				703
				704	.. code-block:: c++
				705
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	706	struct IntrinsicProjection {
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	707	IntrinsicProjection(const double* observation) {
				708	observation_[0] = observation[0];
				709	observation_[1] = observation[1];
				710	}
				711
				712	bool operator()(const double* calibration,
				713	const double* point,
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	714	double* residuals) const {
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	715	double projection[2];
				716	ThirdPartyProjectionFunction(calibration, point, projection);
				717	residuals[0] = observation_[0] - projection[0];
				718	residuals[1] = observation_[1] - projection[1];
				719	return true;
				720	}
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	721	double observation_[2];
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	722	};
				723
				724
				725	Here ``ThirdPartyProjectionFunction`` is some third party library
				726	function that we have no control over. So this function can compute
				727	its value and we would like to use numeric differentiation to
				728	compute its derivatives. In this case we can use a combination of
				729	``NumericDiffCostFunction`` and ``CostFunctionToFunctor`` to get the
				730	job done.
				731
				732	.. code-block:: c++
				733
				734	struct CameraProjection {
				735	CameraProjection(double* observation)
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	736	: intrinsic_projection_(
				737	new NumericDiffCostFunction<IntrinsicProjection, CENTRAL, 2, 5, 3>(
				738	new IntrinsicProjection(observation))) {}
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	739
				740	template <typename T>
				741	bool operator()(const T* rotation,
				742	const T* translation,
				743	const T* intrinsics,
				744	const T* point,
				745	T* residuals) const {
				746	T transformed_point[3];
				747	RotateAndTranslatePoint(rotation, translation, point, transformed_point);
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	748	return intrinsic_projection_(intrinsics, transformed_point, residuals);
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	749	}
				750
				751	private:
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	752	CostFunctionToFunctor<2, 5, 3> intrinsic_projection_;
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	753	};
				754
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	755
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	756	:class:`DynamicCostFunctionToFunctor`
				757	=====================================
				758
				759	.. class:: DynamicCostFunctionToFunctor
				760
				761	:class:`DynamicCostFunctionToFunctor` provides the same functionality as
				762	:class:`CostFunctionToFunctor` for cases where the number and size of the
				763	parameter vectors and residuals are not known at compile-time. The API
				764	provided by :class:`DynamicCostFunctionToFunctor` matches what would be
				765	expected by :class:`DynamicAutoDiffCostFunction`, i.e. it provides a
				766	templated functor of this form:
				767
				768	.. code-block:: c++
				769
				770	template<typename T>
				771	bool operator()(T const* const* parameters, T* residuals) const;
				772
				773	Similar to the example given for :class:`CostFunctionToFunctor`, let us
				774	assume that
				775
				776	.. code-block:: c++
				777
				778	class IntrinsicProjection : public CostFunction {
				779	public:
				780	IntrinsicProjection(const double* observation);
				781	virtual bool Evaluate(double const* const* parameters,
				782	double* residuals,
				783	double** jacobians) const;
				784	};
				785
				786	is a :class:`CostFunction` that projects a point in its local coordinate
				787	system onto its image plane and subtracts it from the observed point
				788	projection.
				789
				790	Using this :class:`CostFunction` in a templated functor would then look like
				791	this:
				792
				793	.. code-block:: c++
				794
				795	struct CameraProjection {
				796	CameraProjection(double* observation)
				797	: intrinsic_projection_(new IntrinsicProjection(observation)) {
				798	}
				799
				800	template <typename T>
				801	bool operator()(T const* const* parameters,
				802	T* residual) const {
				803	const T* rotation = parameters[0];
				804	const T* translation = parameters[1];
				805	const T* intrinsics = parameters[2];
				806	const T* point = parameters[3];
				807
				808	T transformed_point[3];
				809	RotateAndTranslatePoint(rotation, translation, point, transformed_point);
				810
				811	const T* projection_parameters[2];
				812	projection_parameters[0] = intrinsics;
				813	projection_parameters[1] = transformed_point;
				814	return intrinsic_projection_(projection_parameters, residual);
				815	}
				816
				817	private:
				818	DynamicCostFunctionToFunctor intrinsic_projection_;
				819	};
				820
				821	Like :class:`CostFunctionToFunctor`, :class:`DynamicCostFunctionToFunctor`
				822	takes ownership of the :class:`CostFunction` that was passed in to the
				823	constructor.
				824
				825	:class:`ConditionedCostFunction`
				826	================================
				827
				828	.. class:: ConditionedCostFunction
				829
				830	This class allows you to apply different conditioning to the residual
				831	values of a wrapped cost function. An example where this is useful is
				832	where you have an existing cost function that produces N values, but you
				833	want the total cost to be something other than just the sum of these
				834	squared values - maybe you want to apply a different scaling to some
				835	values, to change their contribution to the cost.
				836
				837	Usage:
				838
				839	.. code-block:: c++
				840
				841	// my_cost_function produces N residuals
				842	CostFunction* my_cost_function = ...
				843	CHECK_EQ(N, my_cost_function->num_residuals());
				844	vector<CostFunction*> conditioners;
				845
				846	// Make N 1x1 cost functions (1 parameter, 1 residual)
				847	CostFunction* f_1 = ...
				848	conditioners.push_back(f_1);
				849
				850	CostFunction* f_N = ...
				851	conditioners.push_back(f_N);
				852	ConditionedCostFunction* ccf =
				853	new ConditionedCostFunction(my_cost_function, conditioners);
				854
				855
				856	Now ``ccf`` 's ``residual[i]`` (i=0..N-1) will be passed though the
				857	:math:`i^{\text{th}}` conditioner.
				858
				859	.. code-block:: c++
				860
				861	ccf_residual[i] = f_i(my_cost_function_residual[i])
				862
				863	and the Jacobian will be affected appropriately.
				864
				865
				866	:class:`GradientChecker`
				867	================================
				868
				869	.. class:: GradientChecker
				870
				871	This class compares the Jacobians returned by a cost function against
				872	derivatives estimated using finite differencing. It is meant as a tool for
				873	unit testing, giving you more fine-grained control than the check_gradients
				874	option in the solver options.
				875
				876	The condition enforced is that
				877
				878	.. math:: \forall{i,j}: \frac{J_{ij} - J'_{ij}}{max_{ij}(J_{ij} - J'_{ij})} < r
				879
				880	where :math:`J_{ij}` is the jacobian as computed by the supplied cost
				881	function (by the user) multiplied by the local parameterization Jacobian,
				882	:math:`J'_{ij}` is the jacobian as computed by finite differences,
				883	multiplied by the local parameterization Jacobian as well, and :math:`r`
				884	is the relative precision.
				885
				886	Usage:
				887
				888	.. code-block:: c++
				889
				890	// my_cost_function takes two parameter blocks. The first has a local
				891	// parameterization associated with it.
				892	CostFunction* my_cost_function = ...
				893	LocalParameterization* my_parameterization = ...
				894	NumericDiffOptions numeric_diff_options;
				895
				896	std::vector<LocalParameterization*> local_parameterizations;
				897	local_parameterizations.push_back(my_parameterization);
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	898	local_parameterizations.push_back(nullptr);
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	899
				900	std::vector parameter1;
				901	std::vector parameter2;
				902	// Fill parameter 1 & 2 with test data...
				903
				904	std::vector<double*> parameter_blocks;
				905	parameter_blocks.push_back(parameter1.data());
				906	parameter_blocks.push_back(parameter2.data());
				907
				908	GradientChecker gradient_checker(my_cost_function,
				909	local_parameterizations, numeric_diff_options);
				910	GradientCheckResults results;
				911	if (!gradient_checker.Probe(parameter_blocks.data(), 1e-9, &results) {
				912	LOG(ERROR) << "An error has occurred:\n" << results.error_log;
				913	}
				914
				915
				916	:class:`NormalPrior`
				917	====================
				918
				919	.. class:: NormalPrior
				920
				921	.. code-block:: c++
				922
				923	class NormalPrior: public CostFunction {
				924	public:
				925	// Check that the number of rows in the vector b are the same as the
				926	// number of columns in the matrix A, crash otherwise.
				927	NormalPrior(const Matrix& A, const Vector& b);
				928
				929	virtual bool Evaluate(double const* const* parameters,
				930	double* residuals,
				931	double** jacobians) const;
				932	};
				933
				934	Implements a cost function of the form
				935
				936	.. math:: cost(x) = \|\|A(x - b)\|\|^2
				937
				938	where, the matrix :math:`A` and the vector :math:`b` are fixed and :math:`x`
				939	is the variable. In case the user is interested in implementing a cost
				940	function of the form
				941
				942	.. math:: cost(x) = (x - \mu)^T S^{-1} (x - \mu)
				943
				944	where, :math:`\mu` is a vector and :math:`S` is a covariance matrix,
				945	then, :math:`A = S^{-1/2}`, i.e the matrix :math:`A` is the square
				946	root of the inverse of the covariance, also known as the stiffness
				947	matrix. There are however no restrictions on the shape of
				948	:math:`A`. It is free to be rectangular, which would be the case if
				949	the covariance matrix :math:`S` is rank deficient.
				950
				951
				952
				953	.. _`section-loss_function`:
				954
				955	:class:`LossFunction`
				956	=====================
				957
				958	.. class:: LossFunction
				959
				960	For least squares problems where the minimization may encounter
				961	input terms that contain outliers, that is, completely bogus
				962	measurements, it is important to use a loss function that reduces
				963	their influence.
				964
				965	Consider a structure from motion problem. The unknowns are 3D
				966	points and camera parameters, and the measurements are image
				967	coordinates describing the expected reprojected position for a
				968	point in a camera. For example, we want to model the geometry of a
				969	street scene with fire hydrants and cars, observed by a moving
				970	camera with unknown parameters, and the only 3D points we care
				971	about are the pointy tippy-tops of the fire hydrants. Our magic
				972	image processing algorithm, which is responsible for producing the
				973	measurements that are input to Ceres, has found and matched all
				974	such tippy-tops in all image frames, except that in one of the
				975	frame it mistook a car's headlight for a hydrant. If we didn't do
				976	anything special the residual for the erroneous measurement will
				977	result in the entire solution getting pulled away from the optimum
				978	to reduce the large error that would otherwise be attributed to the
				979	wrong measurement.
				980
				981	Using a robust loss function, the cost for large residuals is
				982	reduced. In the example above, this leads to outlier terms getting
				983	down-weighted so they do not overly influence the final solution.
				984
				985	.. code-block:: c++
				986
				987	class LossFunction {
				988	public:
				989	virtual void Evaluate(double s, double out[3]) const = 0;
				990	};
				991
				992
				993	The key method is :func:`LossFunction::Evaluate`, which given a
				994	non-negative scalar ``s``, computes
				995
				996	.. math:: out = \begin{bmatrix}\rho(s), & \rho'(s), & \rho''(s)\end{bmatrix}
				997
				998	Here the convention is that the contribution of a term to the cost
				999	function is given by :math:`\frac{1}{2}\rho(s)`, where :math:`s
				1000	=\\|f_i\\|^2`. Calling the method with a negative value of :math:`s`
				1001	is an error and the implementations are not required to handle that
				1002	case.
				1003
				1004	Most sane choices of :math:`\rho` satisfy:
				1005
				1006	.. math::
				1007
				1008	\rho(0) &= 0\\
				1009	\rho'(0) &= 1\\
				1010	\rho'(s) &< 1 \text{ in the outlier region}\\
				1011	\rho''(s) &< 0 \text{ in the outlier region}
				1012
				1013	so that they mimic the squared cost for small residuals.
				1014
				1015	Scaling
				1016
				1017	Given one robustifier :math:`\rho(s)` one can change the length
				1018	scale at which robustification takes place, by adding a scale
				1019	factor :math:`a > 0` which gives us :math:`\rho(s,a) = a^2 \rho(s /
				1020	a^2)` and the first and second derivatives as :math:`\rho'(s /
				1021	a^2)` and :math:`(1 / a^2) \rho''(s / a^2)` respectively.
				1022
				1023
				1024	The reason for the appearance of squaring is that :math:`a` is in
				1025	the units of the residual vector norm whereas :math:`s` is a squared
				1026	norm. For applications it is more convenient to specify :math:`a` than
				1027	its square.
				1028
				1029	Instances
				1030	---------
				1031
				1032	Ceres includes a number of predefined loss functions. For simplicity
				1033	we described their unscaled versions. The figure below illustrates
				1034	their shape graphically. More details can be found in
				1035	``include/ceres/loss_function.h``.
				1036
				1037	.. figure:: loss.png
				1038	:figwidth: 500px
				1039	:height: 400px
				1040	:align: center
				1041
				1042	Shape of the various common loss functions.
				1043
				1044	.. class:: TrivialLoss
				1045
				1046	.. math:: \rho(s) = s
				1047
				1048	.. class:: HuberLoss
				1049
				1050	.. math:: \rho(s) = \begin{cases} s & s \le 1\\ 2 \sqrt{s} - 1 & s > 1 \end{cases}
				1051
				1052	.. class:: SoftLOneLoss
				1053
				1054	.. math:: \rho(s) = 2 (\sqrt{1+s} - 1)
				1055
				1056	.. class:: CauchyLoss
				1057
				1058	.. math:: \rho(s) = \log(1 + s)
				1059
				1060	.. class:: ArctanLoss
				1061
				1062	.. math:: \rho(s) = \arctan(s)
				1063
				1064	.. class:: TolerantLoss
				1065
				1066	.. math:: \rho(s,a,b) = b \log(1 + e^{(s - a) / b}) - b \log(1 + e^{-a / b})
				1067
				1068	.. class:: ComposedLoss
				1069
				1070	Given two loss functions ``f`` and ``g``, implements the loss
				1071	function ``h(s) = f(g(s))``.
				1072
				1073	.. code-block:: c++
				1074
				1075	class ComposedLoss : public LossFunction {
				1076	public:
				1077	explicit ComposedLoss(const LossFunction* f,
				1078	Ownership ownership_f,
				1079	const LossFunction* g,
				1080	Ownership ownership_g);
				1081	};
				1082
				1083	.. class:: ScaledLoss
				1084
				1085	Sometimes you want to simply scale the output value of the
				1086	robustifier. For example, you might want to weight different error
				1087	terms differently (e.g., weight pixel reprojection errors
				1088	differently from terrain errors).
				1089
				1090	Given a loss function :math:`\rho(s)` and a scalar :math:`a`, :class:`ScaledLoss`
				1091	implements the function :math:`a \rho(s)`.
				1092
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1093	Since we treat a ``nullptr`` Loss function as the Identity loss
				1094	function, :math:`rho` = ``nullptr``: is a valid input and will result
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1095	in the input being scaled by :math:`a`. This provides a simple way
				1096	of implementing a scaled ResidualBlock.
				1097
				1098	.. class:: LossFunctionWrapper
				1099
				1100	Sometimes after the optimization problem has been constructed, we
				1101	wish to mutate the scale of the loss function. For example, when
				1102	performing estimation from data which has substantial outliers,
				1103	convergence can be improved by starting out with a large scale,
				1104	optimizing the problem and then reducing the scale. This can have
				1105	better convergence behavior than just using a loss function with a
				1106	small scale.
				1107
				1108	This templated class allows the user to implement a loss function
				1109	whose scale can be mutated after an optimization problem has been
				1110	constructed, e.g,
				1111
				1112	.. code-block:: c++
				1113
				1114	Problem problem;
				1115
				1116	// Add parameter blocks
				1117
				1118	CostFunction* cost_function =
				1119	new AutoDiffCostFunction < UW_Camera_Mapper, 2, 9, 3>(
				1120	new UW_Camera_Mapper(feature_x, feature_y));
				1121
				1122	LossFunctionWrapper* loss_function(new HuberLoss(1.0), TAKE_OWNERSHIP);
				1123	problem.AddResidualBlock(cost_function, loss_function, parameters);
				1124
				1125	Solver::Options options;
				1126	Solver::Summary summary;
				1127	Solve(options, &problem, &summary);
				1128
				1129	loss_function->Reset(new HuberLoss(1.0), TAKE_OWNERSHIP);
				1130	Solve(options, &problem, &summary);
				1131
				1132
				1133	Theory
				1134	------
				1135
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1136	Let us consider a problem with a single parameter block.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1137
				1138	.. math::
				1139
				1140	\min_x \frac{1}{2}\rho(f^2(x))
				1141
				1142
				1143	Then, the robustified gradient and the Gauss-Newton Hessian are
				1144
				1145	.. math::
				1146
				1147	g(x) &= \rho'J^\top(x)f(x)\\
				1148	H(x) &= J^\top(x)\left(\rho' + 2 \rho''f(x)f^\top(x)\right)J(x)
				1149
				1150	where the terms involving the second derivatives of :math:`f(x)` have
				1151	been ignored. Note that :math:`H(x)` is indefinite if
				1152	:math:`\rho''f(x)^\top f(x) + \frac{1}{2}\rho' < 0`. If this is not
				1153	the case, then its possible to re-weight the residual and the Jacobian
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1154	matrix such that the robustified Gauss-Newton step corresponds to an
				1155	ordinary linear least squares problem.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1156
				1157
				1158	Let :math:`\alpha` be a root of
				1159
				1160	.. math:: \frac{1}{2}\alpha^2 - \alpha - \frac{\rho''}{\rho'}\\|f(x)\\|^2 = 0.
				1161
				1162
				1163	Then, define the rescaled residual and Jacobian as
				1164
				1165	.. math::
				1166
				1167	\tilde{f}(x) &= \frac{\sqrt{\rho'}}{1 - \alpha} f(x)\\
				1168	\tilde{J}(x) &= \sqrt{\rho'}\left(1 - \alpha
				1169	\frac{f(x)f^\top(x)}{\left\\|f(x)\right\\|^2} \right)J(x)
				1170
				1171
				1172	In the case :math:`2 \rho''\left\\|f(x)\right\\|^2 + \rho' \lesssim 0`,
				1173	we limit :math:`\alpha \le 1- \epsilon` for some small
				1174	:math:`\epsilon`. For more details see [Triggs]_.
				1175
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1176	With this simple rescaling, one can apply any Jacobian based non-linear
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1177	least squares algorithm to robustified non-linear least squares
				1178	problems.
				1179
				1180
				1181	:class:`LocalParameterization`
				1182	==============================
				1183
				1184	.. class:: LocalParameterization
				1185
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1186	In many optimization problems, especially sensor fusion problems,
				1187	one has to model quantities that live in spaces known as `Manifolds
				1188	<https://en.wikipedia.org/wiki/Manifold>`_ , for example the
				1189	rotation/orientation of a sensor that is represented by a
				1190	`Quaternion
				1191	<https://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation>`_.
				1192
				1193	Manifolds are spaces, which locally look like Euclidean spaces. More
				1194	precisely, at each point on the manifold there is a linear space
				1195	that is tangent to the manifold. It has dimension equal to the
				1196	intrinsic dimension of the manifold itself, which is less than or
				1197	equal to the ambient space in which the manifold is embedded.
				1198
				1199	For example, the tangent space to a point on a sphere in three
				1200	dimensions is the two dimensional plane that is tangent to the
				1201	sphere at that point. There are two reasons tangent spaces are
				1202	interesting:
				1203
				1204	1. They are Euclidean spaces, so the usual vector space operations
				1205	apply there, which makes numerical operations easy.
				1206
				1207	2. Movement in the tangent space translate into movements along the
				1208	manifold. Movements perpendicular to the tangent space do not
				1209	translate into movements on the manifold.
				1210
				1211	Returning to our sphere example, moving in the 2 dimensional
				1212	plane tangent to the sphere and projecting back onto the sphere
				1213	will move you away from the point you started from but moving
				1214	along the normal at the same point and the projecting back onto
				1215	the sphere brings you back to the point.
				1216
				1217	Besides the mathematical niceness, modeling manifold valued
				1218	quantities correctly and paying attention to their geometry has
				1219	practical benefits too:
				1220
				1221	1. It naturally constrains the quantity to the manifold through out
				1222	the optimization. Freeing the user from hacks like *quaternion
				1223	normalization*.
				1224
				1225	2. It reduces the dimension of the optimization problem to its
				1226	natural size. For example, a quantity restricted to a line, is a
				1227	one dimensional object regardless of the dimension of the ambient
				1228	space in which this line lives.
				1229
				1230	Working in the tangent space reduces not just the computational
				1231	complexity of the optimization algorithm, but also improves the
				1232	numerical behaviour of the algorithm.
				1233
				1234	A basic operation one can perform on a manifold is the
				1235	:math:`\boxplus` operation that computes the result of moving along
				1236	delta in the tangent space at x, and then projecting back onto the
				1237	manifold that x belongs to. Also known as a Retraction,
				1238	:math:`\boxplus` is a generalization of vector addition in Euclidean
				1239	spaces. Formally, :math:`\boxplus` is a smooth map from a
				1240	manifold :math:`\mathcal{M}` and its tangent space
				1241	:math:`T_\mathcal{M}` to the manifold :math:`\mathcal{M}` that
				1242	obeys the identity
				1243
				1244	.. math:: \boxplus(x, 0) = x,\quad \forall x.
				1245
				1246	That is, it ensures that the tangent space is centered at :math:`x`
				1247	and the zero vector is the identity element. For more see
				1248	[Hertzberg]_ and section A.6.9 of [HartleyZisserman]_.
				1249
				1250	Let us consider two examples:
				1251
				1252	The Euclidean space :math:`R^n` is the simplest example of a
				1253	manifold. It has dimension :math:`n` (and so does its tangent space)
				1254	and :math:`\boxplus` is the familiar vector sum operation.
				1255
				1256	.. math:: \boxplus(x, \Delta) = x + \Delta
				1257
				1258	A more interesting case is :math:`SO(3)`, the special orthogonal
				1259	group in three dimensions - the space of 3x3 rotation
				1260	matrices. :math:`SO(3)` is a three dimensional manifold embedded in
				1261	:math:`R^9` or :math:`R^{3\times 3}`.
				1262
				1263	:math:`\boxplus` on :math:`SO(3)` is defined using the Exponential
				1264	map, from the tangent space (:math:`R^3`) to the manifold. The
				1265	Exponential map :math:`\operatorname{Exp}` is defined as:
				1266
				1267	.. math::
				1268
				1269	\operatorname{Exp}([p,q,r]) = \left [ \begin{matrix}
				1270	\cos \theta + cp^2 & -sr + cpq & sq + cpr \\
				1271	sr + cpq & \cos \theta + cq^2& -sp + cqr \\
				1272	-sq + cpr & sp + cqr & \cos \theta + cr^2
				1273	\end{matrix} \right ]
				1274
				1275	where,
				1276
				1277	.. math::
				1278	\theta = \sqrt{p^2 + q^2 + r^2}, s = \frac{\sin \theta}{\theta},
				1279	c = \frac{1 - \cos \theta}{\theta^2}.
				1280
				1281	Then,
				1282
				1283	.. math::
				1284
				1285	\boxplus(x, \Delta) = x \operatorname{Exp}(\Delta)
				1286
				1287	The ``LocalParameterization`` interface allows the user to define
				1288	and associate with parameter blocks the manifold that they belong
				1289	to. It does so by defining the ``Plus`` (:math:`\boxplus`) operation
				1290	and its derivative with respect to :math:`\Delta` at :math:`\Delta =
				1291	0`.
				1292
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1293	.. code-block:: c++
				1294
				1295	class LocalParameterization {
				1296	public:
				1297	virtual ~LocalParameterization() {}
				1298	virtual bool Plus(const double* x,
				1299	const double* delta,
				1300	double* x_plus_delta) const = 0;
				1301	virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
				1302	virtual bool MultiplyByJacobian(const double* x,
				1303	const int num_rows,
				1304	const double* global_matrix,
				1305	double* local_matrix) const;
				1306	virtual int GlobalSize() const = 0;
				1307	virtual int LocalSize() const = 0;
				1308	};
				1309
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1310
				1311	.. function:: int LocalParameterization::GlobalSize()
				1312
				1313	The dimension of the ambient space in which the parameter block
				1314	:math:`x` lives.
				1315
				1316	.. function:: int LocalParameterization::LocalSize()
				1317
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1318	The size of the tangent space that :math:`\Delta` lives in.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1319
				1320	.. function:: bool LocalParameterization::Plus(const double* x, const double* delta, double* x_plus_delta) const
				1321
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1322	:func:`LocalParameterization::Plus` implements :math:`\boxplus(x,\Delta)`.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1323
				1324	.. function:: bool LocalParameterization::ComputeJacobian(const double* x, double* jacobian) const
				1325
				1326	Computes the Jacobian matrix
				1327
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1328	.. math:: J = D_2 \boxplus(x, 0)
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1329
				1330	in row major form.
				1331
				1332	.. function:: bool MultiplyByJacobian(const double* x, const int num_rows, const double* global_matrix, double* local_matrix) const
				1333
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1334	``local_matrix = global_matrix * jacobian``
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1335
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1336	``global_matrix`` is a ``num_rows x GlobalSize`` row major matrix.
				1337	``local_matrix`` is a ``num_rows x LocalSize`` row major matrix.
				1338	``jacobian`` is the matrix returned by :func:`LocalParameterization::ComputeJacobian` at :math:`x`.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1339
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1340	This is only used by :class:`GradientProblem`. For most normal
				1341	uses, it is okay to use the default implementation.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1342
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1343	Ceres Solver ships with a number of commonly used instances of
				1344	:class:`LocalParameterization`. Another great place to find high
				1345	quality implementations of :math:`\boxplus` operations on a variety of
				1346	manifolds is the `Sophus <https://github.com/strasdat/Sophus>`_
				1347	library developed by Hauke Strasdat and his collaborators.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1348
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1349	:class:`IdentityParameterization`
				1350	---------------------------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1351
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1352	A trivial version of :math:`\boxplus` is when :math:`\Delta` is of the
				1353	same size as :math:`x` and
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1354
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1355	.. math:: \boxplus(x, \Delta) = x + \Delta
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1356
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1357	This is the same as :math:`x` living in a Euclidean manifold.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1358
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1359	:class:`QuaternionParameterization`
				1360	-----------------------------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1361
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1362	Another example that occurs commonly in Structure from Motion problems
				1363	is when camera rotations are parameterized using a quaternion. This is
				1364	a 3-dimensional manifold that lives in 4-dimensional space.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1365
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1366	.. math:: \boxplus(x, \Delta) = \left[ \cos(\|\Delta\|), \frac{\sin\left(\|\Delta\|\right)}{\|\Delta\|} \Delta \right] * x
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1367
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1368	The multiplication :math:`*` between the two 4-vectors on the right
				1369	hand side is the standard quaternion product.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1370
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1371	:class:`EigenQuaternionParameterization`
				1372	----------------------------------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1373
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1374	`Eigen <http://eigen.tuxfamily.org/index.php?title=Main_Page>`_ uses a
				1375	different internal memory layout for the elements of the quaternion
				1376	than what is commonly used. Specifically, Eigen stores the elements in
				1377	memory as :math:`(x, y, z, w)`, i.e., the real part (:math:`w`) is
				1378	stored as the last element. Note, when creating an Eigen quaternion
				1379	through the constructor the elements are accepted in :math:`w, x, y,
				1380	z` order.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1381
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1382	Since Ceres operates on parameter blocks which are raw ``double``
				1383	pointers this difference is important and requires a different
				1384	parameterization. :class:`EigenQuaternionParameterization` uses the
				1385	same ``Plus`` operation as :class:`QuaternionParameterization` but
				1386	takes into account Eigen's internal memory element ordering.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1387
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1388	:class:`SubsetParameterization`
				1389	-------------------------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1390
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1391	Suppose :math:`x` is a two dimensional vector, and the user wishes to
				1392	hold the first coordinate constant. Then, :math:`\Delta` is a scalar
				1393	and :math:`\boxplus` is defined as
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1394
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1395	.. math:: \boxplus(x, \Delta) = x + \left[ \begin{array}{c} 0 \\ 1 \end{array} \right] \Delta
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1396
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1397	:class:`SubsetParameterization` generalizes this construction to hold
				1398	any part of a parameter block constant by specifying the set of
				1399	coordinates that are held constant.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1400
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1401	.. NOTE::
				1402	It is legal to hold all coordinates of a parameter block to constant
				1403	using a :class:`SubsetParameterization`. It is the same as calling
				1404	:func:`Problem::SetParameterBlockConstant` on that parameter block.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1405
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1406	:class:`HomogeneousVectorParameterization`
				1407	------------------------------------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1408
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1409	In computer vision, homogeneous vectors are commonly used to represent
				1410	objects in projective geometry such as points in projective space. One
				1411	example where it is useful to use this over-parameterization is in
				1412	representing points whose triangulation is ill-conditioned. Here it is
				1413	advantageous to use homogeneous vectors, instead of an Euclidean
				1414	vector, because it can represent points at and near infinity.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1415
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1416	:class:`HomogeneousVectorParameterization` defines a
				1417	:class:`LocalParameterization` for an :math:`n-1` dimensional
				1418	manifold that embedded in :math:`n` dimensional space where the
				1419	scale of the vector does not matter, i.e., elements of the
				1420	projective space :math:`\mathbb{P}^{n-1}`. It assumes that the last
				1421	coordinate of the :math:`n`-vector is the scalar component of the
				1422	homogenous vector, i.e., finite points in this representation are
				1423	those for which the scalar component is non-zero.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1424
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1425	Further, ``HomogeneousVectorParameterization::Plus`` preserves the
				1426	scale of :math:`x`.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1427
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1428	:class:`LineParameterization`
				1429	-----------------------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1430
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1431	This class provides a parameterization for lines, where the line is
				1432	defined using an origin point and a direction vector. So the
				1433	parameter vector size needs to be two times the ambient space
				1434	dimension, where the first half is interpreted as the origin point
				1435	and the second half as the direction. This local parameterization is
				1436	a special case of the `Affine Grassmannian manifold
				1437	<https://en.wikipedia.org/wiki/Affine_Grassmannian_(manifold))>`_
				1438	for the case :math:`\operatorname{Graff}_1(R^n)`.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1439
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1440	Note that this is a parameterization for a line, rather than a point
				1441	constrained to lie on a line. It is useful when one wants to optimize
				1442	over the space of lines. For example, :math:`n` distinct points in 3D
				1443	(measurements) we want to find the line that minimizes the sum of
				1444	squared distances to all the points.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1445
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1446	:class:`ProductParameterization`
				1447	--------------------------------
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1448
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1449	Consider an optimization problem over the space of rigid
				1450	transformations :math:`SE(3)`, which is the Cartesian product of
				1451	:math:`SO(3)` and :math:`\mathbb{R}^3`. Suppose you are using
				1452	Quaternions to represent the rotation, Ceres ships with a local
				1453	parameterization for that and :math:`\mathbb{R}^3` requires no, or
				1454	:class:`IdentityParameterization` parameterization. So how do we
				1455	construct a local parameterization for a parameter block a rigid
				1456	transformation?
				1457
				1458	In cases, where a parameter block is the Cartesian product of a number
				1459	of manifolds and you have the local parameterization of the individual
				1460	manifolds available, :class:`ProductParameterization` can be used to
				1461	construct a local parameterization of the cartesian product. For the
				1462	case of the rigid transformation, where say you have a parameter block
				1463	of size 7, where the first four entries represent the rotation as a
				1464	quaternion, a local parameterization can be constructed as
				1465
				1466	.. code-block:: c++
				1467
				1468	ProductParameterization se3_param(new QuaternionParameterization(),
				1469	new IdentityParameterization(3));
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1470
				1471
				1472	:class:`AutoDiffLocalParameterization`
				1473	======================================
				1474
				1475	.. class:: AutoDiffLocalParameterization
				1476
				1477	:class:`AutoDiffLocalParameterization` does for
				1478	:class:`LocalParameterization` what :class:`AutoDiffCostFunction`
				1479	does for :class:`CostFunction`. It allows the user to define a
				1480	templated functor that implements the
				1481	:func:`LocalParameterization::Plus` operation and it uses automatic
				1482	differentiation to implement the computation of the Jacobian.
				1483
				1484	To get an auto differentiated local parameterization, you must
				1485	define a class with a templated operator() (a functor) that computes
				1486
				1487	.. math:: x' = \boxplus(x, \Delta x),
				1488
				1489	For example, Quaternions have a three dimensional local
				1490	parameterization. Its plus operation can be implemented as (taken
				1491	from `internal/ceres/autodiff_local_parameterization_test.cc
				1492	<https://ceres-solver.googlesource.com/ceres-solver/+/master/internal/ceres/autodiff_local_parameterization_test.cc>`_
				1493	)
				1494
				1495	.. code-block:: c++
				1496
				1497	struct QuaternionPlus {
				1498	template<typename T>
				1499	bool operator()(const T* x, const T* delta, T* x_plus_delta) const {
				1500	const T squared_norm_delta =
				1501	delta[0] * delta[0] + delta[1] * delta[1] + delta[2] * delta[2];
				1502
				1503	T q_delta[4];
				1504	if (squared_norm_delta > 0.0) {
				1505	T norm_delta = sqrt(squared_norm_delta);
				1506	const T sin_delta_by_delta = sin(norm_delta) / norm_delta;
				1507	q_delta[0] = cos(norm_delta);
				1508	q_delta[1] = sin_delta_by_delta * delta[0];
				1509	q_delta[2] = sin_delta_by_delta * delta[1];
				1510	q_delta[3] = sin_delta_by_delta * delta[2];
				1511	} else {
				1512	// We do not just use q_delta = [1,0,0,0] here because that is a
				1513	// constant and when used for automatic differentiation will
				1514	// lead to a zero derivative. Instead we take a first order
				1515	// approximation and evaluate it at zero.
				1516	q_delta[0] = T(1.0);
				1517	q_delta[1] = delta[0];
				1518	q_delta[2] = delta[1];
				1519	q_delta[3] = delta[2];
				1520	}
				1521
				1522	Quaternionproduct(q_delta, x, x_plus_delta);
				1523	return true;
				1524	}
				1525	};
				1526
				1527	Given this struct, the auto differentiated local
				1528	parameterization can now be constructed as
				1529
				1530	.. code-block:: c++
				1531
				1532	LocalParameterization* local_parameterization =
				1533	new AutoDiffLocalParameterization<QuaternionPlus, 4, 3>;
				1534	\| \|
				1535	Global Size ---------------+ \|
				1536	Local Size -------------------+
				1537
				1538
				1539	:class:`Problem`
				1540	================
				1541
				1542	.. class:: Problem
				1543
				1544	:class:`Problem` holds the robustified bounds constrained
				1545	non-linear least squares problem :eq:`ceresproblem_modeling`. To
				1546	create a least squares problem, use the
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1547	:func:`Problem::AddResidalBlock` and
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1548	:func:`Problem::AddParameterBlock` methods.
				1549
				1550	For example a problem containing 3 parameter blocks of sizes 3, 4
				1551	and 5 respectively and two residual blocks of size 2 and 6:
				1552
				1553	.. code-block:: c++
				1554
				1555	double x1[] = { 1.0, 2.0, 3.0 };
				1556	double x2[] = { 1.0, 2.0, 3.0, 5.0 };
				1557	double x3[] = { 1.0, 2.0, 3.0, 6.0, 7.0 };
				1558
				1559	Problem problem;
				1560	problem.AddResidualBlock(new MyUnaryCostFunction(...), x1);
				1561	problem.AddResidualBlock(new MyBinaryCostFunction(...), x2, x3);
				1562
				1563	:func:`Problem::AddResidualBlock` as the name implies, adds a
				1564	residual block to the problem. It adds a :class:`CostFunction`, an
				1565	optional :class:`LossFunction` and connects the
				1566	:class:`CostFunction` to a set of parameter block.
				1567
				1568	The cost function carries with it information about the sizes of
				1569	the parameter blocks it expects. The function checks that these
				1570	match the sizes of the parameter blocks listed in
				1571	``parameter_blocks``. The program aborts if a mismatch is
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1572	detected. ``loss_function`` can be ``nullptr``, in which case the cost
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1573	of the term is just the squared norm of the residuals.
				1574
				1575	The user has the option of explicitly adding the parameter blocks
				1576	using :func:`Problem::AddParameterBlock`. This causes additional
				1577	correctness checking; however, :func:`Problem::AddResidualBlock`
				1578	implicitly adds the parameter blocks if they are not present, so
				1579	calling :func:`Problem::AddParameterBlock` explicitly is not
				1580	required.
				1581
				1582	:func:`Problem::AddParameterBlock` explicitly adds a parameter
				1583	block to the :class:`Problem`. Optionally it allows the user to
				1584	associate a :class:`LocalParameterization` object with the
				1585	parameter block too. Repeated calls with the same arguments are
				1586	ignored. Repeated calls with the same double pointer but a
				1587	different size results in undefined behavior.
				1588
				1589	You can set any parameter block to be constant using
				1590	:func:`Problem::SetParameterBlockConstant` and undo this using
				1591	:func:`SetParameterBlockVariable`.
				1592
				1593	In fact you can set any number of parameter blocks to be constant,
				1594	and Ceres is smart enough to figure out what part of the problem
				1595	you have constructed depends on the parameter blocks that are free
				1596	to change and only spends time solving it. So for example if you
				1597	constructed a problem with a million parameter blocks and 2 million
				1598	residual blocks, but then set all but one parameter blocks to be
				1599	constant and say only 10 residual blocks depend on this one
				1600	non-constant parameter block. Then the computational effort Ceres
				1601	spends in solving this problem will be the same if you had defined
				1602	a problem with one parameter block and 10 residual blocks.
				1603
				1604	Ownership
				1605
				1606	:class:`Problem` by default takes ownership of the
				1607	``cost_function``, ``loss_function`` and ``local_parameterization``
				1608	pointers. These objects remain live for the life of the
				1609	:class:`Problem`. If the user wishes to keep control over the
				1610	destruction of these objects, then they can do this by setting the
				1611	corresponding enums in the :class:`Problem::Options` struct.
				1612
				1613	Note that even though the Problem takes ownership of ``cost_function``
				1614	and ``loss_function``, it does not preclude the user from re-using
				1615	them in another residual block. The destructor takes care to call
				1616	delete on each ``cost_function`` or ``loss_function`` pointer only
				1617	once, regardless of how many residual blocks refer to them.
				1618
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1619	.. class:: Problem::Options
				1620
				1621	Options struct that is used to control :class:`Problem`.
				1622
				1623	.. member:: Ownership Problem::Options::cost_function_ownership
				1624
				1625	Default: ``TAKE_OWNERSHIP``
				1626
				1627	This option controls whether the Problem object owns the cost
				1628	functions.
				1629
				1630	If set to TAKE_OWNERSHIP, then the problem object will delete the
				1631	cost functions on destruction. The destructor is careful to delete
				1632	the pointers only once, since sharing cost functions is allowed.
				1633
				1634	.. member:: Ownership Problem::Options::loss_function_ownership
				1635
				1636	Default: ``TAKE_OWNERSHIP``
				1637
				1638	This option controls whether the Problem object owns the loss
				1639	functions.
				1640
				1641	If set to TAKE_OWNERSHIP, then the problem object will delete the
				1642	loss functions on destruction. The destructor is careful to delete
				1643	the pointers only once, since sharing loss functions is allowed.
				1644
				1645	.. member:: Ownership Problem::Options::local_parameterization_ownership
				1646
				1647	Default: ``TAKE_OWNERSHIP``
				1648
				1649	This option controls whether the Problem object owns the local
				1650	parameterizations.
				1651
				1652	If set to TAKE_OWNERSHIP, then the problem object will delete the
				1653	local parameterizations on destruction. The destructor is careful
				1654	to delete the pointers only once, since sharing local
				1655	parameterizations is allowed.
				1656
				1657	.. member:: bool Problem::Options::enable_fast_removal
				1658
				1659	Default: ``false``
				1660
				1661	If true, trades memory for faster
				1662	:func:`Problem::RemoveResidualBlock` and
				1663	:func:`Problem::RemoveParameterBlock` operations.
				1664
				1665	By default, :func:`Problem::RemoveParameterBlock` and
				1666	:func:`Problem::RemoveResidualBlock` take time proportional to
				1667	the size of the entire problem. If you only ever remove
				1668	parameters or residuals from the problem occasionally, this might
				1669	be acceptable. However, if you have memory to spare, enable this
				1670	option to make :func:`Problem::RemoveParameterBlock` take time
				1671	proportional to the number of residual blocks that depend on it,
				1672	and :func:`Problem::RemoveResidualBlock` take (on average)
				1673	constant time.
				1674
				1675	The increase in memory usage is twofold: an additional hash set
				1676	per parameter block containing all the residuals that depend on
				1677	the parameter block; and a hash set in the problem containing all
				1678	residuals.
				1679
				1680	.. member:: bool Problem::Options::disable_all_safety_checks
				1681
				1682	Default: `false`
				1683
				1684	By default, Ceres performs a variety of safety checks when
				1685	constructing the problem. There is a small but measurable
				1686	performance penalty to these checks, typically around 5% of
				1687	construction time. If you are sure your problem construction is
				1688	correct, and 5% of the problem construction time is truly an
				1689	overhead you want to avoid, then you can set
				1690	disable_all_safety_checks to true.
				1691
				1692	WARNING Do not set this to true, unless you are absolutely
				1693	sure of what you are doing.
				1694
				1695	.. member:: Context* Problem::Options::context
				1696
				1697	Default: `nullptr`
				1698
				1699	A Ceres global context to use for solving this problem. This may
				1700	help to reduce computation time as Ceres can reuse expensive
				1701	objects to create. The context object can be `nullptr`, in which
				1702	case Ceres may create one.
				1703
				1704	Ceres does NOT take ownership of the pointer.
				1705
				1706	.. member:: EvaluationCallback* Problem::Options::evaluation_callback
				1707
				1708	Default: `nullptr`
				1709
				1710	Using this callback interface, Ceres will notify you when it is
				1711	about to evaluate the residuals or Jacobians.
				1712
				1713	If an ``evaluation_callback`` is present, Ceres will update the
				1714	user's parameter blocks to the values that will be used when
				1715	calling :func:`CostFunction::Evaluate` before calling
				1716	:func:`EvaluationCallback::PrepareForEvaluation`. One can then use
				1717	this callback to share (or cache) computation between cost
				1718	functions by doing the shared computation in
				1719	:func:`EvaluationCallback::PrepareForEvaluation` before Ceres
				1720	calls :func:`CostFunction::Evaluate`.
				1721
				1722	Problem does NOT take ownership of the callback.
				1723
				1724	.. NOTE::
				1725
				1726	Evaluation callbacks are incompatible with inner iterations. So
				1727	calling Solve with
				1728	:member:`Solver::Options::use_inner_iterations` set to `true`
				1729	on a :class:`Problem` with a non-null evaluation callback is an
				1730	error.
				1731
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1732	.. function:: ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, const vector<double*> parameter_blocks)
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1733
				1734	.. function:: template <typename Ts...> ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, double* x0, Ts... xs)
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1735
				1736	Add a residual block to the overall cost function. The cost
				1737	function carries with it information about the sizes of the
				1738	parameter blocks it expects. The function checks that these match
				1739	the sizes of the parameter blocks listed in parameter_blocks. The
				1740	program aborts if a mismatch is detected. loss_function can be
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1741	`nullptr`, in which case the cost of the term is just the squared
				1742	norm of the residuals.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1743
				1744	The parameter blocks may be passed together as a
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1745	``vector<double>``, or ``double`` pointers.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1746
				1747	The user has the option of explicitly adding the parameter blocks
				1748	using AddParameterBlock. This causes additional correctness
				1749	checking; however, AddResidualBlock implicitly adds the parameter
				1750	blocks if they are not present, so calling AddParameterBlock
				1751	explicitly is not required.
				1752
				1753	The Problem object by default takes ownership of the
				1754	cost_function and loss_function pointers. These objects remain
				1755	live for the life of the Problem object. If the user wishes to
				1756	keep control over the destruction of these objects, then they can
				1757	do this by setting the corresponding enums in the Options struct.
				1758
				1759	Note: Even though the Problem takes ownership of cost_function
				1760	and loss_function, it does not preclude the user from re-using
				1761	them in another residual block. The destructor takes care to call
				1762	delete on each cost_function or loss_function pointer only once,
				1763	regardless of how many residual blocks refer to them.
				1764
				1765	Example usage:
				1766
				1767	.. code-block:: c++
				1768
				1769	double x1[] = {1.0, 2.0, 3.0};
				1770	double x2[] = {1.0, 2.0, 5.0, 6.0};
				1771	double x3[] = {3.0, 6.0, 2.0, 5.0, 1.0};
				1772	vector<double*> v1;
				1773	v1.push_back(x1);
				1774	vector<double*> v2;
				1775	v2.push_back(x2);
				1776	v2.push_back(x1);
				1777
				1778	Problem problem;
				1779
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1780	problem.AddResidualBlock(new MyUnaryCostFunction(...), nullptr, x1);
				1781	problem.AddResidualBlock(new MyBinaryCostFunction(...), nullptr, x2, x1);
				1782	problem.AddResidualBlock(new MyUnaryCostFunction(...), nullptr, v1);
				1783	problem.AddResidualBlock(new MyBinaryCostFunction(...), nullptr, v2);
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1784
				1785	.. function:: void Problem::AddParameterBlock(double* values, int size, LocalParameterization* local_parameterization)
				1786
				1787	Add a parameter block with appropriate size to the problem.
				1788	Repeated calls with the same arguments are ignored. Repeated calls
				1789	with the same double pointer but a different size results in
				1790	undefined behavior.
				1791
				1792	.. function:: void Problem::AddParameterBlock(double* values, int size)
				1793
				1794	Add a parameter block with appropriate size and parameterization to
				1795	the problem. Repeated calls with the same arguments are
				1796	ignored. Repeated calls with the same double pointer but a
				1797	different size results in undefined behavior.
				1798
				1799	.. function:: void Problem::RemoveResidualBlock(ResidualBlockId residual_block)
				1800
				1801	Remove a residual block from the problem. Any parameters that the residual
				1802	block depends on are not removed. The cost and loss functions for the
				1803	residual block will not get deleted immediately; won't happen until the
				1804	problem itself is deleted. If Problem::Options::enable_fast_removal is
				1805	true, then the removal is fast (almost constant time). Otherwise, removing a
				1806	residual block will incur a scan of the entire Problem object to verify that
				1807	the residual_block represents a valid residual in the problem.
				1808
				1809	WARNING: Removing a residual or parameter block will destroy
				1810	the implicit ordering, rendering the jacobian or residuals returned
				1811	from the solver uninterpretable. If you depend on the evaluated
				1812	jacobian, do not use remove! This may change in a future release.
				1813	Hold the indicated parameter block constant during optimization.
				1814
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1815	.. function:: void Problem::RemoveParameterBlock(const double* values)
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1816
				1817	Remove a parameter block from the problem. The parameterization of
				1818	the parameter block, if it exists, will persist until the deletion
				1819	of the problem (similar to cost/loss functions in residual block
				1820	removal). Any residual blocks that depend on the parameter are also
				1821	removed, as described above in RemoveResidualBlock(). If
				1822	Problem::Options::enable_fast_removal is true, then
				1823	the removal is fast (almost constant time). Otherwise, removing a
				1824	parameter block will incur a scan of the entire Problem object.
				1825
				1826	WARNING: Removing a residual or parameter block will destroy
				1827	the implicit ordering, rendering the jacobian or residuals returned
				1828	from the solver uninterpretable. If you depend on the evaluated
				1829	jacobian, do not use remove! This may change in a future release.
				1830
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1831	.. function:: void Problem::SetParameterBlockConstant(const double* values)
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1832
				1833	Hold the indicated parameter block constant during optimization.
				1834
				1835	.. function:: void Problem::SetParameterBlockVariable(double* values)
				1836
				1837	Allow the indicated parameter to vary during optimization.
				1838
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1839	.. function:: bool Problem::IsParameterBlockConstant(const double* values) const
				1840
				1841	Returns ``true`` if a parameter block is set constant, and false
				1842	otherwise. A parameter block may be set constant in two ways:
				1843	either by calling ``SetParameterBlockConstant`` or by associating a
				1844	``LocalParameterization`` with a zero dimensional tangent space
				1845	with it.
				1846
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1847	.. function:: void Problem::SetParameterization(double* values, LocalParameterization* local_parameterization)
				1848
				1849	Set the local parameterization for one of the parameter blocks.
				1850	The local_parameterization is owned by the Problem by default. It
				1851	is acceptable to set the same parameterization for multiple
				1852	parameters; the destructor is careful to delete local
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1853	parameterizations only once. Calling `SetParameterization` with
				1854	`nullptr` will clear any previously set parameterization.
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1855
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1856	.. function:: LocalParameterization* Problem::GetParameterization(const double* values) const
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1857
				1858	Get the local parameterization object associated with this
				1859	parameter block. If there is no parameterization object associated
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1860	then `nullptr` is returned
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1861
				1862	.. function:: void Problem::SetParameterLowerBound(double* values, int index, double lower_bound)
				1863
				1864	Set the lower bound for the parameter at position `index` in the
				1865	parameter block corresponding to `values`. By default the lower
				1866	bound is ``-std::numeric_limits<double>::max()``, which is treated
				1867	by the solver as the same as :math:`-\infty`.
				1868
				1869	.. function:: void Problem::SetParameterUpperBound(double* values, int index, double upper_bound)
				1870
				1871	Set the upper bound for the parameter at position `index` in the
				1872	parameter block corresponding to `values`. By default the value is
				1873	``std::numeric_limits<double>::max()``, which is treated by the
				1874	solver as the same as :math:`\infty`.
				1875
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1876	.. function:: double Problem::GetParameterLowerBound(const double* values, int index)
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1877
				1878	Get the lower bound for the parameter with position `index`. If the
				1879	parameter is not bounded by the user, then its lower bound is
				1880	``-std::numeric_limits<double>::max()``.
				1881
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1882	.. function:: double Problem::GetParameterUpperBound(const double* values, int index)
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1883
				1884	Get the upper bound for the parameter with position `index`. If the
				1885	parameter is not bounded by the user, then its upper bound is
				1886	``std::numeric_limits<double>::max()``.
				1887
				1888	.. function:: int Problem::NumParameterBlocks() const
				1889
				1890	Number of parameter blocks in the problem. Always equals
				1891	parameter_blocks().size() and parameter_block_sizes().size().
				1892
				1893	.. function:: int Problem::NumParameters() const
				1894
				1895	The size of the parameter vector obtained by summing over the sizes
				1896	of all the parameter blocks.
				1897
				1898	.. function:: int Problem::NumResidualBlocks() const
				1899
				1900	Number of residual blocks in the problem. Always equals
				1901	residual_blocks().size().
				1902
				1903	.. function:: int Problem::NumResiduals() const
				1904
				1905	The size of the residual vector obtained by summing over the sizes
				1906	of all of the residual blocks.
				1907
				1908	.. function:: int Problem::ParameterBlockSize(const double* values) const
				1909
				1910	The size of the parameter block.
				1911
				1912	.. function:: int Problem::ParameterBlockLocalSize(const double* values) const
				1913
				1914	The size of local parameterization for the parameter block. If
				1915	there is no local parameterization associated with this parameter
				1916	block, then ``ParameterBlockLocalSize`` = ``ParameterBlockSize``.
				1917
				1918	.. function:: bool Problem::HasParameterBlock(const double* values) const
				1919
				1920	Is the given parameter block present in the problem or not?
				1921
				1922	.. function:: void Problem::GetParameterBlocks(vector<double> parameter_blocks) const
				1923
				1924	Fills the passed ``parameter_blocks`` vector with pointers to the
				1925	parameter blocks currently in the problem. After this call,
				1926	``parameter_block.size() == NumParameterBlocks``.
				1927
				1928	.. function:: void Problem::GetResidualBlocks(vector<ResidualBlockId>* residual_blocks) const
				1929
				1930	Fills the passed `residual_blocks` vector with pointers to the
				1931	residual blocks currently in the problem. After this call,
				1932	`residual_blocks.size() == NumResidualBlocks`.
				1933
				1934	.. function:: void Problem::GetParameterBlocksForResidualBlock(const ResidualBlockId residual_block, vector<double> parameter_blocks) const
				1935
				1936	Get all the parameter blocks that depend on the given residual
				1937	block.
				1938
				1939	.. function:: void Problem::GetResidualBlocksForParameterBlock(const double* values, vector<ResidualBlockId>* residual_blocks) const
				1940
				1941	Get all the residual blocks that depend on the given parameter
				1942	block.
				1943
				1944	If `Problem::Options::enable_fast_removal` is
				1945	`true`, then getting the residual blocks is fast and depends only
				1946	on the number of residual blocks. Otherwise, getting the residual
				1947	blocks for a parameter block will incur a scan of the entire
				1948	:class:`Problem` object.
				1949
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1950	.. function:: const CostFunction* Problem::GetCostFunctionForResidualBlock(const ResidualBlockId residual_block) const
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1951
				1952	Get the :class:`CostFunction` for the given residual block.
				1953
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1954	.. function:: const LossFunction* Problem::GetLossFunctionForResidualBlock(const ResidualBlockId residual_block) const
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1955
				1956	Get the :class:`LossFunction` for the given residual block.
				1957
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	1958	.. function:: bool EvaluateResidualBlock(ResidualBlockId residual_block_id, bool apply_loss_function, double* cost,double* residuals, double** jacobians) const
				1959
				1960	Evaluates the residual block, storing the scalar cost in ``cost``, the
				1961	residual components in ``residuals``, and the jacobians between the
				1962	parameters and residuals in ``jacobians[i]``, in row-major order.
				1963
				1964	If ``residuals`` is ``nullptr``, the residuals are not computed.
				1965
				1966	If ``jacobians`` is ``nullptr``, no Jacobians are computed. If
				1967	``jacobians[i]`` is ``nullptr``, then the Jacobian for that
				1968	parameter block is not computed.
				1969
				1970	It is not okay to request the Jacobian w.r.t a parameter block
				1971	that is constant.
				1972
				1973	The return value indicates the success or failure. Even if the
				1974	function returns false, the caller should expect the output
				1975	memory locations to have been modified.
				1976
				1977	The returned cost and jacobians have had robustification and local
				1978	parameterizations applied already; for example, the jacobian for a
				1979	4-dimensional quaternion parameter using the
				1980	:class:`QuaternionParameterization` is ``num_residuals x 3``
				1981	instead of ``num_residuals x 4``.
				1982
				1983	``apply_loss_function`` as the name implies allows the user to
				1984	switch the application of the loss function on and off.
				1985
				1986	.. NOTE:: If an :class:`EvaluationCallback` is associated with the
				1987	problem, then its
				1988	:func:`EvaluationCallback::PrepareForEvaluation` method will be
				1989	called every time this method is called with `new_point =
				1990	true`. This conservatively assumes that the user may have
				1991	changed the parameter values since the previous call to evaluate
				1992	/ solve. For improved efficiency, and only if you know that the
				1993	parameter values have not changed between calls, see
				1994	:func:`Problem::EvaluateResidualBlockAssumingParametersUnchanged`.
				1995
				1996
				1997	.. function:: bool EvaluateResidualBlockAssumingParametersUnchanged(ResidualBlockId residual_block_id, bool apply_loss_function, double* cost,double* residuals, double** jacobians) const
				1998
				1999	Same as :func:`Problem::EvaluateResidualBlock` except that if an
				2000	:class:`EvaluationCallback` is associated with the problem, then
				2001	its :func:`EvaluationCallback::PrepareForEvaluation` method will
				2002	be called every time this method is called with new_point = false.
				2003
				2004	This means, if an :class:`EvaluationCallback` is associated with
				2005	the problem then it is the user's responsibility to call
				2006	:func:`EvaluationCallback::PrepareForEvaluation` before calling
				2007	this method if necessary, i.e. iff the parameter values have been
				2008	changed since the last call to evaluate / solve.'
				2009
				2010	This is because, as the name implies, we assume that the parameter
				2011	blocks did not change since the last time
				2012	:func:`EvaluationCallback::PrepareForEvaluation` was called (via
				2013	:func:`Solve`, :func:`Problem::Evaluate` or
				2014	:func:`Problem::EvaluateResidualBlock`).
				2015
				2016
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2017	.. function:: bool Problem::Evaluate(const Problem::EvaluateOptions& options, double* cost, vector<double>* residuals, vector<double>* gradient, CRSMatrix* jacobian)
				2018
				2019	Evaluate a :class:`Problem`. Any of the output pointers can be
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	2020	`nullptr`. Which residual blocks and parameter blocks are used is
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2021	controlled by the :class:`Problem::EvaluateOptions` struct below.
				2022
				2023	.. NOTE::
				2024
				2025	The evaluation will use the values stored in the memory
				2026	locations pointed to by the parameter block pointers used at the
				2027	time of the construction of the problem, for example in the
				2028	following code:
				2029
				2030	.. code-block:: c++
				2031
				2032	Problem problem;
				2033	double x = 1;
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	2034	problem.Add(new MyCostFunction, nullptr, &x);
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2035
				2036	double cost = 0.0;
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	2037	problem.Evaluate(Problem::EvaluateOptions(), &cost, nullptr, nullptr, nullptr);
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2038
				2039	The cost is evaluated at `x = 1`. If you wish to evaluate the
				2040	problem at `x = 2`, then
				2041
				2042	.. code-block:: c++
				2043
				2044	x = 2;
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	2045	problem.Evaluate(Problem::EvaluateOptions(), &cost, nullptr, nullptr, nullptr);
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2046
				2047	is the way to do so.
				2048
				2049	.. NOTE::
				2050
				2051	If no local parameterizations are used, then the size of
				2052	the gradient vector is the sum of the sizes of all the parameter
				2053	blocks. If a parameter block has a local parameterization, then
				2054	it contributes "LocalSize" entries to the gradient vector.
				2055
				2056	.. NOTE::
				2057
				2058	This function cannot be called while the problem is being
				2059	solved, for example it cannot be called from an
				2060	:class:`IterationCallback` at the end of an iteration during a
				2061	solve.
				2062
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	2063	.. NOTE::
				2064
				2065	If an EvaluationCallback is associated with the problem, then
				2066	its PrepareForEvaluation method will be called everytime this
				2067	method is called with ``new_point = true``.
				2068
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2069	.. class:: Problem::EvaluateOptions
				2070
				2071	Options struct that is used to control :func:`Problem::Evaluate`.
				2072
				2073	.. member:: vector<double*> Problem::EvaluateOptions::parameter_blocks
				2074
				2075	The set of parameter blocks for which evaluation should be
				2076	performed. This vector determines the order in which parameter
				2077	blocks occur in the gradient vector and in the columns of the
				2078	jacobian matrix. If parameter_blocks is empty, then it is assumed
				2079	to be equal to a vector containing ALL the parameter
				2080	blocks. Generally speaking the ordering of the parameter blocks in
				2081	this case depends on the order in which they were added to the
				2082	problem and whether or not the user removed any parameter blocks.
				2083
				2084	NOTE This vector should contain the same pointers as the ones
				2085	used to add parameter blocks to the Problem. These parameter block
				2086	should NOT point to new memory locations. Bad things will happen if
				2087	you do.
				2088
				2089	.. member:: vector<ResidualBlockId> Problem::EvaluateOptions::residual_blocks
				2090
				2091	The set of residual blocks for which evaluation should be
				2092	performed. This vector determines the order in which the residuals
				2093	occur, and how the rows of the jacobian are ordered. If
				2094	residual_blocks is empty, then it is assumed to be equal to the
				2095	vector containing all the residual blocks.
				2096
				2097	.. member:: bool Problem::EvaluateOptions::apply_loss_function
				2098
				2099	Even though the residual blocks in the problem may contain loss
				2100	functions, setting apply_loss_function to false will turn off the
				2101	application of the loss function to the output of the cost
				2102	function. This is of use for example if the user wishes to analyse
				2103	the solution quality by studying the distribution of residuals
				2104	before and after the solve.
				2105
				2106	.. member:: int Problem::EvaluateOptions::num_threads
				2107
				2108	Number of threads to use. (Requires OpenMP).
				2109
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	2110
				2111	:class:`EvaluationCallback`
				2112	===========================
				2113
				2114	.. class:: EvaluationCallback
				2115
				2116	Interface for receiving callbacks before Ceres evaluates residuals or
				2117	Jacobians:
				2118
				2119	.. code-block:: c++
				2120
				2121	class EvaluationCallback {
				2122	public:
				2123	virtual ~EvaluationCallback() {}
				2124	virtual void PrepareForEvaluation()(bool evaluate_jacobians
				2125	bool new_evaluation_point) = 0;
				2126	};
				2127
				2128	.. function:: void EvaluationCallback::PrepareForEvaluation(bool evaluate_jacobians, bool new_evaluation_point)
				2129
				2130	Ceres will call :func:`EvaluationCallback::PrepareForEvaluation`
				2131	every time, and once before it computes the residuals and/or the
				2132	Jacobians.
				2133
				2134	User parameters (the double* values provided by the us) are fixed
				2135	until the next call to
				2136	:func:`EvaluationCallback::PrepareForEvaluation`. If
				2137	``new_evaluation_point == true``, then this is a new point that is
				2138	different from the last evaluated point. Otherwise, it is the same
				2139	point that was evaluated previously (either Jacobian or residual)
				2140	and the user can use cached results from previous evaluations. If
				2141	``evaluate_jacobians`` is true, then Ceres will request Jacobians
				2142	in the upcoming cost evaluation.
				2143
				2144	Using this callback interface, Ceres can notify you when it is
				2145	about to evaluate the residuals or Jacobians. With the callback,
				2146	you can share computation between residual blocks by doing the
				2147	shared computation in
				2148	:func:`EvaluationCallback::PrepareForEvaluation` before Ceres calls
				2149	:func:`CostFunction::Evaluate` on all the residuals. It also
				2150	enables caching results between a pure residual evaluation and a
				2151	residual & Jacobian evaluation, via the ``new_evaluation_point``
				2152	argument.
				2153
				2154	One use case for this callback is if the cost function compute is
				2155	moved to the GPU. In that case, the prepare call does the actual
				2156	cost function evaluation, and subsequent calls from Ceres to the
				2157	actual cost functions merely copy the results from the GPU onto the
				2158	corresponding blocks for Ceres to plug into the solver.
				2159
				2160	Note: Ceres provides no mechanism to share data other than the
				2161	notification from the callback. Users must provide access to
				2162	pre-computed shared data to their cost functions behind the scenes;
				2163	this all happens without Ceres knowing. One approach is to put a
				2164	pointer to the shared data in each cost function (recommended) or
				2165	to use a global shared variable (discouraged; bug-prone). As far
				2166	as Ceres is concerned, it is evaluating cost functions like any
				2167	other; it just so happens that behind the scenes the cost functions
				2168	reuse pre-computed data to execute faster.
				2169
				2170	See ``evaluation_callback_test.cc`` for code that explicitly
				2171	verifies the preconditions between
				2172	:func:`EvaluationCallback::PrepareForEvaluation` and
				2173	:func:`CostFunction::Evaluate`.
				2174
				2175
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2176	``rotation.h``
				2177	==============
				2178
				2179	Many applications of Ceres Solver involve optimization problems where
				2180	some of the variables correspond to rotations. To ease the pain of
				2181	work with the various representations of rotations (angle-axis,
				2182	quaternion and matrix) we provide a handy set of templated
				2183	functions. These functions are templated so that the user can use them
				2184	within Ceres Solver's automatic differentiation framework.
				2185
				2186	.. function:: template <typename T> void AngleAxisToQuaternion(T const* angle_axis, T* quaternion)
				2187
				2188	Convert a value in combined axis-angle representation to a
				2189	quaternion.
				2190
				2191	The value ``angle_axis`` is a triple whose norm is an angle in radians,
				2192	and whose direction is aligned with the axis of rotation, and
				2193	``quaternion`` is a 4-tuple that will contain the resulting quaternion.
				2194
				2195	.. function:: template <typename T> void QuaternionToAngleAxis(T const* quaternion, T* angle_axis)
				2196
				2197	Convert a quaternion to the equivalent combined axis-angle
				2198	representation.
				2199
				2200	The value ``quaternion`` must be a unit quaternion - it is not
				2201	normalized first, and ``angle_axis`` will be filled with a value
				2202	whose norm is the angle of rotation in radians, and whose direction
				2203	is the axis of rotation.
				2204
				2205	.. function:: template <typename T, int row_stride, int col_stride> void RotationMatrixToAngleAxis(const MatrixAdapter<const T, row_stride, col_stride>& R, T * angle_axis)
				2206	.. function:: template <typename T, int row_stride, int col_stride> void AngleAxisToRotationMatrix(T const * angle_axis, const MatrixAdapter<T, row_stride, col_stride>& R)
				2207	.. function:: template <typename T> void RotationMatrixToAngleAxis(T const * R, T * angle_axis)
				2208	.. function:: template <typename T> void AngleAxisToRotationMatrix(T const * angle_axis, T * R)
				2209
				2210	Conversions between 3x3 rotation matrix with given column and row strides and
				2211	axis-angle rotation representations. The functions that take a pointer to T instead
				2212	of a MatrixAdapter assume a column major representation with unit row stride and a column stride of 3.
				2213
				2214	.. function:: template <typename T, int row_stride, int col_stride> void EulerAnglesToRotationMatrix(const T* euler, const MatrixAdapter<T, row_stride, col_stride>& R)
				2215	.. function:: template <typename T> void EulerAnglesToRotationMatrix(const T* euler, int row_stride, T* R)
				2216
				2217	Conversions between 3x3 rotation matrix with given column and row strides and
				2218	Euler angle (in degrees) rotation representations.
				2219
				2220	The {pitch,roll,yaw} Euler angles are rotations around the {x,y,z}
				2221	axes, respectively. They are applied in that same order, so the
				2222	total rotation R is Rz * Ry * Rx.
				2223
				2224	The function that takes a pointer to T as the rotation matrix assumes a row
				2225	major representation with unit column stride and a row stride of 3.
				2226	The additional parameter row_stride is required to be 3.
				2227
				2228	.. function:: template <typename T, int row_stride, int col_stride> void QuaternionToScaledRotation(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
				2229	.. function:: template <typename T> void QuaternionToScaledRotation(const T q[4], T R[3 * 3])
				2230
				2231	Convert a 4-vector to a 3x3 scaled rotation matrix.
				2232
				2233	The choice of rotation is such that the quaternion
				2234	:math:`\begin{bmatrix} 1 &0 &0 &0\end{bmatrix}` goes to an identity
				2235	matrix and for small :math:`a, b, c` the quaternion
				2236	:math:`\begin{bmatrix}1 &a &b &c\end{bmatrix}` goes to the matrix
				2237
				2238	.. math::
				2239
				2240	I + 2 \begin{bmatrix} 0 & -c & b \\ c & 0 & -a\\ -b & a & 0
				2241	\end{bmatrix} + O(q^2)
				2242
				2243	which corresponds to a Rodrigues approximation, the last matrix
				2244	being the cross-product matrix of :math:`\begin{bmatrix} a& b&
				2245	c\end{bmatrix}`. Together with the property that :math:`R(q1 * q2)
				2246	= R(q1) * R(q2)` this uniquely defines the mapping from :math:`q` to
				2247	:math:`R`.
				2248
				2249	In the function that accepts a pointer to T instead of a MatrixAdapter,
				2250	the rotation matrix ``R`` is a row-major matrix with unit column stride
				2251	and a row stride of 3.
				2252
				2253	No normalization of the quaternion is performed, i.e.
				2254	:math:`R = \\|q\\|^2 Q`, where :math:`Q` is an orthonormal matrix
				2255	such that :math:`\det(Q) = 1` and :math:`Q*Q' = I`.
				2256
				2257
				2258	.. function:: template <typename T> void QuaternionToRotation(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
				2259	.. function:: template <typename T> void QuaternionToRotation(const T q[4], T R[3 * 3])
				2260
				2261	Same as above except that the rotation matrix is normalized by the
				2262	Frobenius norm, so that :math:`R R' = I` (and :math:`\det(R) = 1`).
				2263
				2264	.. function:: template <typename T> void UnitQuaternionRotatePoint(const T q[4], const T pt[3], T result[3])
				2265
				2266	Rotates a point pt by a quaternion q:
				2267
				2268	.. math:: \text{result} = R(q) \text{pt}
				2269
				2270	Assumes the quaternion is unit norm. If you pass in a quaternion
				2271	with :math:`\|q\|^2 = 2` then you WILL NOT get back 2 times the
				2272	result you get for a unit quaternion.
				2273
				2274
				2275	.. function:: template <typename T> void QuaternionRotatePoint(const T q[4], const T pt[3], T result[3])
				2276
				2277	With this function you do not need to assume that :math:`q` has unit norm.
				2278	It does assume that the norm is non-zero.
				2279
				2280	.. function:: template <typename T> void QuaternionProduct(const T z[4], const T w[4], T zw[4])
				2281
				2282	.. math:: zw = z * w
				2283
				2284	where :math:`*` is the Quaternion product between 4-vectors.
				2285
				2286
				2287	.. function:: template <typename T> void CrossProduct(const T x[3], const T y[3], T x_cross_y[3])
				2288
				2289	.. math:: \text{x_cross_y} = x \times y
				2290
				2291	.. function:: template <typename T> void AngleAxisRotatePoint(const T angle_axis[3], const T pt[3], T result[3])
				2292
				2293	.. math:: y = R(\text{angle_axis}) x
				2294
				2295
				2296	Cubic Interpolation
				2297	===================
				2298
				2299	Optimization problems often involve functions that are given in the
				2300	form of a table of values, for example an image. Evaluating these
				2301	functions and their derivatives requires interpolating these
				2302	values. Interpolating tabulated functions is a vast area of research
				2303	and there are a lot of libraries which implement a variety of
				2304	interpolation schemes. However, using them within the automatic
				2305	differentiation framework in Ceres is quite painful. To this end,
				2306	Ceres provides the ability to interpolate one dimensional and two
				2307	dimensional tabular functions.
				2308
				2309	The one dimensional interpolation is based on the Cubic Hermite
				2310	Spline, also known as the Catmull-Rom Spline. This produces a first
				2311	order differentiable interpolating function. The two dimensional
				2312	interpolation scheme is a generalization of the one dimensional scheme
				2313	where the interpolating function is assumed to be separable in the two
				2314	dimensions,
				2315
				2316	More details of the construction can be found `Linear Methods for
				2317	Image Interpolation <http://www.ipol.im/pub/art/2011/g_lmii/>`_ by
				2318	Pascal Getreuer.
				2319
				2320	.. class:: CubicInterpolator
				2321
				2322	Given as input an infinite one dimensional grid, which provides the
				2323	following interface.
				2324
				2325	.. code::
				2326
				2327	struct Grid1D {
				2328	enum { DATA_DIMENSION = 2; };
				2329	void GetValue(int n, double* f) const;
				2330	};
				2331
				2332	Where, ``GetValue`` gives us the value of a function :math:`f`
				2333	(possibly vector valued) for any integer :math:`n` and the enum
				2334	``DATA_DIMENSION`` indicates the dimensionality of the function being
				2335	interpolated. For example if you are interpolating rotations in
				2336	axis-angle format over time, then ``DATA_DIMENSION = 3``.
				2337
				2338	:class:`CubicInterpolator` uses Cubic Hermite splines to produce a
				2339	smooth approximation to it that can be used to evaluate the
				2340	:math:`f(x)` and :math:`f'(x)` at any point on the real number
				2341	line. For example, the following code interpolates an array of four
				2342	numbers.
				2343
				2344	.. code::
				2345
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	2346	const double x[] = {1.0, 2.0, 5.0, 6.0};
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2347	Grid1D<double, 1> array(x, 0, 4);
				2348	CubicInterpolator interpolator(array);
				2349	double f, dfdx;
				2350	interpolator.Evaluate(1.5, &f, &dfdx);
				2351
				2352
				2353	In the above code we use ``Grid1D`` a templated helper class that
				2354	allows easy interfacing between ``C++`` arrays and
				2355	:class:`CubicInterpolator`.
				2356
				2357	``Grid1D`` supports vector valued functions where the various
				2358	coordinates of the function can be interleaved or stacked. It also
				2359	allows the use of any numeric type as input, as long as it can be
				2360	safely cast to a double.
				2361
				2362	.. class:: BiCubicInterpolator
				2363
				2364	Given as input an infinite two dimensional grid, which provides the
				2365	following interface:
				2366
				2367	.. code::
				2368
				2369	struct Grid2D {
				2370	enum { DATA_DIMENSION = 2 };
				2371	void GetValue(int row, int col, double* f) const;
				2372	};
				2373
				2374	Where, ``GetValue`` gives us the value of a function :math:`f`
				2375	(possibly vector valued) for any pair of integers :code:`row` and
				2376	:code:`col` and the enum ``DATA_DIMENSION`` indicates the
				2377	dimensionality of the function being interpolated. For example if you
				2378	are interpolating a color image with three channels (Red, Green &
				2379	Blue), then ``DATA_DIMENSION = 3``.
				2380
				2381	:class:`BiCubicInterpolator` uses the cubic convolution interpolation
				2382	algorithm of R. Keys [Keys]_, to produce a smooth approximation to it
				2383	that can be used to evaluate the :math:`f(r,c)`, :math:`\frac{\partial
				2384	f(r,c)}{\partial r}` and :math:`\frac{\partial f(r,c)}{\partial c}` at
				2385	any any point in the real plane.
				2386
				2387	For example the following code interpolates a two dimensional array.
				2388
				2389	.. code::
				2390
				2391	const double data[] = {1.0, 3.0, -1.0, 4.0,
				2392	3.6, 2.1, 4.2, 2.0,
				2393	2.0, 1.0, 3.1, 5.2};
				2394	Grid2D<double, 1> array(data, 0, 3, 0, 4);
				2395	BiCubicInterpolator interpolator(array);
				2396	double f, dfdr, dfdc;
				2397	interpolator.Evaluate(1.2, 2.5, &f, &dfdr, &dfdc);
				2398
				2399	In the above code, the templated helper class ``Grid2D`` is used to
				2400	make a ``C++`` array look like a two dimensional table to
				2401	:class:`BiCubicInterpolator`.
				2402
				2403	``Grid2D`` supports row or column major layouts. It also supports
				2404	vector valued functions where the individual coordinates of the
				2405	function may be interleaved or stacked. It also allows the use of any
				2406	numeric type as input, as long as it can be safely cast to double.