Blame - docs/source/nnls_modeling.rst - RealtimeRoboticsGroup/test

blob: 860b689f2e9520c787f2d6c61d85922f0b714056 [file] [log] [blame]

Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame^]	1	.. default-domain:: cpp
				2
				3	.. cpp:namespace:: ceres
				4
				5	.. _`chapter-nnls_modeling`:
				6
				7	=================================
				8	Modeling Non-linear Least Squares
				9	=================================
				10
				11	Introduction
				12	============
				13
				14	Ceres solver consists of two distinct parts. A modeling API which
				15	provides a rich set of tools to construct an optimization problem one
				16	term at a time and a solver API that controls the minimization
				17	algorithm. This chapter is devoted to the task of modeling
				18	optimization problems using Ceres. :ref:`chapter-nnls_solving` discusses
				19	the various ways in which an optimization problem can be solved using
				20	Ceres.
				21
				22	Ceres solves robustified bounds constrained non-linear least squares
				23	problems of the form:
				24
				25	.. math:: :label: ceresproblem_modeling
				26
				27	\min_{\mathbf{x}} &\quad \frac{1}{2}\sum_{i}
				28	\rho_i\left(\left\\|f_i\left(x_{i_1},
				29	... ,x_{i_k}\right)\right\\|^2\right) \\
				30	\text{s.t.} &\quad l_j \le x_j \le u_j
				31
				32	In Ceres parlance, the expression
				33	:math:`\rho_i\left(\left\\|f_i\left(x_{i_1},...,x_{i_k}\right)\right\\|^2\right)`
				34	is known as a residual block, where :math:`f_i(\cdot)` is a
				35	:class:`CostFunction` that depends on the parameter blocks
				36	:math:`\left\{x_{i_1},... , x_{i_k}\right\}`.
				37
				38	In most optimization problems small groups of scalars occur
				39	together. For example the three components of a translation vector and
				40	the four components of the quaternion that define the pose of a
				41	camera. We refer to such a group of scalars as a parameter block. Of
				42	course a parameter block can be just a single scalar too.
				43
				44	:math:`\rho_i` is a :class:`LossFunction`. A :class:`LossFunction` is
				45	a scalar valued function that is used to reduce the influence of
				46	outliers on the solution of non-linear least squares problems.
				47
				48	:math:`l_j` and :math:`u_j` are lower and upper bounds on the
				49	parameter block :math:`x_j`.
				50
				51	As a special case, when :math:`\rho_i(x) = x`, i.e., the identity
				52	function, and :math:`l_j = -\infty` and :math:`u_j = \infty` we get
				53	the more familiar unconstrained `non-linear least squares problem
				54	<http://en.wikipedia.org/wiki/Non-linear_least_squares>`_.
				55
				56	.. math:: :label: ceresproblemunconstrained
				57
				58	\frac{1}{2}\sum_{i} \left\\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\\|^2.
				59
				60	:class:`CostFunction`
				61	=====================
				62
				63	For each term in the objective function, a :class:`CostFunction` is
				64	responsible for computing a vector of residuals and Jacobian
				65	matrices. Concretely, consider a function
				66	:math:`f\left(x_{1},...,x_{k}\right)` that depends on parameter blocks
				67	:math:`\left[x_{1}, ... , x_{k}\right]`.
				68
				69	Then, given :math:`\left[x_{1}, ... , x_{k}\right]`,
				70	:class:`CostFunction` is responsible for computing the vector
				71	:math:`f\left(x_{1},...,x_{k}\right)` and the Jacobian matrices
				72
				73	.. math:: J_i = \frac{\partial}{\partial x_i} f(x_1, ..., x_k) \quad \forall i \in \{1, \ldots, k\}
				74
				75	.. class:: CostFunction
				76
				77	.. code-block:: c++
				78
				79	class CostFunction {
				80	public:
				81	virtual bool Evaluate(double const* const* parameters,
				82	double* residuals,
				83	double** jacobians) = 0;
				84	const vector<int32>& parameter_block_sizes();
				85	int num_residuals() const;
				86
				87	protected:
				88	vector<int32>* mutable_parameter_block_sizes();
				89	void set_num_residuals(int num_residuals);
				90	};
				91
				92
				93	The signature of the :class:`CostFunction` (number and sizes of input
				94	parameter blocks and number of outputs) is stored in
				95	:member:`CostFunction::parameter_block_sizes_` and
				96	:member:`CostFunction::num_residuals_` respectively. User code
				97	inheriting from this class is expected to set these two members with
				98	the corresponding accessors. This information will be verified by the
				99	:class:`Problem` when added with :func:`Problem::AddResidualBlock`.
				100
				101	.. function:: bool CostFunction::Evaluate(double const* const* parameters, double* residuals, double** jacobians)
				102
				103	Compute the residual vector and the Jacobian matrices.
				104
				105	``parameters`` is an array of arrays of size
				106	``CostFunction::parameter_block_sizes_.size()`` and
				107	``parameters[i]`` is an array of size ``parameter_block_sizes_[i]``
				108	that contains the :math:`i^{\text{th}}` parameter block that the
				109	``CostFunction`` depends on.
				110
				111	``parameters`` is never ``NULL``.
				112
				113	``residuals`` is an array of size ``num_residuals_``.
				114
				115	``residuals`` is never ``NULL``.
				116
				117	``jacobians`` is an array of arrays of size
				118	``CostFunction::parameter_block_sizes_.size()``.
				119
				120	If ``jacobians`` is ``NULL``, the user is only expected to compute
				121	the residuals.
				122
				123	``jacobians[i]`` is a row-major array of size ``num_residuals x
				124	parameter_block_sizes_[i]``.
				125
				126	If ``jacobians[i]`` is not ``NULL``, the user is required to
				127	compute the Jacobian of the residual vector with respect to
				128	``parameters[i]`` and store it in this array, i.e.
				129
				130	``jacobians[i][r * parameter_block_sizes_[i] + c]`` =
				131	:math:`\frac{\displaystyle \partial \text{residual}[r]}{\displaystyle \partial \text{parameters}[i][c]}`
				132
				133	If ``jacobians[i]`` is ``NULL``, then this computation can be
				134	skipped. This is the case when the corresponding parameter block is
				135	marked constant.
				136
				137	The return value indicates whether the computation of the residuals
				138	and/or jacobians was successful or not. This can be used to
				139	communicate numerical failures in Jacobian computations for
				140	instance.
				141
				142	:class:`SizedCostFunction`
				143	==========================
				144
				145	.. class:: SizedCostFunction
				146
				147	If the size of the parameter blocks and the size of the residual
				148	vector is known at compile time (this is the common case),
				149	:class:`SizeCostFunction` can be used where these values can be
				150	specified as template parameters and the user only needs to
				151	implement :func:`CostFunction::Evaluate`.
				152
				153	.. code-block:: c++
				154
				155	template<int kNumResiduals,
				156	int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
				157	int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
				158	class SizedCostFunction : public CostFunction {
				159	public:
				160	virtual bool Evaluate(double const* const* parameters,
				161	double* residuals,
				162	double** jacobians) const = 0;
				163	};
				164
				165
				166	:class:`AutoDiffCostFunction`
				167	=============================
				168
				169	.. class:: AutoDiffCostFunction
				170
				171	Defining a :class:`CostFunction` or a :class:`SizedCostFunction`
				172	can be a tedious and error prone especially when computing
				173	derivatives. To this end Ceres provides `automatic differentiation
				174	<http://en.wikipedia.org/wiki/Automatic_differentiation>`_.
				175
				176	.. code-block:: c++
				177
				178	template <typename CostFunctor,
				179	int kNumResiduals, // Number of residuals, or ceres::DYNAMIC.
				180	int N0, // Number of parameters in block 0.
				181	int N1 = 0, // Number of parameters in block 1.
				182	int N2 = 0, // Number of parameters in block 2.
				183	int N3 = 0, // Number of parameters in block 3.
				184	int N4 = 0, // Number of parameters in block 4.
				185	int N5 = 0, // Number of parameters in block 5.
				186	int N6 = 0, // Number of parameters in block 6.
				187	int N7 = 0, // Number of parameters in block 7.
				188	int N8 = 0, // Number of parameters in block 8.
				189	int N9 = 0> // Number of parameters in block 9.
				190	class AutoDiffCostFunction : public
				191	SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
				192	public:
				193	explicit AutoDiffCostFunction(CostFunctor* functor);
				194	// Ignore the template parameter kNumResiduals and use
				195	// num_residuals instead.
				196	AutoDiffCostFunction(CostFunctor* functor, int num_residuals);
				197	};
				198
				199	To get an auto differentiated cost function, you must define a
				200	class with a templated ``operator()`` (a functor) that computes the
				201	cost function in terms of the template parameter ``T``. The
				202	autodiff framework substitutes appropriate ``Jet`` objects for
				203	``T`` in order to compute the derivative when necessary, but this
				204	is hidden, and you should write the function as if ``T`` were a
				205	scalar type (e.g. a double-precision floating point number).
				206
				207	The function must write the computed value in the last argument
				208	(the only non-``const`` one) and return true to indicate success.
				209
				210	For example, consider a scalar error :math:`e = k - x^\top y`,
				211	where both :math:`x` and :math:`y` are two-dimensional vector
				212	parameters and :math:`k` is a constant. The form of this error,
				213	which is the difference between a constant and an expression, is a
				214	common pattern in least squares problems. For example, the value
				215	:math:`x^\top y` might be the model expectation for a series of
				216	measurements, where there is an instance of the cost function for
				217	each measurement :math:`k`.
				218
				219	The actual cost added to the total problem is :math:`e^2`, or
				220	:math:`(k - x^\top y)^2`; however, the squaring is implicitly done
				221	by the optimization framework.
				222
				223	To write an auto-differentiable cost function for the above model,
				224	first define the object
				225
				226	.. code-block:: c++
				227
				228	class MyScalarCostFunctor {
				229	MyScalarCostFunctor(double k): k_(k) {}
				230
				231	template <typename T>
				232	bool operator()(const T* const x , const T* const y, T* e) const {
				233	e[0] = k_ - x[0] * y[0] - x[1] * y[1];
				234	return true;
				235	}
				236
				237	private:
				238	double k_;
				239	};
				240
				241
				242	Note that in the declaration of ``operator()`` the input parameters
				243	``x`` and ``y`` come first, and are passed as const pointers to arrays
				244	of ``T``. If there were three input parameters, then the third input
				245	parameter would come after ``y``. The output is always the last
				246	parameter, and is also a pointer to an array. In the example above,
				247	``e`` is a scalar, so only ``e[0]`` is set.
				248
				249	Then given this class definition, the auto differentiated cost
				250	function for it can be constructed as follows.
				251
				252	.. code-block:: c++
				253
				254	CostFunction* cost_function
				255	= new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
				256	new MyScalarCostFunctor(1.0)); ^ ^ ^
				257	\| \| \|
				258	Dimension of residual ------+ \| \|
				259	Dimension of x ----------------+ \|
				260	Dimension of y -------------------+
				261
				262
				263	In this example, there is usually an instance for each measurement
				264	of ``k``.
				265
				266	In the instantiation above, the template parameters following
				267	``MyScalarCostFunction``, ``<1, 2, 2>`` describe the functor as
				268	computing a 1-dimensional output from two arguments, both
				269	2-dimensional.
				270
				271	:class:`AutoDiffCostFunction` also supports cost functions with a
				272	runtime-determined number of residuals. For example:
				273
				274	.. code-block:: c++
				275
				276	CostFunction* cost_function
				277	= new AutoDiffCostFunction<MyScalarCostFunctor, DYNAMIC, 2, 2>(
				278	new CostFunctorWithDynamicNumResiduals(1.0), ^ ^ ^
				279	runtime_number_of_residuals); <----+ \| \| \|
				280	\| \| \| \|
				281	\| \| \| \|
				282	Actual number of residuals ------+ \| \| \|
				283	Indicate dynamic number of residuals --------+ \| \|
				284	Dimension of x ------------------------------------+ \|
				285	Dimension of y ---------------------------------------+
				286
				287	The framework can currently accommodate cost functions of up to 10
				288	independent variables, and there is no limit on the dimensionality
				289	of each of them.
				290
				291	WARNING 1 A common beginner's error when first using
				292	:class:`AutoDiffCostFunction` is to get the sizing wrong. In particular,
				293	there is a tendency to set the template parameters to (dimension of
				294	residual, number of parameters) instead of passing a dimension
				295	parameter for every parameter block. In the example above, that
				296	would be ``<MyScalarCostFunction, 1, 2>``, which is missing the 2
				297	as the last template argument.
				298
				299
				300	:class:`DynamicAutoDiffCostFunction`
				301	====================================
				302
				303	.. class:: DynamicAutoDiffCostFunction
				304
				305	:class:`AutoDiffCostFunction` requires that the number of parameter
				306	blocks and their sizes be known at compile time. It also has an
				307	upper limit of 10 parameter blocks. In a number of applications,
				308	this is not enough e.g., Bezier curve fitting, Neural Network
				309	training etc.
				310
				311	.. code-block:: c++
				312
				313	template <typename CostFunctor, int Stride = 4>
				314	class DynamicAutoDiffCostFunction : public CostFunction {
				315	};
				316
				317	In such cases :class:`DynamicAutoDiffCostFunction` can be
				318	used. Like :class:`AutoDiffCostFunction` the user must define a
				319	templated functor, but the signature of the functor differs
				320	slightly. The expected interface for the cost functors is:
				321
				322	.. code-block:: c++
				323
				324	struct MyCostFunctor {
				325	template<typename T>
				326	bool operator()(T const* const* parameters, T* residuals) const {
				327	}
				328	}
				329
				330	Since the sizing of the parameters is done at runtime, you must
				331	also specify the sizes after creating the dynamic autodiff cost
				332	function. For example:
				333
				334	.. code-block:: c++
				335
				336	DynamicAutoDiffCostFunction<MyCostFunctor, 4>* cost_function =
				337	new DynamicAutoDiffCostFunction<MyCostFunctor, 4>(
				338	new MyCostFunctor());
				339	cost_function->AddParameterBlock(5);
				340	cost_function->AddParameterBlock(10);
				341	cost_function->SetNumResiduals(21);
				342
				343	Under the hood, the implementation evaluates the cost function
				344	multiple times, computing a small set of the derivatives (four by
				345	default, controlled by the ``Stride`` template parameter) with each
				346	pass. There is a performance tradeoff with the size of the passes;
				347	Smaller sizes are more cache efficient but result in larger number
				348	of passes, and larger stride lengths can destroy cache-locality
				349	while reducing the number of passes over the cost function. The
				350	optimal value depends on the number and sizes of the various
				351	parameter blocks.
				352
				353	As a rule of thumb, try using :class:`AutoDiffCostFunction` before
				354	you use :class:`DynamicAutoDiffCostFunction`.
				355
				356	:class:`NumericDiffCostFunction`
				357	================================
				358
				359	.. class:: NumericDiffCostFunction
				360
				361	In some cases, its not possible to define a templated cost functor,
				362	for example when the evaluation of the residual involves a call to a
				363	library function that you do not have control over. In such a
				364	situation, `numerical differentiation
				365	<http://en.wikipedia.org/wiki/Numerical_differentiation>`_ can be
				366	used.
				367
				368	.. NOTE ::
				369
				370	TODO(sameeragarwal): Add documentation for the constructor and for
				371	NumericDiffOptions. Update DynamicNumericDiffOptions in a similar
				372	manner.
				373
				374	.. code-block:: c++
				375
				376	template <typename CostFunctor,
				377	NumericDiffMethodType method = CENTRAL,
				378	int kNumResiduals, // Number of residuals, or ceres::DYNAMIC.
				379	int N0, // Number of parameters in block 0.
				380	int N1 = 0, // Number of parameters in block 1.
				381	int N2 = 0, // Number of parameters in block 2.
				382	int N3 = 0, // Number of parameters in block 3.
				383	int N4 = 0, // Number of parameters in block 4.
				384	int N5 = 0, // Number of parameters in block 5.
				385	int N6 = 0, // Number of parameters in block 6.
				386	int N7 = 0, // Number of parameters in block 7.
				387	int N8 = 0, // Number of parameters in block 8.
				388	int N9 = 0> // Number of parameters in block 9.
				389	class NumericDiffCostFunction : public
				390	SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
				391	};
				392
				393	To get a numerically differentiated :class:`CostFunction`, you must
				394	define a class with a ``operator()`` (a functor) that computes the
				395	residuals. The functor must write the computed value in the last
				396	argument (the only non-``const`` one) and return ``true`` to
				397	indicate success. Please see :class:`CostFunction` for details on
				398	how the return value may be used to impose simple constraints on the
				399	parameter block. e.g., an object of the form
				400
				401	.. code-block:: c++
				402
				403	struct ScalarFunctor {
				404	public:
				405	bool operator()(const double* const x1,
				406	const double* const x2,
				407	double* residuals) const;
				408	}
				409
				410	For example, consider a scalar error :math:`e = k - x'y`, where both
				411	:math:`x` and :math:`y` are two-dimensional column vector
				412	parameters, the prime sign indicates transposition, and :math:`k` is
				413	a constant. The form of this error, which is the difference between
				414	a constant and an expression, is a common pattern in least squares
				415	problems. For example, the value :math:`x'y` might be the model
				416	expectation for a series of measurements, where there is an instance
				417	of the cost function for each measurement :math:`k`.
				418
				419	To write an numerically-differentiable class:`CostFunction` for the
				420	above model, first define the object
				421
				422	.. code-block:: c++
				423
				424	class MyScalarCostFunctor {
				425	MyScalarCostFunctor(double k): k_(k) {}
				426
				427	bool operator()(const double* const x,
				428	const double* const y,
				429	double* residuals) const {
				430	residuals[0] = k_ - x[0] * y[0] + x[1] * y[1];
				431	return true;
				432	}
				433
				434	private:
				435	double k_;
				436	};
				437
				438	Note that in the declaration of ``operator()`` the input parameters
				439	``x`` and ``y`` come first, and are passed as const pointers to
				440	arrays of ``double`` s. If there were three input parameters, then
				441	the third input parameter would come after ``y``. The output is
				442	always the last parameter, and is also a pointer to an array. In the
				443	example above, the residual is a scalar, so only ``residuals[0]`` is
				444	set.
				445
				446	Then given this class definition, the numerically differentiated
				447	:class:`CostFunction` with central differences used for computing
				448	the derivative can be constructed as follows.
				449
				450	.. code-block:: c++
				451
				452	CostFunction* cost_function
				453	= new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, 1, 2, 2>(
				454	new MyScalarCostFunctor(1.0)); ^ ^ ^ ^
				455	\| \| \| \|
				456	Finite Differencing Scheme -+ \| \| \|
				457	Dimension of residual ------------+ \| \|
				458	Dimension of x ----------------------+ \|
				459	Dimension of y -------------------------+
				460
				461	In this example, there is usually an instance for each measurement
				462	of `k`.
				463
				464	In the instantiation above, the template parameters following
				465	``MyScalarCostFunctor``, ``1, 2, 2``, describe the functor as
				466	computing a 1-dimensional output from two arguments, both
				467	2-dimensional.
				468
				469	NumericDiffCostFunction also supports cost functions with a
				470	runtime-determined number of residuals. For example:
				471
				472	.. code-block:: c++
				473
				474	CostFunction* cost_function
				475	= new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, DYNAMIC, 2, 2>(
				476	new CostFunctorWithDynamicNumResiduals(1.0), ^ ^ ^
				477	TAKE_OWNERSHIP, \| \| \|
				478	runtime_number_of_residuals); <----+ \| \| \|
				479	\| \| \| \|
				480	\| \| \| \|
				481	Actual number of residuals ------+ \| \| \|
				482	Indicate dynamic number of residuals --------------------+ \| \|
				483	Dimension of x ------------------------------------------------+ \|
				484	Dimension of y ---------------------------------------------------+
				485
				486
				487	The framework can currently accommodate cost functions of up to 10
				488	independent variables, and there is no limit on the dimensionality
				489	of each of them.
				490
				491	There are three available numeric differentiation schemes in ceres-solver:
				492
				493	The ``FORWARD`` difference method, which approximates :math:`f'(x)`
				494	by computing :math:`\frac{f(x+h)-f(x)}{h}`, computes the cost
				495	function one additional time at :math:`x+h`. It is the fastest but
				496	least accurate method.
				497
				498	The ``CENTRAL`` difference method is more accurate at the cost of
				499	twice as many function evaluations than forward difference,
				500	estimating :math:`f'(x)` by computing
				501	:math:`\frac{f(x+h)-f(x-h)}{2h}`.
				502
				503	The ``RIDDERS`` difference method[Ridders]_ is an adaptive scheme
				504	that estimates derivatives by performing multiple central
				505	differences at varying scales. Specifically, the algorithm starts at
				506	a certain :math:`h` and as the derivative is estimated, this step
				507	size decreases. To conserve function evaluations and estimate the
				508	derivative error, the method performs Richardson extrapolations
				509	between the tested step sizes. The algorithm exhibits considerably
				510	higher accuracy, but does so by additional evaluations of the cost
				511	function.
				512
				513	Consider using ``CENTRAL`` differences to begin with. Based on the
				514	results, either try forward difference to improve performance or
				515	Ridders' method to improve accuracy.
				516
				517	WARNING A common beginner's error when first using
				518	:class:`NumericDiffCostFunction` is to get the sizing wrong. In
				519	particular, there is a tendency to set the template parameters to
				520	(dimension of residual, number of parameters) instead of passing a
				521	dimension parameter for every parameter. In the example above,
				522	that would be ``<MyScalarCostFunctor, 1, 2>``, which is missing the
				523	last ``2`` argument. Please be careful when setting the size
				524	parameters.
				525
				526
				527	Numeric Differentiation & LocalParameterization
				528	-----------------------------------------------
				529
				530	If your cost function depends on a parameter block that must lie on
				531	a manifold and the functor cannot be evaluated for values of that
				532	parameter block not on the manifold then you may have problems
				533	numerically differentiating such functors.
				534
				535	This is because numeric differentiation in Ceres is performed by
				536	perturbing the individual coordinates of the parameter blocks that
				537	a cost functor depends on. In doing so, we assume that the
				538	parameter blocks live in an Euclidean space and ignore the
				539	structure of manifold that they live As a result some of the
				540	perturbations may not lie on the manifold corresponding to the
				541	parameter block.
				542
				543	For example consider a four dimensional parameter block that is
				544	interpreted as a unit Quaternion. Perturbing the coordinates of
				545	this parameter block will violate the unit norm property of the
				546	parameter block.
				547
				548	Fixing this problem requires that :class:`NumericDiffCostFunction`
				549	be aware of the :class:`LocalParameterization` associated with each
				550	parameter block and only generate perturbations in the local
				551	tangent space of each parameter block.
				552
				553	For now this is not considered to be a serious enough problem to
				554	warrant changing the :class:`NumericDiffCostFunction` API. Further,
				555	in most cases it is relatively straightforward to project a point
				556	off the manifold back onto the manifold before using it in the
				557	functor. For example in case of the Quaternion, normalizing the
				558	4-vector before using it does the trick.
				559
				560	Alternate Interface
				561
				562	For a variety of reasons, including compatibility with legacy code,
				563	:class:`NumericDiffCostFunction` can also take
				564	:class:`CostFunction` objects as input. The following describes
				565	how.
				566
				567	To get a numerically differentiated cost function, define a
				568	subclass of :class:`CostFunction` such that the
				569	:func:`CostFunction::Evaluate` function ignores the ``jacobians``
				570	parameter. The numeric differentiation wrapper will fill in the
				571	jacobian parameter if necessary by repeatedly calling the
				572	:func:`CostFunction::Evaluate` with small changes to the
				573	appropriate parameters, and computing the slope. For performance,
				574	the numeric differentiation wrapper class is templated on the
				575	concrete cost function, even though it could be implemented only in
				576	terms of the :class:`CostFunction` interface.
				577
				578	The numerically differentiated version of a cost function for a
				579	cost function can be constructed as follows:
				580
				581	.. code-block:: c++
				582
				583	CostFunction* cost_function
				584	= new NumericDiffCostFunction<MyCostFunction, CENTRAL, 1, 4, 8>(
				585	new MyCostFunction(...), TAKE_OWNERSHIP);
				586
				587	where ``MyCostFunction`` has 1 residual and 2 parameter blocks with
				588	sizes 4 and 8 respectively. Look at the tests for a more detailed
				589	example.
				590
				591	:class:`DynamicNumericDiffCostFunction`
				592	=======================================
				593
				594	.. class:: DynamicNumericDiffCostFunction
				595
				596	Like :class:`AutoDiffCostFunction` :class:`NumericDiffCostFunction`
				597	requires that the number of parameter blocks and their sizes be
				598	known at compile time. It also has an upper limit of 10 parameter
				599	blocks. In a number of applications, this is not enough.
				600
				601	.. code-block:: c++
				602
				603	template <typename CostFunctor, NumericDiffMethodType method = CENTRAL>
				604	class DynamicNumericDiffCostFunction : public CostFunction {
				605	};
				606
				607	In such cases when numeric differentiation is desired,
				608	:class:`DynamicNumericDiffCostFunction` can be used.
				609
				610	Like :class:`NumericDiffCostFunction` the user must define a
				611	functor, but the signature of the functor differs slightly. The
				612	expected interface for the cost functors is:
				613
				614	.. code-block:: c++
				615
				616	struct MyCostFunctor {
				617	bool operator()(double const* const* parameters, double* residuals) const {
				618	}
				619	}
				620
				621	Since the sizing of the parameters is done at runtime, you must
				622	also specify the sizes after creating the dynamic numeric diff cost
				623	function. For example:
				624
				625	.. code-block:: c++
				626
				627	DynamicNumericDiffCostFunction<MyCostFunctor>* cost_function =
				628	new DynamicNumericDiffCostFunction<MyCostFunctor>(new MyCostFunctor);
				629	cost_function->AddParameterBlock(5);
				630	cost_function->AddParameterBlock(10);
				631	cost_function->SetNumResiduals(21);
				632
				633	As a rule of thumb, try using :class:`NumericDiffCostFunction` before
				634	you use :class:`DynamicNumericDiffCostFunction`.
				635
				636	WARNING The same caution about mixing local parameterizations
				637	with numeric differentiation applies as is the case with
				638	:class:`NumericDiffCostFunction`.
				639
				640	:class:`CostFunctionToFunctor`
				641	==============================
				642
				643	.. class:: CostFunctionToFunctor
				644
				645	:class:`CostFunctionToFunctor` is an adapter class that allows
				646	users to use :class:`CostFunction` objects in templated functors
				647	which are to be used for automatic differentiation. This allows
				648	the user to seamlessly mix analytic, numeric and automatic
				649	differentiation.
				650
				651	For example, let us assume that
				652
				653	.. code-block:: c++
				654
				655	class IntrinsicProjection : public SizedCostFunction<2, 5, 3> {
				656	public:
				657	IntrinsicProjection(const double* observation);
				658	virtual bool Evaluate(double const* const* parameters,
				659	double* residuals,
				660	double** jacobians) const;
				661	};
				662
				663	is a :class:`CostFunction` that implements the projection of a
				664	point in its local coordinate system onto its image plane and
				665	subtracts it from the observed point projection. It can compute its
				666	residual and either via analytic or numerical differentiation can
				667	compute its jacobians.
				668
				669	Now we would like to compose the action of this
				670	:class:`CostFunction` with the action of camera extrinsics, i.e.,
				671	rotation and translation. Say we have a templated function
				672
				673	.. code-block:: c++
				674
				675	template<typename T>
				676	void RotateAndTranslatePoint(const T* rotation,
				677	const T* translation,
				678	const T* point,
				679	T* result);
				680
				681
				682	Then we can now do the following,
				683
				684	.. code-block:: c++
				685
				686	struct CameraProjection {
				687	CameraProjection(double* observation)
				688	: intrinsic_projection_(new IntrinsicProjection(observation)) {
				689	}
				690
				691	template <typename T>
				692	bool operator()(const T* rotation,
				693	const T* translation,
				694	const T* intrinsics,
				695	const T* point,
				696	T* residual) const {
				697	T transformed_point[3];
				698	RotateAndTranslatePoint(rotation, translation, point, transformed_point);
				699
				700	// Note that we call intrinsic_projection_, just like it was
				701	// any other templated functor.
				702	return intrinsic_projection_(intrinsics, transformed_point, residual);
				703	}
				704
				705	private:
				706	CostFunctionToFunctor<2,5,3> intrinsic_projection_;
				707	};
				708
				709	Note that :class:`CostFunctionToFunctor` takes ownership of the
				710	:class:`CostFunction` that was passed in to the constructor.
				711
				712	In the above example, we assumed that ``IntrinsicProjection`` is a
				713	``CostFunction`` capable of evaluating its value and its
				714	derivatives. Suppose, if that were not the case and
				715	``IntrinsicProjection`` was defined as follows:
				716
				717	.. code-block:: c++
				718
				719	struct IntrinsicProjection
				720	IntrinsicProjection(const double* observation) {
				721	observation_[0] = observation[0];
				722	observation_[1] = observation[1];
				723	}
				724
				725	bool operator()(const double* calibration,
				726	const double* point,
				727	double* residuals) {
				728	double projection[2];
				729	ThirdPartyProjectionFunction(calibration, point, projection);
				730	residuals[0] = observation_[0] - projection[0];
				731	residuals[1] = observation_[1] - projection[1];
				732	return true;
				733	}
				734	double observation_[2];
				735	};
				736
				737
				738	Here ``ThirdPartyProjectionFunction`` is some third party library
				739	function that we have no control over. So this function can compute
				740	its value and we would like to use numeric differentiation to
				741	compute its derivatives. In this case we can use a combination of
				742	``NumericDiffCostFunction`` and ``CostFunctionToFunctor`` to get the
				743	job done.
				744
				745	.. code-block:: c++
				746
				747	struct CameraProjection {
				748	CameraProjection(double* observation)
				749	intrinsic_projection_(
				750	new NumericDiffCostFunction<IntrinsicProjection, CENTRAL, 2, 5, 3>(
				751	new IntrinsicProjection(observation)) {
				752	}
				753
				754	template <typename T>
				755	bool operator()(const T* rotation,
				756	const T* translation,
				757	const T* intrinsics,
				758	const T* point,
				759	T* residuals) const {
				760	T transformed_point[3];
				761	RotateAndTranslatePoint(rotation, translation, point, transformed_point);
				762	return intrinsic_projection_(intrinsics, transformed_point, residual);
				763	}
				764
				765	private:
				766	CostFunctionToFunctor<2,5,3> intrinsic_projection_;
				767	};
				768
				769	:class:`DynamicCostFunctionToFunctor`
				770	=====================================
				771
				772	.. class:: DynamicCostFunctionToFunctor
				773
				774	:class:`DynamicCostFunctionToFunctor` provides the same functionality as
				775	:class:`CostFunctionToFunctor` for cases where the number and size of the
				776	parameter vectors and residuals are not known at compile-time. The API
				777	provided by :class:`DynamicCostFunctionToFunctor` matches what would be
				778	expected by :class:`DynamicAutoDiffCostFunction`, i.e. it provides a
				779	templated functor of this form:
				780
				781	.. code-block:: c++
				782
				783	template<typename T>
				784	bool operator()(T const* const* parameters, T* residuals) const;
				785
				786	Similar to the example given for :class:`CostFunctionToFunctor`, let us
				787	assume that
				788
				789	.. code-block:: c++
				790
				791	class IntrinsicProjection : public CostFunction {
				792	public:
				793	IntrinsicProjection(const double* observation);
				794	virtual bool Evaluate(double const* const* parameters,
				795	double* residuals,
				796	double** jacobians) const;
				797	};
				798
				799	is a :class:`CostFunction` that projects a point in its local coordinate
				800	system onto its image plane and subtracts it from the observed point
				801	projection.
				802
				803	Using this :class:`CostFunction` in a templated functor would then look like
				804	this:
				805
				806	.. code-block:: c++
				807
				808	struct CameraProjection {
				809	CameraProjection(double* observation)
				810	: intrinsic_projection_(new IntrinsicProjection(observation)) {
				811	}
				812
				813	template <typename T>
				814	bool operator()(T const* const* parameters,
				815	T* residual) const {
				816	const T* rotation = parameters[0];
				817	const T* translation = parameters[1];
				818	const T* intrinsics = parameters[2];
				819	const T* point = parameters[3];
				820
				821	T transformed_point[3];
				822	RotateAndTranslatePoint(rotation, translation, point, transformed_point);
				823
				824	const T* projection_parameters[2];
				825	projection_parameters[0] = intrinsics;
				826	projection_parameters[1] = transformed_point;
				827	return intrinsic_projection_(projection_parameters, residual);
				828	}
				829
				830	private:
				831	DynamicCostFunctionToFunctor intrinsic_projection_;
				832	};
				833
				834	Like :class:`CostFunctionToFunctor`, :class:`DynamicCostFunctionToFunctor`
				835	takes ownership of the :class:`CostFunction` that was passed in to the
				836	constructor.
				837
				838	:class:`ConditionedCostFunction`
				839	================================
				840
				841	.. class:: ConditionedCostFunction
				842
				843	This class allows you to apply different conditioning to the residual
				844	values of a wrapped cost function. An example where this is useful is
				845	where you have an existing cost function that produces N values, but you
				846	want the total cost to be something other than just the sum of these
				847	squared values - maybe you want to apply a different scaling to some
				848	values, to change their contribution to the cost.
				849
				850	Usage:
				851
				852	.. code-block:: c++
				853
				854	// my_cost_function produces N residuals
				855	CostFunction* my_cost_function = ...
				856	CHECK_EQ(N, my_cost_function->num_residuals());
				857	vector<CostFunction*> conditioners;
				858
				859	// Make N 1x1 cost functions (1 parameter, 1 residual)
				860	CostFunction* f_1 = ...
				861	conditioners.push_back(f_1);
				862
				863	CostFunction* f_N = ...
				864	conditioners.push_back(f_N);
				865	ConditionedCostFunction* ccf =
				866	new ConditionedCostFunction(my_cost_function, conditioners);
				867
				868
				869	Now ``ccf`` 's ``residual[i]`` (i=0..N-1) will be passed though the
				870	:math:`i^{\text{th}}` conditioner.
				871
				872	.. code-block:: c++
				873
				874	ccf_residual[i] = f_i(my_cost_function_residual[i])
				875
				876	and the Jacobian will be affected appropriately.
				877
				878
				879	:class:`GradientChecker`
				880	================================
				881
				882	.. class:: GradientChecker
				883
				884	This class compares the Jacobians returned by a cost function against
				885	derivatives estimated using finite differencing. It is meant as a tool for
				886	unit testing, giving you more fine-grained control than the check_gradients
				887	option in the solver options.
				888
				889	The condition enforced is that
				890
				891	.. math:: \forall{i,j}: \frac{J_{ij} - J'_{ij}}{max_{ij}(J_{ij} - J'_{ij})} < r
				892
				893	where :math:`J_{ij}` is the jacobian as computed by the supplied cost
				894	function (by the user) multiplied by the local parameterization Jacobian,
				895	:math:`J'_{ij}` is the jacobian as computed by finite differences,
				896	multiplied by the local parameterization Jacobian as well, and :math:`r`
				897	is the relative precision.
				898
				899	Usage:
				900
				901	.. code-block:: c++
				902
				903	// my_cost_function takes two parameter blocks. The first has a local
				904	// parameterization associated with it.
				905	CostFunction* my_cost_function = ...
				906	LocalParameterization* my_parameterization = ...
				907	NumericDiffOptions numeric_diff_options;
				908
				909	std::vector<LocalParameterization*> local_parameterizations;
				910	local_parameterizations.push_back(my_parameterization);
				911	local_parameterizations.push_back(NULL);
				912
				913	std::vector parameter1;
				914	std::vector parameter2;
				915	// Fill parameter 1 & 2 with test data...
				916
				917	std::vector<double*> parameter_blocks;
				918	parameter_blocks.push_back(parameter1.data());
				919	parameter_blocks.push_back(parameter2.data());
				920
				921	GradientChecker gradient_checker(my_cost_function,
				922	local_parameterizations, numeric_diff_options);
				923	GradientCheckResults results;
				924	if (!gradient_checker.Probe(parameter_blocks.data(), 1e-9, &results) {
				925	LOG(ERROR) << "An error has occurred:\n" << results.error_log;
				926	}
				927
				928
				929	:class:`NormalPrior`
				930	====================
				931
				932	.. class:: NormalPrior
				933
				934	.. code-block:: c++
				935
				936	class NormalPrior: public CostFunction {
				937	public:
				938	// Check that the number of rows in the vector b are the same as the
				939	// number of columns in the matrix A, crash otherwise.
				940	NormalPrior(const Matrix& A, const Vector& b);
				941
				942	virtual bool Evaluate(double const* const* parameters,
				943	double* residuals,
				944	double** jacobians) const;
				945	};
				946
				947	Implements a cost function of the form
				948
				949	.. math:: cost(x) = \|\|A(x - b)\|\|^2
				950
				951	where, the matrix :math:`A` and the vector :math:`b` are fixed and :math:`x`
				952	is the variable. In case the user is interested in implementing a cost
				953	function of the form
				954
				955	.. math:: cost(x) = (x - \mu)^T S^{-1} (x - \mu)
				956
				957	where, :math:`\mu` is a vector and :math:`S` is a covariance matrix,
				958	then, :math:`A = S^{-1/2}`, i.e the matrix :math:`A` is the square
				959	root of the inverse of the covariance, also known as the stiffness
				960	matrix. There are however no restrictions on the shape of
				961	:math:`A`. It is free to be rectangular, which would be the case if
				962	the covariance matrix :math:`S` is rank deficient.
				963
				964
				965
				966	.. _`section-loss_function`:
				967
				968	:class:`LossFunction`
				969	=====================
				970
				971	.. class:: LossFunction
				972
				973	For least squares problems where the minimization may encounter
				974	input terms that contain outliers, that is, completely bogus
				975	measurements, it is important to use a loss function that reduces
				976	their influence.
				977
				978	Consider a structure from motion problem. The unknowns are 3D
				979	points and camera parameters, and the measurements are image
				980	coordinates describing the expected reprojected position for a
				981	point in a camera. For example, we want to model the geometry of a
				982	street scene with fire hydrants and cars, observed by a moving
				983	camera with unknown parameters, and the only 3D points we care
				984	about are the pointy tippy-tops of the fire hydrants. Our magic
				985	image processing algorithm, which is responsible for producing the
				986	measurements that are input to Ceres, has found and matched all
				987	such tippy-tops in all image frames, except that in one of the
				988	frame it mistook a car's headlight for a hydrant. If we didn't do
				989	anything special the residual for the erroneous measurement will
				990	result in the entire solution getting pulled away from the optimum
				991	to reduce the large error that would otherwise be attributed to the
				992	wrong measurement.
				993
				994	Using a robust loss function, the cost for large residuals is
				995	reduced. In the example above, this leads to outlier terms getting
				996	down-weighted so they do not overly influence the final solution.
				997
				998	.. code-block:: c++
				999
				1000	class LossFunction {
				1001	public:
				1002	virtual void Evaluate(double s, double out[3]) const = 0;
				1003	};
				1004
				1005
				1006	The key method is :func:`LossFunction::Evaluate`, which given a
				1007	non-negative scalar ``s``, computes
				1008
				1009	.. math:: out = \begin{bmatrix}\rho(s), & \rho'(s), & \rho''(s)\end{bmatrix}
				1010
				1011	Here the convention is that the contribution of a term to the cost
				1012	function is given by :math:`\frac{1}{2}\rho(s)`, where :math:`s
				1013	=\\|f_i\\|^2`. Calling the method with a negative value of :math:`s`
				1014	is an error and the implementations are not required to handle that
				1015	case.
				1016
				1017	Most sane choices of :math:`\rho` satisfy:
				1018
				1019	.. math::
				1020
				1021	\rho(0) &= 0\\
				1022	\rho'(0) &= 1\\
				1023	\rho'(s) &< 1 \text{ in the outlier region}\\
				1024	\rho''(s) &< 0 \text{ in the outlier region}
				1025
				1026	so that they mimic the squared cost for small residuals.
				1027
				1028	Scaling
				1029
				1030	Given one robustifier :math:`\rho(s)` one can change the length
				1031	scale at which robustification takes place, by adding a scale
				1032	factor :math:`a > 0` which gives us :math:`\rho(s,a) = a^2 \rho(s /
				1033	a^2)` and the first and second derivatives as :math:`\rho'(s /
				1034	a^2)` and :math:`(1 / a^2) \rho''(s / a^2)` respectively.
				1035
				1036
				1037	The reason for the appearance of squaring is that :math:`a` is in
				1038	the units of the residual vector norm whereas :math:`s` is a squared
				1039	norm. For applications it is more convenient to specify :math:`a` than
				1040	its square.
				1041
				1042	Instances
				1043	---------
				1044
				1045	Ceres includes a number of predefined loss functions. For simplicity
				1046	we described their unscaled versions. The figure below illustrates
				1047	their shape graphically. More details can be found in
				1048	``include/ceres/loss_function.h``.
				1049
				1050	.. figure:: loss.png
				1051	:figwidth: 500px
				1052	:height: 400px
				1053	:align: center
				1054
				1055	Shape of the various common loss functions.
				1056
				1057	.. class:: TrivialLoss
				1058
				1059	.. math:: \rho(s) = s
				1060
				1061	.. class:: HuberLoss
				1062
				1063	.. math:: \rho(s) = \begin{cases} s & s \le 1\\ 2 \sqrt{s} - 1 & s > 1 \end{cases}
				1064
				1065	.. class:: SoftLOneLoss
				1066
				1067	.. math:: \rho(s) = 2 (\sqrt{1+s} - 1)
				1068
				1069	.. class:: CauchyLoss
				1070
				1071	.. math:: \rho(s) = \log(1 + s)
				1072
				1073	.. class:: ArctanLoss
				1074
				1075	.. math:: \rho(s) = \arctan(s)
				1076
				1077	.. class:: TolerantLoss
				1078
				1079	.. math:: \rho(s,a,b) = b \log(1 + e^{(s - a) / b}) - b \log(1 + e^{-a / b})
				1080
				1081	.. class:: ComposedLoss
				1082
				1083	Given two loss functions ``f`` and ``g``, implements the loss
				1084	function ``h(s) = f(g(s))``.
				1085
				1086	.. code-block:: c++
				1087
				1088	class ComposedLoss : public LossFunction {
				1089	public:
				1090	explicit ComposedLoss(const LossFunction* f,
				1091	Ownership ownership_f,
				1092	const LossFunction* g,
				1093	Ownership ownership_g);
				1094	};
				1095
				1096	.. class:: ScaledLoss
				1097
				1098	Sometimes you want to simply scale the output value of the
				1099	robustifier. For example, you might want to weight different error
				1100	terms differently (e.g., weight pixel reprojection errors
				1101	differently from terrain errors).
				1102
				1103	Given a loss function :math:`\rho(s)` and a scalar :math:`a`, :class:`ScaledLoss`
				1104	implements the function :math:`a \rho(s)`.
				1105
				1106	Since we treat a ``NULL`` Loss function as the Identity loss
				1107	function, :math:`rho` = ``NULL``: is a valid input and will result
				1108	in the input being scaled by :math:`a`. This provides a simple way
				1109	of implementing a scaled ResidualBlock.
				1110
				1111	.. class:: LossFunctionWrapper
				1112
				1113	Sometimes after the optimization problem has been constructed, we
				1114	wish to mutate the scale of the loss function. For example, when
				1115	performing estimation from data which has substantial outliers,
				1116	convergence can be improved by starting out with a large scale,
				1117	optimizing the problem and then reducing the scale. This can have
				1118	better convergence behavior than just using a loss function with a
				1119	small scale.
				1120
				1121	This templated class allows the user to implement a loss function
				1122	whose scale can be mutated after an optimization problem has been
				1123	constructed, e.g,
				1124
				1125	.. code-block:: c++
				1126
				1127	Problem problem;
				1128
				1129	// Add parameter blocks
				1130
				1131	CostFunction* cost_function =
				1132	new AutoDiffCostFunction < UW_Camera_Mapper, 2, 9, 3>(
				1133	new UW_Camera_Mapper(feature_x, feature_y));
				1134
				1135	LossFunctionWrapper* loss_function(new HuberLoss(1.0), TAKE_OWNERSHIP);
				1136	problem.AddResidualBlock(cost_function, loss_function, parameters);
				1137
				1138	Solver::Options options;
				1139	Solver::Summary summary;
				1140	Solve(options, &problem, &summary);
				1141
				1142	loss_function->Reset(new HuberLoss(1.0), TAKE_OWNERSHIP);
				1143	Solve(options, &problem, &summary);
				1144
				1145
				1146	Theory
				1147	------
				1148
				1149	Let us consider a problem with a single problem and a single parameter
				1150	block.
				1151
				1152	.. math::
				1153
				1154	\min_x \frac{1}{2}\rho(f^2(x))
				1155
				1156
				1157	Then, the robustified gradient and the Gauss-Newton Hessian are
				1158
				1159	.. math::
				1160
				1161	g(x) &= \rho'J^\top(x)f(x)\\
				1162	H(x) &= J^\top(x)\left(\rho' + 2 \rho''f(x)f^\top(x)\right)J(x)
				1163
				1164	where the terms involving the second derivatives of :math:`f(x)` have
				1165	been ignored. Note that :math:`H(x)` is indefinite if
				1166	:math:`\rho''f(x)^\top f(x) + \frac{1}{2}\rho' < 0`. If this is not
				1167	the case, then its possible to re-weight the residual and the Jacobian
				1168	matrix such that the corresponding linear least squares problem for
				1169	the robustified Gauss-Newton step.
				1170
				1171
				1172	Let :math:`\alpha` be a root of
				1173
				1174	.. math:: \frac{1}{2}\alpha^2 - \alpha - \frac{\rho''}{\rho'}\\|f(x)\\|^2 = 0.
				1175
				1176
				1177	Then, define the rescaled residual and Jacobian as
				1178
				1179	.. math::
				1180
				1181	\tilde{f}(x) &= \frac{\sqrt{\rho'}}{1 - \alpha} f(x)\\
				1182	\tilde{J}(x) &= \sqrt{\rho'}\left(1 - \alpha
				1183	\frac{f(x)f^\top(x)}{\left\\|f(x)\right\\|^2} \right)J(x)
				1184
				1185
				1186	In the case :math:`2 \rho''\left\\|f(x)\right\\|^2 + \rho' \lesssim 0`,
				1187	we limit :math:`\alpha \le 1- \epsilon` for some small
				1188	:math:`\epsilon`. For more details see [Triggs]_.
				1189
				1190	With this simple rescaling, one can use any Jacobian based non-linear
				1191	least squares algorithm to robustified non-linear least squares
				1192	problems.
				1193
				1194
				1195	:class:`LocalParameterization`
				1196	==============================
				1197
				1198	.. class:: LocalParameterization
				1199
				1200	.. code-block:: c++
				1201
				1202	class LocalParameterization {
				1203	public:
				1204	virtual ~LocalParameterization() {}
				1205	virtual bool Plus(const double* x,
				1206	const double* delta,
				1207	double* x_plus_delta) const = 0;
				1208	virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
				1209	virtual bool MultiplyByJacobian(const double* x,
				1210	const int num_rows,
				1211	const double* global_matrix,
				1212	double* local_matrix) const;
				1213	virtual int GlobalSize() const = 0;
				1214	virtual int LocalSize() const = 0;
				1215	};
				1216
				1217	Sometimes the parameters :math:`x` can overparameterize a
				1218	problem. In that case it is desirable to choose a parameterization
				1219	to remove the null directions of the cost. More generally, if
				1220	:math:`x` lies on a manifold of a smaller dimension than the
				1221	ambient space that it is embedded in, then it is numerically and
				1222	computationally more effective to optimize it using a
				1223	parameterization that lives in the tangent space of that manifold
				1224	at each point.
				1225
				1226	For example, a sphere in three dimensions is a two dimensional
				1227	manifold, embedded in a three dimensional space. At each point on
				1228	the sphere, the plane tangent to it defines a two dimensional
				1229	tangent space. For a cost function defined on this sphere, given a
				1230	point :math:`x`, moving in the direction normal to the sphere at
				1231	that point is not useful. Thus a better way to parameterize a point
				1232	on a sphere is to optimize over two dimensional vector
				1233	:math:`\Delta x` in the tangent space at the point on the sphere
				1234	point and then "move" to the point :math:`x + \Delta x`, where the
				1235	move operation involves projecting back onto the sphere. Doing so
				1236	removes a redundant dimension from the optimization, making it
				1237	numerically more robust and efficient.
				1238
				1239	More generally we can define a function
				1240
				1241	.. math:: x' = \boxplus(x, \Delta x),
				1242
				1243	where :math:`x'` has the same size as :math:`x`, and :math:`\Delta
				1244	x` is of size less than or equal to :math:`x`. The function
				1245	:math:`\boxplus`, generalizes the definition of vector
				1246	addition. Thus it satisfies the identity
				1247
				1248	.. math:: \boxplus(x, 0) = x,\quad \forall x.
				1249
				1250	Instances of :class:`LocalParameterization` implement the
				1251	:math:`\boxplus` operation and its derivative with respect to
				1252	:math:`\Delta x` at :math:`\Delta x = 0`.
				1253
				1254
				1255	.. function:: int LocalParameterization::GlobalSize()
				1256
				1257	The dimension of the ambient space in which the parameter block
				1258	:math:`x` lives.
				1259
				1260	.. function:: int LocalParameterization::LocalSize()
				1261
				1262	The size of the tangent space
				1263	that :math:`\Delta x` lives in.
				1264
				1265	.. function:: bool LocalParameterization::Plus(const double* x, const double* delta, double* x_plus_delta) const
				1266
				1267	:func:`LocalParameterization::Plus` implements :math:`\boxplus(x,\Delta x)`.
				1268
				1269	.. function:: bool LocalParameterization::ComputeJacobian(const double* x, double* jacobian) const
				1270
				1271	Computes the Jacobian matrix
				1272
				1273	.. math:: J = \left . \frac{\partial }{\partial \Delta x} \boxplus(x,\Delta x)\right\|_{\Delta x = 0}
				1274
				1275	in row major form.
				1276
				1277	.. function:: bool MultiplyByJacobian(const double* x, const int num_rows, const double* global_matrix, double* local_matrix) const
				1278
				1279	local_matrix = global_matrix * jacobian
				1280
				1281	global_matrix is a num_rows x GlobalSize row major matrix.
				1282	local_matrix is a num_rows x LocalSize row major matrix.
				1283	jacobian is the matrix returned by :func:`LocalParameterization::ComputeJacobian` at :math:`x`.
				1284
				1285	This is only used by GradientProblem. For most normal uses, it is
				1286	okay to use the default implementation.
				1287
				1288	Instances
				1289	---------
				1290
				1291	.. class:: IdentityParameterization
				1292
				1293	A trivial version of :math:`\boxplus` is when :math:`\Delta x` is
				1294	of the same size as :math:`x` and
				1295
				1296	.. math:: \boxplus(x, \Delta x) = x + \Delta x
				1297
				1298	.. class:: SubsetParameterization
				1299
				1300	A more interesting case if :math:`x` is a two dimensional vector,
				1301	and the user wishes to hold the first coordinate constant. Then,
				1302	:math:`\Delta x` is a scalar and :math:`\boxplus` is defined as
				1303
				1304	.. math::
				1305
				1306	\boxplus(x, \Delta x) = x + \left[ \begin{array}{c} 0 \\ 1
				1307	\end{array} \right] \Delta x
				1308
				1309	:class:`SubsetParameterization` generalizes this construction to
				1310	hold any part of a parameter block constant.
				1311
				1312	.. class:: QuaternionParameterization
				1313
				1314	Another example that occurs commonly in Structure from Motion
				1315	problems is when camera rotations are parameterized using a
				1316	quaternion. There, it is useful only to make updates orthogonal to
				1317	that 4-vector defining the quaternion. One way to do this is to let
				1318	:math:`\Delta x` be a 3 dimensional vector and define
				1319	:math:`\boxplus` to be
				1320
				1321	.. math:: \boxplus(x, \Delta x) = \left[ \cos(\|\Delta x\|), \frac{\sin\left(\|\Delta x\|\right)}{\|\Delta x\|} \Delta x \right] * x
				1322	:label: quaternion
				1323
				1324	The multiplication between the two 4-vectors on the right hand side
				1325	is the standard quaternion
				1326	product. :class:`QuaternionParameterization` is an implementation
				1327	of :eq:`quaternion`.
				1328
				1329	.. class:: EigenQuaternionParameterization
				1330
				1331	Eigen uses a different internal memory layout for the elements of the
				1332	quaternion than what is commonly used. Specifically, Eigen stores the
				1333	elements in memory as [x, y, z, w] where the real part is last
				1334	whereas it is typically stored first. Note, when creating an Eigen
				1335	quaternion through the constructor the elements are accepted in w, x,
				1336	y, z order. Since Ceres operates on parameter blocks which are raw
				1337	double pointers this difference is important and requires a different
				1338	parameterization. :class:`EigenQuaternionParameterization` uses the
				1339	same update as :class:`QuaternionParameterization` but takes into
				1340	account Eigen's internal memory element ordering.
				1341
				1342	.. class:: HomogeneousVectorParameterization
				1343
				1344	In computer vision, homogeneous vectors are commonly used to
				1345	represent entities in projective geometry such as points in
				1346	projective space. One example where it is useful to use this
				1347	over-parameterization is in representing points whose triangulation
				1348	is ill-conditioned. Here it is advantageous to use homogeneous
				1349	vectors, instead of an Euclidean vector, because it can represent
				1350	points at infinity.
				1351
				1352	When using homogeneous vectors it is useful to only make updates
				1353	orthogonal to that :math:`n`-vector defining the homogeneous
				1354	vector [HartleyZisserman]_. One way to do this is to let :math:`\Delta x`
				1355	be a :math:`n-1` dimensional vector and define :math:`\boxplus` to be
				1356
				1357	.. math:: \boxplus(x, \Delta x) = \left[ \frac{\sin\left(0.5 \|\Delta x\|\right)}{\|\Delta x\|} \Delta x, \cos(0.5 \|\Delta x\|) \right] * x
				1358
				1359	The multiplication between the two vectors on the right hand side
				1360	is defined as an operator which applies the update orthogonal to
				1361	:math:`x` to remain on the sphere. Note, it is assumed that
				1362	last element of :math:`x` is the scalar component of the homogeneous
				1363	vector.
				1364
				1365
				1366	.. class:: ProductParameterization
				1367
				1368	Consider an optimization problem over the space of rigid
				1369	transformations :math:`SE(3)`, which is the Cartesian product of
				1370	:math:`SO(3)` and :math:`\mathbb{R}^3`. Suppose you are using
				1371	Quaternions to represent the rotation, Ceres ships with a local
				1372	parameterization for that and :math:`\mathbb{R}^3` requires no, or
				1373	:class:`IdentityParameterization` parameterization. So how do we
				1374	construct a local parameterization for a parameter block a rigid
				1375	transformation?
				1376
				1377	In cases, where a parameter block is the Cartesian product of a
				1378	number of manifolds and you have the local parameterization of the
				1379	individual manifolds available, :class:`ProductParameterization`
				1380	can be used to construct a local parameterization of the cartesian
				1381	product. For the case of the rigid transformation, where say you
				1382	have a parameter block of size 7, where the first four entries
				1383	represent the rotation as a quaternion, a local parameterization
				1384	can be constructed as
				1385
				1386	.. code-block:: c++
				1387
				1388	ProductParameterization se3_param(new QuaternionParameterization(),
				1389	new IdentityTransformation(3));
				1390
				1391
				1392	:class:`AutoDiffLocalParameterization`
				1393	======================================
				1394
				1395	.. class:: AutoDiffLocalParameterization
				1396
				1397	:class:`AutoDiffLocalParameterization` does for
				1398	:class:`LocalParameterization` what :class:`AutoDiffCostFunction`
				1399	does for :class:`CostFunction`. It allows the user to define a
				1400	templated functor that implements the
				1401	:func:`LocalParameterization::Plus` operation and it uses automatic
				1402	differentiation to implement the computation of the Jacobian.
				1403
				1404	To get an auto differentiated local parameterization, you must
				1405	define a class with a templated operator() (a functor) that computes
				1406
				1407	.. math:: x' = \boxplus(x, \Delta x),
				1408
				1409	For example, Quaternions have a three dimensional local
				1410	parameterization. Its plus operation can be implemented as (taken
				1411	from `internal/ceres/autodiff_local_parameterization_test.cc
				1412	<https://ceres-solver.googlesource.com/ceres-solver/+/master/internal/ceres/autodiff_local_parameterization_test.cc>`_
				1413	)
				1414
				1415	.. code-block:: c++
				1416
				1417	struct QuaternionPlus {
				1418	template<typename T>
				1419	bool operator()(const T* x, const T* delta, T* x_plus_delta) const {
				1420	const T squared_norm_delta =
				1421	delta[0] * delta[0] + delta[1] * delta[1] + delta[2] * delta[2];
				1422
				1423	T q_delta[4];
				1424	if (squared_norm_delta > 0.0) {
				1425	T norm_delta = sqrt(squared_norm_delta);
				1426	const T sin_delta_by_delta = sin(norm_delta) / norm_delta;
				1427	q_delta[0] = cos(norm_delta);
				1428	q_delta[1] = sin_delta_by_delta * delta[0];
				1429	q_delta[2] = sin_delta_by_delta * delta[1];
				1430	q_delta[3] = sin_delta_by_delta * delta[2];
				1431	} else {
				1432	// We do not just use q_delta = [1,0,0,0] here because that is a
				1433	// constant and when used for automatic differentiation will
				1434	// lead to a zero derivative. Instead we take a first order
				1435	// approximation and evaluate it at zero.
				1436	q_delta[0] = T(1.0);
				1437	q_delta[1] = delta[0];
				1438	q_delta[2] = delta[1];
				1439	q_delta[3] = delta[2];
				1440	}
				1441
				1442	Quaternionproduct(q_delta, x, x_plus_delta);
				1443	return true;
				1444	}
				1445	};
				1446
				1447	Given this struct, the auto differentiated local
				1448	parameterization can now be constructed as
				1449
				1450	.. code-block:: c++
				1451
				1452	LocalParameterization* local_parameterization =
				1453	new AutoDiffLocalParameterization<QuaternionPlus, 4, 3>;
				1454	\| \|
				1455	Global Size ---------------+ \|
				1456	Local Size -------------------+
				1457
				1458
				1459	:class:`Problem`
				1460	================
				1461
				1462	.. class:: Problem
				1463
				1464	:class:`Problem` holds the robustified bounds constrained
				1465	non-linear least squares problem :eq:`ceresproblem_modeling`. To
				1466	create a least squares problem, use the
				1467	:func:`Problem::AddResidualBlock` and
				1468	:func:`Problem::AddParameterBlock` methods.
				1469
				1470	For example a problem containing 3 parameter blocks of sizes 3, 4
				1471	and 5 respectively and two residual blocks of size 2 and 6:
				1472
				1473	.. code-block:: c++
				1474
				1475	double x1[] = { 1.0, 2.0, 3.0 };
				1476	double x2[] = { 1.0, 2.0, 3.0, 5.0 };
				1477	double x3[] = { 1.0, 2.0, 3.0, 6.0, 7.0 };
				1478
				1479	Problem problem;
				1480	problem.AddResidualBlock(new MyUnaryCostFunction(...), x1);
				1481	problem.AddResidualBlock(new MyBinaryCostFunction(...), x2, x3);
				1482
				1483	:func:`Problem::AddResidualBlock` as the name implies, adds a
				1484	residual block to the problem. It adds a :class:`CostFunction`, an
				1485	optional :class:`LossFunction` and connects the
				1486	:class:`CostFunction` to a set of parameter block.
				1487
				1488	The cost function carries with it information about the sizes of
				1489	the parameter blocks it expects. The function checks that these
				1490	match the sizes of the parameter blocks listed in
				1491	``parameter_blocks``. The program aborts if a mismatch is
				1492	detected. ``loss_function`` can be ``NULL``, in which case the cost
				1493	of the term is just the squared norm of the residuals.
				1494
				1495	The user has the option of explicitly adding the parameter blocks
				1496	using :func:`Problem::AddParameterBlock`. This causes additional
				1497	correctness checking; however, :func:`Problem::AddResidualBlock`
				1498	implicitly adds the parameter blocks if they are not present, so
				1499	calling :func:`Problem::AddParameterBlock` explicitly is not
				1500	required.
				1501
				1502	:func:`Problem::AddParameterBlock` explicitly adds a parameter
				1503	block to the :class:`Problem`. Optionally it allows the user to
				1504	associate a :class:`LocalParameterization` object with the
				1505	parameter block too. Repeated calls with the same arguments are
				1506	ignored. Repeated calls with the same double pointer but a
				1507	different size results in undefined behavior.
				1508
				1509	You can set any parameter block to be constant using
				1510	:func:`Problem::SetParameterBlockConstant` and undo this using
				1511	:func:`SetParameterBlockVariable`.
				1512
				1513	In fact you can set any number of parameter blocks to be constant,
				1514	and Ceres is smart enough to figure out what part of the problem
				1515	you have constructed depends on the parameter blocks that are free
				1516	to change and only spends time solving it. So for example if you
				1517	constructed a problem with a million parameter blocks and 2 million
				1518	residual blocks, but then set all but one parameter blocks to be
				1519	constant and say only 10 residual blocks depend on this one
				1520	non-constant parameter block. Then the computational effort Ceres
				1521	spends in solving this problem will be the same if you had defined
				1522	a problem with one parameter block and 10 residual blocks.
				1523
				1524	Ownership
				1525
				1526	:class:`Problem` by default takes ownership of the
				1527	``cost_function``, ``loss_function`` and ``local_parameterization``
				1528	pointers. These objects remain live for the life of the
				1529	:class:`Problem`. If the user wishes to keep control over the
				1530	destruction of these objects, then they can do this by setting the
				1531	corresponding enums in the :class:`Problem::Options` struct.
				1532
				1533	Note that even though the Problem takes ownership of ``cost_function``
				1534	and ``loss_function``, it does not preclude the user from re-using
				1535	them in another residual block. The destructor takes care to call
				1536	delete on each ``cost_function`` or ``loss_function`` pointer only
				1537	once, regardless of how many residual blocks refer to them.
				1538
				1539	.. function:: ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, const vector<double*> parameter_blocks)
				1540	.. function:: ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, double x0, double x1, ...)
				1541
				1542	Add a residual block to the overall cost function. The cost
				1543	function carries with it information about the sizes of the
				1544	parameter blocks it expects. The function checks that these match
				1545	the sizes of the parameter blocks listed in parameter_blocks. The
				1546	program aborts if a mismatch is detected. loss_function can be
				1547	NULL, in which case the cost of the term is just the squared norm
				1548	of the residuals.
				1549
				1550	The parameter blocks may be passed together as a
				1551	``vector<double>``, or as up to ten separate ``double`` pointers.
				1552
				1553	The user has the option of explicitly adding the parameter blocks
				1554	using AddParameterBlock. This causes additional correctness
				1555	checking; however, AddResidualBlock implicitly adds the parameter
				1556	blocks if they are not present, so calling AddParameterBlock
				1557	explicitly is not required.
				1558
				1559	The Problem object by default takes ownership of the
				1560	cost_function and loss_function pointers. These objects remain
				1561	live for the life of the Problem object. If the user wishes to
				1562	keep control over the destruction of these objects, then they can
				1563	do this by setting the corresponding enums in the Options struct.
				1564
				1565	Note: Even though the Problem takes ownership of cost_function
				1566	and loss_function, it does not preclude the user from re-using
				1567	them in another residual block. The destructor takes care to call
				1568	delete on each cost_function or loss_function pointer only once,
				1569	regardless of how many residual blocks refer to them.
				1570
				1571	Example usage:
				1572
				1573	.. code-block:: c++
				1574
				1575	double x1[] = {1.0, 2.0, 3.0};
				1576	double x2[] = {1.0, 2.0, 5.0, 6.0};
				1577	double x3[] = {3.0, 6.0, 2.0, 5.0, 1.0};
				1578	vector<double*> v1;
				1579	v1.push_back(x1);
				1580	vector<double*> v2;
				1581	v2.push_back(x2);
				1582	v2.push_back(x1);
				1583
				1584	Problem problem;
				1585
				1586	problem.AddResidualBlock(new MyUnaryCostFunction(...), NULL, x1);
				1587	problem.AddResidualBlock(new MyBinaryCostFunction(...), NULL, x2, x1);
				1588	problem.AddResidualBlock(new MyUnaryCostFunction(...), NULL, v1);
				1589	problem.AddResidualBlock(new MyBinaryCostFunction(...), NULL, v2);
				1590
				1591	.. function:: void Problem::AddParameterBlock(double* values, int size, LocalParameterization* local_parameterization)
				1592
				1593	Add a parameter block with appropriate size to the problem.
				1594	Repeated calls with the same arguments are ignored. Repeated calls
				1595	with the same double pointer but a different size results in
				1596	undefined behavior.
				1597
				1598	.. function:: void Problem::AddParameterBlock(double* values, int size)
				1599
				1600	Add a parameter block with appropriate size and parameterization to
				1601	the problem. Repeated calls with the same arguments are
				1602	ignored. Repeated calls with the same double pointer but a
				1603	different size results in undefined behavior.
				1604
				1605	.. function:: void Problem::RemoveResidualBlock(ResidualBlockId residual_block)
				1606
				1607	Remove a residual block from the problem. Any parameters that the residual
				1608	block depends on are not removed. The cost and loss functions for the
				1609	residual block will not get deleted immediately; won't happen until the
				1610	problem itself is deleted. If Problem::Options::enable_fast_removal is
				1611	true, then the removal is fast (almost constant time). Otherwise, removing a
				1612	residual block will incur a scan of the entire Problem object to verify that
				1613	the residual_block represents a valid residual in the problem.
				1614
				1615	WARNING: Removing a residual or parameter block will destroy
				1616	the implicit ordering, rendering the jacobian or residuals returned
				1617	from the solver uninterpretable. If you depend on the evaluated
				1618	jacobian, do not use remove! This may change in a future release.
				1619	Hold the indicated parameter block constant during optimization.
				1620
				1621	.. function:: void Problem::RemoveParameterBlock(double* values)
				1622
				1623	Remove a parameter block from the problem. The parameterization of
				1624	the parameter block, if it exists, will persist until the deletion
				1625	of the problem (similar to cost/loss functions in residual block
				1626	removal). Any residual blocks that depend on the parameter are also
				1627	removed, as described above in RemoveResidualBlock(). If
				1628	Problem::Options::enable_fast_removal is true, then
				1629	the removal is fast (almost constant time). Otherwise, removing a
				1630	parameter block will incur a scan of the entire Problem object.
				1631
				1632	WARNING: Removing a residual or parameter block will destroy
				1633	the implicit ordering, rendering the jacobian or residuals returned
				1634	from the solver uninterpretable. If you depend on the evaluated
				1635	jacobian, do not use remove! This may change in a future release.
				1636
				1637	.. function:: void Problem::SetParameterBlockConstant(double* values)
				1638
				1639	Hold the indicated parameter block constant during optimization.
				1640
				1641	.. function:: void Problem::SetParameterBlockVariable(double* values)
				1642
				1643	Allow the indicated parameter to vary during optimization.
				1644
				1645	.. function:: void Problem::SetParameterization(double* values, LocalParameterization* local_parameterization)
				1646
				1647	Set the local parameterization for one of the parameter blocks.
				1648	The local_parameterization is owned by the Problem by default. It
				1649	is acceptable to set the same parameterization for multiple
				1650	parameters; the destructor is careful to delete local
				1651	parameterizations only once. The local parameterization can only be
				1652	set once per parameter, and cannot be changed once set.
				1653
				1654	.. function:: LocalParameterization* Problem::GetParameterization(double* values) const
				1655
				1656	Get the local parameterization object associated with this
				1657	parameter block. If there is no parameterization object associated
				1658	then `NULL` is returned
				1659
				1660	.. function:: void Problem::SetParameterLowerBound(double* values, int index, double lower_bound)
				1661
				1662	Set the lower bound for the parameter at position `index` in the
				1663	parameter block corresponding to `values`. By default the lower
				1664	bound is ``-std::numeric_limits<double>::max()``, which is treated
				1665	by the solver as the same as :math:`-\infty`.
				1666
				1667	.. function:: void Problem::SetParameterUpperBound(double* values, int index, double upper_bound)
				1668
				1669	Set the upper bound for the parameter at position `index` in the
				1670	parameter block corresponding to `values`. By default the value is
				1671	``std::numeric_limits<double>::max()``, which is treated by the
				1672	solver as the same as :math:`\infty`.
				1673
				1674	.. function:: double Problem::GetParameterLowerBound(double* values, int index)
				1675
				1676	Get the lower bound for the parameter with position `index`. If the
				1677	parameter is not bounded by the user, then its lower bound is
				1678	``-std::numeric_limits<double>::max()``.
				1679
				1680	.. function:: double Problem::GetParameterUpperBound(double* values, int index)
				1681
				1682	Get the upper bound for the parameter with position `index`. If the
				1683	parameter is not bounded by the user, then its upper bound is
				1684	``std::numeric_limits<double>::max()``.
				1685
				1686	.. function:: int Problem::NumParameterBlocks() const
				1687
				1688	Number of parameter blocks in the problem. Always equals
				1689	parameter_blocks().size() and parameter_block_sizes().size().
				1690
				1691	.. function:: int Problem::NumParameters() const
				1692
				1693	The size of the parameter vector obtained by summing over the sizes
				1694	of all the parameter blocks.
				1695
				1696	.. function:: int Problem::NumResidualBlocks() const
				1697
				1698	Number of residual blocks in the problem. Always equals
				1699	residual_blocks().size().
				1700
				1701	.. function:: int Problem::NumResiduals() const
				1702
				1703	The size of the residual vector obtained by summing over the sizes
				1704	of all of the residual blocks.
				1705
				1706	.. function:: int Problem::ParameterBlockSize(const double* values) const
				1707
				1708	The size of the parameter block.
				1709
				1710	.. function:: int Problem::ParameterBlockLocalSize(const double* values) const
				1711
				1712	The size of local parameterization for the parameter block. If
				1713	there is no local parameterization associated with this parameter
				1714	block, then ``ParameterBlockLocalSize`` = ``ParameterBlockSize``.
				1715
				1716	.. function:: bool Problem::HasParameterBlock(const double* values) const
				1717
				1718	Is the given parameter block present in the problem or not?
				1719
				1720	.. function:: void Problem::GetParameterBlocks(vector<double> parameter_blocks) const
				1721
				1722	Fills the passed ``parameter_blocks`` vector with pointers to the
				1723	parameter blocks currently in the problem. After this call,
				1724	``parameter_block.size() == NumParameterBlocks``.
				1725
				1726	.. function:: void Problem::GetResidualBlocks(vector<ResidualBlockId>* residual_blocks) const
				1727
				1728	Fills the passed `residual_blocks` vector with pointers to the
				1729	residual blocks currently in the problem. After this call,
				1730	`residual_blocks.size() == NumResidualBlocks`.
				1731
				1732	.. function:: void Problem::GetParameterBlocksForResidualBlock(const ResidualBlockId residual_block, vector<double> parameter_blocks) const
				1733
				1734	Get all the parameter blocks that depend on the given residual
				1735	block.
				1736
				1737	.. function:: void Problem::GetResidualBlocksForParameterBlock(const double* values, vector<ResidualBlockId>* residual_blocks) const
				1738
				1739	Get all the residual blocks that depend on the given parameter
				1740	block.
				1741
				1742	If `Problem::Options::enable_fast_removal` is
				1743	`true`, then getting the residual blocks is fast and depends only
				1744	on the number of residual blocks. Otherwise, getting the residual
				1745	blocks for a parameter block will incur a scan of the entire
				1746	:class:`Problem` object.
				1747
				1748	.. function:: const CostFunction* GetCostFunctionForResidualBlock(const ResidualBlockId residual_block) const
				1749
				1750	Get the :class:`CostFunction` for the given residual block.
				1751
				1752	.. function:: const LossFunction* GetLossFunctionForResidualBlock(const ResidualBlockId residual_block) const
				1753
				1754	Get the :class:`LossFunction` for the given residual block.
				1755
				1756	.. function:: bool Problem::Evaluate(const Problem::EvaluateOptions& options, double* cost, vector<double>* residuals, vector<double>* gradient, CRSMatrix* jacobian)
				1757
				1758	Evaluate a :class:`Problem`. Any of the output pointers can be
				1759	`NULL`. Which residual blocks and parameter blocks are used is
				1760	controlled by the :class:`Problem::EvaluateOptions` struct below.
				1761
				1762	.. NOTE::
				1763
				1764	The evaluation will use the values stored in the memory
				1765	locations pointed to by the parameter block pointers used at the
				1766	time of the construction of the problem, for example in the
				1767	following code:
				1768
				1769	.. code-block:: c++
				1770
				1771	Problem problem;
				1772	double x = 1;
				1773	problem.Add(new MyCostFunction, NULL, &x);
				1774
				1775	double cost = 0.0;
				1776	problem.Evaluate(Problem::EvaluateOptions(), &cost, NULL, NULL, NULL);
				1777
				1778	The cost is evaluated at `x = 1`. If you wish to evaluate the
				1779	problem at `x = 2`, then
				1780
				1781	.. code-block:: c++
				1782
				1783	x = 2;
				1784	problem.Evaluate(Problem::EvaluateOptions(), &cost, NULL, NULL, NULL);
				1785
				1786	is the way to do so.
				1787
				1788	.. NOTE::
				1789
				1790	If no local parameterizations are used, then the size of
				1791	the gradient vector is the sum of the sizes of all the parameter
				1792	blocks. If a parameter block has a local parameterization, then
				1793	it contributes "LocalSize" entries to the gradient vector.
				1794
				1795	.. NOTE::
				1796
				1797	This function cannot be called while the problem is being
				1798	solved, for example it cannot be called from an
				1799	:class:`IterationCallback` at the end of an iteration during a
				1800	solve.
				1801
				1802	.. class:: Problem::EvaluateOptions
				1803
				1804	Options struct that is used to control :func:`Problem::Evaluate`.
				1805
				1806	.. member:: vector<double*> Problem::EvaluateOptions::parameter_blocks
				1807
				1808	The set of parameter blocks for which evaluation should be
				1809	performed. This vector determines the order in which parameter
				1810	blocks occur in the gradient vector and in the columns of the
				1811	jacobian matrix. If parameter_blocks is empty, then it is assumed
				1812	to be equal to a vector containing ALL the parameter
				1813	blocks. Generally speaking the ordering of the parameter blocks in
				1814	this case depends on the order in which they were added to the
				1815	problem and whether or not the user removed any parameter blocks.
				1816
				1817	NOTE This vector should contain the same pointers as the ones
				1818	used to add parameter blocks to the Problem. These parameter block
				1819	should NOT point to new memory locations. Bad things will happen if
				1820	you do.
				1821
				1822	.. member:: vector<ResidualBlockId> Problem::EvaluateOptions::residual_blocks
				1823
				1824	The set of residual blocks for which evaluation should be
				1825	performed. This vector determines the order in which the residuals
				1826	occur, and how the rows of the jacobian are ordered. If
				1827	residual_blocks is empty, then it is assumed to be equal to the
				1828	vector containing all the residual blocks.
				1829
				1830	.. member:: bool Problem::EvaluateOptions::apply_loss_function
				1831
				1832	Even though the residual blocks in the problem may contain loss
				1833	functions, setting apply_loss_function to false will turn off the
				1834	application of the loss function to the output of the cost
				1835	function. This is of use for example if the user wishes to analyse
				1836	the solution quality by studying the distribution of residuals
				1837	before and after the solve.
				1838
				1839	.. member:: int Problem::EvaluateOptions::num_threads
				1840
				1841	Number of threads to use. (Requires OpenMP).
				1842
				1843	``rotation.h``
				1844	==============
				1845
				1846	Many applications of Ceres Solver involve optimization problems where
				1847	some of the variables correspond to rotations. To ease the pain of
				1848	work with the various representations of rotations (angle-axis,
				1849	quaternion and matrix) we provide a handy set of templated
				1850	functions. These functions are templated so that the user can use them
				1851	within Ceres Solver's automatic differentiation framework.
				1852
				1853	.. function:: template <typename T> void AngleAxisToQuaternion(T const* angle_axis, T* quaternion)
				1854
				1855	Convert a value in combined axis-angle representation to a
				1856	quaternion.
				1857
				1858	The value ``angle_axis`` is a triple whose norm is an angle in radians,
				1859	and whose direction is aligned with the axis of rotation, and
				1860	``quaternion`` is a 4-tuple that will contain the resulting quaternion.
				1861
				1862	.. function:: template <typename T> void QuaternionToAngleAxis(T const* quaternion, T* angle_axis)
				1863
				1864	Convert a quaternion to the equivalent combined axis-angle
				1865	representation.
				1866
				1867	The value ``quaternion`` must be a unit quaternion - it is not
				1868	normalized first, and ``angle_axis`` will be filled with a value
				1869	whose norm is the angle of rotation in radians, and whose direction
				1870	is the axis of rotation.
				1871
				1872	.. function:: template <typename T, int row_stride, int col_stride> void RotationMatrixToAngleAxis(const MatrixAdapter<const T, row_stride, col_stride>& R, T * angle_axis)
				1873	.. function:: template <typename T, int row_stride, int col_stride> void AngleAxisToRotationMatrix(T const * angle_axis, const MatrixAdapter<T, row_stride, col_stride>& R)
				1874	.. function:: template <typename T> void RotationMatrixToAngleAxis(T const * R, T * angle_axis)
				1875	.. function:: template <typename T> void AngleAxisToRotationMatrix(T const * angle_axis, T * R)
				1876
				1877	Conversions between 3x3 rotation matrix with given column and row strides and
				1878	axis-angle rotation representations. The functions that take a pointer to T instead
				1879	of a MatrixAdapter assume a column major representation with unit row stride and a column stride of 3.
				1880
				1881	.. function:: template <typename T, int row_stride, int col_stride> void EulerAnglesToRotationMatrix(const T* euler, const MatrixAdapter<T, row_stride, col_stride>& R)
				1882	.. function:: template <typename T> void EulerAnglesToRotationMatrix(const T* euler, int row_stride, T* R)
				1883
				1884	Conversions between 3x3 rotation matrix with given column and row strides and
				1885	Euler angle (in degrees) rotation representations.
				1886
				1887	The {pitch,roll,yaw} Euler angles are rotations around the {x,y,z}
				1888	axes, respectively. They are applied in that same order, so the
				1889	total rotation R is Rz * Ry * Rx.
				1890
				1891	The function that takes a pointer to T as the rotation matrix assumes a row
				1892	major representation with unit column stride and a row stride of 3.
				1893	The additional parameter row_stride is required to be 3.
				1894
				1895	.. function:: template <typename T, int row_stride, int col_stride> void QuaternionToScaledRotation(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
				1896	.. function:: template <typename T> void QuaternionToScaledRotation(const T q[4], T R[3 * 3])
				1897
				1898	Convert a 4-vector to a 3x3 scaled rotation matrix.
				1899
				1900	The choice of rotation is such that the quaternion
				1901	:math:`\begin{bmatrix} 1 &0 &0 &0\end{bmatrix}` goes to an identity
				1902	matrix and for small :math:`a, b, c` the quaternion
				1903	:math:`\begin{bmatrix}1 &a &b &c\end{bmatrix}` goes to the matrix
				1904
				1905	.. math::
				1906
				1907	I + 2 \begin{bmatrix} 0 & -c & b \\ c & 0 & -a\\ -b & a & 0
				1908	\end{bmatrix} + O(q^2)
				1909
				1910	which corresponds to a Rodrigues approximation, the last matrix
				1911	being the cross-product matrix of :math:`\begin{bmatrix} a& b&
				1912	c\end{bmatrix}`. Together with the property that :math:`R(q1 * q2)
				1913	= R(q1) * R(q2)` this uniquely defines the mapping from :math:`q` to
				1914	:math:`R`.
				1915
				1916	In the function that accepts a pointer to T instead of a MatrixAdapter,
				1917	the rotation matrix ``R`` is a row-major matrix with unit column stride
				1918	and a row stride of 3.
				1919
				1920	No normalization of the quaternion is performed, i.e.
				1921	:math:`R = \\|q\\|^2 Q`, where :math:`Q` is an orthonormal matrix
				1922	such that :math:`\det(Q) = 1` and :math:`Q*Q' = I`.
				1923
				1924
				1925	.. function:: template <typename T> void QuaternionToRotation(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
				1926	.. function:: template <typename T> void QuaternionToRotation(const T q[4], T R[3 * 3])
				1927
				1928	Same as above except that the rotation matrix is normalized by the
				1929	Frobenius norm, so that :math:`R R' = I` (and :math:`\det(R) = 1`).
				1930
				1931	.. function:: template <typename T> void UnitQuaternionRotatePoint(const T q[4], const T pt[3], T result[3])
				1932
				1933	Rotates a point pt by a quaternion q:
				1934
				1935	.. math:: \text{result} = R(q) \text{pt}
				1936
				1937	Assumes the quaternion is unit norm. If you pass in a quaternion
				1938	with :math:`\|q\|^2 = 2` then you WILL NOT get back 2 times the
				1939	result you get for a unit quaternion.
				1940
				1941
				1942	.. function:: template <typename T> void QuaternionRotatePoint(const T q[4], const T pt[3], T result[3])
				1943
				1944	With this function you do not need to assume that :math:`q` has unit norm.
				1945	It does assume that the norm is non-zero.
				1946
				1947	.. function:: template <typename T> void QuaternionProduct(const T z[4], const T w[4], T zw[4])
				1948
				1949	.. math:: zw = z * w
				1950
				1951	where :math:`*` is the Quaternion product between 4-vectors.
				1952
				1953
				1954	.. function:: template <typename T> void CrossProduct(const T x[3], const T y[3], T x_cross_y[3])
				1955
				1956	.. math:: \text{x_cross_y} = x \times y
				1957
				1958	.. function:: template <typename T> void AngleAxisRotatePoint(const T angle_axis[3], const T pt[3], T result[3])
				1959
				1960	.. math:: y = R(\text{angle_axis}) x
				1961
				1962
				1963	Cubic Interpolation
				1964	===================
				1965
				1966	Optimization problems often involve functions that are given in the
				1967	form of a table of values, for example an image. Evaluating these
				1968	functions and their derivatives requires interpolating these
				1969	values. Interpolating tabulated functions is a vast area of research
				1970	and there are a lot of libraries which implement a variety of
				1971	interpolation schemes. However, using them within the automatic
				1972	differentiation framework in Ceres is quite painful. To this end,
				1973	Ceres provides the ability to interpolate one dimensional and two
				1974	dimensional tabular functions.
				1975
				1976	The one dimensional interpolation is based on the Cubic Hermite
				1977	Spline, also known as the Catmull-Rom Spline. This produces a first
				1978	order differentiable interpolating function. The two dimensional
				1979	interpolation scheme is a generalization of the one dimensional scheme
				1980	where the interpolating function is assumed to be separable in the two
				1981	dimensions,
				1982
				1983	More details of the construction can be found `Linear Methods for
				1984	Image Interpolation <http://www.ipol.im/pub/art/2011/g_lmii/>`_ by
				1985	Pascal Getreuer.
				1986
				1987	.. class:: CubicInterpolator
				1988
				1989	Given as input an infinite one dimensional grid, which provides the
				1990	following interface.
				1991
				1992	.. code::
				1993
				1994	struct Grid1D {
				1995	enum { DATA_DIMENSION = 2; };
				1996	void GetValue(int n, double* f) const;
				1997	};
				1998
				1999	Where, ``GetValue`` gives us the value of a function :math:`f`
				2000	(possibly vector valued) for any integer :math:`n` and the enum
				2001	``DATA_DIMENSION`` indicates the dimensionality of the function being
				2002	interpolated. For example if you are interpolating rotations in
				2003	axis-angle format over time, then ``DATA_DIMENSION = 3``.
				2004
				2005	:class:`CubicInterpolator` uses Cubic Hermite splines to produce a
				2006	smooth approximation to it that can be used to evaluate the
				2007	:math:`f(x)` and :math:`f'(x)` at any point on the real number
				2008	line. For example, the following code interpolates an array of four
				2009	numbers.
				2010
				2011	.. code::
				2012
				2013	const double data[] = {1.0, 2.0, 5.0, 6.0};
				2014	Grid1D<double, 1> array(x, 0, 4);
				2015	CubicInterpolator interpolator(array);
				2016	double f, dfdx;
				2017	interpolator.Evaluate(1.5, &f, &dfdx);
				2018
				2019
				2020	In the above code we use ``Grid1D`` a templated helper class that
				2021	allows easy interfacing between ``C++`` arrays and
				2022	:class:`CubicInterpolator`.
				2023
				2024	``Grid1D`` supports vector valued functions where the various
				2025	coordinates of the function can be interleaved or stacked. It also
				2026	allows the use of any numeric type as input, as long as it can be
				2027	safely cast to a double.
				2028
				2029	.. class:: BiCubicInterpolator
				2030
				2031	Given as input an infinite two dimensional grid, which provides the
				2032	following interface:
				2033
				2034	.. code::
				2035
				2036	struct Grid2D {
				2037	enum { DATA_DIMENSION = 2 };
				2038	void GetValue(int row, int col, double* f) const;
				2039	};
				2040
				2041	Where, ``GetValue`` gives us the value of a function :math:`f`
				2042	(possibly vector valued) for any pair of integers :code:`row` and
				2043	:code:`col` and the enum ``DATA_DIMENSION`` indicates the
				2044	dimensionality of the function being interpolated. For example if you
				2045	are interpolating a color image with three channels (Red, Green &
				2046	Blue), then ``DATA_DIMENSION = 3``.
				2047
				2048	:class:`BiCubicInterpolator` uses the cubic convolution interpolation
				2049	algorithm of R. Keys [Keys]_, to produce a smooth approximation to it
				2050	that can be used to evaluate the :math:`f(r,c)`, :math:`\frac{\partial
				2051	f(r,c)}{\partial r}` and :math:`\frac{\partial f(r,c)}{\partial c}` at
				2052	any any point in the real plane.
				2053
				2054	For example the following code interpolates a two dimensional array.
				2055
				2056	.. code::
				2057
				2058	const double data[] = {1.0, 3.0, -1.0, 4.0,
				2059	3.6, 2.1, 4.2, 2.0,
				2060	2.0, 1.0, 3.1, 5.2};
				2061	Grid2D<double, 1> array(data, 0, 3, 0, 4);
				2062	BiCubicInterpolator interpolator(array);
				2063	double f, dfdr, dfdc;
				2064	interpolator.Evaluate(1.2, 2.5, &f, &dfdr, &dfdc);
				2065
				2066	In the above code, the templated helper class ``Grid2D`` is used to
				2067	make a ``C++`` array look like a two dimensional table to
				2068	:class:`BiCubicInterpolator`.
				2069
				2070	``Grid2D`` supports row or column major layouts. It also supports
				2071	vector valued functions where the individual coordinates of the
				2072	function may be interleaved or stacked. It also allows the use of any
				2073	numeric type as input, as long as it can be safely cast to double.