blob: 860b689f2e9520c787f2d6c61d85922f0b714056 [file] [log] [blame]
Austin Schuh70cc9552019-01-21 19:46:48 -08001.. default-domain:: cpp
2
3.. cpp:namespace:: ceres
4
5.. _`chapter-nnls_modeling`:
6
7=================================
8Modeling Non-linear Least Squares
9=================================
10
11Introduction
12============
13
14Ceres solver consists of two distinct parts. A modeling API which
15provides a rich set of tools to construct an optimization problem one
16term at a time and a solver API that controls the minimization
17algorithm. This chapter is devoted to the task of modeling
18optimization problems using Ceres. :ref:`chapter-nnls_solving` discusses
19the various ways in which an optimization problem can be solved using
20Ceres.
21
22Ceres solves robustified bounds constrained non-linear least squares
23problems of the form:
24
25.. math:: :label: ceresproblem_modeling
26
27 \min_{\mathbf{x}} &\quad \frac{1}{2}\sum_{i}
28 \rho_i\left(\left\|f_i\left(x_{i_1},
29 ... ,x_{i_k}\right)\right\|^2\right) \\
30 \text{s.t.} &\quad l_j \le x_j \le u_j
31
32In Ceres parlance, the expression
33:math:`\rho_i\left(\left\|f_i\left(x_{i_1},...,x_{i_k}\right)\right\|^2\right)`
34is known as a **residual block**, where :math:`f_i(\cdot)` is a
35:class:`CostFunction` that depends on the **parameter blocks**
36:math:`\left\{x_{i_1},... , x_{i_k}\right\}`.
37
38In most optimization problems small groups of scalars occur
39together. For example the three components of a translation vector and
40the four components of the quaternion that define the pose of a
41camera. We refer to such a group of scalars as a **parameter block**. Of
42course a parameter block can be just a single scalar too.
43
44:math:`\rho_i` is a :class:`LossFunction`. A :class:`LossFunction` is
45a scalar valued function that is used to reduce the influence of
46outliers on the solution of non-linear least squares problems.
47
48:math:`l_j` and :math:`u_j` are lower and upper bounds on the
49parameter block :math:`x_j`.
50
51As a special case, when :math:`\rho_i(x) = x`, i.e., the identity
52function, and :math:`l_j = -\infty` and :math:`u_j = \infty` we get
53the more familiar unconstrained `non-linear least squares problem
54<http://en.wikipedia.org/wiki/Non-linear_least_squares>`_.
55
56.. math:: :label: ceresproblemunconstrained
57
58 \frac{1}{2}\sum_{i} \left\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\|^2.
59
60:class:`CostFunction`
61=====================
62
63For each term in the objective function, a :class:`CostFunction` is
64responsible for computing a vector of residuals and Jacobian
65matrices. Concretely, consider a function
66:math:`f\left(x_{1},...,x_{k}\right)` that depends on parameter blocks
67:math:`\left[x_{1}, ... , x_{k}\right]`.
68
69Then, given :math:`\left[x_{1}, ... , x_{k}\right]`,
70:class:`CostFunction` is responsible for computing the vector
71:math:`f\left(x_{1},...,x_{k}\right)` and the Jacobian matrices
72
73.. math:: J_i = \frac{\partial}{\partial x_i} f(x_1, ..., x_k) \quad \forall i \in \{1, \ldots, k\}
74
75.. class:: CostFunction
76
77 .. code-block:: c++
78
79 class CostFunction {
80 public:
81 virtual bool Evaluate(double const* const* parameters,
82 double* residuals,
83 double** jacobians) = 0;
84 const vector<int32>& parameter_block_sizes();
85 int num_residuals() const;
86
87 protected:
88 vector<int32>* mutable_parameter_block_sizes();
89 void set_num_residuals(int num_residuals);
90 };
91
92
93The signature of the :class:`CostFunction` (number and sizes of input
94parameter blocks and number of outputs) is stored in
95:member:`CostFunction::parameter_block_sizes_` and
96:member:`CostFunction::num_residuals_` respectively. User code
97inheriting from this class is expected to set these two members with
98the corresponding accessors. This information will be verified by the
99:class:`Problem` when added with :func:`Problem::AddResidualBlock`.
100
101.. function:: bool CostFunction::Evaluate(double const* const* parameters, double* residuals, double** jacobians)
102
103 Compute the residual vector and the Jacobian matrices.
104
105 ``parameters`` is an array of arrays of size
106 ``CostFunction::parameter_block_sizes_.size()`` and
107 ``parameters[i]`` is an array of size ``parameter_block_sizes_[i]``
108 that contains the :math:`i^{\text{th}}` parameter block that the
109 ``CostFunction`` depends on.
110
111 ``parameters`` is never ``NULL``.
112
113 ``residuals`` is an array of size ``num_residuals_``.
114
115 ``residuals`` is never ``NULL``.
116
117 ``jacobians`` is an array of arrays of size
118 ``CostFunction::parameter_block_sizes_.size()``.
119
120 If ``jacobians`` is ``NULL``, the user is only expected to compute
121 the residuals.
122
123 ``jacobians[i]`` is a row-major array of size ``num_residuals x
124 parameter_block_sizes_[i]``.
125
126 If ``jacobians[i]`` is **not** ``NULL``, the user is required to
127 compute the Jacobian of the residual vector with respect to
128 ``parameters[i]`` and store it in this array, i.e.
129
130 ``jacobians[i][r * parameter_block_sizes_[i] + c]`` =
131 :math:`\frac{\displaystyle \partial \text{residual}[r]}{\displaystyle \partial \text{parameters}[i][c]}`
132
133 If ``jacobians[i]`` is ``NULL``, then this computation can be
134 skipped. This is the case when the corresponding parameter block is
135 marked constant.
136
137 The return value indicates whether the computation of the residuals
138 and/or jacobians was successful or not. This can be used to
139 communicate numerical failures in Jacobian computations for
140 instance.
141
142:class:`SizedCostFunction`
143==========================
144
145.. class:: SizedCostFunction
146
147 If the size of the parameter blocks and the size of the residual
148 vector is known at compile time (this is the common case),
149 :class:`SizeCostFunction` can be used where these values can be
150 specified as template parameters and the user only needs to
151 implement :func:`CostFunction::Evaluate`.
152
153 .. code-block:: c++
154
155 template<int kNumResiduals,
156 int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
157 int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
158 class SizedCostFunction : public CostFunction {
159 public:
160 virtual bool Evaluate(double const* const* parameters,
161 double* residuals,
162 double** jacobians) const = 0;
163 };
164
165
166:class:`AutoDiffCostFunction`
167=============================
168
169.. class:: AutoDiffCostFunction
170
171 Defining a :class:`CostFunction` or a :class:`SizedCostFunction`
172 can be a tedious and error prone especially when computing
173 derivatives. To this end Ceres provides `automatic differentiation
174 <http://en.wikipedia.org/wiki/Automatic_differentiation>`_.
175
176 .. code-block:: c++
177
178 template <typename CostFunctor,
179 int kNumResiduals, // Number of residuals, or ceres::DYNAMIC.
180 int N0, // Number of parameters in block 0.
181 int N1 = 0, // Number of parameters in block 1.
182 int N2 = 0, // Number of parameters in block 2.
183 int N3 = 0, // Number of parameters in block 3.
184 int N4 = 0, // Number of parameters in block 4.
185 int N5 = 0, // Number of parameters in block 5.
186 int N6 = 0, // Number of parameters in block 6.
187 int N7 = 0, // Number of parameters in block 7.
188 int N8 = 0, // Number of parameters in block 8.
189 int N9 = 0> // Number of parameters in block 9.
190 class AutoDiffCostFunction : public
191 SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
192 public:
193 explicit AutoDiffCostFunction(CostFunctor* functor);
194 // Ignore the template parameter kNumResiduals and use
195 // num_residuals instead.
196 AutoDiffCostFunction(CostFunctor* functor, int num_residuals);
197 };
198
199 To get an auto differentiated cost function, you must define a
200 class with a templated ``operator()`` (a functor) that computes the
201 cost function in terms of the template parameter ``T``. The
202 autodiff framework substitutes appropriate ``Jet`` objects for
203 ``T`` in order to compute the derivative when necessary, but this
204 is hidden, and you should write the function as if ``T`` were a
205 scalar type (e.g. a double-precision floating point number).
206
207 The function must write the computed value in the last argument
208 (the only non-``const`` one) and return true to indicate success.
209
210 For example, consider a scalar error :math:`e = k - x^\top y`,
211 where both :math:`x` and :math:`y` are two-dimensional vector
212 parameters and :math:`k` is a constant. The form of this error,
213 which is the difference between a constant and an expression, is a
214 common pattern in least squares problems. For example, the value
215 :math:`x^\top y` might be the model expectation for a series of
216 measurements, where there is an instance of the cost function for
217 each measurement :math:`k`.
218
219 The actual cost added to the total problem is :math:`e^2`, or
220 :math:`(k - x^\top y)^2`; however, the squaring is implicitly done
221 by the optimization framework.
222
223 To write an auto-differentiable cost function for the above model,
224 first define the object
225
226 .. code-block:: c++
227
228 class MyScalarCostFunctor {
229 MyScalarCostFunctor(double k): k_(k) {}
230
231 template <typename T>
232 bool operator()(const T* const x , const T* const y, T* e) const {
233 e[0] = k_ - x[0] * y[0] - x[1] * y[1];
234 return true;
235 }
236
237 private:
238 double k_;
239 };
240
241
242 Note that in the declaration of ``operator()`` the input parameters
243 ``x`` and ``y`` come first, and are passed as const pointers to arrays
244 of ``T``. If there were three input parameters, then the third input
245 parameter would come after ``y``. The output is always the last
246 parameter, and is also a pointer to an array. In the example above,
247 ``e`` is a scalar, so only ``e[0]`` is set.
248
249 Then given this class definition, the auto differentiated cost
250 function for it can be constructed as follows.
251
252 .. code-block:: c++
253
254 CostFunction* cost_function
255 = new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
256 new MyScalarCostFunctor(1.0)); ^ ^ ^
257 | | |
258 Dimension of residual ------+ | |
259 Dimension of x ----------------+ |
260 Dimension of y -------------------+
261
262
263 In this example, there is usually an instance for each measurement
264 of ``k``.
265
266 In the instantiation above, the template parameters following
267 ``MyScalarCostFunction``, ``<1, 2, 2>`` describe the functor as
268 computing a 1-dimensional output from two arguments, both
269 2-dimensional.
270
271 :class:`AutoDiffCostFunction` also supports cost functions with a
272 runtime-determined number of residuals. For example:
273
274 .. code-block:: c++
275
276 CostFunction* cost_function
277 = new AutoDiffCostFunction<MyScalarCostFunctor, DYNAMIC, 2, 2>(
278 new CostFunctorWithDynamicNumResiduals(1.0), ^ ^ ^
279 runtime_number_of_residuals); <----+ | | |
280 | | | |
281 | | | |
282 Actual number of residuals ------+ | | |
283 Indicate dynamic number of residuals --------+ | |
284 Dimension of x ------------------------------------+ |
285 Dimension of y ---------------------------------------+
286
287 The framework can currently accommodate cost functions of up to 10
288 independent variables, and there is no limit on the dimensionality
289 of each of them.
290
291 **WARNING 1** A common beginner's error when first using
292 :class:`AutoDiffCostFunction` is to get the sizing wrong. In particular,
293 there is a tendency to set the template parameters to (dimension of
294 residual, number of parameters) instead of passing a dimension
295 parameter for *every parameter block*. In the example above, that
296 would be ``<MyScalarCostFunction, 1, 2>``, which is missing the 2
297 as the last template argument.
298
299
300:class:`DynamicAutoDiffCostFunction`
301====================================
302
303.. class:: DynamicAutoDiffCostFunction
304
305 :class:`AutoDiffCostFunction` requires that the number of parameter
306 blocks and their sizes be known at compile time. It also has an
307 upper limit of 10 parameter blocks. In a number of applications,
308 this is not enough e.g., Bezier curve fitting, Neural Network
309 training etc.
310
311 .. code-block:: c++
312
313 template <typename CostFunctor, int Stride = 4>
314 class DynamicAutoDiffCostFunction : public CostFunction {
315 };
316
317 In such cases :class:`DynamicAutoDiffCostFunction` can be
318 used. Like :class:`AutoDiffCostFunction` the user must define a
319 templated functor, but the signature of the functor differs
320 slightly. The expected interface for the cost functors is:
321
322 .. code-block:: c++
323
324 struct MyCostFunctor {
325 template<typename T>
326 bool operator()(T const* const* parameters, T* residuals) const {
327 }
328 }
329
330 Since the sizing of the parameters is done at runtime, you must
331 also specify the sizes after creating the dynamic autodiff cost
332 function. For example:
333
334 .. code-block:: c++
335
336 DynamicAutoDiffCostFunction<MyCostFunctor, 4>* cost_function =
337 new DynamicAutoDiffCostFunction<MyCostFunctor, 4>(
338 new MyCostFunctor());
339 cost_function->AddParameterBlock(5);
340 cost_function->AddParameterBlock(10);
341 cost_function->SetNumResiduals(21);
342
343 Under the hood, the implementation evaluates the cost function
344 multiple times, computing a small set of the derivatives (four by
345 default, controlled by the ``Stride`` template parameter) with each
346 pass. There is a performance tradeoff with the size of the passes;
347 Smaller sizes are more cache efficient but result in larger number
348 of passes, and larger stride lengths can destroy cache-locality
349 while reducing the number of passes over the cost function. The
350 optimal value depends on the number and sizes of the various
351 parameter blocks.
352
353 As a rule of thumb, try using :class:`AutoDiffCostFunction` before
354 you use :class:`DynamicAutoDiffCostFunction`.
355
356:class:`NumericDiffCostFunction`
357================================
358
359.. class:: NumericDiffCostFunction
360
361 In some cases, its not possible to define a templated cost functor,
362 for example when the evaluation of the residual involves a call to a
363 library function that you do not have control over. In such a
364 situation, `numerical differentiation
365 <http://en.wikipedia.org/wiki/Numerical_differentiation>`_ can be
366 used.
367
368 .. NOTE ::
369
370 TODO(sameeragarwal): Add documentation for the constructor and for
371 NumericDiffOptions. Update DynamicNumericDiffOptions in a similar
372 manner.
373
374 .. code-block:: c++
375
376 template <typename CostFunctor,
377 NumericDiffMethodType method = CENTRAL,
378 int kNumResiduals, // Number of residuals, or ceres::DYNAMIC.
379 int N0, // Number of parameters in block 0.
380 int N1 = 0, // Number of parameters in block 1.
381 int N2 = 0, // Number of parameters in block 2.
382 int N3 = 0, // Number of parameters in block 3.
383 int N4 = 0, // Number of parameters in block 4.
384 int N5 = 0, // Number of parameters in block 5.
385 int N6 = 0, // Number of parameters in block 6.
386 int N7 = 0, // Number of parameters in block 7.
387 int N8 = 0, // Number of parameters in block 8.
388 int N9 = 0> // Number of parameters in block 9.
389 class NumericDiffCostFunction : public
390 SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
391 };
392
393 To get a numerically differentiated :class:`CostFunction`, you must
394 define a class with a ``operator()`` (a functor) that computes the
395 residuals. The functor must write the computed value in the last
396 argument (the only non-``const`` one) and return ``true`` to
397 indicate success. Please see :class:`CostFunction` for details on
398 how the return value may be used to impose simple constraints on the
399 parameter block. e.g., an object of the form
400
401 .. code-block:: c++
402
403 struct ScalarFunctor {
404 public:
405 bool operator()(const double* const x1,
406 const double* const x2,
407 double* residuals) const;
408 }
409
410 For example, consider a scalar error :math:`e = k - x'y`, where both
411 :math:`x` and :math:`y` are two-dimensional column vector
412 parameters, the prime sign indicates transposition, and :math:`k` is
413 a constant. The form of this error, which is the difference between
414 a constant and an expression, is a common pattern in least squares
415 problems. For example, the value :math:`x'y` might be the model
416 expectation for a series of measurements, where there is an instance
417 of the cost function for each measurement :math:`k`.
418
419 To write an numerically-differentiable class:`CostFunction` for the
420 above model, first define the object
421
422 .. code-block:: c++
423
424 class MyScalarCostFunctor {
425 MyScalarCostFunctor(double k): k_(k) {}
426
427 bool operator()(const double* const x,
428 const double* const y,
429 double* residuals) const {
430 residuals[0] = k_ - x[0] * y[0] + x[1] * y[1];
431 return true;
432 }
433
434 private:
435 double k_;
436 };
437
438 Note that in the declaration of ``operator()`` the input parameters
439 ``x`` and ``y`` come first, and are passed as const pointers to
440 arrays of ``double`` s. If there were three input parameters, then
441 the third input parameter would come after ``y``. The output is
442 always the last parameter, and is also a pointer to an array. In the
443 example above, the residual is a scalar, so only ``residuals[0]`` is
444 set.
445
446 Then given this class definition, the numerically differentiated
447 :class:`CostFunction` with central differences used for computing
448 the derivative can be constructed as follows.
449
450 .. code-block:: c++
451
452 CostFunction* cost_function
453 = new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, 1, 2, 2>(
454 new MyScalarCostFunctor(1.0)); ^ ^ ^ ^
455 | | | |
456 Finite Differencing Scheme -+ | | |
457 Dimension of residual ------------+ | |
458 Dimension of x ----------------------+ |
459 Dimension of y -------------------------+
460
461 In this example, there is usually an instance for each measurement
462 of `k`.
463
464 In the instantiation above, the template parameters following
465 ``MyScalarCostFunctor``, ``1, 2, 2``, describe the functor as
466 computing a 1-dimensional output from two arguments, both
467 2-dimensional.
468
469 NumericDiffCostFunction also supports cost functions with a
470 runtime-determined number of residuals. For example:
471
472 .. code-block:: c++
473
474 CostFunction* cost_function
475 = new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, DYNAMIC, 2, 2>(
476 new CostFunctorWithDynamicNumResiduals(1.0), ^ ^ ^
477 TAKE_OWNERSHIP, | | |
478 runtime_number_of_residuals); <----+ | | |
479 | | | |
480 | | | |
481 Actual number of residuals ------+ | | |
482 Indicate dynamic number of residuals --------------------+ | |
483 Dimension of x ------------------------------------------------+ |
484 Dimension of y ---------------------------------------------------+
485
486
487 The framework can currently accommodate cost functions of up to 10
488 independent variables, and there is no limit on the dimensionality
489 of each of them.
490
491 There are three available numeric differentiation schemes in ceres-solver:
492
493 The ``FORWARD`` difference method, which approximates :math:`f'(x)`
494 by computing :math:`\frac{f(x+h)-f(x)}{h}`, computes the cost
495 function one additional time at :math:`x+h`. It is the fastest but
496 least accurate method.
497
498 The ``CENTRAL`` difference method is more accurate at the cost of
499 twice as many function evaluations than forward difference,
500 estimating :math:`f'(x)` by computing
501 :math:`\frac{f(x+h)-f(x-h)}{2h}`.
502
503 The ``RIDDERS`` difference method[Ridders]_ is an adaptive scheme
504 that estimates derivatives by performing multiple central
505 differences at varying scales. Specifically, the algorithm starts at
506 a certain :math:`h` and as the derivative is estimated, this step
507 size decreases. To conserve function evaluations and estimate the
508 derivative error, the method performs Richardson extrapolations
509 between the tested step sizes. The algorithm exhibits considerably
510 higher accuracy, but does so by additional evaluations of the cost
511 function.
512
513 Consider using ``CENTRAL`` differences to begin with. Based on the
514 results, either try forward difference to improve performance or
515 Ridders' method to improve accuracy.
516
517 **WARNING** A common beginner's error when first using
518 :class:`NumericDiffCostFunction` is to get the sizing wrong. In
519 particular, there is a tendency to set the template parameters to
520 (dimension of residual, number of parameters) instead of passing a
521 dimension parameter for *every parameter*. In the example above,
522 that would be ``<MyScalarCostFunctor, 1, 2>``, which is missing the
523 last ``2`` argument. Please be careful when setting the size
524 parameters.
525
526
527Numeric Differentiation & LocalParameterization
528-----------------------------------------------
529
530 If your cost function depends on a parameter block that must lie on
531 a manifold and the functor cannot be evaluated for values of that
532 parameter block not on the manifold then you may have problems
533 numerically differentiating such functors.
534
535 This is because numeric differentiation in Ceres is performed by
536 perturbing the individual coordinates of the parameter blocks that
537 a cost functor depends on. In doing so, we assume that the
538 parameter blocks live in an Euclidean space and ignore the
539 structure of manifold that they live As a result some of the
540 perturbations may not lie on the manifold corresponding to the
541 parameter block.
542
543 For example consider a four dimensional parameter block that is
544 interpreted as a unit Quaternion. Perturbing the coordinates of
545 this parameter block will violate the unit norm property of the
546 parameter block.
547
548 Fixing this problem requires that :class:`NumericDiffCostFunction`
549 be aware of the :class:`LocalParameterization` associated with each
550 parameter block and only generate perturbations in the local
551 tangent space of each parameter block.
552
553 For now this is not considered to be a serious enough problem to
554 warrant changing the :class:`NumericDiffCostFunction` API. Further,
555 in most cases it is relatively straightforward to project a point
556 off the manifold back onto the manifold before using it in the
557 functor. For example in case of the Quaternion, normalizing the
558 4-vector before using it does the trick.
559
560 **Alternate Interface**
561
562 For a variety of reasons, including compatibility with legacy code,
563 :class:`NumericDiffCostFunction` can also take
564 :class:`CostFunction` objects as input. The following describes
565 how.
566
567 To get a numerically differentiated cost function, define a
568 subclass of :class:`CostFunction` such that the
569 :func:`CostFunction::Evaluate` function ignores the ``jacobians``
570 parameter. The numeric differentiation wrapper will fill in the
571 jacobian parameter if necessary by repeatedly calling the
572 :func:`CostFunction::Evaluate` with small changes to the
573 appropriate parameters, and computing the slope. For performance,
574 the numeric differentiation wrapper class is templated on the
575 concrete cost function, even though it could be implemented only in
576 terms of the :class:`CostFunction` interface.
577
578 The numerically differentiated version of a cost function for a
579 cost function can be constructed as follows:
580
581 .. code-block:: c++
582
583 CostFunction* cost_function
584 = new NumericDiffCostFunction<MyCostFunction, CENTRAL, 1, 4, 8>(
585 new MyCostFunction(...), TAKE_OWNERSHIP);
586
587 where ``MyCostFunction`` has 1 residual and 2 parameter blocks with
588 sizes 4 and 8 respectively. Look at the tests for a more detailed
589 example.
590
591:class:`DynamicNumericDiffCostFunction`
592=======================================
593
594.. class:: DynamicNumericDiffCostFunction
595
596 Like :class:`AutoDiffCostFunction` :class:`NumericDiffCostFunction`
597 requires that the number of parameter blocks and their sizes be
598 known at compile time. It also has an upper limit of 10 parameter
599 blocks. In a number of applications, this is not enough.
600
601 .. code-block:: c++
602
603 template <typename CostFunctor, NumericDiffMethodType method = CENTRAL>
604 class DynamicNumericDiffCostFunction : public CostFunction {
605 };
606
607 In such cases when numeric differentiation is desired,
608 :class:`DynamicNumericDiffCostFunction` can be used.
609
610 Like :class:`NumericDiffCostFunction` the user must define a
611 functor, but the signature of the functor differs slightly. The
612 expected interface for the cost functors is:
613
614 .. code-block:: c++
615
616 struct MyCostFunctor {
617 bool operator()(double const* const* parameters, double* residuals) const {
618 }
619 }
620
621 Since the sizing of the parameters is done at runtime, you must
622 also specify the sizes after creating the dynamic numeric diff cost
623 function. For example:
624
625 .. code-block:: c++
626
627 DynamicNumericDiffCostFunction<MyCostFunctor>* cost_function =
628 new DynamicNumericDiffCostFunction<MyCostFunctor>(new MyCostFunctor);
629 cost_function->AddParameterBlock(5);
630 cost_function->AddParameterBlock(10);
631 cost_function->SetNumResiduals(21);
632
633 As a rule of thumb, try using :class:`NumericDiffCostFunction` before
634 you use :class:`DynamicNumericDiffCostFunction`.
635
636 **WARNING** The same caution about mixing local parameterizations
637 with numeric differentiation applies as is the case with
638 :class:`NumericDiffCostFunction`.
639
640:class:`CostFunctionToFunctor`
641==============================
642
643.. class:: CostFunctionToFunctor
644
645 :class:`CostFunctionToFunctor` is an adapter class that allows
646 users to use :class:`CostFunction` objects in templated functors
647 which are to be used for automatic differentiation. This allows
648 the user to seamlessly mix analytic, numeric and automatic
649 differentiation.
650
651 For example, let us assume that
652
653 .. code-block:: c++
654
655 class IntrinsicProjection : public SizedCostFunction<2, 5, 3> {
656 public:
657 IntrinsicProjection(const double* observation);
658 virtual bool Evaluate(double const* const* parameters,
659 double* residuals,
660 double** jacobians) const;
661 };
662
663 is a :class:`CostFunction` that implements the projection of a
664 point in its local coordinate system onto its image plane and
665 subtracts it from the observed point projection. It can compute its
666 residual and either via analytic or numerical differentiation can
667 compute its jacobians.
668
669 Now we would like to compose the action of this
670 :class:`CostFunction` with the action of camera extrinsics, i.e.,
671 rotation and translation. Say we have a templated function
672
673 .. code-block:: c++
674
675 template<typename T>
676 void RotateAndTranslatePoint(const T* rotation,
677 const T* translation,
678 const T* point,
679 T* result);
680
681
682 Then we can now do the following,
683
684 .. code-block:: c++
685
686 struct CameraProjection {
687 CameraProjection(double* observation)
688 : intrinsic_projection_(new IntrinsicProjection(observation)) {
689 }
690
691 template <typename T>
692 bool operator()(const T* rotation,
693 const T* translation,
694 const T* intrinsics,
695 const T* point,
696 T* residual) const {
697 T transformed_point[3];
698 RotateAndTranslatePoint(rotation, translation, point, transformed_point);
699
700 // Note that we call intrinsic_projection_, just like it was
701 // any other templated functor.
702 return intrinsic_projection_(intrinsics, transformed_point, residual);
703 }
704
705 private:
706 CostFunctionToFunctor<2,5,3> intrinsic_projection_;
707 };
708
709 Note that :class:`CostFunctionToFunctor` takes ownership of the
710 :class:`CostFunction` that was passed in to the constructor.
711
712 In the above example, we assumed that ``IntrinsicProjection`` is a
713 ``CostFunction`` capable of evaluating its value and its
714 derivatives. Suppose, if that were not the case and
715 ``IntrinsicProjection`` was defined as follows:
716
717 .. code-block:: c++
718
719 struct IntrinsicProjection
720 IntrinsicProjection(const double* observation) {
721 observation_[0] = observation[0];
722 observation_[1] = observation[1];
723 }
724
725 bool operator()(const double* calibration,
726 const double* point,
727 double* residuals) {
728 double projection[2];
729 ThirdPartyProjectionFunction(calibration, point, projection);
730 residuals[0] = observation_[0] - projection[0];
731 residuals[1] = observation_[1] - projection[1];
732 return true;
733 }
734 double observation_[2];
735 };
736
737
738 Here ``ThirdPartyProjectionFunction`` is some third party library
739 function that we have no control over. So this function can compute
740 its value and we would like to use numeric differentiation to
741 compute its derivatives. In this case we can use a combination of
742 ``NumericDiffCostFunction`` and ``CostFunctionToFunctor`` to get the
743 job done.
744
745 .. code-block:: c++
746
747 struct CameraProjection {
748 CameraProjection(double* observation)
749 intrinsic_projection_(
750 new NumericDiffCostFunction<IntrinsicProjection, CENTRAL, 2, 5, 3>(
751 new IntrinsicProjection(observation)) {
752 }
753
754 template <typename T>
755 bool operator()(const T* rotation,
756 const T* translation,
757 const T* intrinsics,
758 const T* point,
759 T* residuals) const {
760 T transformed_point[3];
761 RotateAndTranslatePoint(rotation, translation, point, transformed_point);
762 return intrinsic_projection_(intrinsics, transformed_point, residual);
763 }
764
765 private:
766 CostFunctionToFunctor<2,5,3> intrinsic_projection_;
767 };
768
769:class:`DynamicCostFunctionToFunctor`
770=====================================
771
772.. class:: DynamicCostFunctionToFunctor
773
774 :class:`DynamicCostFunctionToFunctor` provides the same functionality as
775 :class:`CostFunctionToFunctor` for cases where the number and size of the
776 parameter vectors and residuals are not known at compile-time. The API
777 provided by :class:`DynamicCostFunctionToFunctor` matches what would be
778 expected by :class:`DynamicAutoDiffCostFunction`, i.e. it provides a
779 templated functor of this form:
780
781 .. code-block:: c++
782
783 template<typename T>
784 bool operator()(T const* const* parameters, T* residuals) const;
785
786 Similar to the example given for :class:`CostFunctionToFunctor`, let us
787 assume that
788
789 .. code-block:: c++
790
791 class IntrinsicProjection : public CostFunction {
792 public:
793 IntrinsicProjection(const double* observation);
794 virtual bool Evaluate(double const* const* parameters,
795 double* residuals,
796 double** jacobians) const;
797 };
798
799 is a :class:`CostFunction` that projects a point in its local coordinate
800 system onto its image plane and subtracts it from the observed point
801 projection.
802
803 Using this :class:`CostFunction` in a templated functor would then look like
804 this:
805
806 .. code-block:: c++
807
808 struct CameraProjection {
809 CameraProjection(double* observation)
810 : intrinsic_projection_(new IntrinsicProjection(observation)) {
811 }
812
813 template <typename T>
814 bool operator()(T const* const* parameters,
815 T* residual) const {
816 const T* rotation = parameters[0];
817 const T* translation = parameters[1];
818 const T* intrinsics = parameters[2];
819 const T* point = parameters[3];
820
821 T transformed_point[3];
822 RotateAndTranslatePoint(rotation, translation, point, transformed_point);
823
824 const T* projection_parameters[2];
825 projection_parameters[0] = intrinsics;
826 projection_parameters[1] = transformed_point;
827 return intrinsic_projection_(projection_parameters, residual);
828 }
829
830 private:
831 DynamicCostFunctionToFunctor intrinsic_projection_;
832 };
833
834 Like :class:`CostFunctionToFunctor`, :class:`DynamicCostFunctionToFunctor`
835 takes ownership of the :class:`CostFunction` that was passed in to the
836 constructor.
837
838:class:`ConditionedCostFunction`
839================================
840
841.. class:: ConditionedCostFunction
842
843 This class allows you to apply different conditioning to the residual
844 values of a wrapped cost function. An example where this is useful is
845 where you have an existing cost function that produces N values, but you
846 want the total cost to be something other than just the sum of these
847 squared values - maybe you want to apply a different scaling to some
848 values, to change their contribution to the cost.
849
850 Usage:
851
852 .. code-block:: c++
853
854 // my_cost_function produces N residuals
855 CostFunction* my_cost_function = ...
856 CHECK_EQ(N, my_cost_function->num_residuals());
857 vector<CostFunction*> conditioners;
858
859 // Make N 1x1 cost functions (1 parameter, 1 residual)
860 CostFunction* f_1 = ...
861 conditioners.push_back(f_1);
862
863 CostFunction* f_N = ...
864 conditioners.push_back(f_N);
865 ConditionedCostFunction* ccf =
866 new ConditionedCostFunction(my_cost_function, conditioners);
867
868
869 Now ``ccf`` 's ``residual[i]`` (i=0..N-1) will be passed though the
870 :math:`i^{\text{th}}` conditioner.
871
872 .. code-block:: c++
873
874 ccf_residual[i] = f_i(my_cost_function_residual[i])
875
876 and the Jacobian will be affected appropriately.
877
878
879:class:`GradientChecker`
880================================
881
882.. class:: GradientChecker
883
884 This class compares the Jacobians returned by a cost function against
885 derivatives estimated using finite differencing. It is meant as a tool for
886 unit testing, giving you more fine-grained control than the check_gradients
887 option in the solver options.
888
889 The condition enforced is that
890
891 .. math:: \forall{i,j}: \frac{J_{ij} - J'_{ij}}{max_{ij}(J_{ij} - J'_{ij})} < r
892
893 where :math:`J_{ij}` is the jacobian as computed by the supplied cost
894 function (by the user) multiplied by the local parameterization Jacobian,
895 :math:`J'_{ij}` is the jacobian as computed by finite differences,
896 multiplied by the local parameterization Jacobian as well, and :math:`r`
897 is the relative precision.
898
899 Usage:
900
901 .. code-block:: c++
902
903 // my_cost_function takes two parameter blocks. The first has a local
904 // parameterization associated with it.
905 CostFunction* my_cost_function = ...
906 LocalParameterization* my_parameterization = ...
907 NumericDiffOptions numeric_diff_options;
908
909 std::vector<LocalParameterization*> local_parameterizations;
910 local_parameterizations.push_back(my_parameterization);
911 local_parameterizations.push_back(NULL);
912
913 std::vector parameter1;
914 std::vector parameter2;
915 // Fill parameter 1 & 2 with test data...
916
917 std::vector<double*> parameter_blocks;
918 parameter_blocks.push_back(parameter1.data());
919 parameter_blocks.push_back(parameter2.data());
920
921 GradientChecker gradient_checker(my_cost_function,
922 local_parameterizations, numeric_diff_options);
923 GradientCheckResults results;
924 if (!gradient_checker.Probe(parameter_blocks.data(), 1e-9, &results) {
925 LOG(ERROR) << "An error has occurred:\n" << results.error_log;
926 }
927
928
929:class:`NormalPrior`
930====================
931
932.. class:: NormalPrior
933
934 .. code-block:: c++
935
936 class NormalPrior: public CostFunction {
937 public:
938 // Check that the number of rows in the vector b are the same as the
939 // number of columns in the matrix A, crash otherwise.
940 NormalPrior(const Matrix& A, const Vector& b);
941
942 virtual bool Evaluate(double const* const* parameters,
943 double* residuals,
944 double** jacobians) const;
945 };
946
947 Implements a cost function of the form
948
949 .. math:: cost(x) = ||A(x - b)||^2
950
951 where, the matrix :math:`A` and the vector :math:`b` are fixed and :math:`x`
952 is the variable. In case the user is interested in implementing a cost
953 function of the form
954
955 .. math:: cost(x) = (x - \mu)^T S^{-1} (x - \mu)
956
957 where, :math:`\mu` is a vector and :math:`S` is a covariance matrix,
958 then, :math:`A = S^{-1/2}`, i.e the matrix :math:`A` is the square
959 root of the inverse of the covariance, also known as the stiffness
960 matrix. There are however no restrictions on the shape of
961 :math:`A`. It is free to be rectangular, which would be the case if
962 the covariance matrix :math:`S` is rank deficient.
963
964
965
966.. _`section-loss_function`:
967
968:class:`LossFunction`
969=====================
970
971.. class:: LossFunction
972
973 For least squares problems where the minimization may encounter
974 input terms that contain outliers, that is, completely bogus
975 measurements, it is important to use a loss function that reduces
976 their influence.
977
978 Consider a structure from motion problem. The unknowns are 3D
979 points and camera parameters, and the measurements are image
980 coordinates describing the expected reprojected position for a
981 point in a camera. For example, we want to model the geometry of a
982 street scene with fire hydrants and cars, observed by a moving
983 camera with unknown parameters, and the only 3D points we care
984 about are the pointy tippy-tops of the fire hydrants. Our magic
985 image processing algorithm, which is responsible for producing the
986 measurements that are input to Ceres, has found and matched all
987 such tippy-tops in all image frames, except that in one of the
988 frame it mistook a car's headlight for a hydrant. If we didn't do
989 anything special the residual for the erroneous measurement will
990 result in the entire solution getting pulled away from the optimum
991 to reduce the large error that would otherwise be attributed to the
992 wrong measurement.
993
994 Using a robust loss function, the cost for large residuals is
995 reduced. In the example above, this leads to outlier terms getting
996 down-weighted so they do not overly influence the final solution.
997
998 .. code-block:: c++
999
1000 class LossFunction {
1001 public:
1002 virtual void Evaluate(double s, double out[3]) const = 0;
1003 };
1004
1005
1006 The key method is :func:`LossFunction::Evaluate`, which given a
1007 non-negative scalar ``s``, computes
1008
1009 .. math:: out = \begin{bmatrix}\rho(s), & \rho'(s), & \rho''(s)\end{bmatrix}
1010
1011 Here the convention is that the contribution of a term to the cost
1012 function is given by :math:`\frac{1}{2}\rho(s)`, where :math:`s
1013 =\|f_i\|^2`. Calling the method with a negative value of :math:`s`
1014 is an error and the implementations are not required to handle that
1015 case.
1016
1017 Most sane choices of :math:`\rho` satisfy:
1018
1019 .. math::
1020
1021 \rho(0) &= 0\\
1022 \rho'(0) &= 1\\
1023 \rho'(s) &< 1 \text{ in the outlier region}\\
1024 \rho''(s) &< 0 \text{ in the outlier region}
1025
1026 so that they mimic the squared cost for small residuals.
1027
1028 **Scaling**
1029
1030 Given one robustifier :math:`\rho(s)` one can change the length
1031 scale at which robustification takes place, by adding a scale
1032 factor :math:`a > 0` which gives us :math:`\rho(s,a) = a^2 \rho(s /
1033 a^2)` and the first and second derivatives as :math:`\rho'(s /
1034 a^2)` and :math:`(1 / a^2) \rho''(s / a^2)` respectively.
1035
1036
1037 The reason for the appearance of squaring is that :math:`a` is in
1038 the units of the residual vector norm whereas :math:`s` is a squared
1039 norm. For applications it is more convenient to specify :math:`a` than
1040 its square.
1041
1042Instances
1043---------
1044
1045Ceres includes a number of predefined loss functions. For simplicity
1046we described their unscaled versions. The figure below illustrates
1047their shape graphically. More details can be found in
1048``include/ceres/loss_function.h``.
1049
1050.. figure:: loss.png
1051 :figwidth: 500px
1052 :height: 400px
1053 :align: center
1054
1055 Shape of the various common loss functions.
1056
1057.. class:: TrivialLoss
1058
1059 .. math:: \rho(s) = s
1060
1061.. class:: HuberLoss
1062
1063 .. math:: \rho(s) = \begin{cases} s & s \le 1\\ 2 \sqrt{s} - 1 & s > 1 \end{cases}
1064
1065.. class:: SoftLOneLoss
1066
1067 .. math:: \rho(s) = 2 (\sqrt{1+s} - 1)
1068
1069.. class:: CauchyLoss
1070
1071 .. math:: \rho(s) = \log(1 + s)
1072
1073.. class:: ArctanLoss
1074
1075 .. math:: \rho(s) = \arctan(s)
1076
1077.. class:: TolerantLoss
1078
1079 .. math:: \rho(s,a,b) = b \log(1 + e^{(s - a) / b}) - b \log(1 + e^{-a / b})
1080
1081.. class:: ComposedLoss
1082
1083 Given two loss functions ``f`` and ``g``, implements the loss
1084 function ``h(s) = f(g(s))``.
1085
1086 .. code-block:: c++
1087
1088 class ComposedLoss : public LossFunction {
1089 public:
1090 explicit ComposedLoss(const LossFunction* f,
1091 Ownership ownership_f,
1092 const LossFunction* g,
1093 Ownership ownership_g);
1094 };
1095
1096.. class:: ScaledLoss
1097
1098 Sometimes you want to simply scale the output value of the
1099 robustifier. For example, you might want to weight different error
1100 terms differently (e.g., weight pixel reprojection errors
1101 differently from terrain errors).
1102
1103 Given a loss function :math:`\rho(s)` and a scalar :math:`a`, :class:`ScaledLoss`
1104 implements the function :math:`a \rho(s)`.
1105
1106 Since we treat a ``NULL`` Loss function as the Identity loss
1107 function, :math:`rho` = ``NULL``: is a valid input and will result
1108 in the input being scaled by :math:`a`. This provides a simple way
1109 of implementing a scaled ResidualBlock.
1110
1111.. class:: LossFunctionWrapper
1112
1113 Sometimes after the optimization problem has been constructed, we
1114 wish to mutate the scale of the loss function. For example, when
1115 performing estimation from data which has substantial outliers,
1116 convergence can be improved by starting out with a large scale,
1117 optimizing the problem and then reducing the scale. This can have
1118 better convergence behavior than just using a loss function with a
1119 small scale.
1120
1121 This templated class allows the user to implement a loss function
1122 whose scale can be mutated after an optimization problem has been
1123 constructed, e.g,
1124
1125 .. code-block:: c++
1126
1127 Problem problem;
1128
1129 // Add parameter blocks
1130
1131 CostFunction* cost_function =
1132 new AutoDiffCostFunction < UW_Camera_Mapper, 2, 9, 3>(
1133 new UW_Camera_Mapper(feature_x, feature_y));
1134
1135 LossFunctionWrapper* loss_function(new HuberLoss(1.0), TAKE_OWNERSHIP);
1136 problem.AddResidualBlock(cost_function, loss_function, parameters);
1137
1138 Solver::Options options;
1139 Solver::Summary summary;
1140 Solve(options, &problem, &summary);
1141
1142 loss_function->Reset(new HuberLoss(1.0), TAKE_OWNERSHIP);
1143 Solve(options, &problem, &summary);
1144
1145
1146Theory
1147------
1148
1149Let us consider a problem with a single problem and a single parameter
1150block.
1151
1152.. math::
1153
1154 \min_x \frac{1}{2}\rho(f^2(x))
1155
1156
1157Then, the robustified gradient and the Gauss-Newton Hessian are
1158
1159.. math::
1160
1161 g(x) &= \rho'J^\top(x)f(x)\\
1162 H(x) &= J^\top(x)\left(\rho' + 2 \rho''f(x)f^\top(x)\right)J(x)
1163
1164where the terms involving the second derivatives of :math:`f(x)` have
1165been ignored. Note that :math:`H(x)` is indefinite if
1166:math:`\rho''f(x)^\top f(x) + \frac{1}{2}\rho' < 0`. If this is not
1167the case, then its possible to re-weight the residual and the Jacobian
1168matrix such that the corresponding linear least squares problem for
1169the robustified Gauss-Newton step.
1170
1171
1172Let :math:`\alpha` be a root of
1173
1174.. math:: \frac{1}{2}\alpha^2 - \alpha - \frac{\rho''}{\rho'}\|f(x)\|^2 = 0.
1175
1176
1177Then, define the rescaled residual and Jacobian as
1178
1179.. math::
1180
1181 \tilde{f}(x) &= \frac{\sqrt{\rho'}}{1 - \alpha} f(x)\\
1182 \tilde{J}(x) &= \sqrt{\rho'}\left(1 - \alpha
1183 \frac{f(x)f^\top(x)}{\left\|f(x)\right\|^2} \right)J(x)
1184
1185
1186In the case :math:`2 \rho''\left\|f(x)\right\|^2 + \rho' \lesssim 0`,
1187we limit :math:`\alpha \le 1- \epsilon` for some small
1188:math:`\epsilon`. For more details see [Triggs]_.
1189
1190With this simple rescaling, one can use any Jacobian based non-linear
1191least squares algorithm to robustified non-linear least squares
1192problems.
1193
1194
1195:class:`LocalParameterization`
1196==============================
1197
1198.. class:: LocalParameterization
1199
1200 .. code-block:: c++
1201
1202 class LocalParameterization {
1203 public:
1204 virtual ~LocalParameterization() {}
1205 virtual bool Plus(const double* x,
1206 const double* delta,
1207 double* x_plus_delta) const = 0;
1208 virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
1209 virtual bool MultiplyByJacobian(const double* x,
1210 const int num_rows,
1211 const double* global_matrix,
1212 double* local_matrix) const;
1213 virtual int GlobalSize() const = 0;
1214 virtual int LocalSize() const = 0;
1215 };
1216
1217 Sometimes the parameters :math:`x` can overparameterize a
1218 problem. In that case it is desirable to choose a parameterization
1219 to remove the null directions of the cost. More generally, if
1220 :math:`x` lies on a manifold of a smaller dimension than the
1221 ambient space that it is embedded in, then it is numerically and
1222 computationally more effective to optimize it using a
1223 parameterization that lives in the tangent space of that manifold
1224 at each point.
1225
1226 For example, a sphere in three dimensions is a two dimensional
1227 manifold, embedded in a three dimensional space. At each point on
1228 the sphere, the plane tangent to it defines a two dimensional
1229 tangent space. For a cost function defined on this sphere, given a
1230 point :math:`x`, moving in the direction normal to the sphere at
1231 that point is not useful. Thus a better way to parameterize a point
1232 on a sphere is to optimize over two dimensional vector
1233 :math:`\Delta x` in the tangent space at the point on the sphere
1234 point and then "move" to the point :math:`x + \Delta x`, where the
1235 move operation involves projecting back onto the sphere. Doing so
1236 removes a redundant dimension from the optimization, making it
1237 numerically more robust and efficient.
1238
1239 More generally we can define a function
1240
1241 .. math:: x' = \boxplus(x, \Delta x),
1242
1243 where :math:`x'` has the same size as :math:`x`, and :math:`\Delta
1244 x` is of size less than or equal to :math:`x`. The function
1245 :math:`\boxplus`, generalizes the definition of vector
1246 addition. Thus it satisfies the identity
1247
1248 .. math:: \boxplus(x, 0) = x,\quad \forall x.
1249
1250 Instances of :class:`LocalParameterization` implement the
1251 :math:`\boxplus` operation and its derivative with respect to
1252 :math:`\Delta x` at :math:`\Delta x = 0`.
1253
1254
1255.. function:: int LocalParameterization::GlobalSize()
1256
1257 The dimension of the ambient space in which the parameter block
1258 :math:`x` lives.
1259
1260.. function:: int LocalParameterization::LocalSize()
1261
1262 The size of the tangent space
1263 that :math:`\Delta x` lives in.
1264
1265.. function:: bool LocalParameterization::Plus(const double* x, const double* delta, double* x_plus_delta) const
1266
1267 :func:`LocalParameterization::Plus` implements :math:`\boxplus(x,\Delta x)`.
1268
1269.. function:: bool LocalParameterization::ComputeJacobian(const double* x, double* jacobian) const
1270
1271 Computes the Jacobian matrix
1272
1273 .. math:: J = \left . \frac{\partial }{\partial \Delta x} \boxplus(x,\Delta x)\right|_{\Delta x = 0}
1274
1275 in row major form.
1276
1277.. function:: bool MultiplyByJacobian(const double* x, const int num_rows, const double* global_matrix, double* local_matrix) const
1278
1279 local_matrix = global_matrix * jacobian
1280
1281 global_matrix is a num_rows x GlobalSize row major matrix.
1282 local_matrix is a num_rows x LocalSize row major matrix.
1283 jacobian is the matrix returned by :func:`LocalParameterization::ComputeJacobian` at :math:`x`.
1284
1285 This is only used by GradientProblem. For most normal uses, it is
1286 okay to use the default implementation.
1287
1288Instances
1289---------
1290
1291.. class:: IdentityParameterization
1292
1293 A trivial version of :math:`\boxplus` is when :math:`\Delta x` is
1294 of the same size as :math:`x` and
1295
1296 .. math:: \boxplus(x, \Delta x) = x + \Delta x
1297
1298.. class:: SubsetParameterization
1299
1300 A more interesting case if :math:`x` is a two dimensional vector,
1301 and the user wishes to hold the first coordinate constant. Then,
1302 :math:`\Delta x` is a scalar and :math:`\boxplus` is defined as
1303
1304 .. math::
1305
1306 \boxplus(x, \Delta x) = x + \left[ \begin{array}{c} 0 \\ 1
1307 \end{array} \right] \Delta x
1308
1309 :class:`SubsetParameterization` generalizes this construction to
1310 hold any part of a parameter block constant.
1311
1312.. class:: QuaternionParameterization
1313
1314 Another example that occurs commonly in Structure from Motion
1315 problems is when camera rotations are parameterized using a
1316 quaternion. There, it is useful only to make updates orthogonal to
1317 that 4-vector defining the quaternion. One way to do this is to let
1318 :math:`\Delta x` be a 3 dimensional vector and define
1319 :math:`\boxplus` to be
1320
1321 .. math:: \boxplus(x, \Delta x) = \left[ \cos(|\Delta x|), \frac{\sin\left(|\Delta x|\right)}{|\Delta x|} \Delta x \right] * x
1322 :label: quaternion
1323
1324 The multiplication between the two 4-vectors on the right hand side
1325 is the standard quaternion
1326 product. :class:`QuaternionParameterization` is an implementation
1327 of :eq:`quaternion`.
1328
1329.. class:: EigenQuaternionParameterization
1330
1331 Eigen uses a different internal memory layout for the elements of the
1332 quaternion than what is commonly used. Specifically, Eigen stores the
1333 elements in memory as [x, y, z, w] where the real part is last
1334 whereas it is typically stored first. Note, when creating an Eigen
1335 quaternion through the constructor the elements are accepted in w, x,
1336 y, z order. Since Ceres operates on parameter blocks which are raw
1337 double pointers this difference is important and requires a different
1338 parameterization. :class:`EigenQuaternionParameterization` uses the
1339 same update as :class:`QuaternionParameterization` but takes into
1340 account Eigen's internal memory element ordering.
1341
1342.. class:: HomogeneousVectorParameterization
1343
1344 In computer vision, homogeneous vectors are commonly used to
1345 represent entities in projective geometry such as points in
1346 projective space. One example where it is useful to use this
1347 over-parameterization is in representing points whose triangulation
1348 is ill-conditioned. Here it is advantageous to use homogeneous
1349 vectors, instead of an Euclidean vector, because it can represent
1350 points at infinity.
1351
1352 When using homogeneous vectors it is useful to only make updates
1353 orthogonal to that :math:`n`-vector defining the homogeneous
1354 vector [HartleyZisserman]_. One way to do this is to let :math:`\Delta x`
1355 be a :math:`n-1` dimensional vector and define :math:`\boxplus` to be
1356
1357 .. math:: \boxplus(x, \Delta x) = \left[ \frac{\sin\left(0.5 |\Delta x|\right)}{|\Delta x|} \Delta x, \cos(0.5 |\Delta x|) \right] * x
1358
1359 The multiplication between the two vectors on the right hand side
1360 is defined as an operator which applies the update orthogonal to
1361 :math:`x` to remain on the sphere. Note, it is assumed that
1362 last element of :math:`x` is the scalar component of the homogeneous
1363 vector.
1364
1365
1366.. class:: ProductParameterization
1367
1368 Consider an optimization problem over the space of rigid
1369 transformations :math:`SE(3)`, which is the Cartesian product of
1370 :math:`SO(3)` and :math:`\mathbb{R}^3`. Suppose you are using
1371 Quaternions to represent the rotation, Ceres ships with a local
1372 parameterization for that and :math:`\mathbb{R}^3` requires no, or
1373 :class:`IdentityParameterization` parameterization. So how do we
1374 construct a local parameterization for a parameter block a rigid
1375 transformation?
1376
1377 In cases, where a parameter block is the Cartesian product of a
1378 number of manifolds and you have the local parameterization of the
1379 individual manifolds available, :class:`ProductParameterization`
1380 can be used to construct a local parameterization of the cartesian
1381 product. For the case of the rigid transformation, where say you
1382 have a parameter block of size 7, where the first four entries
1383 represent the rotation as a quaternion, a local parameterization
1384 can be constructed as
1385
1386 .. code-block:: c++
1387
1388 ProductParameterization se3_param(new QuaternionParameterization(),
1389 new IdentityTransformation(3));
1390
1391
1392:class:`AutoDiffLocalParameterization`
1393======================================
1394
1395.. class:: AutoDiffLocalParameterization
1396
1397 :class:`AutoDiffLocalParameterization` does for
1398 :class:`LocalParameterization` what :class:`AutoDiffCostFunction`
1399 does for :class:`CostFunction`. It allows the user to define a
1400 templated functor that implements the
1401 :func:`LocalParameterization::Plus` operation and it uses automatic
1402 differentiation to implement the computation of the Jacobian.
1403
1404 To get an auto differentiated local parameterization, you must
1405 define a class with a templated operator() (a functor) that computes
1406
1407 .. math:: x' = \boxplus(x, \Delta x),
1408
1409 For example, Quaternions have a three dimensional local
1410 parameterization. Its plus operation can be implemented as (taken
1411 from `internal/ceres/autodiff_local_parameterization_test.cc
1412 <https://ceres-solver.googlesource.com/ceres-solver/+/master/internal/ceres/autodiff_local_parameterization_test.cc>`_
1413 )
1414
1415 .. code-block:: c++
1416
1417 struct QuaternionPlus {
1418 template<typename T>
1419 bool operator()(const T* x, const T* delta, T* x_plus_delta) const {
1420 const T squared_norm_delta =
1421 delta[0] * delta[0] + delta[1] * delta[1] + delta[2] * delta[2];
1422
1423 T q_delta[4];
1424 if (squared_norm_delta > 0.0) {
1425 T norm_delta = sqrt(squared_norm_delta);
1426 const T sin_delta_by_delta = sin(norm_delta) / norm_delta;
1427 q_delta[0] = cos(norm_delta);
1428 q_delta[1] = sin_delta_by_delta * delta[0];
1429 q_delta[2] = sin_delta_by_delta * delta[1];
1430 q_delta[3] = sin_delta_by_delta * delta[2];
1431 } else {
1432 // We do not just use q_delta = [1,0,0,0] here because that is a
1433 // constant and when used for automatic differentiation will
1434 // lead to a zero derivative. Instead we take a first order
1435 // approximation and evaluate it at zero.
1436 q_delta[0] = T(1.0);
1437 q_delta[1] = delta[0];
1438 q_delta[2] = delta[1];
1439 q_delta[3] = delta[2];
1440 }
1441
1442 Quaternionproduct(q_delta, x, x_plus_delta);
1443 return true;
1444 }
1445 };
1446
1447 Given this struct, the auto differentiated local
1448 parameterization can now be constructed as
1449
1450 .. code-block:: c++
1451
1452 LocalParameterization* local_parameterization =
1453 new AutoDiffLocalParameterization<QuaternionPlus, 4, 3>;
1454 | |
1455 Global Size ---------------+ |
1456 Local Size -------------------+
1457
1458
1459:class:`Problem`
1460================
1461
1462.. class:: Problem
1463
1464 :class:`Problem` holds the robustified bounds constrained
1465 non-linear least squares problem :eq:`ceresproblem_modeling`. To
1466 create a least squares problem, use the
1467 :func:`Problem::AddResidualBlock` and
1468 :func:`Problem::AddParameterBlock` methods.
1469
1470 For example a problem containing 3 parameter blocks of sizes 3, 4
1471 and 5 respectively and two residual blocks of size 2 and 6:
1472
1473 .. code-block:: c++
1474
1475 double x1[] = { 1.0, 2.0, 3.0 };
1476 double x2[] = { 1.0, 2.0, 3.0, 5.0 };
1477 double x3[] = { 1.0, 2.0, 3.0, 6.0, 7.0 };
1478
1479 Problem problem;
1480 problem.AddResidualBlock(new MyUnaryCostFunction(...), x1);
1481 problem.AddResidualBlock(new MyBinaryCostFunction(...), x2, x3);
1482
1483 :func:`Problem::AddResidualBlock` as the name implies, adds a
1484 residual block to the problem. It adds a :class:`CostFunction`, an
1485 optional :class:`LossFunction` and connects the
1486 :class:`CostFunction` to a set of parameter block.
1487
1488 The cost function carries with it information about the sizes of
1489 the parameter blocks it expects. The function checks that these
1490 match the sizes of the parameter blocks listed in
1491 ``parameter_blocks``. The program aborts if a mismatch is
1492 detected. ``loss_function`` can be ``NULL``, in which case the cost
1493 of the term is just the squared norm of the residuals.
1494
1495 The user has the option of explicitly adding the parameter blocks
1496 using :func:`Problem::AddParameterBlock`. This causes additional
1497 correctness checking; however, :func:`Problem::AddResidualBlock`
1498 implicitly adds the parameter blocks if they are not present, so
1499 calling :func:`Problem::AddParameterBlock` explicitly is not
1500 required.
1501
1502 :func:`Problem::AddParameterBlock` explicitly adds a parameter
1503 block to the :class:`Problem`. Optionally it allows the user to
1504 associate a :class:`LocalParameterization` object with the
1505 parameter block too. Repeated calls with the same arguments are
1506 ignored. Repeated calls with the same double pointer but a
1507 different size results in undefined behavior.
1508
1509 You can set any parameter block to be constant using
1510 :func:`Problem::SetParameterBlockConstant` and undo this using
1511 :func:`SetParameterBlockVariable`.
1512
1513 In fact you can set any number of parameter blocks to be constant,
1514 and Ceres is smart enough to figure out what part of the problem
1515 you have constructed depends on the parameter blocks that are free
1516 to change and only spends time solving it. So for example if you
1517 constructed a problem with a million parameter blocks and 2 million
1518 residual blocks, but then set all but one parameter blocks to be
1519 constant and say only 10 residual blocks depend on this one
1520 non-constant parameter block. Then the computational effort Ceres
1521 spends in solving this problem will be the same if you had defined
1522 a problem with one parameter block and 10 residual blocks.
1523
1524 **Ownership**
1525
1526 :class:`Problem` by default takes ownership of the
1527 ``cost_function``, ``loss_function`` and ``local_parameterization``
1528 pointers. These objects remain live for the life of the
1529 :class:`Problem`. If the user wishes to keep control over the
1530 destruction of these objects, then they can do this by setting the
1531 corresponding enums in the :class:`Problem::Options` struct.
1532
1533 Note that even though the Problem takes ownership of ``cost_function``
1534 and ``loss_function``, it does not preclude the user from re-using
1535 them in another residual block. The destructor takes care to call
1536 delete on each ``cost_function`` or ``loss_function`` pointer only
1537 once, regardless of how many residual blocks refer to them.
1538
1539.. function:: ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, const vector<double*> parameter_blocks)
1540.. function:: ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, double *x0, double *x1, ...)
1541
1542 Add a residual block to the overall cost function. The cost
1543 function carries with it information about the sizes of the
1544 parameter blocks it expects. The function checks that these match
1545 the sizes of the parameter blocks listed in parameter_blocks. The
1546 program aborts if a mismatch is detected. loss_function can be
1547 NULL, in which case the cost of the term is just the squared norm
1548 of the residuals.
1549
1550 The parameter blocks may be passed together as a
1551 ``vector<double*>``, or as up to ten separate ``double*`` pointers.
1552
1553 The user has the option of explicitly adding the parameter blocks
1554 using AddParameterBlock. This causes additional correctness
1555 checking; however, AddResidualBlock implicitly adds the parameter
1556 blocks if they are not present, so calling AddParameterBlock
1557 explicitly is not required.
1558
1559 The Problem object by default takes ownership of the
1560 cost_function and loss_function pointers. These objects remain
1561 live for the life of the Problem object. If the user wishes to
1562 keep control over the destruction of these objects, then they can
1563 do this by setting the corresponding enums in the Options struct.
1564
1565 Note: Even though the Problem takes ownership of cost_function
1566 and loss_function, it does not preclude the user from re-using
1567 them in another residual block. The destructor takes care to call
1568 delete on each cost_function or loss_function pointer only once,
1569 regardless of how many residual blocks refer to them.
1570
1571 Example usage:
1572
1573 .. code-block:: c++
1574
1575 double x1[] = {1.0, 2.0, 3.0};
1576 double x2[] = {1.0, 2.0, 5.0, 6.0};
1577 double x3[] = {3.0, 6.0, 2.0, 5.0, 1.0};
1578 vector<double*> v1;
1579 v1.push_back(x1);
1580 vector<double*> v2;
1581 v2.push_back(x2);
1582 v2.push_back(x1);
1583
1584 Problem problem;
1585
1586 problem.AddResidualBlock(new MyUnaryCostFunction(...), NULL, x1);
1587 problem.AddResidualBlock(new MyBinaryCostFunction(...), NULL, x2, x1);
1588 problem.AddResidualBlock(new MyUnaryCostFunction(...), NULL, v1);
1589 problem.AddResidualBlock(new MyBinaryCostFunction(...), NULL, v2);
1590
1591.. function:: void Problem::AddParameterBlock(double* values, int size, LocalParameterization* local_parameterization)
1592
1593 Add a parameter block with appropriate size to the problem.
1594 Repeated calls with the same arguments are ignored. Repeated calls
1595 with the same double pointer but a different size results in
1596 undefined behavior.
1597
1598.. function:: void Problem::AddParameterBlock(double* values, int size)
1599
1600 Add a parameter block with appropriate size and parameterization to
1601 the problem. Repeated calls with the same arguments are
1602 ignored. Repeated calls with the same double pointer but a
1603 different size results in undefined behavior.
1604
1605.. function:: void Problem::RemoveResidualBlock(ResidualBlockId residual_block)
1606
1607 Remove a residual block from the problem. Any parameters that the residual
1608 block depends on are not removed. The cost and loss functions for the
1609 residual block will not get deleted immediately; won't happen until the
1610 problem itself is deleted. If Problem::Options::enable_fast_removal is
1611 true, then the removal is fast (almost constant time). Otherwise, removing a
1612 residual block will incur a scan of the entire Problem object to verify that
1613 the residual_block represents a valid residual in the problem.
1614
1615 **WARNING:** Removing a residual or parameter block will destroy
1616 the implicit ordering, rendering the jacobian or residuals returned
1617 from the solver uninterpretable. If you depend on the evaluated
1618 jacobian, do not use remove! This may change in a future release.
1619 Hold the indicated parameter block constant during optimization.
1620
1621.. function:: void Problem::RemoveParameterBlock(double* values)
1622
1623 Remove a parameter block from the problem. The parameterization of
1624 the parameter block, if it exists, will persist until the deletion
1625 of the problem (similar to cost/loss functions in residual block
1626 removal). Any residual blocks that depend on the parameter are also
1627 removed, as described above in RemoveResidualBlock(). If
1628 Problem::Options::enable_fast_removal is true, then
1629 the removal is fast (almost constant time). Otherwise, removing a
1630 parameter block will incur a scan of the entire Problem object.
1631
1632 **WARNING:** Removing a residual or parameter block will destroy
1633 the implicit ordering, rendering the jacobian or residuals returned
1634 from the solver uninterpretable. If you depend on the evaluated
1635 jacobian, do not use remove! This may change in a future release.
1636
1637.. function:: void Problem::SetParameterBlockConstant(double* values)
1638
1639 Hold the indicated parameter block constant during optimization.
1640
1641.. function:: void Problem::SetParameterBlockVariable(double* values)
1642
1643 Allow the indicated parameter to vary during optimization.
1644
1645.. function:: void Problem::SetParameterization(double* values, LocalParameterization* local_parameterization)
1646
1647 Set the local parameterization for one of the parameter blocks.
1648 The local_parameterization is owned by the Problem by default. It
1649 is acceptable to set the same parameterization for multiple
1650 parameters; the destructor is careful to delete local
1651 parameterizations only once. The local parameterization can only be
1652 set once per parameter, and cannot be changed once set.
1653
1654.. function:: LocalParameterization* Problem::GetParameterization(double* values) const
1655
1656 Get the local parameterization object associated with this
1657 parameter block. If there is no parameterization object associated
1658 then `NULL` is returned
1659
1660.. function:: void Problem::SetParameterLowerBound(double* values, int index, double lower_bound)
1661
1662 Set the lower bound for the parameter at position `index` in the
1663 parameter block corresponding to `values`. By default the lower
1664 bound is ``-std::numeric_limits<double>::max()``, which is treated
1665 by the solver as the same as :math:`-\infty`.
1666
1667.. function:: void Problem::SetParameterUpperBound(double* values, int index, double upper_bound)
1668
1669 Set the upper bound for the parameter at position `index` in the
1670 parameter block corresponding to `values`. By default the value is
1671 ``std::numeric_limits<double>::max()``, which is treated by the
1672 solver as the same as :math:`\infty`.
1673
1674.. function:: double Problem::GetParameterLowerBound(double* values, int index)
1675
1676 Get the lower bound for the parameter with position `index`. If the
1677 parameter is not bounded by the user, then its lower bound is
1678 ``-std::numeric_limits<double>::max()``.
1679
1680.. function:: double Problem::GetParameterUpperBound(double* values, int index)
1681
1682 Get the upper bound for the parameter with position `index`. If the
1683 parameter is not bounded by the user, then its upper bound is
1684 ``std::numeric_limits<double>::max()``.
1685
1686.. function:: int Problem::NumParameterBlocks() const
1687
1688 Number of parameter blocks in the problem. Always equals
1689 parameter_blocks().size() and parameter_block_sizes().size().
1690
1691.. function:: int Problem::NumParameters() const
1692
1693 The size of the parameter vector obtained by summing over the sizes
1694 of all the parameter blocks.
1695
1696.. function:: int Problem::NumResidualBlocks() const
1697
1698 Number of residual blocks in the problem. Always equals
1699 residual_blocks().size().
1700
1701.. function:: int Problem::NumResiduals() const
1702
1703 The size of the residual vector obtained by summing over the sizes
1704 of all of the residual blocks.
1705
1706.. function:: int Problem::ParameterBlockSize(const double* values) const
1707
1708 The size of the parameter block.
1709
1710.. function:: int Problem::ParameterBlockLocalSize(const double* values) const
1711
1712 The size of local parameterization for the parameter block. If
1713 there is no local parameterization associated with this parameter
1714 block, then ``ParameterBlockLocalSize`` = ``ParameterBlockSize``.
1715
1716.. function:: bool Problem::HasParameterBlock(const double* values) const
1717
1718 Is the given parameter block present in the problem or not?
1719
1720.. function:: void Problem::GetParameterBlocks(vector<double*>* parameter_blocks) const
1721
1722 Fills the passed ``parameter_blocks`` vector with pointers to the
1723 parameter blocks currently in the problem. After this call,
1724 ``parameter_block.size() == NumParameterBlocks``.
1725
1726.. function:: void Problem::GetResidualBlocks(vector<ResidualBlockId>* residual_blocks) const
1727
1728 Fills the passed `residual_blocks` vector with pointers to the
1729 residual blocks currently in the problem. After this call,
1730 `residual_blocks.size() == NumResidualBlocks`.
1731
1732.. function:: void Problem::GetParameterBlocksForResidualBlock(const ResidualBlockId residual_block, vector<double*>* parameter_blocks) const
1733
1734 Get all the parameter blocks that depend on the given residual
1735 block.
1736
1737.. function:: void Problem::GetResidualBlocksForParameterBlock(const double* values, vector<ResidualBlockId>* residual_blocks) const
1738
1739 Get all the residual blocks that depend on the given parameter
1740 block.
1741
1742 If `Problem::Options::enable_fast_removal` is
1743 `true`, then getting the residual blocks is fast and depends only
1744 on the number of residual blocks. Otherwise, getting the residual
1745 blocks for a parameter block will incur a scan of the entire
1746 :class:`Problem` object.
1747
1748.. function:: const CostFunction* GetCostFunctionForResidualBlock(const ResidualBlockId residual_block) const
1749
1750 Get the :class:`CostFunction` for the given residual block.
1751
1752.. function:: const LossFunction* GetLossFunctionForResidualBlock(const ResidualBlockId residual_block) const
1753
1754 Get the :class:`LossFunction` for the given residual block.
1755
1756.. function:: bool Problem::Evaluate(const Problem::EvaluateOptions& options, double* cost, vector<double>* residuals, vector<double>* gradient, CRSMatrix* jacobian)
1757
1758 Evaluate a :class:`Problem`. Any of the output pointers can be
1759 `NULL`. Which residual blocks and parameter blocks are used is
1760 controlled by the :class:`Problem::EvaluateOptions` struct below.
1761
1762 .. NOTE::
1763
1764 The evaluation will use the values stored in the memory
1765 locations pointed to by the parameter block pointers used at the
1766 time of the construction of the problem, for example in the
1767 following code:
1768
1769 .. code-block:: c++
1770
1771 Problem problem;
1772 double x = 1;
1773 problem.Add(new MyCostFunction, NULL, &x);
1774
1775 double cost = 0.0;
1776 problem.Evaluate(Problem::EvaluateOptions(), &cost, NULL, NULL, NULL);
1777
1778 The cost is evaluated at `x = 1`. If you wish to evaluate the
1779 problem at `x = 2`, then
1780
1781 .. code-block:: c++
1782
1783 x = 2;
1784 problem.Evaluate(Problem::EvaluateOptions(), &cost, NULL, NULL, NULL);
1785
1786 is the way to do so.
1787
1788 .. NOTE::
1789
1790 If no local parameterizations are used, then the size of
1791 the gradient vector is the sum of the sizes of all the parameter
1792 blocks. If a parameter block has a local parameterization, then
1793 it contributes "LocalSize" entries to the gradient vector.
1794
1795 .. NOTE::
1796
1797 This function cannot be called while the problem is being
1798 solved, for example it cannot be called from an
1799 :class:`IterationCallback` at the end of an iteration during a
1800 solve.
1801
1802.. class:: Problem::EvaluateOptions
1803
1804 Options struct that is used to control :func:`Problem::Evaluate`.
1805
1806.. member:: vector<double*> Problem::EvaluateOptions::parameter_blocks
1807
1808 The set of parameter blocks for which evaluation should be
1809 performed. This vector determines the order in which parameter
1810 blocks occur in the gradient vector and in the columns of the
1811 jacobian matrix. If parameter_blocks is empty, then it is assumed
1812 to be equal to a vector containing ALL the parameter
1813 blocks. Generally speaking the ordering of the parameter blocks in
1814 this case depends on the order in which they were added to the
1815 problem and whether or not the user removed any parameter blocks.
1816
1817 **NOTE** This vector should contain the same pointers as the ones
1818 used to add parameter blocks to the Problem. These parameter block
1819 should NOT point to new memory locations. Bad things will happen if
1820 you do.
1821
1822.. member:: vector<ResidualBlockId> Problem::EvaluateOptions::residual_blocks
1823
1824 The set of residual blocks for which evaluation should be
1825 performed. This vector determines the order in which the residuals
1826 occur, and how the rows of the jacobian are ordered. If
1827 residual_blocks is empty, then it is assumed to be equal to the
1828 vector containing all the residual blocks.
1829
1830.. member:: bool Problem::EvaluateOptions::apply_loss_function
1831
1832 Even though the residual blocks in the problem may contain loss
1833 functions, setting apply_loss_function to false will turn off the
1834 application of the loss function to the output of the cost
1835 function. This is of use for example if the user wishes to analyse
1836 the solution quality by studying the distribution of residuals
1837 before and after the solve.
1838
1839.. member:: int Problem::EvaluateOptions::num_threads
1840
1841 Number of threads to use. (Requires OpenMP).
1842
1843``rotation.h``
1844==============
1845
1846Many applications of Ceres Solver involve optimization problems where
1847some of the variables correspond to rotations. To ease the pain of
1848work with the various representations of rotations (angle-axis,
1849quaternion and matrix) we provide a handy set of templated
1850functions. These functions are templated so that the user can use them
1851within Ceres Solver's automatic differentiation framework.
1852
1853.. function:: template <typename T> void AngleAxisToQuaternion(T const* angle_axis, T* quaternion)
1854
1855 Convert a value in combined axis-angle representation to a
1856 quaternion.
1857
1858 The value ``angle_axis`` is a triple whose norm is an angle in radians,
1859 and whose direction is aligned with the axis of rotation, and
1860 ``quaternion`` is a 4-tuple that will contain the resulting quaternion.
1861
1862.. function:: template <typename T> void QuaternionToAngleAxis(T const* quaternion, T* angle_axis)
1863
1864 Convert a quaternion to the equivalent combined axis-angle
1865 representation.
1866
1867 The value ``quaternion`` must be a unit quaternion - it is not
1868 normalized first, and ``angle_axis`` will be filled with a value
1869 whose norm is the angle of rotation in radians, and whose direction
1870 is the axis of rotation.
1871
1872.. function:: template <typename T, int row_stride, int col_stride> void RotationMatrixToAngleAxis(const MatrixAdapter<const T, row_stride, col_stride>& R, T * angle_axis)
1873.. function:: template <typename T, int row_stride, int col_stride> void AngleAxisToRotationMatrix(T const * angle_axis, const MatrixAdapter<T, row_stride, col_stride>& R)
1874.. function:: template <typename T> void RotationMatrixToAngleAxis(T const * R, T * angle_axis)
1875.. function:: template <typename T> void AngleAxisToRotationMatrix(T const * angle_axis, T * R)
1876
1877 Conversions between 3x3 rotation matrix with given column and row strides and
1878 axis-angle rotation representations. The functions that take a pointer to T instead
1879 of a MatrixAdapter assume a column major representation with unit row stride and a column stride of 3.
1880
1881.. function:: template <typename T, int row_stride, int col_stride> void EulerAnglesToRotationMatrix(const T* euler, const MatrixAdapter<T, row_stride, col_stride>& R)
1882.. function:: template <typename T> void EulerAnglesToRotationMatrix(const T* euler, int row_stride, T* R)
1883
1884 Conversions between 3x3 rotation matrix with given column and row strides and
1885 Euler angle (in degrees) rotation representations.
1886
1887 The {pitch,roll,yaw} Euler angles are rotations around the {x,y,z}
1888 axes, respectively. They are applied in that same order, so the
1889 total rotation R is Rz * Ry * Rx.
1890
1891 The function that takes a pointer to T as the rotation matrix assumes a row
1892 major representation with unit column stride and a row stride of 3.
1893 The additional parameter row_stride is required to be 3.
1894
1895.. function:: template <typename T, int row_stride, int col_stride> void QuaternionToScaledRotation(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
1896.. function:: template <typename T> void QuaternionToScaledRotation(const T q[4], T R[3 * 3])
1897
1898 Convert a 4-vector to a 3x3 scaled rotation matrix.
1899
1900 The choice of rotation is such that the quaternion
1901 :math:`\begin{bmatrix} 1 &0 &0 &0\end{bmatrix}` goes to an identity
1902 matrix and for small :math:`a, b, c` the quaternion
1903 :math:`\begin{bmatrix}1 &a &b &c\end{bmatrix}` goes to the matrix
1904
1905 .. math::
1906
1907 I + 2 \begin{bmatrix} 0 & -c & b \\ c & 0 & -a\\ -b & a & 0
1908 \end{bmatrix} + O(q^2)
1909
1910 which corresponds to a Rodrigues approximation, the last matrix
1911 being the cross-product matrix of :math:`\begin{bmatrix} a& b&
1912 c\end{bmatrix}`. Together with the property that :math:`R(q1 * q2)
1913 = R(q1) * R(q2)` this uniquely defines the mapping from :math:`q` to
1914 :math:`R`.
1915
1916 In the function that accepts a pointer to T instead of a MatrixAdapter,
1917 the rotation matrix ``R`` is a row-major matrix with unit column stride
1918 and a row stride of 3.
1919
1920 No normalization of the quaternion is performed, i.e.
1921 :math:`R = \|q\|^2 Q`, where :math:`Q` is an orthonormal matrix
1922 such that :math:`\det(Q) = 1` and :math:`Q*Q' = I`.
1923
1924
1925.. function:: template <typename T> void QuaternionToRotation(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
1926.. function:: template <typename T> void QuaternionToRotation(const T q[4], T R[3 * 3])
1927
1928 Same as above except that the rotation matrix is normalized by the
1929 Frobenius norm, so that :math:`R R' = I` (and :math:`\det(R) = 1`).
1930
1931.. function:: template <typename T> void UnitQuaternionRotatePoint(const T q[4], const T pt[3], T result[3])
1932
1933 Rotates a point pt by a quaternion q:
1934
1935 .. math:: \text{result} = R(q) \text{pt}
1936
1937 Assumes the quaternion is unit norm. If you pass in a quaternion
1938 with :math:`|q|^2 = 2` then you WILL NOT get back 2 times the
1939 result you get for a unit quaternion.
1940
1941
1942.. function:: template <typename T> void QuaternionRotatePoint(const T q[4], const T pt[3], T result[3])
1943
1944 With this function you do not need to assume that :math:`q` has unit norm.
1945 It does assume that the norm is non-zero.
1946
1947.. function:: template <typename T> void QuaternionProduct(const T z[4], const T w[4], T zw[4])
1948
1949 .. math:: zw = z * w
1950
1951 where :math:`*` is the Quaternion product between 4-vectors.
1952
1953
1954.. function:: template <typename T> void CrossProduct(const T x[3], const T y[3], T x_cross_y[3])
1955
1956 .. math:: \text{x_cross_y} = x \times y
1957
1958.. function:: template <typename T> void AngleAxisRotatePoint(const T angle_axis[3], const T pt[3], T result[3])
1959
1960 .. math:: y = R(\text{angle_axis}) x
1961
1962
1963Cubic Interpolation
1964===================
1965
1966Optimization problems often involve functions that are given in the
1967form of a table of values, for example an image. Evaluating these
1968functions and their derivatives requires interpolating these
1969values. Interpolating tabulated functions is a vast area of research
1970and there are a lot of libraries which implement a variety of
1971interpolation schemes. However, using them within the automatic
1972differentiation framework in Ceres is quite painful. To this end,
1973Ceres provides the ability to interpolate one dimensional and two
1974dimensional tabular functions.
1975
1976The one dimensional interpolation is based on the Cubic Hermite
1977Spline, also known as the Catmull-Rom Spline. This produces a first
1978order differentiable interpolating function. The two dimensional
1979interpolation scheme is a generalization of the one dimensional scheme
1980where the interpolating function is assumed to be separable in the two
1981dimensions,
1982
1983More details of the construction can be found `Linear Methods for
1984Image Interpolation <http://www.ipol.im/pub/art/2011/g_lmii/>`_ by
1985Pascal Getreuer.
1986
1987.. class:: CubicInterpolator
1988
1989Given as input an infinite one dimensional grid, which provides the
1990following interface.
1991
1992.. code::
1993
1994 struct Grid1D {
1995 enum { DATA_DIMENSION = 2; };
1996 void GetValue(int n, double* f) const;
1997 };
1998
1999Where, ``GetValue`` gives us the value of a function :math:`f`
2000(possibly vector valued) for any integer :math:`n` and the enum
2001``DATA_DIMENSION`` indicates the dimensionality of the function being
2002interpolated. For example if you are interpolating rotations in
2003axis-angle format over time, then ``DATA_DIMENSION = 3``.
2004
2005:class:`CubicInterpolator` uses Cubic Hermite splines to produce a
2006smooth approximation to it that can be used to evaluate the
2007:math:`f(x)` and :math:`f'(x)` at any point on the real number
2008line. For example, the following code interpolates an array of four
2009numbers.
2010
2011.. code::
2012
2013 const double data[] = {1.0, 2.0, 5.0, 6.0};
2014 Grid1D<double, 1> array(x, 0, 4);
2015 CubicInterpolator interpolator(array);
2016 double f, dfdx;
2017 interpolator.Evaluate(1.5, &f, &dfdx);
2018
2019
2020In the above code we use ``Grid1D`` a templated helper class that
2021allows easy interfacing between ``C++`` arrays and
2022:class:`CubicInterpolator`.
2023
2024``Grid1D`` supports vector valued functions where the various
2025coordinates of the function can be interleaved or stacked. It also
2026allows the use of any numeric type as input, as long as it can be
2027safely cast to a double.
2028
2029.. class:: BiCubicInterpolator
2030
2031Given as input an infinite two dimensional grid, which provides the
2032following interface:
2033
2034.. code::
2035
2036 struct Grid2D {
2037 enum { DATA_DIMENSION = 2 };
2038 void GetValue(int row, int col, double* f) const;
2039 };
2040
2041Where, ``GetValue`` gives us the value of a function :math:`f`
2042(possibly vector valued) for any pair of integers :code:`row` and
2043:code:`col` and the enum ``DATA_DIMENSION`` indicates the
2044dimensionality of the function being interpolated. For example if you
2045are interpolating a color image with three channels (Red, Green &
2046Blue), then ``DATA_DIMENSION = 3``.
2047
2048:class:`BiCubicInterpolator` uses the cubic convolution interpolation
2049algorithm of R. Keys [Keys]_, to produce a smooth approximation to it
2050that can be used to evaluate the :math:`f(r,c)`, :math:`\frac{\partial
2051f(r,c)}{\partial r}` and :math:`\frac{\partial f(r,c)}{\partial c}` at
2052any any point in the real plane.
2053
2054For example the following code interpolates a two dimensional array.
2055
2056.. code::
2057
2058 const double data[] = {1.0, 3.0, -1.0, 4.0,
2059 3.6, 2.1, 4.2, 2.0,
2060 2.0, 1.0, 3.1, 5.2};
2061 Grid2D<double, 1> array(data, 0, 3, 0, 4);
2062 BiCubicInterpolator interpolator(array);
2063 double f, dfdr, dfdc;
2064 interpolator.Evaluate(1.2, 2.5, &f, &dfdr, &dfdc);
2065
2066In the above code, the templated helper class ``Grid2D`` is used to
2067make a ``C++`` array look like a two dimensional table to
2068:class:`BiCubicInterpolator`.
2069
2070``Grid2D`` supports row or column major layouts. It also supports
2071vector valued functions where the individual coordinates of the
2072function may be interleaved or stacked. It also allows the use of any
2073numeric type as input, as long as it can be safely cast to double.