blob: f95d246c66c43f4e8b608922d6371111f70fc279 [file] [log] [blame]
Austin Schuh3de38b02024-06-25 18:25:10 -07001.. highlight:: c++
Austin Schuh70cc9552019-01-21 19:46:48 -08002
3.. default-domain:: cpp
4
5.. cpp:namespace:: ceres
6
7.. _chapter-nnls_covariance:
8
9=====================
10Covariance Estimation
11=====================
12
13Introduction
14============
15
16One way to assess the quality of the solution returned by a non-linear
17least squares solver is to analyze the covariance of the solution.
18
19Let us consider the non-linear regression problem
20
21.. math:: y = f(x) + N(0, I)
22
23i.e., the observation :math:`y` is a random non-linear function of the
24independent variable :math:`x` with mean :math:`f(x)` and identity
25covariance. Then the maximum likelihood estimate of :math:`x` given
26observations :math:`y` is the solution to the non-linear least squares
27problem:
28
Austin Schuh1d1e6ea2020-12-23 21:56:30 -080029.. math:: x^* = \arg \min_x \|f(x) - y\|^2
Austin Schuh70cc9552019-01-21 19:46:48 -080030
31And the covariance of :math:`x^*` is given by
32
33.. math:: C(x^*) = \left(J'(x^*)J(x^*)\right)^{-1}
34
35Here :math:`J(x^*)` is the Jacobian of :math:`f` at :math:`x^*`. The
36above formula assumes that :math:`J(x^*)` has full column rank.
37
38If :math:`J(x^*)` is rank deficient, then the covariance matrix :math:`C(x^*)`
39is also rank deficient and is given by the Moore-Penrose pseudo inverse.
40
41.. math:: C(x^*) = \left(J'(x^*)J(x^*)\right)^{\dagger}
42
43Note that in the above, we assumed that the covariance matrix for
44:math:`y` was identity. This is an important assumption. If this is
45not the case and we have
46
47.. math:: y = f(x) + N(0, S)
48
49Where :math:`S` is a positive semi-definite matrix denoting the
50covariance of :math:`y`, then the maximum likelihood problem to be
51solved is
52
53.. math:: x^* = \arg \min_x f'(x) S^{-1} f(x)
54
55and the corresponding covariance estimate of :math:`x^*` is given by
56
57.. math:: C(x^*) = \left(J'(x^*) S^{-1} J(x^*)\right)^{-1}
58
59So, if it is the case that the observations being fitted to have a
60covariance matrix not equal to identity, then it is the user's
61responsibility that the corresponding cost functions are correctly
62scaled, e.g. in the above case the cost function for this problem
63should evaluate :math:`S^{-1/2} f(x)` instead of just :math:`f(x)`,
64where :math:`S^{-1/2}` is the inverse square root of the covariance
65matrix :math:`S`.
66
67Gauge Invariance
68================
69
70In structure from motion (3D reconstruction) problems, the
71reconstruction is ambiguous up to a similarity transform. This is
72known as a *Gauge Ambiguity*. Handling Gauges correctly requires the
73use of SVD or custom inversion algorithms. For small problems the
74user can use the dense algorithm. For more details see the work of
75Kanatani & Morris [KanataniMorris]_.
76
77
78:class:`Covariance`
79===================
80
81:class:`Covariance` allows the user to evaluate the covariance for a
82non-linear least squares problem and provides random access to its
83blocks. The computation assumes that the cost functions compute
84residuals such that their covariance is identity.
85
86Since the computation of the covariance matrix requires computing the
87inverse of a potentially large matrix, this can involve a rather large
88amount of time and memory. However, it is usually the case that the
89user is only interested in a small part of the covariance
90matrix. Quite often just the block diagonal. :class:`Covariance`
91allows the user to specify the parts of the covariance matrix that she
92is interested in and then uses this information to only compute and
93store those parts of the covariance matrix.
94
95Rank of the Jacobian
96====================
97
98As we noted above, if the Jacobian is rank deficient, then the inverse
99of :math:`J'J` is not defined and instead a pseudo inverse needs to be
100computed.
101
102The rank deficiency in :math:`J` can be *structural* -- columns
103which are always known to be zero or *numerical* -- depending on the
104exact values in the Jacobian.
105
106Structural rank deficiency occurs when the problem contains parameter
107blocks that are constant. This class correctly handles structural rank
108deficiency like that.
109
110Numerical rank deficiency, where the rank of the matrix cannot be
111predicted by its sparsity structure and requires looking at its
112numerical values is more complicated. Here again there are two
113cases.
114
115 a. The rank deficiency arises from overparameterization. e.g., a
116 four dimensional quaternion used to parameterize :math:`SO(3)`,
117 which is a three dimensional manifold. In cases like this, the
118 user should use an appropriate
Austin Schuh3de38b02024-06-25 18:25:10 -0700119 :class:`Manifold`. Not only will this lead to better
Austin Schuh70cc9552019-01-21 19:46:48 -0800120 numerical behaviour of the Solver, it will also expose the rank
121 deficiency to the :class:`Covariance` object so that it can
122 handle it correctly.
123
124 b. More general numerical rank deficiency in the Jacobian requires
125 the computation of the so called Singular Value Decomposition
126 (SVD) of :math:`J'J`. We do not know how to do this for large
127 sparse matrices efficiently. For small and moderate sized
128 problems this is done using dense linear algebra.
129
130
131:class:`Covariance::Options`
132
133.. class:: Covariance::Options
134
135.. member:: int Covariance::Options::num_threads
136
137 Default: ``1``
138
139 Number of threads to be used for evaluating the Jacobian and
140 estimation of covariance.
141
142.. member:: SparseLinearAlgebraLibraryType Covariance::Options::sparse_linear_algebra_library_type
143
144 Default: ``SUITE_SPARSE`` Ceres Solver is built with support for
145 `SuiteSparse <http://faculty.cse.tamu.edu/davis/suitesparse.html>`_
146 and ``EIGEN_SPARSE`` otherwise. Note that ``EIGEN_SPARSE`` is
147 always available.
148
149.. member:: CovarianceAlgorithmType Covariance::Options::algorithm_type
150
151 Default: ``SPARSE_QR``
152
153 Ceres supports two different algorithms for covariance estimation,
154 which represent different tradeoffs in speed, accuracy and
155 reliability.
156
157 1. ``SPARSE_QR`` uses the sparse QR factorization algorithm to
158 compute the decomposition
159
160 .. math::
161
162 QR &= J\\
163 \left(J^\top J\right)^{-1} &= \left(R^\top R\right)^{-1}
164
165 The speed of this algorithm depends on the sparse linear algebra
166 library being used. ``Eigen``'s sparse QR factorization is a
167 moderately fast algorithm suitable for small to medium sized
168 matrices. For best performance we recommend using
169 ``SuiteSparseQR`` which is enabled by setting
Austin Schuh3de38b02024-06-25 18:25:10 -0700170 :member:`Covariance::Options::sparse_linear_algebra_library_type`
Austin Schuh70cc9552019-01-21 19:46:48 -0800171 to ``SUITE_SPARSE``.
172
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800173 ``SPARSE_QR`` cannot compute the covariance if the
Austin Schuh70cc9552019-01-21 19:46:48 -0800174 Jacobian is rank deficient.
175
176
177 2. ``DENSE_SVD`` uses ``Eigen``'s ``JacobiSVD`` to perform the
178 computations. It computes the singular value decomposition
179
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800180 .. math:: U D V^\top = J
Austin Schuh70cc9552019-01-21 19:46:48 -0800181
182 and then uses it to compute the pseudo inverse of J'J as
183
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800184 .. math:: (J'J)^{\dagger} = V D^{2\dagger} V^\top
Austin Schuh70cc9552019-01-21 19:46:48 -0800185
186 It is an accurate but slow method and should only be used for
187 small to moderate sized problems. It can handle full-rank as
188 well as rank deficient Jacobians.
189
190
Austin Schuh3de38b02024-06-25 18:25:10 -0700191.. member:: double Covariance::Options::column_pivot_threshold
192
193 Default: :math:`-1`
194
195 During QR factorization, if a column with Euclidean norm less than
196 ``column_pivot_threshold`` is encountered it is treated as zero.
197
198 If ``column_pivot_threshold < 0``, then an automatic default value
199 of `20*(m+n)*eps*sqrt(max(diag(J’*J)))` is used. Here `m` and `n`
200 are the number of rows and columns of the Jacobian (`J`)
201 respectively.
202
203 This is an advanced option meant for users who know enough about
204 their Jacobian matrices that they can determine a value better
205 than the default.
206
207
Austin Schuh70cc9552019-01-21 19:46:48 -0800208.. member:: int Covariance::Options::min_reciprocal_condition_number
209
210 Default: :math:`10^{-14}`
211
212 If the Jacobian matrix is near singular, then inverting :math:`J'J`
213 will result in unreliable results, e.g, if
214
215 .. math::
216
217 J = \begin{bmatrix}
218 1.0& 1.0 \\
219 1.0& 1.0000001
220 \end{bmatrix}
221
222 which is essentially a rank deficient matrix, we have
223
224 .. math::
225
226 (J'J)^{-1} = \begin{bmatrix}
227 2.0471e+14& -2.0471e+14 \\
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800228 -2.0471e+14& 2.0471e+14
Austin Schuh70cc9552019-01-21 19:46:48 -0800229 \end{bmatrix}
230
231
232 This is not a useful result. Therefore, by default
233 :func:`Covariance::Compute` will return ``false`` if a rank
234 deficient Jacobian is encountered. How rank deficiency is detected
235 depends on the algorithm being used.
236
237 1. ``DENSE_SVD``
238
239 .. math:: \frac{\sigma_{\text{min}}}{\sigma_{\text{max}}} < \sqrt{\text{min_reciprocal_condition_number}}
240
241 where :math:`\sigma_{\text{min}}` and
Austin Schuh3de38b02024-06-25 18:25:10 -0700242 :math:`\sigma_{\text{max}}` are the minimum and maximum
Austin Schuh70cc9552019-01-21 19:46:48 -0800243 singular values of :math:`J` respectively.
244
245 2. ``SPARSE_QR``
246
247 .. math:: \operatorname{rank}(J) < \operatorname{num\_col}(J)
248
249 Here :math:`\operatorname{rank}(J)` is the estimate of the rank
250 of :math:`J` returned by the sparse QR factorization
251 algorithm. It is a fairly reliable indication of rank
252 deficiency.
253
254.. member:: int Covariance::Options::null_space_rank
255
256 When using ``DENSE_SVD``, the user has more control in dealing
257 with singular and near singular covariance matrices.
258
259 As mentioned above, when the covariance matrix is near singular,
260 instead of computing the inverse of :math:`J'J`, the Moore-Penrose
261 pseudoinverse of :math:`J'J` should be computed.
262
263 If :math:`J'J` has the eigen decomposition :math:`(\lambda_i,
264 e_i)`, where :math:`\lambda_i` is the :math:`i^\textrm{th}`
265 eigenvalue and :math:`e_i` is the corresponding eigenvector, then
266 the inverse of :math:`J'J` is
267
268 .. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i'
269
270 and computing the pseudo inverse involves dropping terms from this
271 sum that correspond to small eigenvalues.
272
273 How terms are dropped is controlled by
274 `min_reciprocal_condition_number` and `null_space_rank`.
275
276 If `null_space_rank` is non-negative, then the smallest
277 `null_space_rank` eigenvalue/eigenvectors are dropped irrespective
278 of the magnitude of :math:`\lambda_i`. If the ratio of the
279 smallest non-zero eigenvalue to the largest eigenvalue in the
280 truncated matrix is still below min_reciprocal_condition_number,
281 then the `Covariance::Compute()` will fail and return `false`.
282
283 Setting `null_space_rank = -1` drops all terms for which
284
285 .. math:: \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
286
287 This option has no effect on ``SPARSE_QR``.
288
289.. member:: bool Covariance::Options::apply_loss_function
290
291 Default: `true`
292
293 Even though the residual blocks in the problem may contain loss
294 functions, setting ``apply_loss_function`` to false will turn off
295 the application of the loss function to the output of the cost
296 function and in turn its effect on the covariance.
297
298.. class:: Covariance
299
300 :class:`Covariance::Options` as the name implies is used to control
301 the covariance estimation algorithm. Covariance estimation is a
302 complicated and numerically sensitive procedure. Please read the
303 entire documentation for :class:`Covariance::Options` before using
304 :class:`Covariance`.
305
Austin Schuh3de38b02024-06-25 18:25:10 -0700306.. function:: bool Covariance::Compute(const std::vector<std::pair<const double*, const double*> >& covariance_blocks, Problem* problem)
Austin Schuh70cc9552019-01-21 19:46:48 -0800307
308 Compute a part of the covariance matrix.
309
310 The vector ``covariance_blocks``, indexes into the covariance
311 matrix block-wise using pairs of parameter blocks. This allows the
312 covariance estimation algorithm to only compute and store these
313 blocks.
314
315 Since the covariance matrix is symmetric, if the user passes
316 ``<block1, block2>``, then ``GetCovarianceBlock`` can be called with
317 ``block1``, ``block2`` as well as ``block2``, ``block1``.
318
319 ``covariance_blocks`` cannot contain duplicates. Bad things will
320 happen if they do.
321
322 Note that the list of ``covariance_blocks`` is only used to
323 determine what parts of the covariance matrix are computed. The
324 full Jacobian is used to do the computation, i.e. they do not have
325 an impact on what part of the Jacobian is used for computation.
326
327 The return value indicates the success or failure of the covariance
328 computation. Please see the documentation for
329 :class:`Covariance::Options` for more on the conditions under which
330 this function returns ``false``.
331
332.. function:: bool GetCovarianceBlock(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
333
334 Return the block of the cross-covariance matrix corresponding to
335 ``parameter_block1`` and ``parameter_block2``.
336
337 Compute must be called before the first call to ``GetCovarianceBlock``
338 and the pair ``<parameter_block1, parameter_block2>`` OR the pair
339 ``<parameter_block2, parameter_block1>`` must have been present in the
340 vector covariance_blocks when ``Compute`` was called. Otherwise
341 ``GetCovarianceBlock`` will return false.
342
343 ``covariance_block`` must point to a memory location that can store
344 a ``parameter_block1_size x parameter_block2_size`` matrix. The
345 returned covariance will be a row-major matrix.
346
347.. function:: bool GetCovarianceBlockInTangentSpace(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
348
349 Return the block of the cross-covariance matrix corresponding to
350 ``parameter_block1`` and ``parameter_block2``.
351 Returns cross-covariance in the tangent space if a local
352 parameterization is associated with either parameter block;
353 else returns cross-covariance in the ambient space.
354
355 Compute must be called before the first call to ``GetCovarianceBlock``
356 and the pair ``<parameter_block1, parameter_block2>`` OR the pair
357 ``<parameter_block2, parameter_block1>`` must have been present in the
358 vector covariance_blocks when ``Compute`` was called. Otherwise
359 ``GetCovarianceBlock`` will return false.
360
361 ``covariance_block`` must point to a memory location that can store
362 a ``parameter_block1_local_size x parameter_block2_local_size`` matrix. The
363 returned covariance will be a row-major matrix.
364
365Example Usage
366=============
367
368.. code-block:: c++
369
370 double x[3];
371 double y[2];
372
373 Problem problem;
374 problem.AddParameterBlock(x, 3);
375 problem.AddParameterBlock(y, 2);
376 <Build Problem>
377 <Solve Problem>
378
379 Covariance::Options options;
380 Covariance covariance(options);
381
Austin Schuh3de38b02024-06-25 18:25:10 -0700382 std::vector<std::pair<const double*, const double*> > covariance_blocks;
Austin Schuh70cc9552019-01-21 19:46:48 -0800383 covariance_blocks.push_back(make_pair(x, x));
384 covariance_blocks.push_back(make_pair(y, y));
385 covariance_blocks.push_back(make_pair(x, y));
386
387 CHECK(covariance.Compute(covariance_blocks, &problem));
388
389 double covariance_xx[3 * 3];
390 double covariance_yy[2 * 2];
391 double covariance_xy[3 * 2];
392 covariance.GetCovarianceBlock(x, x, covariance_xx)
393 covariance.GetCovarianceBlock(y, y, covariance_yy)
394 covariance.GetCovarianceBlock(x, y, covariance_xy)