blob: 66afd44a3fe80fb81e7446da44ed11466b6c80e8 [file] [log] [blame]
Austin Schuh70cc9552019-01-21 19:46:48 -08001
2.. default-domain:: cpp
3
4.. cpp:namespace:: ceres
5
6.. _chapter-nnls_covariance:
7
8=====================
9Covariance Estimation
10=====================
11
12Introduction
13============
14
15One way to assess the quality of the solution returned by a non-linear
16least squares solver is to analyze the covariance of the solution.
17
18Let us consider the non-linear regression problem
19
20.. math:: y = f(x) + N(0, I)
21
22i.e., the observation :math:`y` is a random non-linear function of the
23independent variable :math:`x` with mean :math:`f(x)` and identity
24covariance. Then the maximum likelihood estimate of :math:`x` given
25observations :math:`y` is the solution to the non-linear least squares
26problem:
27
Austin Schuh1d1e6ea2020-12-23 21:56:30 -080028.. math:: x^* = \arg \min_x \|f(x) - y\|^2
Austin Schuh70cc9552019-01-21 19:46:48 -080029
30And the covariance of :math:`x^*` is given by
31
32.. math:: C(x^*) = \left(J'(x^*)J(x^*)\right)^{-1}
33
34Here :math:`J(x^*)` is the Jacobian of :math:`f` at :math:`x^*`. The
35above formula assumes that :math:`J(x^*)` has full column rank.
36
37If :math:`J(x^*)` is rank deficient, then the covariance matrix :math:`C(x^*)`
38is also rank deficient and is given by the Moore-Penrose pseudo inverse.
39
40.. math:: C(x^*) = \left(J'(x^*)J(x^*)\right)^{\dagger}
41
42Note that in the above, we assumed that the covariance matrix for
43:math:`y` was identity. This is an important assumption. If this is
44not the case and we have
45
46.. math:: y = f(x) + N(0, S)
47
48Where :math:`S` is a positive semi-definite matrix denoting the
49covariance of :math:`y`, then the maximum likelihood problem to be
50solved is
51
52.. math:: x^* = \arg \min_x f'(x) S^{-1} f(x)
53
54and the corresponding covariance estimate of :math:`x^*` is given by
55
56.. math:: C(x^*) = \left(J'(x^*) S^{-1} J(x^*)\right)^{-1}
57
58So, if it is the case that the observations being fitted to have a
59covariance matrix not equal to identity, then it is the user's
60responsibility that the corresponding cost functions are correctly
61scaled, e.g. in the above case the cost function for this problem
62should evaluate :math:`S^{-1/2} f(x)` instead of just :math:`f(x)`,
63where :math:`S^{-1/2}` is the inverse square root of the covariance
64matrix :math:`S`.
65
66Gauge Invariance
67================
68
69In structure from motion (3D reconstruction) problems, the
70reconstruction is ambiguous up to a similarity transform. This is
71known as a *Gauge Ambiguity*. Handling Gauges correctly requires the
72use of SVD or custom inversion algorithms. For small problems the
73user can use the dense algorithm. For more details see the work of
74Kanatani & Morris [KanataniMorris]_.
75
76
77:class:`Covariance`
78===================
79
80:class:`Covariance` allows the user to evaluate the covariance for a
81non-linear least squares problem and provides random access to its
82blocks. The computation assumes that the cost functions compute
83residuals such that their covariance is identity.
84
85Since the computation of the covariance matrix requires computing the
86inverse of a potentially large matrix, this can involve a rather large
87amount of time and memory. However, it is usually the case that the
88user is only interested in a small part of the covariance
89matrix. Quite often just the block diagonal. :class:`Covariance`
90allows the user to specify the parts of the covariance matrix that she
91is interested in and then uses this information to only compute and
92store those parts of the covariance matrix.
93
94Rank of the Jacobian
95====================
96
97As we noted above, if the Jacobian is rank deficient, then the inverse
98of :math:`J'J` is not defined and instead a pseudo inverse needs to be
99computed.
100
101The rank deficiency in :math:`J` can be *structural* -- columns
102which are always known to be zero or *numerical* -- depending on the
103exact values in the Jacobian.
104
105Structural rank deficiency occurs when the problem contains parameter
106blocks that are constant. This class correctly handles structural rank
107deficiency like that.
108
109Numerical rank deficiency, where the rank of the matrix cannot be
110predicted by its sparsity structure and requires looking at its
111numerical values is more complicated. Here again there are two
112cases.
113
114 a. The rank deficiency arises from overparameterization. e.g., a
115 four dimensional quaternion used to parameterize :math:`SO(3)`,
116 which is a three dimensional manifold. In cases like this, the
117 user should use an appropriate
118 :class:`LocalParameterization`. Not only will this lead to better
119 numerical behaviour of the Solver, it will also expose the rank
120 deficiency to the :class:`Covariance` object so that it can
121 handle it correctly.
122
123 b. More general numerical rank deficiency in the Jacobian requires
124 the computation of the so called Singular Value Decomposition
125 (SVD) of :math:`J'J`. We do not know how to do this for large
126 sparse matrices efficiently. For small and moderate sized
127 problems this is done using dense linear algebra.
128
129
130:class:`Covariance::Options`
131
132.. class:: Covariance::Options
133
134.. member:: int Covariance::Options::num_threads
135
136 Default: ``1``
137
138 Number of threads to be used for evaluating the Jacobian and
139 estimation of covariance.
140
141.. member:: SparseLinearAlgebraLibraryType Covariance::Options::sparse_linear_algebra_library_type
142
143 Default: ``SUITE_SPARSE`` Ceres Solver is built with support for
144 `SuiteSparse <http://faculty.cse.tamu.edu/davis/suitesparse.html>`_
145 and ``EIGEN_SPARSE`` otherwise. Note that ``EIGEN_SPARSE`` is
146 always available.
147
148.. member:: CovarianceAlgorithmType Covariance::Options::algorithm_type
149
150 Default: ``SPARSE_QR``
151
152 Ceres supports two different algorithms for covariance estimation,
153 which represent different tradeoffs in speed, accuracy and
154 reliability.
155
156 1. ``SPARSE_QR`` uses the sparse QR factorization algorithm to
157 compute the decomposition
158
159 .. math::
160
161 QR &= J\\
162 \left(J^\top J\right)^{-1} &= \left(R^\top R\right)^{-1}
163
164 The speed of this algorithm depends on the sparse linear algebra
165 library being used. ``Eigen``'s sparse QR factorization is a
166 moderately fast algorithm suitable for small to medium sized
167 matrices. For best performance we recommend using
168 ``SuiteSparseQR`` which is enabled by setting
169 :member:`Covaraince::Options::sparse_linear_algebra_library_type`
170 to ``SUITE_SPARSE``.
171
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800172 ``SPARSE_QR`` cannot compute the covariance if the
Austin Schuh70cc9552019-01-21 19:46:48 -0800173 Jacobian is rank deficient.
174
175
176 2. ``DENSE_SVD`` uses ``Eigen``'s ``JacobiSVD`` to perform the
177 computations. It computes the singular value decomposition
178
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800179 .. math:: U D V^\top = J
Austin Schuh70cc9552019-01-21 19:46:48 -0800180
181 and then uses it to compute the pseudo inverse of J'J as
182
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800183 .. math:: (J'J)^{\dagger} = V D^{2\dagger} V^\top
Austin Schuh70cc9552019-01-21 19:46:48 -0800184
185 It is an accurate but slow method and should only be used for
186 small to moderate sized problems. It can handle full-rank as
187 well as rank deficient Jacobians.
188
189
190.. member:: int Covariance::Options::min_reciprocal_condition_number
191
192 Default: :math:`10^{-14}`
193
194 If the Jacobian matrix is near singular, then inverting :math:`J'J`
195 will result in unreliable results, e.g, if
196
197 .. math::
198
199 J = \begin{bmatrix}
200 1.0& 1.0 \\
201 1.0& 1.0000001
202 \end{bmatrix}
203
204 which is essentially a rank deficient matrix, we have
205
206 .. math::
207
208 (J'J)^{-1} = \begin{bmatrix}
209 2.0471e+14& -2.0471e+14 \\
Austin Schuh1d1e6ea2020-12-23 21:56:30 -0800210 -2.0471e+14& 2.0471e+14
Austin Schuh70cc9552019-01-21 19:46:48 -0800211 \end{bmatrix}
212
213
214 This is not a useful result. Therefore, by default
215 :func:`Covariance::Compute` will return ``false`` if a rank
216 deficient Jacobian is encountered. How rank deficiency is detected
217 depends on the algorithm being used.
218
219 1. ``DENSE_SVD``
220
221 .. math:: \frac{\sigma_{\text{min}}}{\sigma_{\text{max}}} < \sqrt{\text{min_reciprocal_condition_number}}
222
223 where :math:`\sigma_{\text{min}}` and
224 :math:`\sigma_{\text{max}}` are the minimum and maxiumum
225 singular values of :math:`J` respectively.
226
227 2. ``SPARSE_QR``
228
229 .. math:: \operatorname{rank}(J) < \operatorname{num\_col}(J)
230
231 Here :math:`\operatorname{rank}(J)` is the estimate of the rank
232 of :math:`J` returned by the sparse QR factorization
233 algorithm. It is a fairly reliable indication of rank
234 deficiency.
235
236.. member:: int Covariance::Options::null_space_rank
237
238 When using ``DENSE_SVD``, the user has more control in dealing
239 with singular and near singular covariance matrices.
240
241 As mentioned above, when the covariance matrix is near singular,
242 instead of computing the inverse of :math:`J'J`, the Moore-Penrose
243 pseudoinverse of :math:`J'J` should be computed.
244
245 If :math:`J'J` has the eigen decomposition :math:`(\lambda_i,
246 e_i)`, where :math:`\lambda_i` is the :math:`i^\textrm{th}`
247 eigenvalue and :math:`e_i` is the corresponding eigenvector, then
248 the inverse of :math:`J'J` is
249
250 .. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i'
251
252 and computing the pseudo inverse involves dropping terms from this
253 sum that correspond to small eigenvalues.
254
255 How terms are dropped is controlled by
256 `min_reciprocal_condition_number` and `null_space_rank`.
257
258 If `null_space_rank` is non-negative, then the smallest
259 `null_space_rank` eigenvalue/eigenvectors are dropped irrespective
260 of the magnitude of :math:`\lambda_i`. If the ratio of the
261 smallest non-zero eigenvalue to the largest eigenvalue in the
262 truncated matrix is still below min_reciprocal_condition_number,
263 then the `Covariance::Compute()` will fail and return `false`.
264
265 Setting `null_space_rank = -1` drops all terms for which
266
267 .. math:: \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
268
269 This option has no effect on ``SPARSE_QR``.
270
271.. member:: bool Covariance::Options::apply_loss_function
272
273 Default: `true`
274
275 Even though the residual blocks in the problem may contain loss
276 functions, setting ``apply_loss_function`` to false will turn off
277 the application of the loss function to the output of the cost
278 function and in turn its effect on the covariance.
279
280.. class:: Covariance
281
282 :class:`Covariance::Options` as the name implies is used to control
283 the covariance estimation algorithm. Covariance estimation is a
284 complicated and numerically sensitive procedure. Please read the
285 entire documentation for :class:`Covariance::Options` before using
286 :class:`Covariance`.
287
288.. function:: bool Covariance::Compute(const vector<pair<const double*, const double*> >& covariance_blocks, Problem* problem)
289
290 Compute a part of the covariance matrix.
291
292 The vector ``covariance_blocks``, indexes into the covariance
293 matrix block-wise using pairs of parameter blocks. This allows the
294 covariance estimation algorithm to only compute and store these
295 blocks.
296
297 Since the covariance matrix is symmetric, if the user passes
298 ``<block1, block2>``, then ``GetCovarianceBlock`` can be called with
299 ``block1``, ``block2`` as well as ``block2``, ``block1``.
300
301 ``covariance_blocks`` cannot contain duplicates. Bad things will
302 happen if they do.
303
304 Note that the list of ``covariance_blocks`` is only used to
305 determine what parts of the covariance matrix are computed. The
306 full Jacobian is used to do the computation, i.e. they do not have
307 an impact on what part of the Jacobian is used for computation.
308
309 The return value indicates the success or failure of the covariance
310 computation. Please see the documentation for
311 :class:`Covariance::Options` for more on the conditions under which
312 this function returns ``false``.
313
314.. function:: bool GetCovarianceBlock(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
315
316 Return the block of the cross-covariance matrix corresponding to
317 ``parameter_block1`` and ``parameter_block2``.
318
319 Compute must be called before the first call to ``GetCovarianceBlock``
320 and the pair ``<parameter_block1, parameter_block2>`` OR the pair
321 ``<parameter_block2, parameter_block1>`` must have been present in the
322 vector covariance_blocks when ``Compute`` was called. Otherwise
323 ``GetCovarianceBlock`` will return false.
324
325 ``covariance_block`` must point to a memory location that can store
326 a ``parameter_block1_size x parameter_block2_size`` matrix. The
327 returned covariance will be a row-major matrix.
328
329.. function:: bool GetCovarianceBlockInTangentSpace(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
330
331 Return the block of the cross-covariance matrix corresponding to
332 ``parameter_block1`` and ``parameter_block2``.
333 Returns cross-covariance in the tangent space if a local
334 parameterization is associated with either parameter block;
335 else returns cross-covariance in the ambient space.
336
337 Compute must be called before the first call to ``GetCovarianceBlock``
338 and the pair ``<parameter_block1, parameter_block2>`` OR the pair
339 ``<parameter_block2, parameter_block1>`` must have been present in the
340 vector covariance_blocks when ``Compute`` was called. Otherwise
341 ``GetCovarianceBlock`` will return false.
342
343 ``covariance_block`` must point to a memory location that can store
344 a ``parameter_block1_local_size x parameter_block2_local_size`` matrix. The
345 returned covariance will be a row-major matrix.
346
347Example Usage
348=============
349
350.. code-block:: c++
351
352 double x[3];
353 double y[2];
354
355 Problem problem;
356 problem.AddParameterBlock(x, 3);
357 problem.AddParameterBlock(y, 2);
358 <Build Problem>
359 <Solve Problem>
360
361 Covariance::Options options;
362 Covariance covariance(options);
363
364 vector<pair<const double*, const double*> > covariance_blocks;
365 covariance_blocks.push_back(make_pair(x, x));
366 covariance_blocks.push_back(make_pair(y, y));
367 covariance_blocks.push_back(make_pair(x, y));
368
369 CHECK(covariance.Compute(covariance_blocks, &problem));
370
371 double covariance_xx[3 * 3];
372 double covariance_yy[2 * 2];
373 double covariance_xy[3 * 2];
374 covariance.GetCovarianceBlock(x, x, covariance_xx)
375 covariance.GetCovarianceBlock(y, y, covariance_yy)
376 covariance.GetCovarianceBlock(x, y, covariance_xy)