Blame - docs/source/nnls_covariance.rst - RealtimeRoboticsGroup/test

blob: 9c6cea89681ded46148f87fd755b5e146d64353d [file] [log] [blame]

Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame^]	1
				2	.. default-domain:: cpp
				3
				4	.. cpp:namespace:: ceres
				5
				6	.. _chapter-nnls_covariance:
				7
				8	=====================
				9	Covariance Estimation
				10	=====================
				11
				12	Introduction
				13	============
				14
				15	One way to assess the quality of the solution returned by a non-linear
				16	least squares solver is to analyze the covariance of the solution.
				17
				18	Let us consider the non-linear regression problem
				19
				20	.. math:: y = f(x) + N(0, I)
				21
				22	i.e., the observation :math:`y` is a random non-linear function of the
				23	independent variable :math:`x` with mean :math:`f(x)` and identity
				24	covariance. Then the maximum likelihood estimate of :math:`x` given
				25	observations :math:`y` is the solution to the non-linear least squares
				26	problem:
				27
				28	.. math:: x^* = \arg \min_x \\|f(x)\\|^2
				29
				30	And the covariance of :math:`x^*` is given by
				31
				32	.. math:: C(x^) = \left(J'(x^)J(x^*)\right)^{-1}
				33
				34	Here :math:`J(x^)` is the Jacobian of :math:`f` at :math:`x^`. The
				35	above formula assumes that :math:`J(x^*)` has full column rank.
				36
				37	If :math:`J(x^)` is rank deficient, then the covariance matrix :math:`C(x^)`
				38	is also rank deficient and is given by the Moore-Penrose pseudo inverse.
				39
				40	.. math:: C(x^) = \left(J'(x^)J(x^*)\right)^{\dagger}
				41
				42	Note that in the above, we assumed that the covariance matrix for
				43	:math:`y` was identity. This is an important assumption. If this is
				44	not the case and we have
				45
				46	.. math:: y = f(x) + N(0, S)
				47
				48	Where :math:`S` is a positive semi-definite matrix denoting the
				49	covariance of :math:`y`, then the maximum likelihood problem to be
				50	solved is
				51
				52	.. math:: x^* = \arg \min_x f'(x) S^{-1} f(x)
				53
				54	and the corresponding covariance estimate of :math:`x^*` is given by
				55
				56	.. math:: C(x^) = \left(J'(x^) S^{-1} J(x^*)\right)^{-1}
				57
				58	So, if it is the case that the observations being fitted to have a
				59	covariance matrix not equal to identity, then it is the user's
				60	responsibility that the corresponding cost functions are correctly
				61	scaled, e.g. in the above case the cost function for this problem
				62	should evaluate :math:`S^{-1/2} f(x)` instead of just :math:`f(x)`,
				63	where :math:`S^{-1/2}` is the inverse square root of the covariance
				64	matrix :math:`S`.
				65
				66	Gauge Invariance
				67	================
				68
				69	In structure from motion (3D reconstruction) problems, the
				70	reconstruction is ambiguous up to a similarity transform. This is
				71	known as a Gauge Ambiguity. Handling Gauges correctly requires the
				72	use of SVD or custom inversion algorithms. For small problems the
				73	user can use the dense algorithm. For more details see the work of
				74	Kanatani & Morris [KanataniMorris]_.
				75
				76
				77	:class:`Covariance`
				78	===================
				79
				80	:class:`Covariance` allows the user to evaluate the covariance for a
				81	non-linear least squares problem and provides random access to its
				82	blocks. The computation assumes that the cost functions compute
				83	residuals such that their covariance is identity.
				84
				85	Since the computation of the covariance matrix requires computing the
				86	inverse of a potentially large matrix, this can involve a rather large
				87	amount of time and memory. However, it is usually the case that the
				88	user is only interested in a small part of the covariance
				89	matrix. Quite often just the block diagonal. :class:`Covariance`
				90	allows the user to specify the parts of the covariance matrix that she
				91	is interested in and then uses this information to only compute and
				92	store those parts of the covariance matrix.
				93
				94	Rank of the Jacobian
				95	====================
				96
				97	As we noted above, if the Jacobian is rank deficient, then the inverse
				98	of :math:`J'J` is not defined and instead a pseudo inverse needs to be
				99	computed.
				100
				101	The rank deficiency in :math:`J` can be structural -- columns
				102	which are always known to be zero or numerical -- depending on the
				103	exact values in the Jacobian.
				104
				105	Structural rank deficiency occurs when the problem contains parameter
				106	blocks that are constant. This class correctly handles structural rank
				107	deficiency like that.
				108
				109	Numerical rank deficiency, where the rank of the matrix cannot be
				110	predicted by its sparsity structure and requires looking at its
				111	numerical values is more complicated. Here again there are two
				112	cases.
				113
				114	a. The rank deficiency arises from overparameterization. e.g., a
				115	four dimensional quaternion used to parameterize :math:`SO(3)`,
				116	which is a three dimensional manifold. In cases like this, the
				117	user should use an appropriate
				118	:class:`LocalParameterization`. Not only will this lead to better
				119	numerical behaviour of the Solver, it will also expose the rank
				120	deficiency to the :class:`Covariance` object so that it can
				121	handle it correctly.
				122
				123	b. More general numerical rank deficiency in the Jacobian requires
				124	the computation of the so called Singular Value Decomposition
				125	(SVD) of :math:`J'J`. We do not know how to do this for large
				126	sparse matrices efficiently. For small and moderate sized
				127	problems this is done using dense linear algebra.
				128
				129
				130	:class:`Covariance::Options`
				131
				132	.. class:: Covariance::Options
				133
				134	.. member:: int Covariance::Options::num_threads
				135
				136	Default: ``1``
				137
				138	Number of threads to be used for evaluating the Jacobian and
				139	estimation of covariance.
				140
				141	.. member:: SparseLinearAlgebraLibraryType Covariance::Options::sparse_linear_algebra_library_type
				142
				143	Default: ``SUITE_SPARSE`` Ceres Solver is built with support for
				144	`SuiteSparse <http://faculty.cse.tamu.edu/davis/suitesparse.html>`_
				145	and ``EIGEN_SPARSE`` otherwise. Note that ``EIGEN_SPARSE`` is
				146	always available.
				147
				148	.. member:: CovarianceAlgorithmType Covariance::Options::algorithm_type
				149
				150	Default: ``SPARSE_QR``
				151
				152	Ceres supports two different algorithms for covariance estimation,
				153	which represent different tradeoffs in speed, accuracy and
				154	reliability.
				155
				156	1. ``SPARSE_QR`` uses the sparse QR factorization algorithm to
				157	compute the decomposition
				158
				159	.. math::
				160
				161	QR &= J\\
				162	\left(J^\top J\right)^{-1} &= \left(R^\top R\right)^{-1}
				163
				164	The speed of this algorithm depends on the sparse linear algebra
				165	library being used. ``Eigen``'s sparse QR factorization is a
				166	moderately fast algorithm suitable for small to medium sized
				167	matrices. For best performance we recommend using
				168	``SuiteSparseQR`` which is enabled by setting
				169	:member:`Covaraince::Options::sparse_linear_algebra_library_type`
				170	to ``SUITE_SPARSE``.
				171
				172	Neither ``SPARSE_QR`` cannot compute the covariance if the
				173	Jacobian is rank deficient.
				174
				175
				176	2. ``DENSE_SVD`` uses ``Eigen``'s ``JacobiSVD`` to perform the
				177	computations. It computes the singular value decomposition
				178
				179	.. math:: U S V^\top = J
				180
				181	and then uses it to compute the pseudo inverse of J'J as
				182
				183	.. math:: (J'J)^{\dagger} = V S^{\dagger} V^\top
				184
				185	It is an accurate but slow method and should only be used for
				186	small to moderate sized problems. It can handle full-rank as
				187	well as rank deficient Jacobians.
				188
				189
				190	.. member:: int Covariance::Options::min_reciprocal_condition_number
				191
				192	Default: :math:`10^{-14}`
				193
				194	If the Jacobian matrix is near singular, then inverting :math:`J'J`
				195	will result in unreliable results, e.g, if
				196
				197	.. math::
				198
				199	J = \begin{bmatrix}
				200	1.0& 1.0 \\
				201	1.0& 1.0000001
				202	\end{bmatrix}
				203
				204	which is essentially a rank deficient matrix, we have
				205
				206	.. math::
				207
				208	(J'J)^{-1} = \begin{bmatrix}
				209	2.0471e+14& -2.0471e+14 \\
				210	-2.0471e+14 2.0471e+14
				211	\end{bmatrix}
				212
				213
				214	This is not a useful result. Therefore, by default
				215	:func:`Covariance::Compute` will return ``false`` if a rank
				216	deficient Jacobian is encountered. How rank deficiency is detected
				217	depends on the algorithm being used.
				218
				219	1. ``DENSE_SVD``
				220
				221	.. math:: \frac{\sigma_{\text{min}}}{\sigma_{\text{max}}} < \sqrt{\text{min_reciprocal_condition_number}}
				222
				223	where :math:`\sigma_{\text{min}}` and
				224	:math:`\sigma_{\text{max}}` are the minimum and maxiumum
				225	singular values of :math:`J` respectively.
				226
				227	2. ``SPARSE_QR``
				228
				229	.. math:: \operatorname{rank}(J) < \operatorname{num\_col}(J)
				230
				231	Here :math:`\operatorname{rank}(J)` is the estimate of the rank
				232	of :math:`J` returned by the sparse QR factorization
				233	algorithm. It is a fairly reliable indication of rank
				234	deficiency.
				235
				236	.. member:: int Covariance::Options::null_space_rank
				237
				238	When using ``DENSE_SVD``, the user has more control in dealing
				239	with singular and near singular covariance matrices.
				240
				241	As mentioned above, when the covariance matrix is near singular,
				242	instead of computing the inverse of :math:`J'J`, the Moore-Penrose
				243	pseudoinverse of :math:`J'J` should be computed.
				244
				245	If :math:`J'J` has the eigen decomposition :math:`(\lambda_i,
				246	e_i)`, where :math:`\lambda_i` is the :math:`i^\textrm{th}`
				247	eigenvalue and :math:`e_i` is the corresponding eigenvector, then
				248	the inverse of :math:`J'J` is
				249
				250	.. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i'
				251
				252	and computing the pseudo inverse involves dropping terms from this
				253	sum that correspond to small eigenvalues.
				254
				255	How terms are dropped is controlled by
				256	`min_reciprocal_condition_number` and `null_space_rank`.
				257
				258	If `null_space_rank` is non-negative, then the smallest
				259	`null_space_rank` eigenvalue/eigenvectors are dropped irrespective
				260	of the magnitude of :math:`\lambda_i`. If the ratio of the
				261	smallest non-zero eigenvalue to the largest eigenvalue in the
				262	truncated matrix is still below min_reciprocal_condition_number,
				263	then the `Covariance::Compute()` will fail and return `false`.
				264
				265	Setting `null_space_rank = -1` drops all terms for which
				266
				267	.. math:: \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
				268
				269	This option has no effect on ``SPARSE_QR``.
				270
				271	.. member:: bool Covariance::Options::apply_loss_function
				272
				273	Default: `true`
				274
				275	Even though the residual blocks in the problem may contain loss
				276	functions, setting ``apply_loss_function`` to false will turn off
				277	the application of the loss function to the output of the cost
				278	function and in turn its effect on the covariance.
				279
				280	.. class:: Covariance
				281
				282	:class:`Covariance::Options` as the name implies is used to control
				283	the covariance estimation algorithm. Covariance estimation is a
				284	complicated and numerically sensitive procedure. Please read the
				285	entire documentation for :class:`Covariance::Options` before using
				286	:class:`Covariance`.
				287
				288	.. function:: bool Covariance::Compute(const vector<pair<const double, const double> >& covariance_blocks, Problem* problem)
				289
				290	Compute a part of the covariance matrix.
				291
				292	The vector ``covariance_blocks``, indexes into the covariance
				293	matrix block-wise using pairs of parameter blocks. This allows the
				294	covariance estimation algorithm to only compute and store these
				295	blocks.
				296
				297	Since the covariance matrix is symmetric, if the user passes
				298	``<block1, block2>``, then ``GetCovarianceBlock`` can be called with
				299	``block1``, ``block2`` as well as ``block2``, ``block1``.
				300
				301	``covariance_blocks`` cannot contain duplicates. Bad things will
				302	happen if they do.
				303
				304	Note that the list of ``covariance_blocks`` is only used to
				305	determine what parts of the covariance matrix are computed. The
				306	full Jacobian is used to do the computation, i.e. they do not have
				307	an impact on what part of the Jacobian is used for computation.
				308
				309	The return value indicates the success or failure of the covariance
				310	computation. Please see the documentation for
				311	:class:`Covariance::Options` for more on the conditions under which
				312	this function returns ``false``.
				313
				314	.. function:: bool GetCovarianceBlock(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
				315
				316	Return the block of the cross-covariance matrix corresponding to
				317	``parameter_block1`` and ``parameter_block2``.
				318
				319	Compute must be called before the first call to ``GetCovarianceBlock``
				320	and the pair ``<parameter_block1, parameter_block2>`` OR the pair
				321	``<parameter_block2, parameter_block1>`` must have been present in the
				322	vector covariance_blocks when ``Compute`` was called. Otherwise
				323	``GetCovarianceBlock`` will return false.
				324
				325	``covariance_block`` must point to a memory location that can store
				326	a ``parameter_block1_size x parameter_block2_size`` matrix. The
				327	returned covariance will be a row-major matrix.
				328
				329	.. function:: bool GetCovarianceBlockInTangentSpace(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
				330
				331	Return the block of the cross-covariance matrix corresponding to
				332	``parameter_block1`` and ``parameter_block2``.
				333	Returns cross-covariance in the tangent space if a local
				334	parameterization is associated with either parameter block;
				335	else returns cross-covariance in the ambient space.
				336
				337	Compute must be called before the first call to ``GetCovarianceBlock``
				338	and the pair ``<parameter_block1, parameter_block2>`` OR the pair
				339	``<parameter_block2, parameter_block1>`` must have been present in the
				340	vector covariance_blocks when ``Compute`` was called. Otherwise
				341	``GetCovarianceBlock`` will return false.
				342
				343	``covariance_block`` must point to a memory location that can store
				344	a ``parameter_block1_local_size x parameter_block2_local_size`` matrix. The
				345	returned covariance will be a row-major matrix.
				346
				347	Example Usage
				348	=============
				349
				350	.. code-block:: c++
				351
				352	double x[3];
				353	double y[2];
				354
				355	Problem problem;
				356	problem.AddParameterBlock(x, 3);
				357	problem.AddParameterBlock(y, 2);
				358	<Build Problem>
				359	<Solve Problem>
				360
				361	Covariance::Options options;
				362	Covariance covariance(options);
				363
				364	vector<pair<const double, const double> > covariance_blocks;
				365	covariance_blocks.push_back(make_pair(x, x));
				366	covariance_blocks.push_back(make_pair(y, y));
				367	covariance_blocks.push_back(make_pair(x, y));
				368
				369	CHECK(covariance.Compute(covariance_blocks, &problem));
				370
				371	double covariance_xx[3 * 3];
				372	double covariance_yy[2 * 2];
				373	double covariance_xy[3 * 2];
				374	covariance.GetCovarianceBlock(x, x, covariance_xx)
				375	covariance.GetCovarianceBlock(y, y, covariance_yy)
				376	covariance.GetCovarianceBlock(x, y, covariance_xy)