Blame - docs/source/nnls_covariance.rst - RealtimeRoboticsGroup/test

blob: f95d246c66c43f4e8b608922d6371111f70fc279 [file] [log] [blame]

Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	1	.. highlight:: c++
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	2
				3	.. default-domain:: cpp
				4
				5	.. cpp:namespace:: ceres
				6
				7	.. _chapter-nnls_covariance:
				8
				9	=====================
				10	Covariance Estimation
				11	=====================
				12
				13	Introduction
				14	============
				15
				16	One way to assess the quality of the solution returned by a non-linear
				17	least squares solver is to analyze the covariance of the solution.
				18
				19	Let us consider the non-linear regression problem
				20
				21	.. math:: y = f(x) + N(0, I)
				22
				23	i.e., the observation :math:`y` is a random non-linear function of the
				24	independent variable :math:`x` with mean :math:`f(x)` and identity
				25	covariance. Then the maximum likelihood estimate of :math:`x` given
				26	observations :math:`y` is the solution to the non-linear least squares
				27	problem:
				28
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	29	.. math:: x^* = \arg \min_x \\|f(x) - y\\|^2
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	30
				31	And the covariance of :math:`x^*` is given by
				32
				33	.. math:: C(x^) = \left(J'(x^)J(x^*)\right)^{-1}
				34
				35	Here :math:`J(x^)` is the Jacobian of :math:`f` at :math:`x^`. The
				36	above formula assumes that :math:`J(x^*)` has full column rank.
				37
				38	If :math:`J(x^)` is rank deficient, then the covariance matrix :math:`C(x^)`
				39	is also rank deficient and is given by the Moore-Penrose pseudo inverse.
				40
				41	.. math:: C(x^) = \left(J'(x^)J(x^*)\right)^{\dagger}
				42
				43	Note that in the above, we assumed that the covariance matrix for
				44	:math:`y` was identity. This is an important assumption. If this is
				45	not the case and we have
				46
				47	.. math:: y = f(x) + N(0, S)
				48
				49	Where :math:`S` is a positive semi-definite matrix denoting the
				50	covariance of :math:`y`, then the maximum likelihood problem to be
				51	solved is
				52
				53	.. math:: x^* = \arg \min_x f'(x) S^{-1} f(x)
				54
				55	and the corresponding covariance estimate of :math:`x^*` is given by
				56
				57	.. math:: C(x^) = \left(J'(x^) S^{-1} J(x^*)\right)^{-1}
				58
				59	So, if it is the case that the observations being fitted to have a
				60	covariance matrix not equal to identity, then it is the user's
				61	responsibility that the corresponding cost functions are correctly
				62	scaled, e.g. in the above case the cost function for this problem
				63	should evaluate :math:`S^{-1/2} f(x)` instead of just :math:`f(x)`,
				64	where :math:`S^{-1/2}` is the inverse square root of the covariance
				65	matrix :math:`S`.
				66
				67	Gauge Invariance
				68	================
				69
				70	In structure from motion (3D reconstruction) problems, the
				71	reconstruction is ambiguous up to a similarity transform. This is
				72	known as a Gauge Ambiguity. Handling Gauges correctly requires the
				73	use of SVD or custom inversion algorithms. For small problems the
				74	user can use the dense algorithm. For more details see the work of
				75	Kanatani & Morris [KanataniMorris]_.
				76
				77
				78	:class:`Covariance`
				79	===================
				80
				81	:class:`Covariance` allows the user to evaluate the covariance for a
				82	non-linear least squares problem and provides random access to its
				83	blocks. The computation assumes that the cost functions compute
				84	residuals such that their covariance is identity.
				85
				86	Since the computation of the covariance matrix requires computing the
				87	inverse of a potentially large matrix, this can involve a rather large
				88	amount of time and memory. However, it is usually the case that the
				89	user is only interested in a small part of the covariance
				90	matrix. Quite often just the block diagonal. :class:`Covariance`
				91	allows the user to specify the parts of the covariance matrix that she
				92	is interested in and then uses this information to only compute and
				93	store those parts of the covariance matrix.
				94
				95	Rank of the Jacobian
				96	====================
				97
				98	As we noted above, if the Jacobian is rank deficient, then the inverse
				99	of :math:`J'J` is not defined and instead a pseudo inverse needs to be
				100	computed.
				101
				102	The rank deficiency in :math:`J` can be structural -- columns
				103	which are always known to be zero or numerical -- depending on the
				104	exact values in the Jacobian.
				105
				106	Structural rank deficiency occurs when the problem contains parameter
				107	blocks that are constant. This class correctly handles structural rank
				108	deficiency like that.
				109
				110	Numerical rank deficiency, where the rank of the matrix cannot be
				111	predicted by its sparsity structure and requires looking at its
				112	numerical values is more complicated. Here again there are two
				113	cases.
				114
				115	a. The rank deficiency arises from overparameterization. e.g., a
				116	four dimensional quaternion used to parameterize :math:`SO(3)`,
				117	which is a three dimensional manifold. In cases like this, the
				118	user should use an appropriate
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	119	:class:`Manifold`. Not only will this lead to better
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	120	numerical behaviour of the Solver, it will also expose the rank
				121	deficiency to the :class:`Covariance` object so that it can
				122	handle it correctly.
				123
				124	b. More general numerical rank deficiency in the Jacobian requires
				125	the computation of the so called Singular Value Decomposition
				126	(SVD) of :math:`J'J`. We do not know how to do this for large
				127	sparse matrices efficiently. For small and moderate sized
				128	problems this is done using dense linear algebra.
				129
				130
				131	:class:`Covariance::Options`
				132
				133	.. class:: Covariance::Options
				134
				135	.. member:: int Covariance::Options::num_threads
				136
				137	Default: ``1``
				138
				139	Number of threads to be used for evaluating the Jacobian and
				140	estimation of covariance.
				141
				142	.. member:: SparseLinearAlgebraLibraryType Covariance::Options::sparse_linear_algebra_library_type
				143
				144	Default: ``SUITE_SPARSE`` Ceres Solver is built with support for
				145	`SuiteSparse <http://faculty.cse.tamu.edu/davis/suitesparse.html>`_
				146	and ``EIGEN_SPARSE`` otherwise. Note that ``EIGEN_SPARSE`` is
				147	always available.
				148
				149	.. member:: CovarianceAlgorithmType Covariance::Options::algorithm_type
				150
				151	Default: ``SPARSE_QR``
				152
				153	Ceres supports two different algorithms for covariance estimation,
				154	which represent different tradeoffs in speed, accuracy and
				155	reliability.
				156
				157	1. ``SPARSE_QR`` uses the sparse QR factorization algorithm to
				158	compute the decomposition
				159
				160	.. math::
				161
				162	QR &= J\\
				163	\left(J^\top J\right)^{-1} &= \left(R^\top R\right)^{-1}
				164
				165	The speed of this algorithm depends on the sparse linear algebra
				166	library being used. ``Eigen``'s sparse QR factorization is a
				167	moderately fast algorithm suitable for small to medium sized
				168	matrices. For best performance we recommend using
				169	``SuiteSparseQR`` which is enabled by setting
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	170	:member:`Covariance::Options::sparse_linear_algebra_library_type`
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	171	to ``SUITE_SPARSE``.
				172
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	173	``SPARSE_QR`` cannot compute the covariance if the
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	174	Jacobian is rank deficient.
				175
				176
				177	2. ``DENSE_SVD`` uses ``Eigen``'s ``JacobiSVD`` to perform the
				178	computations. It computes the singular value decomposition
				179
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	180	.. math:: U D V^\top = J
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	181
				182	and then uses it to compute the pseudo inverse of J'J as
				183
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	184	.. math:: (J'J)^{\dagger} = V D^{2\dagger} V^\top
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	185
				186	It is an accurate but slow method and should only be used for
				187	small to moderate sized problems. It can handle full-rank as
				188	well as rank deficient Jacobians.
				189
				190
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	191	.. member:: double Covariance::Options::column_pivot_threshold
				192
				193	Default: :math:`-1`
				194
				195	During QR factorization, if a column with Euclidean norm less than
				196	``column_pivot_threshold`` is encountered it is treated as zero.
				197
				198	If ``column_pivot_threshold < 0``, then an automatic default value
				199	of `20(m+n)epssqrt(max(diag(J’J)))` is used. Here `m` and `n`
				200	are the number of rows and columns of the Jacobian (`J`)
				201	respectively.
				202
				203	This is an advanced option meant for users who know enough about
				204	their Jacobian matrices that they can determine a value better
				205	than the default.
				206
				207
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	208	.. member:: int Covariance::Options::min_reciprocal_condition_number
				209
				210	Default: :math:`10^{-14}`
				211
				212	If the Jacobian matrix is near singular, then inverting :math:`J'J`
				213	will result in unreliable results, e.g, if
				214
				215	.. math::
				216
				217	J = \begin{bmatrix}
				218	1.0& 1.0 \\
				219	1.0& 1.0000001
				220	\end{bmatrix}
				221
				222	which is essentially a rank deficient matrix, we have
				223
				224	.. math::
				225
				226	(J'J)^{-1} = \begin{bmatrix}
				227	2.0471e+14& -2.0471e+14 \\
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame]	228	-2.0471e+14& 2.0471e+14
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	229	\end{bmatrix}
				230
				231
				232	This is not a useful result. Therefore, by default
				233	:func:`Covariance::Compute` will return ``false`` if a rank
				234	deficient Jacobian is encountered. How rank deficiency is detected
				235	depends on the algorithm being used.
				236
				237	1. ``DENSE_SVD``
				238
				239	.. math:: \frac{\sigma_{\text{min}}}{\sigma_{\text{max}}} < \sqrt{\text{min_reciprocal_condition_number}}
				240
				241	where :math:`\sigma_{\text{min}}` and
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	242	:math:`\sigma_{\text{max}}` are the minimum and maximum
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	243	singular values of :math:`J` respectively.
				244
				245	2. ``SPARSE_QR``
				246
				247	.. math:: \operatorname{rank}(J) < \operatorname{num\_col}(J)
				248
				249	Here :math:`\operatorname{rank}(J)` is the estimate of the rank
				250	of :math:`J` returned by the sparse QR factorization
				251	algorithm. It is a fairly reliable indication of rank
				252	deficiency.
				253
				254	.. member:: int Covariance::Options::null_space_rank
				255
				256	When using ``DENSE_SVD``, the user has more control in dealing
				257	with singular and near singular covariance matrices.
				258
				259	As mentioned above, when the covariance matrix is near singular,
				260	instead of computing the inverse of :math:`J'J`, the Moore-Penrose
				261	pseudoinverse of :math:`J'J` should be computed.
				262
				263	If :math:`J'J` has the eigen decomposition :math:`(\lambda_i,
				264	e_i)`, where :math:`\lambda_i` is the :math:`i^\textrm{th}`
				265	eigenvalue and :math:`e_i` is the corresponding eigenvector, then
				266	the inverse of :math:`J'J` is
				267
				268	.. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i'
				269
				270	and computing the pseudo inverse involves dropping terms from this
				271	sum that correspond to small eigenvalues.
				272
				273	How terms are dropped is controlled by
				274	`min_reciprocal_condition_number` and `null_space_rank`.
				275
				276	If `null_space_rank` is non-negative, then the smallest
				277	`null_space_rank` eigenvalue/eigenvectors are dropped irrespective
				278	of the magnitude of :math:`\lambda_i`. If the ratio of the
				279	smallest non-zero eigenvalue to the largest eigenvalue in the
				280	truncated matrix is still below min_reciprocal_condition_number,
				281	then the `Covariance::Compute()` will fail and return `false`.
				282
				283	Setting `null_space_rank = -1` drops all terms for which
				284
				285	.. math:: \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
				286
				287	This option has no effect on ``SPARSE_QR``.
				288
				289	.. member:: bool Covariance::Options::apply_loss_function
				290
				291	Default: `true`
				292
				293	Even though the residual blocks in the problem may contain loss
				294	functions, setting ``apply_loss_function`` to false will turn off
				295	the application of the loss function to the output of the cost
				296	function and in turn its effect on the covariance.
				297
				298	.. class:: Covariance
				299
				300	:class:`Covariance::Options` as the name implies is used to control
				301	the covariance estimation algorithm. Covariance estimation is a
				302	complicated and numerically sensitive procedure. Please read the
				303	entire documentation for :class:`Covariance::Options` before using
				304	:class:`Covariance`.
				305
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	306	.. function:: bool Covariance::Compute(const std::vector<std::pair<const double, const double> >& covariance_blocks, Problem* problem)
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	307
				308	Compute a part of the covariance matrix.
				309
				310	The vector ``covariance_blocks``, indexes into the covariance
				311	matrix block-wise using pairs of parameter blocks. This allows the
				312	covariance estimation algorithm to only compute and store these
				313	blocks.
				314
				315	Since the covariance matrix is symmetric, if the user passes
				316	``<block1, block2>``, then ``GetCovarianceBlock`` can be called with
				317	``block1``, ``block2`` as well as ``block2``, ``block1``.
				318
				319	``covariance_blocks`` cannot contain duplicates. Bad things will
				320	happen if they do.
				321
				322	Note that the list of ``covariance_blocks`` is only used to
				323	determine what parts of the covariance matrix are computed. The
				324	full Jacobian is used to do the computation, i.e. they do not have
				325	an impact on what part of the Jacobian is used for computation.
				326
				327	The return value indicates the success or failure of the covariance
				328	computation. Please see the documentation for
				329	:class:`Covariance::Options` for more on the conditions under which
				330	this function returns ``false``.
				331
				332	.. function:: bool GetCovarianceBlock(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
				333
				334	Return the block of the cross-covariance matrix corresponding to
				335	``parameter_block1`` and ``parameter_block2``.
				336
				337	Compute must be called before the first call to ``GetCovarianceBlock``
				338	and the pair ``<parameter_block1, parameter_block2>`` OR the pair
				339	``<parameter_block2, parameter_block1>`` must have been present in the
				340	vector covariance_blocks when ``Compute`` was called. Otherwise
				341	``GetCovarianceBlock`` will return false.
				342
				343	``covariance_block`` must point to a memory location that can store
				344	a ``parameter_block1_size x parameter_block2_size`` matrix. The
				345	returned covariance will be a row-major matrix.
				346
				347	.. function:: bool GetCovarianceBlockInTangentSpace(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
				348
				349	Return the block of the cross-covariance matrix corresponding to
				350	``parameter_block1`` and ``parameter_block2``.
				351	Returns cross-covariance in the tangent space if a local
				352	parameterization is associated with either parameter block;
				353	else returns cross-covariance in the ambient space.
				354
				355	Compute must be called before the first call to ``GetCovarianceBlock``
				356	and the pair ``<parameter_block1, parameter_block2>`` OR the pair
				357	``<parameter_block2, parameter_block1>`` must have been present in the
				358	vector covariance_blocks when ``Compute`` was called. Otherwise
				359	``GetCovarianceBlock`` will return false.
				360
				361	``covariance_block`` must point to a memory location that can store
				362	a ``parameter_block1_local_size x parameter_block2_local_size`` matrix. The
				363	returned covariance will be a row-major matrix.
				364
				365	Example Usage
				366	=============
				367
				368	.. code-block:: c++
				369
				370	double x[3];
				371	double y[2];
				372
				373	Problem problem;
				374	problem.AddParameterBlock(x, 3);
				375	problem.AddParameterBlock(y, 2);
				376	<Build Problem>
				377	<Solve Problem>
				378
				379	Covariance::Options options;
				380	Covariance covariance(options);
				381
Austin Schuh	3de38b0	2024-06-25 18:25:10 -0700	[diff] [blame^]	382	std::vector<std::pair<const double, const double> > covariance_blocks;
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	383	covariance_blocks.push_back(make_pair(x, x));
				384	covariance_blocks.push_back(make_pair(y, y));
				385	covariance_blocks.push_back(make_pair(x, y));
				386
				387	CHECK(covariance.Compute(covariance_blocks, &problem));
				388
				389	double covariance_xx[3 * 3];
				390	double covariance_yy[2 * 2];
				391	double covariance_xy[3 * 2];
				392	covariance.GetCovarianceBlock(x, x, covariance_xx)
				393	covariance.GetCovarianceBlock(y, y, covariance_yy)
				394	covariance.GetCovarianceBlock(x, y, covariance_xy)