Blame - docs/source/interfacing_with_autodiff.rst - RealtimeRoboticsGroup/test

blob: 02f58b2d1c3438b78e2800416f3ee4f2c301e456 [file] [log] [blame]

Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	1	.. default-domain:: cpp
				2
				3	.. cpp:namespace:: ceres
				4
				5	.. _chapter-interfacing_with_automatic_differentiation:
				6
				7	Interfacing with Automatic Differentiation
				8	==========================================
				9
				10	Automatic differentiation is straightforward to use in cases where an
				11	explicit expression for the cost function is available. But this is
				12	not always possible. Often one has to interface with external routines
				13	or data. In this chapter we will consider a number of different ways
				14	of doing so.
				15
				16	To do this, we will consider the problem of finding parameters
				17	:math:`\theta` and :math:`t` that solve an optimization problem of the
				18	form:
				19
				20	.. math::
				21	\min & \quad \sum_i \left \\|y_i - f\left (\\|q_{i}\\|^2\right) q_i
				22	\right \\|^2\\
				23	\text{such that} & \quad q_i = R(\theta) x_i + t
				24
				25	Here, :math:`R` is a two dimensional rotation matrix parameterized
				26	using the angle :math:`\theta` and :math:`t` is a two dimensional
				27	vector. :math:`f` is an external distortion function.
				28
				29	We begin by considering the case, where we have a templated function
				30	:code:`TemplatedComputeDistortion` that can compute the function
				31	:math:`f`. Then the implementation of the corresponding residual
				32	functor is straightforward and will look as follows:
				33
				34	.. code-block:: c++
				35	:emphasize-lines: 21
				36
				37	template <typename T> T TemplatedComputeDistortion(const T r2) {
				38	const double k1 = 0.0082;
				39	const double k2 = 0.000023;
				40	return 1.0 + k1 * y2 + k2 * r2 * r2;
				41	}
				42
				43	struct Affine2DWithDistortion {
				44	Affine2DWithDistortion(const double x_in[2], const double y_in[2]) {
				45	x[0] = x_in[0];
				46	x[1] = x_in[1];
				47	y[0] = y_in[0];
				48	y[1] = y_in[1];
				49	}
				50
				51	template <typename T>
				52	bool operator()(const T* theta,
				53	const T* t,
				54	T* residuals) const {
				55	const T q_0 = cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
				56	const T q_1 = sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
				57	const T f = TemplatedComputeDistortion(q_0 * q_0 + q_1 * q_1);
				58	residuals[0] = y[0] - f * q_0;
				59	residuals[1] = y[1] - f * q_1;
				60	return true;
				61	}
				62
				63	double x[2];
				64	double y[2];
				65	};
				66
				67	So far so good, but let us now consider three ways of defining
				68	:math:`f` which are not directly amenable to being used with automatic
				69	differentiation:
				70
				71	#. A non-templated function that evaluates its value.
				72	#. A function that evaluates its value and derivative.
				73	#. A function that is defined as a table of values to be interpolated.
				74
				75	We will consider them in turn below.
				76
				77	A function that returns its value
				78	----------------------------------
				79
				80	Suppose we were given a function :code:`ComputeDistortionValue` with
				81	the following signature
				82
				83	.. code-block:: c++
				84
				85	double ComputeDistortionValue(double r2);
				86
				87	that computes the value of :math:`f`. The actual implementation of the
				88	function does not matter. Interfacing this function with
				89	:code:`Affine2DWithDistortion` is a three step process:
				90
				91	1. Wrap :code:`ComputeDistortionValue` into a functor
				92	:code:`ComputeDistortionValueFunctor`.
				93	2. Numerically differentiate :code:`ComputeDistortionValueFunctor`
				94	using :class:`NumericDiffCostFunction` to create a
				95	:class:`CostFunction`.
				96	3. Wrap the resulting :class:`CostFunction` object using
				97	:class:`CostFunctionToFunctor`. The resulting object is a functor
				98	with a templated :code:`operator()` method, which pipes the
				99	Jacobian computed by :class:`NumericDiffCostFunction` into the
				100	appropriate :code:`Jet` objects.
				101
				102	An implementation of the above three steps looks as follows:
				103
				104	.. code-block:: c++
				105	:emphasize-lines: 15,16,17,18,19,20, 29
				106
				107	struct ComputeDistortionValueFunctor {
				108	bool operator()(const double* r2, double* value) const {
				109	*value = ComputeDistortionValue(r2[0]);
				110	return true;
				111	}
				112	};
				113
				114	struct Affine2DWithDistortion {
				115	Affine2DWithDistortion(const double x_in[2], const double y_in[2]) {
				116	x[0] = x_in[0];
				117	x[1] = x_in[1];
				118	y[0] = y_in[0];
				119	y[1] = y_in[1];
				120
				121	compute_distortion.reset(new ceres::CostFunctionToFunctor<1, 1>(
				122	new ceres::NumericDiffCostFunction<ComputeDistortionValueFunctor,
				123	ceres::CENTRAL,
				124	1,
				125	1>(
				126	new ComputeDistortionValueFunctor)));
				127	}
				128
				129	template <typename T>
				130	bool operator()(const T* theta, const T* t, T* residuals) const {
				131	const T q_0 = cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
				132	const T q_1 = sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
				133	const T r2 = q_0 * q_0 + q_1 * q_1;
				134	T f;
				135	(*compute_distortion)(&r2, &f);
				136	residuals[0] = y[0] - f * q_0;
				137	residuals[1] = y[1] - f * q_1;
				138	return true;
				139	}
				140
				141	double x[2];
				142	double y[2];
				143	std::unique_ptr<ceres::CostFunctionToFunctor<1, 1> > compute_distortion;
				144	};
				145
				146
				147	A function that returns its value and derivative
				148	------------------------------------------------
				149
				150	Now suppose we are given a function :code:`ComputeDistortionValue`
				151	thatis able to compute its value and optionally its Jacobian on demand
				152	and has the following signature:
				153
				154	.. code-block:: c++
				155
				156	void ComputeDistortionValueAndJacobian(double r2,
				157	double* value,
				158	double* jacobian);
				159
				160	Again, the actual implementation of the function does not
				161	matter. Interfacing this function with :code:`Affine2DWithDistortion`
				162	is a two step process:
				163
				164	1. Wrap :code:`ComputeDistortionValueAndJacobian` into a
				165	:class:`CostFunction` object which we call
				166	:code:`ComputeDistortionFunction`.
				167	2. Wrap the resulting :class:`ComputeDistortionFunction` object using
				168	:class:`CostFunctionToFunctor`. The resulting object is a functor
				169	with a templated :code:`operator()` method, which pipes the
				170	Jacobian computed by :class:`NumericDiffCostFunction` into the
				171	appropriate :code:`Jet` objects.
				172
				173	The resulting code will look as follows:
				174
				175	.. code-block:: c++
				176	:emphasize-lines: 21,22, 33
				177
				178	class ComputeDistortionFunction : public ceres::SizedCostFunction<1, 1> {
				179	public:
				180	virtual bool Evaluate(double const* const* parameters,
				181	double* residuals,
				182	double** jacobians) const {
				183	if (!jacobians) {
Austin Schuh	1d1e6ea	2020-12-23 21:56:30 -0800	[diff] [blame^]	184	ComputeDistortionValueAndJacobian(parameters[0][0], residuals, nullptr);
Austin Schuh	70cc955	2019-01-21 19:46:48 -0800	[diff] [blame]	185	} else {
				186	ComputeDistortionValueAndJacobian(parameters[0][0], residuals, jacobians[0]);
				187	}
				188	return true;
				189	}
				190	};
				191
				192	struct Affine2DWithDistortion {
				193	Affine2DWithDistortion(const double x_in[2], const double y_in[2]) {
				194	x[0] = x_in[0];
				195	x[1] = x_in[1];
				196	y[0] = y_in[0];
				197	y[1] = y_in[1];
				198	compute_distortion.reset(
				199	new ceres::CostFunctionToFunctor<1, 1>(new ComputeDistortionFunction));
				200	}
				201
				202	template <typename T>
				203	bool operator()(const T* theta,
				204	const T* t,
				205	T* residuals) const {
				206	const T q_0 = cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
				207	const T q_1 = sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
				208	const T r2 = q_0 * q_0 + q_1 * q_1;
				209	T f;
				210	(*compute_distortion)(&r2, &f);
				211	residuals[0] = y[0] - f * q_0;
				212	residuals[1] = y[1] - f * q_1;
				213	return true;
				214	}
				215
				216	double x[2];
				217	double y[2];
				218	std::unique_ptr<ceres::CostFunctionToFunctor<1, 1> > compute_distortion;
				219	};
				220
				221
				222	A function that is defined as a table of values
				223	-----------------------------------------------
				224
				225	The third and final case we will consider is where the function
				226	:math:`f` is defined as a table of values on the interval :math:`[0,
				227	100)`, with a value for each integer.
				228
				229	.. code-block:: c++
				230
				231	vector<double> distortion_values;
				232
				233	There are many ways of interpolating a table of values. Perhaps the
				234	simplest and most common method is linear interpolation. But it is not
				235	a great idea to use linear interpolation because the interpolating
				236	function is not differentiable at the sample points.
				237
				238	A simple (well behaved) differentiable interpolation is the `Cubic
				239	Hermite Spline
				240	<http://en.wikipedia.org/wiki/Cubic_Hermite_spline>`_. Ceres Solver
				241	ships with routines to perform Cubic & Bi-Cubic interpolation that is
				242	automatic differentiation friendly.
				243
				244	Using Cubic interpolation requires first constructing a
				245	:class:`Grid1D` object to wrap the table of values and then
				246	constructing a :class:`CubicInterpolator` object using it.
				247
				248	The resulting code will look as follows:
				249
				250	.. code-block:: c++
				251	:emphasize-lines: 10,11,12,13, 24, 32,33
				252
				253	struct Affine2DWithDistortion {
				254	Affine2DWithDistortion(const double x_in[2],
				255	const double y_in[2],
				256	const std::vector<double>& distortion_values) {
				257	x[0] = x_in[0];
				258	x[1] = x_in[1];
				259	y[0] = y_in[0];
				260	y[1] = y_in[1];
				261
				262	grid.reset(new ceres::Grid1D<double, 1>(
				263	&distortion_values[0], 0, distortion_values.size()));
				264	compute_distortion.reset(
				265	new ceres::CubicInterpolator<ceres::Grid1D<double, 1> >(*grid));
				266	}
				267
				268	template <typename T>
				269	bool operator()(const T* theta,
				270	const T* t,
				271	T* residuals) const {
				272	const T q_0 = cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
				273	const T q_1 = sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
				274	const T r2 = q_0 * q_0 + q_1 * q_1;
				275	T f;
				276	compute_distortion->Evaluate(r2, &f);
				277	residuals[0] = y[0] - f * q_0;
				278	residuals[1] = y[1] - f * q_1;
				279	return true;
				280	}
				281
				282	double x[2];
				283	double y[2];
				284	std::unique_ptr<ceres::Grid1D<double, 1> > grid;
				285	std::unique_ptr<ceres::CubicInterpolator<ceres::Grid1D<double, 1> > > compute_distortion;
				286	};
				287
				288	In the above example we used :class:`Grid1D` and
				289	:class:`CubicInterpolator` to interpolate a one dimensional table of
				290	values. :class:`Grid2D` combined with :class:`CubicInterpolator` lets
				291	the user to interpolate two dimensional tables of values. Note that
				292	neither :class:`Grid1D` or :class:`Grid2D` are limited to scalar
				293	valued functions, they also work with vector valued functions.