Austin Schuh | 70cc955 | 2019-01-21 19:46:48 -0800 | [diff] [blame] | 1 | .. default-domain:: cpp |
| 2 | |
| 3 | .. cpp:namespace:: ceres |
| 4 | |
| 5 | .. _chapter-interfacing_with_automatic_differentiation: |
| 6 | |
| 7 | Interfacing with Automatic Differentiation |
| 8 | ========================================== |
| 9 | |
| 10 | Automatic differentiation is straightforward to use in cases where an |
| 11 | explicit expression for the cost function is available. But this is |
| 12 | not always possible. Often one has to interface with external routines |
| 13 | or data. In this chapter we will consider a number of different ways |
| 14 | of doing so. |
| 15 | |
| 16 | To do this, we will consider the problem of finding parameters |
| 17 | :math:`\theta` and :math:`t` that solve an optimization problem of the |
| 18 | form: |
| 19 | |
| 20 | .. math:: |
| 21 | \min & \quad \sum_i \left \|y_i - f\left (\|q_{i}\|^2\right) q_i |
| 22 | \right \|^2\\ |
| 23 | \text{such that} & \quad q_i = R(\theta) x_i + t |
| 24 | |
| 25 | Here, :math:`R` is a two dimensional rotation matrix parameterized |
| 26 | using the angle :math:`\theta` and :math:`t` is a two dimensional |
| 27 | vector. :math:`f` is an external distortion function. |
| 28 | |
| 29 | We begin by considering the case, where we have a templated function |
| 30 | :code:`TemplatedComputeDistortion` that can compute the function |
| 31 | :math:`f`. Then the implementation of the corresponding residual |
| 32 | functor is straightforward and will look as follows: |
| 33 | |
| 34 | .. code-block:: c++ |
| 35 | :emphasize-lines: 21 |
| 36 | |
| 37 | template <typename T> T TemplatedComputeDistortion(const T r2) { |
| 38 | const double k1 = 0.0082; |
| 39 | const double k2 = 0.000023; |
| 40 | return 1.0 + k1 * y2 + k2 * r2 * r2; |
| 41 | } |
| 42 | |
| 43 | struct Affine2DWithDistortion { |
| 44 | Affine2DWithDistortion(const double x_in[2], const double y_in[2]) { |
| 45 | x[0] = x_in[0]; |
| 46 | x[1] = x_in[1]; |
| 47 | y[0] = y_in[0]; |
| 48 | y[1] = y_in[1]; |
| 49 | } |
| 50 | |
| 51 | template <typename T> |
| 52 | bool operator()(const T* theta, |
| 53 | const T* t, |
| 54 | T* residuals) const { |
| 55 | const T q_0 = cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0]; |
| 56 | const T q_1 = sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1]; |
| 57 | const T f = TemplatedComputeDistortion(q_0 * q_0 + q_1 * q_1); |
| 58 | residuals[0] = y[0] - f * q_0; |
| 59 | residuals[1] = y[1] - f * q_1; |
| 60 | return true; |
| 61 | } |
| 62 | |
| 63 | double x[2]; |
| 64 | double y[2]; |
| 65 | }; |
| 66 | |
| 67 | So far so good, but let us now consider three ways of defining |
| 68 | :math:`f` which are not directly amenable to being used with automatic |
| 69 | differentiation: |
| 70 | |
| 71 | #. A non-templated function that evaluates its value. |
| 72 | #. A function that evaluates its value and derivative. |
| 73 | #. A function that is defined as a table of values to be interpolated. |
| 74 | |
| 75 | We will consider them in turn below. |
| 76 | |
| 77 | A function that returns its value |
| 78 | ---------------------------------- |
| 79 | |
| 80 | Suppose we were given a function :code:`ComputeDistortionValue` with |
| 81 | the following signature |
| 82 | |
| 83 | .. code-block:: c++ |
| 84 | |
| 85 | double ComputeDistortionValue(double r2); |
| 86 | |
| 87 | that computes the value of :math:`f`. The actual implementation of the |
| 88 | function does not matter. Interfacing this function with |
| 89 | :code:`Affine2DWithDistortion` is a three step process: |
| 90 | |
| 91 | 1. Wrap :code:`ComputeDistortionValue` into a functor |
| 92 | :code:`ComputeDistortionValueFunctor`. |
| 93 | 2. Numerically differentiate :code:`ComputeDistortionValueFunctor` |
| 94 | using :class:`NumericDiffCostFunction` to create a |
| 95 | :class:`CostFunction`. |
| 96 | 3. Wrap the resulting :class:`CostFunction` object using |
| 97 | :class:`CostFunctionToFunctor`. The resulting object is a functor |
| 98 | with a templated :code:`operator()` method, which pipes the |
| 99 | Jacobian computed by :class:`NumericDiffCostFunction` into the |
| 100 | appropriate :code:`Jet` objects. |
| 101 | |
| 102 | An implementation of the above three steps looks as follows: |
| 103 | |
| 104 | .. code-block:: c++ |
| 105 | :emphasize-lines: 15,16,17,18,19,20, 29 |
| 106 | |
| 107 | struct ComputeDistortionValueFunctor { |
| 108 | bool operator()(const double* r2, double* value) const { |
| 109 | *value = ComputeDistortionValue(r2[0]); |
| 110 | return true; |
| 111 | } |
| 112 | }; |
| 113 | |
| 114 | struct Affine2DWithDistortion { |
| 115 | Affine2DWithDistortion(const double x_in[2], const double y_in[2]) { |
| 116 | x[0] = x_in[0]; |
| 117 | x[1] = x_in[1]; |
| 118 | y[0] = y_in[0]; |
| 119 | y[1] = y_in[1]; |
| 120 | |
| 121 | compute_distortion.reset(new ceres::CostFunctionToFunctor<1, 1>( |
| 122 | new ceres::NumericDiffCostFunction<ComputeDistortionValueFunctor, |
| 123 | ceres::CENTRAL, |
| 124 | 1, |
| 125 | 1>( |
| 126 | new ComputeDistortionValueFunctor))); |
| 127 | } |
| 128 | |
| 129 | template <typename T> |
| 130 | bool operator()(const T* theta, const T* t, T* residuals) const { |
| 131 | const T q_0 = cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0]; |
| 132 | const T q_1 = sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1]; |
| 133 | const T r2 = q_0 * q_0 + q_1 * q_1; |
| 134 | T f; |
| 135 | (*compute_distortion)(&r2, &f); |
| 136 | residuals[0] = y[0] - f * q_0; |
| 137 | residuals[1] = y[1] - f * q_1; |
| 138 | return true; |
| 139 | } |
| 140 | |
| 141 | double x[2]; |
| 142 | double y[2]; |
| 143 | std::unique_ptr<ceres::CostFunctionToFunctor<1, 1> > compute_distortion; |
| 144 | }; |
| 145 | |
| 146 | |
| 147 | A function that returns its value and derivative |
| 148 | ------------------------------------------------ |
| 149 | |
| 150 | Now suppose we are given a function :code:`ComputeDistortionValue` |
| 151 | thatis able to compute its value and optionally its Jacobian on demand |
| 152 | and has the following signature: |
| 153 | |
| 154 | .. code-block:: c++ |
| 155 | |
| 156 | void ComputeDistortionValueAndJacobian(double r2, |
| 157 | double* value, |
| 158 | double* jacobian); |
| 159 | |
| 160 | Again, the actual implementation of the function does not |
| 161 | matter. Interfacing this function with :code:`Affine2DWithDistortion` |
| 162 | is a two step process: |
| 163 | |
| 164 | 1. Wrap :code:`ComputeDistortionValueAndJacobian` into a |
| 165 | :class:`CostFunction` object which we call |
| 166 | :code:`ComputeDistortionFunction`. |
| 167 | 2. Wrap the resulting :class:`ComputeDistortionFunction` object using |
| 168 | :class:`CostFunctionToFunctor`. The resulting object is a functor |
| 169 | with a templated :code:`operator()` method, which pipes the |
| 170 | Jacobian computed by :class:`NumericDiffCostFunction` into the |
| 171 | appropriate :code:`Jet` objects. |
| 172 | |
| 173 | The resulting code will look as follows: |
| 174 | |
| 175 | .. code-block:: c++ |
| 176 | :emphasize-lines: 21,22, 33 |
| 177 | |
| 178 | class ComputeDistortionFunction : public ceres::SizedCostFunction<1, 1> { |
| 179 | public: |
| 180 | virtual bool Evaluate(double const* const* parameters, |
| 181 | double* residuals, |
| 182 | double** jacobians) const { |
| 183 | if (!jacobians) { |
Austin Schuh | 1d1e6ea | 2020-12-23 21:56:30 -0800 | [diff] [blame^] | 184 | ComputeDistortionValueAndJacobian(parameters[0][0], residuals, nullptr); |
Austin Schuh | 70cc955 | 2019-01-21 19:46:48 -0800 | [diff] [blame] | 185 | } else { |
| 186 | ComputeDistortionValueAndJacobian(parameters[0][0], residuals, jacobians[0]); |
| 187 | } |
| 188 | return true; |
| 189 | } |
| 190 | }; |
| 191 | |
| 192 | struct Affine2DWithDistortion { |
| 193 | Affine2DWithDistortion(const double x_in[2], const double y_in[2]) { |
| 194 | x[0] = x_in[0]; |
| 195 | x[1] = x_in[1]; |
| 196 | y[0] = y_in[0]; |
| 197 | y[1] = y_in[1]; |
| 198 | compute_distortion.reset( |
| 199 | new ceres::CostFunctionToFunctor<1, 1>(new ComputeDistortionFunction)); |
| 200 | } |
| 201 | |
| 202 | template <typename T> |
| 203 | bool operator()(const T* theta, |
| 204 | const T* t, |
| 205 | T* residuals) const { |
| 206 | const T q_0 = cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0]; |
| 207 | const T q_1 = sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1]; |
| 208 | const T r2 = q_0 * q_0 + q_1 * q_1; |
| 209 | T f; |
| 210 | (*compute_distortion)(&r2, &f); |
| 211 | residuals[0] = y[0] - f * q_0; |
| 212 | residuals[1] = y[1] - f * q_1; |
| 213 | return true; |
| 214 | } |
| 215 | |
| 216 | double x[2]; |
| 217 | double y[2]; |
| 218 | std::unique_ptr<ceres::CostFunctionToFunctor<1, 1> > compute_distortion; |
| 219 | }; |
| 220 | |
| 221 | |
| 222 | A function that is defined as a table of values |
| 223 | ----------------------------------------------- |
| 224 | |
| 225 | The third and final case we will consider is where the function |
| 226 | :math:`f` is defined as a table of values on the interval :math:`[0, |
| 227 | 100)`, with a value for each integer. |
| 228 | |
| 229 | .. code-block:: c++ |
| 230 | |
| 231 | vector<double> distortion_values; |
| 232 | |
| 233 | There are many ways of interpolating a table of values. Perhaps the |
| 234 | simplest and most common method is linear interpolation. But it is not |
| 235 | a great idea to use linear interpolation because the interpolating |
| 236 | function is not differentiable at the sample points. |
| 237 | |
| 238 | A simple (well behaved) differentiable interpolation is the `Cubic |
| 239 | Hermite Spline |
| 240 | <http://en.wikipedia.org/wiki/Cubic_Hermite_spline>`_. Ceres Solver |
| 241 | ships with routines to perform Cubic & Bi-Cubic interpolation that is |
| 242 | automatic differentiation friendly. |
| 243 | |
| 244 | Using Cubic interpolation requires first constructing a |
| 245 | :class:`Grid1D` object to wrap the table of values and then |
| 246 | constructing a :class:`CubicInterpolator` object using it. |
| 247 | |
| 248 | The resulting code will look as follows: |
| 249 | |
| 250 | .. code-block:: c++ |
| 251 | :emphasize-lines: 10,11,12,13, 24, 32,33 |
| 252 | |
| 253 | struct Affine2DWithDistortion { |
| 254 | Affine2DWithDistortion(const double x_in[2], |
| 255 | const double y_in[2], |
| 256 | const std::vector<double>& distortion_values) { |
| 257 | x[0] = x_in[0]; |
| 258 | x[1] = x_in[1]; |
| 259 | y[0] = y_in[0]; |
| 260 | y[1] = y_in[1]; |
| 261 | |
| 262 | grid.reset(new ceres::Grid1D<double, 1>( |
| 263 | &distortion_values[0], 0, distortion_values.size())); |
| 264 | compute_distortion.reset( |
| 265 | new ceres::CubicInterpolator<ceres::Grid1D<double, 1> >(*grid)); |
| 266 | } |
| 267 | |
| 268 | template <typename T> |
| 269 | bool operator()(const T* theta, |
| 270 | const T* t, |
| 271 | T* residuals) const { |
| 272 | const T q_0 = cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0]; |
| 273 | const T q_1 = sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1]; |
| 274 | const T r2 = q_0 * q_0 + q_1 * q_1; |
| 275 | T f; |
| 276 | compute_distortion->Evaluate(r2, &f); |
| 277 | residuals[0] = y[0] - f * q_0; |
| 278 | residuals[1] = y[1] - f * q_1; |
| 279 | return true; |
| 280 | } |
| 281 | |
| 282 | double x[2]; |
| 283 | double y[2]; |
| 284 | std::unique_ptr<ceres::Grid1D<double, 1> > grid; |
| 285 | std::unique_ptr<ceres::CubicInterpolator<ceres::Grid1D<double, 1> > > compute_distortion; |
| 286 | }; |
| 287 | |
| 288 | In the above example we used :class:`Grid1D` and |
| 289 | :class:`CubicInterpolator` to interpolate a one dimensional table of |
| 290 | values. :class:`Grid2D` combined with :class:`CubicInterpolator` lets |
| 291 | the user to interpolate two dimensional tables of values. Note that |
| 292 | neither :class:`Grid1D` or :class:`Grid2D` are limited to scalar |
| 293 | valued functions, they also work with vector valued functions. |