diff --git a/docs/source/nnls_solving.rst b/docs/source/nnls_solving.rst
index 713d54d..285df3a 100644
--- a/docs/source/nnls_solving.rst
+++ b/docs/source/nnls_solving.rst
@@ -58,8 +58,8 @@
 algorithms can be divided into two major categories [NocedalWright]_.
 
 1. **Trust Region** The trust region approach approximates the
-   objective function using using a model function (often a quadratic)
-   over a subset of the search space known as the trust region. If the
+   objective function using a model function (often a quadratic) over
+   a subset of the search space known as the trust region. If the
    model function succeeds in minimizing the true objective function
    the trust region is expanded; conversely, otherwise it is
    contracted and the model optimization problem is solved again.
@@ -166,10 +166,10 @@
 will assume that the matrix :math:`\frac{1}{\sqrt{\mu}} D` has been concatenated
 at the bottom of the matrix :math:`J` and similarly a vector of zeros
 has been added to the bottom of the vector :math:`f` and the rest of
-our discussion will be in terms of :math:`J` and :math:`f`, i.e, the
+our discussion will be in terms of :math:`J` and :math:`F`, i.e, the
 linear least squares problem.
 
-.. math:: \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + f(x)\|^2 .
+.. math:: \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + F(x)\|^2 .
    :label: simple
 
 For all but the smallest problems the solution of :eq:`simple` in
@@ -648,11 +648,11 @@
      access to :math:`S` via its product with a vector, one way to
      evaluate :math:`Sx` is to observe that
 
-     .. math::  x_1 &= E^\top x
-     .. math::  x_2 &= C^{-1} x_1
-     .. math::  x_3 &= Ex_2\\
-     .. math::  x_4 &= Bx\\
-     .. math::   Sx &= x_4 - x_3
+     .. math::  x_1 &= E^\top x\\
+                x_2 &= C^{-1} x_1\\
+                x_3 &= Ex_2\\
+                x_4 &= Bx\\
+                Sx &= x_4 - x_3
         :label: schurtrick1
 
      Thus, we can run PCG on :math:`S` with the same computational
@@ -693,7 +693,7 @@
 .. _section-preconditioner:
 
 Preconditioner
---------------
+==============
 
 The convergence rate of Conjugate Gradients for
 solving :eq:`normal` depends on the distribution of eigenvalues
@@ -726,34 +726,96 @@
 based preconditioners have much better convergence behavior than the
 Jacobi preconditioner, but are also much more expensive.
 
+For a survey of the state of the art in preconditioning linear least
+squares problems with general sparsity structure see [GouldScott]_.
+
+Ceres Solver comes with an number of preconditioners suited for
+problems with general sparsity as well as the special sparsity
+structure encountered in bundle adjustment problems.
+
+``JACOBI``
+----------
+
 The simplest of all preconditioners is the diagonal or Jacobi
 preconditioner, i.e., :math:`M=\operatorname{diag}(A)`, which for
 block structured matrices like :math:`H` can be generalized to the
-block Jacobi preconditioner. Ceres implements the block Jacobi
-preconditioner and refers to it as ``JACOBI``. When used with
-:ref:`section-cgnr` it refers to the block diagonal of :math:`H` and
-when used with :ref:`section-iterative_schur` it refers to the block
-diagonal of :math:`B` [Mandel]_.
+block Jacobi preconditioner. The ``JACOBI`` preconditioner in Ceres
+when used with :ref:`section-cgnr` refers to the block diagonal of
+:math:`H` and when used with :ref:`section-iterative_schur` refers to
+the block diagonal of :math:`B` [Mandel]_. For detailed performance
+data about the performance of ``JACOBI`` on bundle adjustment problems
+see [Agarwal]_.
+
+
+``SCHUR_JACOBI``
+----------------
 
 Another obvious choice for :ref:`section-iterative_schur` is the block
 diagonal of the Schur complement matrix :math:`S`, i.e, the block
-Jacobi preconditioner for :math:`S`. Ceres implements it and refers to
-is as the ``SCHUR_JACOBI`` preconditioner.
+Jacobi preconditioner for :math:`S`. In Ceres we refer to it as the
+``SCHUR_JACOBI`` preconditioner.  For detailed performance data about
+the performance of ``SCHUR_JACOBI`` on bundle adjustment problems see
+[Agarwal]_.
+
+
+``CLUSTER_JACOBI`` and ``CLUSTER_TRIDIAGONAL``
+----------------------------------------------
 
 For bundle adjustment problems arising in reconstruction from
 community photo collections, more effective preconditioners can be
 constructed by analyzing and exploiting the camera-point visibility
-structure of the scene [KushalAgarwal]_. Ceres implements the two
-visibility based preconditioners described by Kushal & Agarwal as
-``CLUSTER_JACOBI`` and ``CLUSTER_TRIDIAGONAL``. These are fairly new
-preconditioners and Ceres' implementation of them is in its early
-stages and is not as mature as the other preconditioners described
-above.
+structure of the scene.
+
+The key idea is to cluster the cameras based on the visibility
+structure of the scene. The similarity between a pair of cameras
+:math:`i` and :math:`j` is given by:
+
+  .. math:: S_{ij} = \frac{|V_i \cap V_j|}{|V_i| |V_j|}
+
+Here :math:`V_i` is the set of scene points visible in camera
+:math:`i`. This idea was first exploited by [KushalAgarwal]_ to create
+the ``CLUSTER_JACOBI`` and the ``CLUSTER_TRIDIAGONAL`` preconditioners
+which Ceres implements.
+
+The performance of these two preconditioners depends on the speed and
+clustering quality of the clustering algorithm used when building the
+preconditioner. In the original paper, [KushalAgarwal]_ used the
+Canonical Views algorithm [Simon]_, which while producing high quality
+clusterings can be quite expensive for large graphs. So, Ceres
+supports two visibility clustering algorithms - ``CANONICAL_VIEWS``
+and ``SINGLE_LINKAGE``. The former is as the name implies Canonical
+Views algorithm of [Simon]_. The latter is the the classic `Single
+Linkage Clustering
+<https://en.wikipedia.org/wiki/Single-linkage_clustering>`_
+algorithm. The choice of clustering algorithm is controlled by
+:member:`Solver::Options::visibility_clustering_type`.
+
+``SUBSET``
+----------
+
+This is a  preconditioner for problems with general  sparsity. Given a
+subset  of residual  blocks of  a problem,  it uses  the corresponding
+subset  of the  rows of  the  Jacobian to  construct a  preconditioner
+[Dellaert]_.
+
+Suppose the Jacobian :math:`J` has been horizontally partitioned as
+
+  .. math:: J = \begin{bmatrix} P \\ Q \end{bmatrix}
+
+Where, :math:`Q` is the set of rows corresponding to the residual
+blocks in
+:member:`Solver::Options::residual_blocks_for_subset_preconditioner`. The
+preconditioner is the matrix :math:`(Q^\top Q)^{-1}`.
+
+The efficacy of the preconditioner depends on how well the matrix
+:math:`Q` approximates :math:`J^\top J`, or how well the chosen
+residual blocks approximate the full problem.
+
 
 .. _section-ordering:
 
 Ordering
---------
+========
 
 The order in which variables are eliminated in a linear solver can
 have a significant of impact on the efficiency and accuracy of the
@@ -992,6 +1054,11 @@
    search, if a step size satisfying the search conditions cannot be
    found within this number of trials, the line search will stop.
 
+   The minimum allowed value is 0 for trust region minimizer and 1
+   otherwise. If 0 is specified for the trust region minimizer, then
+   line search will not be used when solving constrained optimization
+   problems.
+
    As this is an 'artificial' constraint (one imposed by the user, not
    the underlying math), if ``WOLFE`` line search is being used, *and*
    points satisfying the Armijo sufficient (function) decrease
@@ -1125,7 +1192,7 @@
 
 .. member:: double Solver::Options::min_lm_diagonal
 
-   Default: ``1e6``
+   Default: ``1e-6``
 
    The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to
    regularize the trust region step. This is the lower bound on
@@ -1229,6 +1296,29 @@
    recommend that you try ``CANONICAL_VIEWS`` first and if it is too
    expensive try ``SINGLE_LINKAGE``.
 
+.. member:: std::unordered_set<ResidualBlockId> residual_blocks_for_subset_preconditioner
+
+   ``SUBSET`` preconditioner is a preconditioner for problems with
+   general sparsity. Given a subset of residual blocks of a problem,
+   it uses the corresponding subset of the rows of the Jacobian to
+   construct a preconditioner.
+
+   Suppose the Jacobian :math:`J` has been horizontally partitioned as
+
+       .. math:: J = \begin{bmatrix} P \\ Q \end{bmatrix}
+
+   Where, :math:`Q` is the set of rows corresponding to the residual
+   blocks in
+   :member:`Solver::Options::residual_blocks_for_subset_preconditioner`. The
+   preconditioner is the matrix :math:`(Q^\top Q)^{-1}`.
+
+   The efficacy of the preconditioner depends on how well the matrix
+   :math:`Q` approximates :math:`J^\top J`, or how well the chosen
+   residual blocks approximate the full problem.
+
+   If ``Solver::Options::preconditioner_type == SUBSET``, then
+   ``residual_blocks_for_subset_preconditioner`` must be non-empty.
+
 .. member:: DenseLinearAlgebraLibrary Solver::Options::dense_linear_algebra_library_type
 
    Default:``EIGEN``
@@ -1408,6 +1498,10 @@
    on each Newton/Trust region step using a coordinate descent
    algorithm.  For more details, see :ref:`section-inner-iterations`.
 
+   **Note** Inner iterations cannot be used with :class:`Problem`
+   objects that have an :class:`EvaluationCallback` associated with
+   them.
+
 .. member:: double Solver::Options::inner_iteration_tolerance
 
    Default: ``1e-3``
@@ -1559,7 +1653,7 @@
 
 .. member:: double Solver::Options::gradient_check_relative_precision
 
-   Default: ``1e08``
+   Default: ``1e-8``
 
    Precision to check for in the gradient checker. If the relative
    difference between an element in a Jacobian exceeds this number,
@@ -1598,63 +1692,40 @@
    which break this finite difference heuristic, but they do not come
    up often in practice.
 
-.. member:: vector<IterationCallback> Solver::Options::callbacks
-
-   Callbacks that are executed at the end of each iteration of the
-   :class:`Minimizer`. They are executed in the order that they are
-   specified in this vector. By default, parameter blocks are updated
-   only at the end of the optimization, i.e., when the
-   :class:`Minimizer` terminates. This behavior is controlled by
-   :member:`Solver::Options::update_state_every_iteration`. If the user
-   wishes to have access to the updated parameter blocks when his/her
-   callbacks are executed, then set
-   :member:`Solver::Options::update_state_every_iteration` to true.
-
-   The solver does NOT take ownership of these pointers.
-
 .. member:: bool Solver::Options::update_state_every_iteration
 
    Default: ``false``
 
-   If true, the user's parameter blocks are updated at the end of
-   every Minimizer iteration, otherwise they are updated when the
-   Minimizer terminates. This is useful if, for example, the user
-   wishes to visualize the state of the optimization every iteration
-   (in combination with an IterationCallback).
+   If ``update_state_every_iteration`` is ``true``, then Ceres Solver
+   will guarantee that at the end of every iteration and before any
+   user :class:`IterationCallback` is called, the parameter blocks are
+   updated to the current best solution found by the solver. Thus the
+   IterationCallback can inspect the values of the parameter blocks
+   for purposes of computation, visualization or termination.
 
-   **Note**: If :member:`Solver::Options::evaluation_callback` is set,
-   then the behaviour of this flag is slightly different in each case:
+   If ``update_state_every_iteration`` is ``false`` then there is no
+   such guarantee, and user provided :class:`IterationCallback` s
+   should not expect to look at the parameter blocks and interpret
+   their values.
 
-   1. If :member:`Solver::Options::update_state_every_iteration` is
-      false, then the user's state is changed at every residual and/or
-      jacobian evaluation. Any user provided IterationCallbacks should
-      **not** inspect and depend on the user visible state while the
-      solver is running, since they it have undefined contents.
+.. member:: vector<IterationCallback> Solver::Options::callbacks
 
-   2. If :member:`Solver::Options::update_state_every_iteration` is
-      false, then the user's state is changed at every residual and/or
-      jacobian evaluation, BUT the solver will ensure that before the
-      user provided `IterationCallbacks` are called, the user visible
-      state will be updated to the current best point found by the
-      solver.
+   Callbacks that are executed at the end of each iteration of the
+   :class:`Minimizer`. They are executed in the order that they are
+   specified in this vector.
 
-.. member:: bool Solver::Options::evaluation_callback
+   By default, parameter blocks are updated only at the end of the
+   optimization, i.e., when the :class:`Minimizer` terminates. This
+   means that by default, if an :class:`IterationCallback` inspects
+   the parameter blocks, they will not see them changing in the course
+   of the optimization.
 
-   Default: ``NULL``
+   To tell Ceres to update the parameter blocks at the end of each
+   iteration and before calling the user's callback, set
+   :member:`Solver::Options::update_state_every_iteration` to
+   ``true``.
 
-   If non-``NULL``, gets notified when Ceres is about to evaluate the
-   residuals and/or Jacobians. This enables sharing computation between
-   residuals, which in some cases is important for efficient cost
-   evaluation. See :class:`EvaluationCallback` for details.
-
-   **Note**: Evaluation callbacks are incompatible with inner
-   iterations.
-
-   **Warning**: This interacts with
-   :member:`Solver::Options::update_state_every_iteration`. See the
-   documentation for that option for more details.
-
-   The solver does `not`  take ownership of the pointer.
+   The solver does NOT take ownership of these pointers.
 
 :class:`ParameterBlockOrdering`
 ===============================
@@ -1715,62 +1786,6 @@
 
    Number of groups with one or more elements.
 
-:class:`EvaluationCallback`
-===========================
-
-.. class:: EvaluationCallback
-
-   Interface for receiving callbacks before Ceres evaluates residuals or
-   Jacobians:
-
-   .. code-block:: c++
-
-      class EvaluationCallback {
-       public:
-        virtual ~EvaluationCallback() {}
-        virtual void PrepareForEvaluation()(bool evaluate_jacobians
-                                            bool new_evaluation_point) = 0;
-      };
-
-   ``PrepareForEvaluation()`` is called before Ceres requests residuals
-   or jacobians for a given setting of the parameters. User parameters
-   (the double* values provided to the cost functions) are fixed until
-   the next call to ``PrepareForEvaluation()``. If
-   ``new_evaluation_point == true``, then this is a new point that is
-   different from the last evaluated point. Otherwise, it is the same
-   point that was evaluated previously (either jacobian or residual) and
-   the user can use cached results from previous evaluations. If
-   ``evaluate_jacobians`` is true, then Ceres will request jacobians in
-   the upcoming cost evaluation.
-
-   Using this callback interface, Ceres can notify you when it is about
-   to evaluate the residuals or jacobians. With the callback, you can
-   share computation between residual blocks by doing the shared
-   computation in PrepareForEvaluation() before Ceres calls
-   CostFunction::Evaluate() on all the residuals. It also enables
-   caching results between a pure residual evaluation and a residual &
-   jacobian evaluation, via the new_evaluation_point argument.
-
-   One use case for this callback is if the cost function compute is
-   moved to the GPU. In that case, the prepare call does the actual cost
-   function evaluation, and subsequent calls from Ceres to the actual
-   cost functions merely copy the results from the GPU onto the
-   corresponding blocks for Ceres to plug into the solver.
-
-   **Note**: Ceres provides no mechanism to share data other than the
-   notification from the callback. Users must provide access to
-   pre-computed shared data to their cost functions behind the scenes;
-   this all happens without Ceres knowing. One approach is to put a
-   pointer to the shared data in each cost function (recommended) or to
-   use a global shared variable (discouraged; bug-prone).  As far as
-   Ceres is concerned, it is evaluating cost functions like any other;
-   it just so happens that behind the scenes the cost functions reuse
-   pre-computed data to execute faster.
-
-   See ``evaluation_callback_test.cc`` for code that explicitly verifies
-   the preconditions between ``PrepareForEvaluation()`` and
-   ``CostFunction::Evaluate()``.
-
 :class:`IterationCallback`
 ==========================
 
@@ -1779,7 +1794,7 @@
    :class:`IterationSummary` describes the state of the minimizer at
    the end of each iteration.
 
-.. member:: int32 IterationSummary::iteration
+.. member:: int IterationSummary::iteration
 
    Current iteration number.
 
@@ -2211,7 +2226,7 @@
    Number of threads actually used by the solver for Jacobian and
    residual evaluation. This number is not equal to
    :member:`Solver::Summary::num_threads_given` if none of `OpenMP`
-   or `CXX11_THREADS` is available.
+   or `CXX_THREADS` is available.
 
 .. member:: LinearSolverType Solver::Summary::linear_solver_type_given
 
