Austin Schuh | 9049e20 | 2022-02-20 17:34:16 -0800 | [diff] [blame] | 1 | The solver |
| 2 | ========== |
| 3 | |
| 4 | Problem statement |
| 5 | ----------------- |
| 6 | |
| 7 | OSQP solves convex quadratic programs (QPs) of the form |
| 8 | |
| 9 | .. math:: |
| 10 | \begin{array}{ll} |
| 11 | \mbox{minimize} & \frac{1}{2} x^T P x + q^T x \\ |
| 12 | \mbox{subject to} & l \leq A x \leq u |
| 13 | \end{array} |
| 14 | |
| 15 | where :math:`x\in\mathbf{R}^{n}` is the optimization variable. |
| 16 | The objective function is defined by a positive semidefinite matrix |
| 17 | :math:`P \in \mathbf{S}^{n}_{+}` and vector :math:`q\in \mathbf{R}^{n}`. |
| 18 | The linear constraints are defined by matrix :math:`A\in\mathbf{R}^{m \times n}` |
| 19 | and vectors :math:`l` and :math:`u` so that :math:`l_i \in \mathbf{R} \cup \{-\infty\}` and :math:`u_i \in \mathbf{R} \cup \{+\infty\}` for all :math:`i \in \{1,\ldots,m\}`. |
| 20 | |
| 21 | |
| 22 | Algorithm |
| 23 | --------- |
| 24 | |
| 25 | The solver runs the following `ADMM algorithm <http://web.stanford.edu/~boyd/papers/admm_distr_stats.html>`_ (for more details see the related papers at the :ref:`citing` section): |
| 26 | |
| 27 | |
| 28 | .. math:: |
| 29 | |
| 30 | (x^{k+1}, \nu^{k+1}) & \gets \text{solve linear system}\\ |
| 31 | \tilde{z}^{k+1} & \gets z^{k} + \rho^{-1}(\nu^{k+1} - y^{k})\\ |
| 32 | z^{k+1} &\gets \Pi(\tilde{z}^{k} + \rho^{-1}y^{k})\\ |
| 33 | y^{k+1} &\gets y^{k} + \rho (\tilde{z}^{k+1} - z^{k+1}) |
| 34 | |
| 35 | where :math:`\Pi` is the projection onto the hyperbox :math:`[l,u]`. |
| 36 | :math:`\rho` is the ADMM step-size. |
| 37 | |
| 38 | |
| 39 | Linear system solution |
| 40 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 41 | The linear system solution is the core part of the algorithm. |
| 42 | It can be done using a **direct** or **indirect** method. |
| 43 | |
| 44 | With a **direct linear system solver** we solve the following linear system with a quasi-definite matrix |
| 45 | |
| 46 | .. math:: |
| 47 | |
| 48 | \begin{bmatrix} P + \sigma I & A^T \\ A & -\rho^{-1}I \end{bmatrix} \begin{bmatrix} x^{k+1} \\ \nu^{k+1} \end{bmatrix}= \begin{bmatrix}\sigma x^{k} - q \\ z^{k} - \rho^{-1} y^k \end{bmatrix}. |
| 49 | |
| 50 | With an **indirect linear system solver** we solve the following linear system with a positive definite matrix |
| 51 | |
| 52 | |
| 53 | .. math:: |
| 54 | |
| 55 | \left(P + \sigma I + \rho A^T A \right)x^{k+1} = \sigma x^{k} - q + A^T (\rho z^{k} - y^{k}). |
| 56 | |
| 57 | |
| 58 | OSQP core is designed to support different linear system solvers. |
| 59 | For their installation see :ref:`this section <linear_system_solvers_installation>`. |
| 60 | To specify your favorite linear system solver see :ref:`this section <linear_system_solvers_setting>`. |
| 61 | |
| 62 | Convergence |
| 63 | ^^^^^^^^^^^ |
| 64 | At each iteration :math:`k` OSQP produces a tuple :math:`(x^{k}, z^{k}, y^{k})` with :math:`x^{k} \in \mathbf{R}^{n}` and :math:`z^{k}, y^{k} \in \mathbf{R}^{m}`. |
| 65 | |
| 66 | The primal and and dual residuals associated to :math:`(x^{k}, z^{k}, y^{k})` are |
| 67 | |
| 68 | .. math:: |
| 69 | |
| 70 | \begin{aligned} |
| 71 | r_{\rm prim}^{k} &= Ax^{k} - z^{k}\\ |
| 72 | r_{\rm dual}^{k} &= Px^{k} + q + A^{T} y^{k}. |
| 73 | \end{aligned} |
| 74 | |
| 75 | Complementary slackness is satisfied by construction at machine precision. If the problem is feasible, the residuals converge to zero as :math:`k\to\infty`. The algorithm stops when the norms of :math:`r_{\rm prim}^{k}` and :math:`r_{\rm dual}^{k}` are within the specified tolerance levels :math:`\epsilon_{\rm prim}>0` and :math:`\epsilon_{\rm dual}>0` |
| 76 | |
| 77 | .. math:: |
| 78 | |
| 79 | \| r_{\rm prim}^{k} \|_{\infty} \le |
| 80 | \epsilon_{\rm prim}, |
| 81 | \quad |
| 82 | \| r_{\rm dual}^{k} \|_{\infty} \le |
| 83 | \epsilon_{\rm dual}. |
| 84 | |
| 85 | We set the tolerance levels as |
| 86 | |
| 87 | .. math:: |
| 88 | |
| 89 | \epsilon_{\rm prim} &= \epsilon_{\rm abs} + \epsilon_{\rm rel} \max\lbrace \|Ax^{k}\|_{\infty}, \| z^{k} \|_{\infty} \rbrace \\ |
| 90 | \epsilon_{\rm dual} &= \epsilon_{\rm abs} + \epsilon_{\rm rel} \max\lbrace \| P x^{k} \|_{\infty}, \| A^T y^{k} \|_{\infty}, \| q \|_{\infty} \rbrace. |
| 91 | |
| 92 | |
| 93 | .. _rho_step_size : |
| 94 | |
| 95 | :math:`\rho` step-size |
| 96 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 97 | To ensure quick convergence of the algorithm we adapt :math:`\rho` by balancing the residuals. |
| 98 | In default mode, the inteval (*i.e.*, number of iterations) at which we update :math:`\rho` is defined by a time measurement. |
| 99 | When the iterations time becomes greater than a certain fraction of the setup time, *i.e.* :code:`adaptive_rho_fraction`, we set the current number of iterations as the interval to update :math:`\rho`. |
| 100 | The update happens as follows |
| 101 | |
| 102 | .. math:: |
| 103 | |
| 104 | \rho^{k+1} \gets \rho^{k} \sqrt{\frac{\|r_{\rm prim}\|_{\infty}}{\|r_{\rm dual}\|_{\infty}}}. |
| 105 | |
| 106 | |
| 107 | Note that :math:`\rho` is updated only if it is sufficiently different than the current one. |
| 108 | In particular if it is :code:`adaptive_rho_tolerance` times larger or smaller than the current one. |
| 109 | |
| 110 | |
| 111 | |
| 112 | Infeasible problems |
| 113 | ------------------------------- |
| 114 | |
| 115 | OSQP is able to detect if the problem is primal or dual infeasible. |
| 116 | |
| 117 | |
| 118 | Primal infeasibility |
| 119 | ^^^^^^^^^^^^^^^^^^^^ |
| 120 | |
| 121 | When the problem is primal infeasible, the algorithm generates a certificate of infeasibility, *i.e.*, a vector :math:`v\in\mathbf{R}^{m}` such that |
| 122 | |
| 123 | .. math:: |
| 124 | |
| 125 | A^T v = 0, \quad u^T v_{+} + l^T v_{-} < 0, |
| 126 | |
| 127 | where :math:`v_+=\max(v,0)` and :math:`v_-=\min(v,0)`. |
| 128 | |
| 129 | The primal infeasibility check is then |
| 130 | |
| 131 | .. math:: |
| 132 | |
| 133 | \left\|A^T v \right\|_{\infty} \le \epsilon_{\rm prim\_inf}, \quad u^T v_{+} + l^T v_{-} \le \epsilon_{\rm prim\_inf}. |
| 134 | |
| 135 | |
| 136 | |
| 137 | Dual infeasibility |
| 138 | ^^^^^^^^^^^^^^^^^^^^ |
| 139 | |
| 140 | When the problem is dual infeasible, OSQP generates a vector :math:`s\in\mathbf{R}^{n}` being a certificate of dual infeasibility, *i.e.*, |
| 141 | |
| 142 | .. math:: |
| 143 | |
| 144 | P s = 0, \quad q^T s < 0, \quad (As)_i = \begin{cases} 0 & l_i \in \mathbf{R}, u_i\in\mathbf{R} \\ \ge 0 & l_i\in\mathbf{R}, u_i=+\infty \\ \le 0 & u_i\in\mathbf{R}, l_i=-\infty. \end{cases} |
| 145 | |
| 146 | |
| 147 | The dual infeasibility check is then |
| 148 | |
| 149 | .. math:: |
| 150 | |
| 151 | \| P s \|_{\infty} \le \epsilon_{\rm dual\_inf} , \quad |
| 152 | q^T s \le \epsilon_{\rm dual\_inf}, \\ |
| 153 | (A s)_i \begin{cases} \in \left[-\epsilon_{\rm dual\_inf}, \epsilon_{\rm dual\_inf}\right] & u_i, l_i \in \mathbf{R}\\ |
| 154 | \ge -\epsilon_{\rm dual\_inf} &u_i = +\infty\\ |
| 155 | \le \epsilon_{\rm dual\_inf} &l_i = -\infty.\end{cases} |
| 156 | |
| 157 | |
| 158 | |
| 159 | |
| 160 | Polishing |
| 161 | ^^^^^^^^^^^ |
| 162 | |
| 163 | Polishing is an additional algorithm step where OSQP tries to compute a high-accuracy solution. |
| 164 | We can enable it by turning the setting :code:`polish` to 1. |
| 165 | |
| 166 | Polishing works by guessing the active constraints at the optimum and solving an additional linear system. |
| 167 | If the guess is correct, OSQP returns a high accuracy solution. |
| 168 | Otherwise OSQP returns the ADMM solution. |
| 169 | The status of the polishing phase appears in the information :code:`status_polish`. |
| 170 | |
| 171 | Note that polishing requires the solution of an additional linear system and thereby, an additional factorization if the linear system solver is direct. |
| 172 | However, the linear system is usually much smaller than the one solved during the ADMM iterations. |
| 173 | |
| 174 | The chances to have a successful polishing increase if the tolerances :code:`eps_abs` and :code:`eps_rel` are small. |
| 175 | However, low tolerances might require a very large number of iterations. |
| 176 | |
| 177 | |