13.3 Conic Optimization - Interior-point optimizer

For conic optimization problems only an interior-point type optimizer is available. The same optimizer is used for quadratic optimization problems which are internally reformulated to conic form.

13.3.1 The homogeneous primal-dual problem

The interior-point optimizer is an implementation of the so-called homogeneous and self-dual algorithm. For a detailed description of the algorithm, please see [ART03]. In order to keep our discussion simple we will assume that MOSEK solves a conic optimization problem of the form:

(13.6)\[\begin{split}\begin{array}{lcll} \mbox{minimize} & c^T x & & \\ \mbox{subject to} & A x & = & b, \\ & x \in \K & & \end{array}\end{split}\]

where \(K\) is a convex cone. The corresponding dual problem is

(13.7)\[\begin{split}\begin{array}{lcll} \mbox{maximize} & b^T y & & \\ \mbox{subject to} & A^T y + s & = & c, \\ & s \in \K^* & & \end{array}\end{split}\]

where \(\K^*\) is the dual cone of \(\K\). See Sec. 12.2 (Conic Optimization) for definitions.

Since it is not known beforehand whether problem (13.6) has an optimal solution, is primal infeasible or is dual infeasible, the optimization algorithm must deal with all three situations. This is the reason that MOSEK solves the so-called homogeneous model

(13.8)\[\begin{split}\begin{array}{rcl} A x - b \tau & = & 0, \\ A^T y + s - c \tau & = & 0, \\ -c^T x + b^T y - \kappa & = & 0, \\ x & \in & \K, \\ s & \in & \K^*, \\ \tau ,\kappa & \geq & 0, \end{array}\end{split}\]

where \(y\) and \(s\) correspond to the dual variables in (13.6), and \(\tau\) and \(\kappa\) are two additional scalar variables. Note that the homogeneous model (13.8) always has a solution since

\[(x,y,s,\tau ,\kappa ) = (0,0,0,0,0)\]

is a solution, although not a very interesting one. Any solution

\[(x^*,y^*,s^*,\tau^*,\kappa^*)\]

to the homogeneous model (13.8) satisfies

\[(x^*)^T s^* + \tau^* \kappa^* = 0\]

i.e. complementarity. Observe that \(x^* \in \K\) and \(s^* \in \K^*\) implies

\[(x^*)^T s^* \geq 0\]

and therefore

\[\tau^* \kappa^* = 0.\]

since \(\tau^*, \kappa^* \geq 0\). Hence, at least one of \(\tau^*\) and \(\kappa^*\) is zero.

First, assume that \(\tau^*>0\) and hence \(\kappa^*=0\). It follows that

\[\begin{split}\begin{array}{rcl} A \frac{x^*}{\tau^*} & = & b, \\ A^T \frac{y^*}{\tau^*} + \frac{s^*}{\tau^*} & = & c, \\ - c^T \frac{x^*}{\tau^*} + b^T\frac{y^*}{\tau^*} & = & 0, \\ x^*/\tau^* & \in & \K, \\ s^*/\tau^* & \in & \K^*. \end{array}\end{split}\]

This shows that \(\frac{x^*}{\tau^*}\) is a primal optimal solution and \((\frac{y^*}{\tau^*},\frac{s^*}{\tau^*})\) is a dual optimal solution; this is reported as the optimal interior-point solution since

\[(x,y,s) = \left ( \frac{x^*}{\tau^*},\frac{y^*}{\tau^*},\frac{s^*}{\tau^*} \right )\]

is a primal-dual optimal solution.

On other hand, if \(\kappa^* >0\) then

\[\begin{split}\begin{array}{rcl} A x^* & = & 0, \\ A^T y^* + s^* & = & 0, \\ -c^T x^* + b^T y^* & = & \kappa^*, \\ x^* & \in & \K, \\ s^* & \in & \K^*. \end{array}\end{split}\]

This implies that at least one of

(13.9)\[c^T x^* < 0\]

or

(13.10)\[b^T y^* > 0\]

holds. If (13.9) is satisfied, then \(x^*\) is a certificate of dual infeasibility, whereas if (13.10) holds then \(y^*\) is a certificate of primal infeasibility.

In summary, by computing an appropriate solution to the homogeneous model, all information required for a solution to the original problem is obtained. A solution to the homogeneous model can be computed using a primal-dual interior-point algorithm [And09].

13.3.2 Interior-point Termination Criterion

Since computations are performed in finite precision, and for efficiency reasons, it is not possible to solve the homogeneous model exactly in general. Hence, an exact optimal solution or an exact infeasibility certificate cannot be computed and a reasonable termination criterion has to be employed.

In every iteration \(k\) of the interior-point algorithm a trial solution

\[(x^k,y^k,s^k,\tau^k,\kappa^k)\]

to the homogeneous model is generated, where

\[x^k \in \K, s^k \in \K^*, \tau^k, \kappa^k > 0.\]

Therefore, it is possible to compute the values:

\[\begin{split}\begin{array}{rcl} \rho_p^k & = & \mbox{arg min}_{\rho} \left \{\rho \mid \, \left\| A \frac{x^k}{\tau^k} - b \right\|_\infty \leq \rho \varepsilon_p (1+\left\| b \right\|_\infty ) \right \}, \\ \rho_d^k & = & \mbox{arg min}_{\rho} \left \{\rho \mid \, \left\| A^T \frac{y^k}{\tau^k} + \frac{s^k}{\tau^k}- c \right\|_\infty \leq \rho \varepsilon_d (1+\left\| c \right\|_\infty ) \right \}, \\ \rho_g^k & = & \mbox{arg min}_{\rho} \left \{\rho \mid \, \left( \frac{(x^k)^T s^k}{ (\tau^k)^2 }, \vert \frac{c^T x^k}{\tau^k} - \frac{ b^T y^k}{\tau^k} \vert \right) \leq \rho \varepsilon_g \max \left( 1, \frac{ \min\left( \left\vert c^T x^k \right\vert, \left\vert b^T y^k \right\vert \right) }{\tau^k} \right) \right \}, \\ \rho_{pi}^k & = & \mbox{arg min}_{\rho} \left \{\rho \mid \, \left\| A^T y^k + s^k \right \|_\infty \leq \rho \varepsilon_i b^T y^k,\, b^T y^k > 0 \right \} \mbox{ and} \\ \rho_{di}^k & = & \mbox{arg min}_{\rho} \left \{\rho \mid \, \left\| A x^k \right \|_\infty \leq - \rho \varepsilon_i c^T x^k,\, c^T x^k < 0 \right \}. \\ \end{array}\end{split}\]

Note \(\varepsilon_p, \varepsilon_d, \varepsilon_g \mbox{ and } \varepsilon_i\) are nonnegative user specified tolerances.

Optimal Case

Observe \(\rho_p^k\) measures how far \(x^k/\tau^k\) is from being a good approximate primal feasible solution. Indeed if \(\rho_p^k \leq 1\), then

(13.11)\[\left\| A \frac{x^k}{\tau^k} - b \right\|_\infty \leq \varepsilon_p (1+\left\| b \right\|_\infty ).\]

This shows the violations in the primal equality constraints for the solution \(x^k/\tau^k\) is small compared to the size of \(b\) given \(\varepsilon_p\) is small.

Similarly, if \(\rho_d^k \leq 1\), then \((y^k,s^k)/\tau^k\) is an approximate dual feasible solution. If in addition \(\rho_g \leq 1\), then the solution \((x^k,y^k,s^k)/\tau^k\) is approximate optimal because the associated primal and dual objective values are almost identical.

In other words if \(\max(\rho_p^k,\rho_d^k,\rho_g^k) \leq 1\), then

\[\frac{(x^k,y^k,s^k)}{\tau^k}\]

is an approximate optimal solution.

Dual Infeasibility Certificate

Next assume that \(\rho_{di}^k \leq 1\) and hence

\[\left\| A x^k \right \|_\infty \leq - \varepsilon_i c^T x^k \mbox{ and } -c^T x^k > 0\]

holds. Now in this case the problem is declared dual infeasible and \(x^k\) is reported as a certificate of dual infeasibility. The motivation for this stopping criterion is as follows. Let

\[\bar{x} := \frac{x^k} {- c^T x^k }\]

and it is easy to verify that

\[\left\| A \bar x \right\|_\infty \leq \varepsilon_i \mbox{ and } c^T \bar{x} = -1\]

which shows \(\bar{x}\) is an approximate certificate of dual infeasibility, where \(\varepsilon_{i}\) controls the quality of the approximation.

Primal Infeasiblity Certificate

Next assume that \(\rho_{pi}^k \leq 1\) and hence

\[\left\| A^T y^k + s^k \right \|_\infty \leq \varepsilon_i b^T y^k \mbox{ and } b^T y^k > 0\]

holds. Now in this case the problem is declared primal infeasible and \((y^k,s^k)\) is reported as a certificate of primal infeasibility. The motivation for this stopping criterion is as follows. Let

\[\bar{y} := \frac{y^k} { b^T y^k } \mbox{ and } \bar{s} := \frac{s^k} { b^T y^k }\]

and it is easy to verify that

\[\left\| A^T \bar y + \bar s \right \|_\infty \leq \varepsilon_i \mbox{ and } b^T \bar y = 1\]

which shows \((y^k,s^k)\) is an approximate certificate of dual infeasibility, where \(\varepsilon_{i}\) controls the quality of the approximation.

13.3.3 Adjusting optimality criteria

It is possible to adjust the tolerances \(\varepsilon_{p}\), \(\varepsilon_{d}\), \(\varepsilon_{g}\) and \(\varepsilon_{i}\) using parameters; see the next table for details. Note that although this section discusses the conic optimizer, if the problem was originally input as a quadratic or quadratically constrained optimization problem then the parameter names that apply are those from the third column (with infix QO instead of CO).

Table 13.2 Parameters employed in termination criterion

ToleranceParameter

Name (for conic problems)

Name (for quadratic problems)

\(\varepsilon_{p}\)

Dparam::INTPNT_CO_TOL_PFEAS

Dparam::INTPNT_QO_TOL_PFEAS

\(\varepsilon_{d}\)

Dparam::INTPNT_CO_TOL_DFEAS

Dparam::INTPNT_QO_TOL_DFEAS

\(\varepsilon_{g}\)

Dparam::INTPNT_CO_TOL_REL_GAP

Dparam::INTPNT_QO_TOL_REL_GAP

\(\varepsilon_{i}\)

Dparam::INTPNT_CO_TOL_INFEAS

Dparam::INTPNT_QO_TOL_INFEAS

The default values of the termination tolerances are chosen such that for a majority of problems appearing in practice it is not possible to achieve much better accuracy. Therefore, tightening the tolerances usually is not worthwhile. However, an inspection of (13.11) reveals that the quality of the solution depends on \(\left\| b \right\|_{\infty}\) and \(\left\| c \right\|_{\infty}\); the smaller the norms are, the better the solution accuracy.

The interior-point method as implemented by MOSEK will converge toward optimality and primal and dual feasibility at the same rate [And09]. This means that if the optimizer is stopped prematurely then it is very unlikely that either the primal or dual solution is feasible. Another consequence is that in most cases all the tolerances, \(\varepsilon_p\), \(\varepsilon_d\), \(\varepsilon_g\) and \(\varepsilon_i\), have to be relaxed together to achieve an effect.

If the optimizer terminates without locating a solution that satisfies the termination criteria, for example because of a stall or other numerical issues, then it will check if the solution found up to that point satisfies the same criteria with all tolerances multiplied by the value of Dparam::INTPNT_CO_TOL_NEAR_REL. If this is the case, the solution is still declared as optimal.

To conclude the discussion in this section, relaxing the termination criterion is usually not worthwhile.

13.3.4 The Interior-point Log

Below is a typical log output from the interior-point optimizer:

Optimizer  - threads                : 20
Optimizer  - solved problem         : the primal
Optimizer  - Constraints            : 1
Optimizer  - Cones                  : 2
Optimizer  - Scalar variables       : 6                 conic                  : 6
Optimizer  - Semi-definite variables: 0                 scalarized             : 0
Factor     - setup time             : 0.00              dense det. time        : 0.00
Factor     - ML order time          : 0.00              GP order time          : 0.00
Factor     - nonzeros before factor : 1                 after factor           : 1
Factor     - dense dim.             : 0                 flops                  : 1.70e+01
ITE PFEAS    DFEAS    GFEAS    PRSTATUS   POBJ              DOBJ              MU       TIME
0   1.0e+00  2.9e-01  3.4e+00  0.00e+00   2.414213562e+00   0.000000000e+00   1.0e+00  0.01
1   2.7e-01  7.9e-02  2.2e+00  8.83e-01   6.969257574e-01   -9.685901771e-03  2.7e-01  0.01
2   6.5e-02  1.9e-02  1.2e+00  1.16e+00   7.606090061e-01   6.046141322e-01   6.5e-02  0.01
3   1.7e-03  5.0e-04  2.2e-01  1.12e+00   7.084385672e-01   7.045122560e-01   1.7e-03  0.01
4   1.4e-08  4.2e-09  4.9e-08  1.00e+00   7.071067941e-01   7.071067599e-01   1.4e-08  0.01

The first line displays the number of threads used by the optimizer and the second line indicates if the optimizer chose to solve the primal or dual problem (see Iparam::INTPNT_SOLVE_FORM). The next lines display the problem dimensions as seen by the optimizer, and the Factor... lines show various statistics. This is followed by the iteration log.

Using the same notation as in Sec. 13.3.1 (The homogeneous primal-dual problem) the columns of the iteration log have the following meaning:

  • ITE: Iteration index \(k\).

  • PFEAS: \(\left\| Ax^k - b \tau^k \right\|_{\infty}\) . The numbers in this column should converge monotonically towards zero but may stall at low level due to rounding errors.

  • DFEAS: \(\left\| A^T y^{k} + s^k - c \tau^k \right\|_{\infty}\) . The numbers in this column should converge monotonically towards zero but may stall at low level due to rounding errors.

  • GFEAS: \(| - c^T x^k + b^T y^{k} - \kappa^k|\) . The numbers in this column should converge monotonically towards zero but may stall at low level due to rounding errors.

  • PRSTATUS: This number converges to \(1\) if the problem has an optimal solution whereas it converges to \(-1\) if that is not the case.

  • POBJ: \(c^T x^k / \tau^k\). An estimate for the primal objective value.

  • DOBJ: \(b^T y^k / \tau^k\). An estimate for the dual objective value.

  • MU: \(\frac{ (x^k)^T s^k + \tau^k \kappa^k }{n+1}\) . The numbers in this column should always converge to zero.

  • TIME: Time spent since the optimization started (in seconds).