7. Practical optimization

7.1. Introduction

In this chapter we discuss different practical aspects of creating optimization models.

7.2. The quality of a solution

In this section we will discuss how to validate an obtained solution. Assume we have a conic model with continuous variables only and that the optimization software has reported an optimal primal and dual solution. Given such a solution, we might ask how can we verify that the reported solution is indeed optimal?

To that end, consider a simple model

(1)\[\begin{split}\begin{array}{rccl} \mbox{minimize}& -x_2 & \\ \mbox{subject to}& x_1 + x_2 & \leq & 2, \\ & x_2 & \leq & \sqrt{2}, \\ & x_1, x_2 & \geq & 0, \end{array}\end{split}\]

where MOSEK might approximate the solution as

\[x_1 = 0.0000000000000000 \mbox{ and } x_2 = 1.4142135623730951.\]

and therefore the approximate optimal objective value is

\[- 1.4142135623730951.\]

The true objective value is \(-\sqrt{2}\), so the approximate objective value is wrong by the amount

\[1.4142135623730951 - \sqrt{2}.\]

Most likely this difference is irrelevant for all practical purposes. Nevertheless, in general a solution obtained using floating point arithmetic is only an approximation. Most (if not all) commercial optimization software uses double precision floating point arithmetic, implying that about 16 digits are correct in the computations performed internally by the software. This also means that irrational numbers such as \(\sqrt{2}\) and \(\pi\) can only be stored accurately within 16 digits.

A good practice after solving an optimization problem is to evaluate whether the reported solution is indeed an optimal solution; at the very least this process should be carried out during the initial phase of building a model, or if the reported solution is unexpected in some way. The first step in that process is to check that is it feasible; in case of the small example (1) this amounts to verifying that

\[\begin{split}\begin{array}{rcrcl} x_1 + x_2 & = & 0.0000000000000000 + 1.4142135623730951 & \leq & 2, \\ x_2 & = & 1.4142135623730951 & \leq & \sqrt{2}, \\ x_1 & = & 0.0000000000000000 & \geq & 0, \\ x_2 & = & 1.4142135623730951 & \geq & 0, \\ \end{array}\end{split}\]

which demonstrates that the solution is feasible. However, in general the constraints might be violated due to computations in finite precision. It can be difficult to access the significance of a specific violation; for example a violation of one unit in

\[x_1 + x_2 \leq 1\]

may or may not be more significant than a violation of one unit in

\[x_1 + x_2 \leq 10^9.\]

The right-hand side of \(10^9\) may itself be the result of a computation in finite precision, and my only be known with, say 3 digits of accuracy. Therefore, a violation of 1 unit is not significant since the true right-hand side could just as well be \(1.001 \cdot 10^9\).

Another question is how to verify that a feasible approximate solution is actually optimal, which is answered by duality theory. From the discussion of duality in Sec. 2.3, the dual problem to (1) is

\[\begin{split}\begin{array}{rccl} \mbox{maximize}& 2 y_1 + \sqrt{2}y_2 & \\ \mbox{subject to}& y_1 & \leq & 0, \\ & y_1 + y_2 & \leq & -1. \\ \end{array}\end{split}\]

and weak duality mandates that the primal objective value is greater than dual objective value for any dual feasible solution, i.e.,

\[-x_2 \geq 2 y_1 + \sqrt{2}y_2.\]

Furthermore, if the bound is tight (i.e., if the inequality holds as an equality), then the primal solution must be optimal. In other words, if there exists a dual feasible solution such that the dual objective value is identical to \(-\sqrt{2}\), then we know (from duality) that the primal solution is optimal. For example, the optimization software may report

\[y_1 = 0 \mbox{ and } y_2 = -1,\]

which is easily verified to be a feasible solution with dual objective value \(-\sqrt{2}\), proving optimality of both the primal and solution.

7.3. Distance to a cone

Assume we want to solve the conic optimization problem

\[\begin{split}\begin{array}{lccl} \mbox{minimize}& c^T x & & \\ \mbox{subject to}& A x & = & b, \\ & x \in \K& & \\ \end{array}\end{split}\]

where \(\K\) is a convex cone. Let \(P(x,\K)\) denotes the projection of \(x\) onto \(\K\). The distance measure

(2)\[d(x,\K) := \min_{y\in \K} \| x - y \|_2 = \| x - P(x,\K) \|_2\]

then gives the distance from \(x\) to the nearest feasible point in \(\K\), and obviously

\[x \in \K\quad \Leftrightarrow \quad d(x,\K) = 0.\]

Thus \(d(x,\K)\) is natural measure of the cone-feasibility of a point \(x\); if \(d(x,\K)>0\) then \(x\not \in \K\) (i.e., \(x\) is infeasible) and the smaller the value, the closer \(x\) is to \(\K\). For \(x\in \R\), let

\[[ x ]_+ := \max\{x, 0\}.\]

The following lemmas then give the projections and distance measure for the different cones.

Lemma 10 Distance to linear cones

The projection is

\[P(x,\R_+) = [ x ]_+\]

and the distance is

\[d(x,\R_+) = [ -x ]_+.\]
Lemma 11 Distance to quadratic cones

Let \(x:=(x_1, x_2)\in \R\times \R^{n-1}\). Then

\[\begin{split}P(x,\mathcal{Q}^n) = \left \{ \begin{array}{cl} x, & x_1 \geq \|x_2 \|, \\ \frac{[ x_1+\|x_2\| ]_+}{2\|x_2\|} ( \|x_2\|, x_2 ), & x_1 < \| x_2 \|, \end{array} \right.\end{split}\]

see [FLT02] prop 3.3, and

\[\begin{split}d(x,\mathcal{Q}^n) = \left \{ \begin{array}{cl} \frac{[ \|x_2\|-x_1 ]_+}{\sqrt{2}}, & x_1 \geq -\|x_{2:n}\|,\\ \|x\|, & x_1 < -\|x_{2:n} \|. \\ \end{array} \right .\end{split}\]
Lemma 12 Distance to rotated quadratic cones

Let \(x:=(x_1, x_2, x_3)\in \R\times \R\times \R^{n-2}\). Then

\[P(x,\mathcal{Q}_r^n) = T_n P(T_n x,\mathcal{Q}^n)\]

and

\[d(x,\mathcal{Q}_r^n) = d(T_n x,\mathcal{Q}^n),\]

where \(T_n\) is the rotation operator defining the rotated cone.

Lemma 13 Distance to semidefinite cones

Let \(X=\sum_{i=1}^n \lambda_i q_i q_i^T\in\mathcal{S}^n\) be an eigenvalue decomposition. Then

\[P(X,\mathcal{S}_+^n) = \sum_{i=1}^n [\lambda_i]_+ q_i q_i^T,\]

and

\[d(X,\mathcal{S}_+^n) = \max_i [ -\lambda_i ]_+.\]