3. Conic quadratic optimization

3.1. Introduction

This chapter extends the notion of linear optimization with quadratic cones; conic quadratic optimization is a straightforward generalization of linear optimization, in the sense that we optimize a linear function with linear (in)equalities with variables belonging to one or more (rotated) quadratic cones. In general we also allow some of the variables to be linear variables as long as some of the variables belong to a quadratic cone.

We discuss the basic concept of quadratic cones, and demonstrate the surprisingly large flexibility of conic quadratic modeling with several examples of (non-trivial) convex functions or sets that be represented using quadratic cones. These convex sets can then be combined arbitrarily to form different conic quadratic optimization problems.

We finally extend the duality theory and infeasibility analysis from linear to conic optimization, and discuss infeasibility of conic quadratic optimization problems.

3.1.1. Quadratic cones

We define an \(n\)-dimensional quadratic cone as

(1)\[\mathcal{Q}^n = \left \{ x\in \R^n \mid x_1 \geq \sqrt{x_2^2 + x_3^2 + \cdots + x_n^2} \right \}.\]

The geometric interpretation of a quadratic (or second-order) cone is shown in Fig. 3.1 for a cone with three variables, and illustrates how the boundary of the cone resembles an ice-cream cone.

The 1-dimensional quadratic cone simply implies standard nonnegativity \(x_1\geq 0\).

A set \(S\) is called a convex cone if for any \(x\in S\) we have \(\alpha x \in S,\: \forall \alpha \geq 0\). From the definition (1) it is clear that if \(x\in \mathcal{Q}^n\) then obviously \(\alpha x \in \mathcal{Q}^n, \: \forall \alpha \geq 0\), which justifies the notion quadratic cone.

_images/fig-qcone.png

Fig. 3.1 A quadratic or second-order cone satisfying \(x_1\geq \sqrt{x_2^2 + x_3^2}\).

3.1.2. Rotated quadratic cones

An \(n-\)dimensional rotated quadratic cone is defined as

(2)\[\mathcal{Q}_r^n = \left \{ x\in \R^n \mid 2x_1 x_2 \geq x_3^2 + \cdots + x_n^2, \: x_1, x_2 \geq 0 \right \}.\]

As the name indicates, there is simple relationship between quadratic and rotated quadratic cones. Define an orthogonal transformation

(3)\[\begin{split}T_n := \left[ \begin{array}{ccc} 1/\sqrt{2} & \hphantom{-}1/\sqrt{2} & \hphantom{-}0 \\ 1/\sqrt{2} & -1/\sqrt{2} & \hphantom{-}0 \\ 0 & \hphantom{-}0 & I_{n-2} \end{array} \right].\end{split}\]

Then it is easy to verify that

\[x \in \mathcal{Q}^n \quad \Longleftrightarrow \quad (T_n x) \in \mathcal{Q}^n_r,\]

and since \(T\) is orthogonal we call \(\mathcal{Q}_r\) a rotated cone; the transformation corresponds to a rotation of \(\pi/4\) of the \((x_1,x_2)\) axis. For example if \(x\in \mathcal{Q}^3\) and

\[\begin{split}\left[ \begin{array}{c}z_1 \\ z_2 \\ z_3 \end{array} \right] = \left[ \begin{array}{ccc} \frac{1}{\sqrt{2}} & \hphantom{-}\frac{1}{\sqrt{2}} & \hphantom{-}0 \\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & \hphantom{-}0 \\ 0 & \hphantom{-}0 & 1 \end{array} \right] \left[ \begin{array}{c}x_1 \\ x_2 \\ x_3 \end{array} \right] = \left[ \begin{array}{c}\frac{1}{\sqrt{2}}(x_1 + x_2)\\\frac{1}{\sqrt{2}}(x_1 - x_2)\\x_3 \end{array} \right]\end{split}\]

then

\[2 z_1 z_2 \geq z_3^2, \: z_1,z_2 \geq 0 \quad \Longrightarrow \quad (x_1^2 - x_2^2) \geq x_3^2, \: x_1 \geq 0,\]

and similarly (by interchanging roles of \(x\) and \(z\)) we see that

\[x_1^2 \geq x_2^2 + x_3^2, \: x_1\geq 0 \quad \Longrightarrow \quad 2 z_1 z_2 \geq z_3^2, \: z_1, z_2 \geq 0.\]

Thus, one could argue that we only need quadratic cones, but there are many examples of functions where using an explicit rotated quadratic conic formulation is more natural; in Sec. 3.2 we discuss many examples involving both quadratic cones and rotated quadratic cones.

3.2. Conic quadratic modeling

In the following we describe several convex sets that can be modeled using conic quadratic formulations. We describe the convex sets in their simplest embodiment, which we interpret as conic quadratic building blocks; it is then straightforward to combine those in arbitrary intersections and affine mappings to model complex sets. For example we will show that the set

\[\frac{1}{x} \leq t, \quad x \geq 0\]

can be represented using quadratic cones. Sets that can be represented using quadratic cones are called conic quadratic representable sets.

3.2.1. Absolute values

In Sec. 2.2.2 we saw how to model \(|x|\leq t\) using two linear inequalities, but in fact the epigraph of the absolute value is just the definition of a two-dimensional quadratic cone, i.e.,

\[|x|\leq t \quad \Longleftrightarrow \quad (t, x) \in \mathcal{Q}^2.\]

3.2.2. Euclidean norms

The Euclidean norm of \(x\in \R^n\),

\[\| x \|_2 = \sqrt { x_1^2 + x_2^2 + \cdots + x_n^2 }\]

essentially defines the quadratic cone, i.e.,

\[\| x \|_2 \leq t \quad \Longleftrightarrow \quad (t, x)\in \mathcal{Q}^{n+1}.\]

3.2.3. Squared Euclidean norms

The epigraph of the squared Euclidean norm can be described as the intersection of a rotated quadratic cone with an affine hyperplane,

\[\| x \|_2^2 \leq t \quad \Longleftrightarrow \quad (1/2, t, x)\in \mathcal{Q}_r^{n+2}.\]

3.2.4. Convex quadratic sets

Assume \(Q \in \R^{n \times n}\) is a symmetric positive semidefinite matrix. The convex inequality

\[(1/2)x^T Q x + c^T x + r \leq 0\]

may be rewritten as

(4)\[\begin{split} \begin{array}{rcl} t + c^T x + r & = & 0, \\ x^T Q x & \leq & 2 t. \\ \end{array}\end{split}\]

Since \(Q\) is symmetric positive semidefinite the epigraph

(5)\[x^T Q x \leq 2 t\]

is a convex set and there exists a matrix \(F\in \R^{k\times n}\) such that

(6)\[Q = F^T F,\]

(see Chap. 4 for properties of semidefinite matrices); for instance \(F\) could be the Cholesky factorization of \(Q\). We then have an equivalent characterization of (5) as

\[\begin{split}\begin{array}{rcl} F x - y & = & 0, \\ s & = & 1, \\ \| y \|^2 & \leq & 2 s t, \end{array}\end{split}\]

or more succinctly: if \(Q=F^TF\) then

\[(1/2)x^T Q x \leq t \quad \Longleftrightarrow \quad (t,1,F x) \in \mathcal{Q}_r^{2+k}.\]

Frequently \(Q\) has the structure

\[Q = I + F^T F\]

where \(I\) is the identity matrix, so

\[x^T Q x = x^T x + x^T F^T F x\]

and hence we can write

\[\begin{split}\begin{array}{rclll} F x - y & = & 0, \\ f+h & = & t \\ \| x \|^2 & \leq & 2 e f, & e=1, & \\ \| y \|^2 & \leq & 2 g h, & g=1. & \\ \end{array}\end{split}\]

In other words,

\[(f,1,x ) \in \mathcal{Q}_r^{2+n}, \qquad (h,1,F x ) \in \mathcal{Q}_r^{2+k}, \qquad f+h=t\]

is a conic quadratic representation of (5) in this case.

3.2.5. Second-order cones

A second-order cone is occasionally specified as

(7)\[\| Ax + b \|_2 \leq c^T x + d\]

where \(A \in \R^{m\times n}\) and \(c \in \R^n\). The formulation (7)) is equivalent to

\[(c^T x + d, Ax + b) \in \mathcal{Q}^{m+1}\]

or equivalently

(8)\[\begin{split}\begin{array}{rcl} s & = & A x + b, \\ t & = & c^T x + d, \\ (t,s) & \in & \mathcal{Q}^{m+1}. \end{array}\end{split}\]

As will be explained later, we refer to (7) as the dual form and (8) as the primal form, respectively. An alternative characterization of (7) is

(9)\[\|Ax + b\|^2_2 - (c^T x + d)^2 \leq 0, \qquad c^T x+ d \geq 0\]

which shows that certain quadratic inequalities are conic quadratic representable. The next section shows such an example.

3.2.6. Simple sets involving power functions

Next we show how to represent some frequently occurring convex sets represented by power like functions. The result is stated in the following lemma:

Lemma 1
The following propositions are true.
  1. \(\left \{(t,x)\mid t \leq \sqrt{x}, \, x \geq 0 \right \} = \left \{(t,x) \mid (x,1/2,t) \in \mathcal{Q}_r^3 \right \}\).
  2. \(\left \{(t,x)\mid t \geq \frac{1}{x}, \, x \geq 0 \right \} = \left \{(t,x)\mid (x,t,\sqrt{2}) \in \mathcal{Q}_r^3 \right \}\).
  3. \(\left \{(t,x)\mid t \geq x^{3/2} ,\, x \geq 0 \right \} = \left \{(t,x)\mid (s,t,x), (x,1/8,s) \in \mathcal{Q}_r^3 \right \}\).
  4. \(\left \{(t,x)\mid t \geq x^{5/3} ,\, x \geq 0 \right \} = \left \{(t,x)\mid (s,t,x), (1/8,z,s),(s,x,z) \in \mathcal{Q}_r^3 \right \}\).
  5. \(\left \{(t,x)\mid t \geq \frac{1}{x^2}, \, x \geq 0 \right \} = \left \{(t,x) \mid (t,1/2,s), (x,s,\sqrt{2}) \in \mathcal{Q}_r^3 \right \}\).
  6. \(\left \{(t,x,y)\mid t \geq \frac{|x|^3}{y^2}, y\geq 0 \right \} = \left \{(t,x,y)\mid (z,x) \in \mathcal{Q}^2, (\frac{y}{2},s,z), (\frac{t}{2},z,s) \in \mathcal{Q}_r^3 \right \}\).

Proof. Proposition (i) follows from

\[\begin{split}\begin{array}{rl} & (x,\frac{1}{2},t) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & x \geq t^2, \, x \geq 0 \\ \Leftrightarrow & \sqrt{x} \geq t,\, x \geq 0. \end{array}\end{split}\]

Proposition (ii) follows from

\[\begin{split}\begin{array}{rl} & (x,t,\sqrt{2}) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & 2 x t \geq 2,\, x,t\geq 0 \\ \Leftrightarrow & t \geq \frac{1}{x},\, x\geq 0. \\ \end{array}\end{split}\]

Proposition (iii) follows from

\[\begin{split}\begin{array}{rl} & (s,t,x), (x,\frac{1}{8},s) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & 2st \geq x^2,\, \frac{1}{4} x \geq s^2,\, s,t,x \geq 0\\ \Leftrightarrow & \sqrt{x} t \geq x^2,\, t,x \geq 0 \\ \Leftrightarrow & t \geq x^{3/2}, \, x\geq 0. \ \end{array}\end{split}\]

Proposition (iv) follows from

\[\begin{split}\begin{array}{rl} & (s,t,x), (1/8,z,s),(s,x,z) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & \frac{1}{4} z \geq s^2,\, 2 s x \geq z^2,\, 2st \geq x^2,\, s,t,x \geq 0\\ \Leftrightarrow & 2 s x \geq (4s^2)^2,\, 2st \geq x^2,\, s,t,x \geq 0 \\ \Leftrightarrow & x \geq 8s^3,\, 2st \geq x^2,\, x,s \geq 0 \\ \Leftrightarrow & x^{1/3} t \geq x^2,\, x \geq 0 \\ \Leftrightarrow & t \geq x^{5/3},\, x \geq 0. \\ \end{array}\end{split}\]

Proposition (v) follows from

\[\begin{split}\begin{array}{rl} & (t,\frac{1}{2},s), (x,s,\sqrt{2}) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & t \geq s^2,\, 2 x s \geq 2,\, t,x \geq 0 \\ \Leftrightarrow & t \geq \frac{1}{x^2}.\\ \end{array}\end{split}\]

Finally, proposition (vi) follows from

\[\begin{split}\begin{array}{rl} & (z,x) \in \mathcal{Q}^2, (\frac{y}{2},s,z), (\frac{t}{2},z,s) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & z \geq |x|,\, ys \geq z^2,\, zt \geq s^2,\, z,y,s,t \geq 0\\ \Leftrightarrow & z \geq |x|,\, zt \geq \frac{z^4}{y^2},\, z,y,t \geq 0 \\ \Leftrightarrow & t \geq \frac{|x|^3}{y^2}, \, y\geq 0. \ \end{array}\end{split}\]

In the above proposition the terms \(x^{3/2}\) and \(x^{5/3}\) appeared. Those terms are special cases of

(10)\[t \geq x^\frac{2k-1}{k},\, x \geq 0\]

defined for \(k=2,3,4,5,...\) (10) is identical to

\[t x^{1/k} \geq x^2,\, x \geq 0\]

which implies

\[\begin{split}\begin{array}{rcll} 2 t z & \geq & x^2, & x,z \geq 0, \\ x^{1/k} & \geq & 2 z. \end{array}\end{split}\]

The latter can be rewritten as

\[\begin{split}\begin{array}{rcll} 2 t z & \geq & x^2, & x,z \geq 0,\\ x & \geq & 2^k z^k & \end{array}\end{split}\]

for which it is easy to build a conic quadratic representation. Therefore we will leave it as an exercise for reader.

3.2.7. Geometric mean

Closely related is the hypograph of the geometric mean of nonnegative variables \(x_1\) and \(x_2\),

\[K_\text{gm}^2 = \{ (x_1,x_2,t) \mid \sqrt{x_1 x_2} \geq t, \: x_1, x_2 \geq 0 \}.\]

This corresponds to a scaled rotated quadratic cone

\[K_\text{gm}^2 = \{ (x_1,x_2,t) \mid (x_1,x_2,\sqrt{2}t) \in \mathcal{Q}_r^3 \}.\]

More generally, we can obtain a conic quadratic representation of the hypograph

\[K_\text{gm}^n = \left \{ (x,t)\in \R^{n+1} \mid \left (x_1 x_2 \cdots x_n \right )^{1/n} \geq t, x \geq 0 \right \}.\]

Initially let us assume that \(n=2^l\) for an integer \(l\geq 1\). In particular, let us assume \(l=3\), i.e., we seek a conic quadratic representation of

(11)\[t \leq (x_1 x_2 \cdots x_8)^{1/8}, \quad x \geq 0.\]

If we introduce the cone relationships

(12)\[(x_1,x_2,y_1), (x_3,x_4,y_2),(x_5,x_6,y_3),(x_7,x_8,y_4) \in \mathcal{Q}_r^3,\]

then

\[2 x_1 x_2 \geq y_1^2\]

and so forth, which is obviously equivalent to

\[x_1 x_2 \geq (1/2)y_1^2.\]

This leads to a characterization

\[((1/2)y_1^2 (1/2)y_2^2 (1/2)y_3^2 (1/2)y_4^2)^{1/8} \leq (x_1 x_2 \cdots x_8)^{1/8},\]

or equivalently

\[\frac{1}{\sqrt{2}}( y_1 y_2 y_3 y_4)^{1/4} \leq (x_1 x_2 \cdots x_8)^{1/8}.\]

Hence, by introducing 4 3-dimensional rotated quadratic cones, we have obtained a simpler problem with 4 \(y\) variables instead of 8 \(x\) variables. Clearly, we can apply the same idea to the reduced problem. To that end, introduce the inequalities

\[(y_1,y_2,z_1),(y_3,y_4,z_2) \in \mathcal{Q}_r^3,\]

implying that

\[\frac{1}{\sqrt{2}} (z_1 z_2)^{1/2} \leq (y_1 y_2 y_3 y_4)^{1/4}.\]

Finally, if we introduce the conic relationship

\[(z_1,z_2,w_1) \in \mathcal{Q}_r^3\]

we get

\[w_1 \leq \sqrt{2}(z_1 z_2)^{1/2} \leq \sqrt{4} (y_1 y_2 y_3 y_4)^{1/4} \leq \sqrt{8} (x_1 x_2 \cdots x_8)^{1/8}.\]

Therefore

(13)\[\begin{split}\begin{array}{rcl} (x_1,x_2,y_1),(x_3,x_4,y_2),(x_5,x_6,y_3),(x_7,x_8,y_4) & \in & \mathcal{Q}_r^3, \\ (y_1,y_2,z_1),(y_3,y_4,z_2) & \in & \mathcal{Q}_r^3, \\ (z_1,z_2,w_1) & \in & \mathcal{Q}_r^3, \\ w_1 & = & \sqrt{8} t \\ \end{array}\end{split}\]

is a representation of the set (11), which is by construction conic quadratic representable. The relaxation using \(l\) levels of auxiliary variables and rotated quadratic cones can represented using a tree-structure as shown shown in Fig. 3.2.

_images/gm-tree-crop.png

Fig. 3.2 Tree-structure of the relaxations as in (13): on the edges we report the additional variables, on the nodes the additional constraints.

Next let us assume that \(n\) is not a power of two. Let, for example, \(n=6\). We then wish to characterize the set

\[t \leq (x_1 x_2 x_3 x_4 x_5 x_6 )^{1/6}, \quad x \geq 0.\]

which is equivalent to

\[t \leq (x_1 x_2 x_3 x_4 x_5 x_6 x_7 x_8 )^{1/8}, \quad x_7=x_8=t, \quad x \geq 0.\]

In other words, we can reuse the result (11) if we add simple equality constraints

\[x_7 = x_8 = t.\]

Thus, if \(n\) is not a power of two, we take \(l=\lceil \log_2 n\rceil\) and build the conic quadratic quadratic representation for that set, and we add \(2^l-n\) simple equality constraints.

3.2.8. Harmonic mean

Consider next the hypograph of the harmonic mean,

\[\left (\frac{1}{n} \sum_{i=1}^n x_i^{-1} \right )^{-1} \geq t, \quad x\geq 0.\]

This is a convex inequality, but not conic quadratic representable. However, the reciprocal inequality

(14)\[\frac{1}{n} \sum_{i=1}^n x_i^{-1} \leq y, \quad x\geq 0.\]

with \(y=1/t\) can be characterized as

\[x_i z_i \geq 1, \quad \sum_{i=1}^n z_i = n y.\]

In other words, the set (14) corresponding to the epigraph of the reciprocal harmonic mean of \(x\) can be described using an intersection of rotated quadratic cones an affine hyperplanes

\[2 x_i z_i \geq w_i^2, \quad x_i, z_i\geq 0, \quad w_i=\sqrt{2}, \quad \sum_{i=1}^n z_i = n y,\]

or \((x_i, z_i, \sqrt{2}) \in \mathcal{Q}_r^3\), \(e^T z =ny\).

3.2.9. Convex increasing rational powers

We can extend the formulations in Sec. 3.2.7 to describe the epigraph

\[K^{p/q} = \{ (x, t) \mid x^{p/q}\leq t, \: x \geq 0 \}\]

for any rational convex power \(p/q\geq 1\), \(p, q\in \mathbb{Z}_+\). For example, consider

\[x^{5/3} \leq t, \quad x\geq 0.\]

We rewrite this as

\[x^5 \leq t^3, \quad x\geq 0,\]

which can be described (using the approach in Sec. 3.2.7) as

\[0\leq x \leq (y_1 y_2 y_3 y_4 y_5)^{1/5}, \quad y_1=y_2=y_3=t, \quad y_4=y_5=1,\]

where we note that

\[y_1 y_2 y_3 y_4 y_5 = t^3.\]

The inequality

\[0\leq x \leq (y_1 y_2 y_3 y_4 y_5)^{1/5}\]

is a geometric mean inequality discussed in Sec. 3.2.7. If we let \(e_k\in\R^k\) denote the vector of all ones, then for general \(p/q \geq 1\), \(p,q\in \mathbb{Z}_+\) we have

\[K^{p/q} = \{ (x, t) \mid (e_q t, e_{p-q}, x)\in K^p_\text{gm} \}.\]

3.2.10. Convex decreasing rational powers

In a similar way we can describe the epigraph

\[K^{-p/q} = \{ (x,t) \mid x^{-p/q}\leq t, \: x \geq 0 \}\]

for any \(p, q\in \mathbb{Z}_+\). For example, consider

\[x^{-5/2} \leq t, \quad x\geq 0,\]

which we rewrite as

\[1 \leq x^5 t^2, \quad x\geq 0,\]

or equivalently as

\[1 \leq (y_1 y_2 \dots y_7)^{1/7}, \quad y_1=\dots=y_5=x, \quad y_6=y_7=t, \quad y,t\geq 0.\]

Let \(e_k\in \R^k\) denote the vector of all ones. For general \(p,q\in \mathbb{Z}_+\) we then have

\[K^{-p/q} = \{ (x,t) \mid (e_p x, e_q t, 1)\in K^{p+q}_\text{gm} \}.\]

3.2.11. Power cones

The \((n+1)\)-dimensional power cone is defined by

(15)\[K^{n+1}_\alpha = \left \{ (x,y) \in \R_+^n \times \R\mid |y| \leq \prod_{j=1}^n x_j^{\alpha_j} \right \}\]

where \(\alpha > 0\) and

\[\sum_{j=1}^n \alpha_j = 1.\]

Of particular interest is the three-dimensional power cone given by

(16)\[K^3_\alpha = \left \{ (x,y) \in \R_+^2 \times \R\mid |y| \leq x_1^{\alpha_1} x_2^{1-\alpha_1} \right \} .\]

For example, the intersection of the three-dimensional power cone with an affine hyperplane

\[|y| \leq x_1^{\alpha_1} x_2^{1-\alpha_1}, \qquad x_2 = 1\]

is equivalent to

\[|y|^{1/\alpha_1} \leq x_1,\]

i.e., we can model the epigraph of convex increasing terms (\(1/\alpha_1\geq 1\)) using a three-dimensional power cone. As another example, the cone

\[|y| \leq x_1^{\frac{2}{3}} x_2^{1-\frac{2}{3}}\]

is equivalent to

\[\frac{|y|^3}{x_1^2} \leq x_2.\]

Assume next that \(\alpha_1 = p/q\) where \(p\) and \(q\) are positive integers and \(p \leq q\). Then

(17)\[K^3_{(\frac{p}{q},1-\frac{p}{q})} = \left \{ (x,y) \in \R_+^2 \times \R\mid |y| \leq x_1^{\frac{p}{q}} x_2^{(1-\frac{p}{q})} \right \}.\]

Now

\[|y| \leq x_1^{\frac{p}{q}} x_2^{(1-\frac{p}{q})}\]

is equivalent to

(18)\[|y| \leq (x_1^p x_2^{q-p})^{\frac{1}{q}},\]

so by introducing additional \(z\) variables and the constraints

\[z_1 = z_2 = \ldots = z_{q-p} = x_1, \qquad z_{q-p+1} = z_{q-p+2} = \ldots = z_q = x_2\]

we can rewrite (18) as

(19)\[|y| \leq (z_1 z_2 \ldots z_q)^{\frac{1}{p}}\]

which essentially is the geometric mean inequality discussed above.

We next consider the \(n+1\) dimensional power cone with rational powers,

(20)\[K^{n+1}_\alpha = \left \{ (x,y) \in \R_+^n \times \R\mid |y| \leq \prod_{j=1}^n x_j^{p_j/q_j} \right \}\]

where \(p_j,q_j\) are integers satisfying \(0 < p_j \leq q_j\) and \(\sum_{j=1}^n p_j/q_j = 1\). To motivate the general procedure, we consider a simple example.

Consider the 4 dimensional power cone

\[K^4_{(\frac{1}{4},\frac{3}{8},\frac{3}{8})} = \left \{ (x,y)\in\R_+^3 \times \R\mid | y | \leq (x_1^\frac{1}{4} x_2^\frac{3}{8} x_3^\frac{3}{8}) \right \}.\]

An equivalent representation is given by the geometric mean inequality

\[| y | \leq (z_1 z_2 \dots z_8)^{1/8}\]

with

\[z_1 = z_2 = x_1, \qquad z_3=z_4=z_5=x_2, \qquad z_6=z_7=z_8=x_3.\]

In the general case, let

\[\beta = \mbox{lcm}(q_1,q_2,\ldots,q_n)\]

denote the least common multiple of \(\{ q_i \}\). Then

\[K^{n+1}_\alpha = \left \{ (x,y) \in \R_+^n \times \R\mid |y| \leq \left (\prod_{j=1}^n x_j^{\frac{\beta p_j}{q_j}} \right )^\frac{1}{\beta} \right \},\]

where we note that \(\frac{\beta p_j}{q_j}\) are integers and \(\sum_{j=1}^n \frac{\beta p_j}{q_j} = \beta\). If we define

\[s_j = \sum_{k=1}^{j}\frac{\beta p_k}{q_k}, \quad j=1,\ldots,n-1,\]

we can then reformulate (20) as

\[\begin{split}\begin{array}{c} |y| \leq (z_1 z_2 \ldots z_\beta)^{\frac{1}{q}} \\ z_1 = \ldots = z_{s_1} = x_1, \quad z_{s_1+1}= \ldots = z_{s_2}=x_2, \quad \ldots \quad z_{s_{n-1}+1} = \ldots = z_\beta = x_n. \end{array}\end{split}\]

3.2.12. Quadratic forms with one negative eigenvalue

Assume that \(A \in \R^{n \times n}\) is a symmetric matrix with exactly one negative eigenvalue, i.e., \(A\) has a spectral factorization (i.e., eigenvalue decomposition)

\[A = Q \Lambda Q^T = -\alpha_1 q_1 q_1^T + \sum_{i=2}^n \alpha_i q_i q_i^T\]

where \(Q^T Q = I\), \(\Lambda=\mathbf{diag}(-\alpha_1, \alpha_2, \dots, \alpha_n)\), \(\alpha_i \geq 0, \: \forall i\). Then

\[x^T A x \leq 0\]

is equivalent to

(21)\[\sum_{j=2}^n \alpha_j (q_j^T x)^2 \leq \alpha_1 (q_1^T x)^2.\]

Suppose \(q_1^T x \geq 0\). We can characterize (21) in different ways, e.g., as

(22)\[(\sqrt{\alpha_1/2}q_1^Tx, \sqrt{\alpha_1/2}q_1^Tx, \sqrt{\alpha_2}q_2^Tx, \dots, \sqrt{\alpha_n}q_n^Tx ) \in \mathcal{Q}_r^{n+1}\]

or as

(23)\[(\sqrt{\alpha_1}q_1^Tx, \sqrt{\alpha_2}q_2^Tx, \dots, \sqrt{\alpha_n}q_n^Tx ) \in \mathcal{Q}^{n}.\]

The latter equivalence follows directly by using the orthogonal transformation \(T_{n+1}\) on (22), i.e.,

\[\begin{split}\left[ \begin{array}{ccccc} 1/\sqrt{2} & \hphantom{-}1/\sqrt{2} & \\ 1/\sqrt{2} & -1/\sqrt{2} & \\ & & \hphantom{-}1 \\ & & & \hphantom{-}\ddots \\ & & & & \hphantom{-}1 \end{array} \right] \left[ \begin{array}{c} \sqrt{\alpha_1/2} q_1^T x \\ \sqrt{\alpha_1/2} q_1^T x \\ \sqrt{\alpha_2} q_2^T x \\ \vdots \\ \sqrt{\alpha_n} q_n^T x \end{array} \right] = \left[ \begin{array}{c} \sqrt{\alpha_1} q_1^T x \\ 0 \\ \sqrt{\alpha_2} q_2^T x \\ \vdots \\ \sqrt{\alpha_n} q_n^T x \end{array} \right].\end{split}\]

3.2.13. Ellipsoidal sets

The set

\[{\cal E} = \{ x\in \R^n \mid \| P(x-c) \|_2 \leq 1 \}\]

describes an ellipsoid centred at \(c\). It has a natural conic quadratic representation, i.e., \(x\in {\cal E}\) if and only if

\[y = P(x-c), \quad (t, y) \in \mathcal{Q}^{n+1}, \quad t=1.\]

Suppose \(P\) is nonsingular then

\[{\cal E} = \{ x \in \R^n \mid x = P^{-1} y + c, \: \| y \|_2 \leq 1 \}.\]

is an alternatively characterization.

Depending on the context one characterization may be more useful than the other.

3.3. Conic quadratic optimization examples

In this section we give different instructive examples of conic quadratic optimization problems formed using the formulations from Sec. 3.2.

3.3.1. Quadratically constrained quadratic optimization

A general convex quadratically constrained quadratic optimization problem can be written as

(24)\[\begin{split}\begin{array}{ll} \mbox{minimize} & (1/2)x^T Q_0 x + c_0^T x + r_0\\ \mbox{subject to}& (1/2)x^T Q_i x + c_i^T x + r_i \leq 0, \quad i=1,\dots,p, \end{array}\end{split}\]

where all \(Q_i\) are symmetric positive. Let

\[Q_i = F_i^T F_i, \quad i=0,\dots,p,\]

where \(F_i \in \R^{k_i \times n}\). Using the formulations in Sec. 3.2.4 we then get an equivalent conic quadratic problem

(25)\[\begin{split}\begin{array}{lll} \mbox{minimize} & t_0 + c_0^T x + r_0 & \\ \mbox{subject to} & t_i + c_i^T x + r_i = 0, & i=1,\dots,p, \\ & (t_i,1,F_i x) \in \mathcal{Q}_r^{k_i+2},& i=0,\dots,p. \\ \end{array}\end{split}\]

Assume next that \(Q_i\) is a rank 1 matrix, i.e., \(F_i\) has 1 row and \(n\) columns. Storing \(Q_i\) requires about \(n^2/2\) space whereas storing \(F_i\) then only requires \(n\) space. Moreover, the amount of work required to evaluate

\[x^T Q_i x\]

is proportional to \(n^2\) whereas the work required to evaluate

\[x^T F_i^T F_ i x = \| F_i x \|^2\]

is proportional to \(n\) only. In other words, if \(Q_i\) have low rank, then (25) will require much less space to solve than (24). This fact usually also translates into much faster solution times.

3.3.2. Robust optimization with ellipsoidal uncertainties

Often in robust optimization some of the parameters in the model are assumed to be unknown, but we assume that the unknown parameters belong to a simple set describing the uncertainty. For example, for a standard linear optimization problem we may wish to find a robust solution for all vectors \(c\) in an ellipsoid

\[{\cal E} = \{ c\in \R^n \mid c = Fy + g, \: \| y \|_2 \leq 1 \}.\]

A common approach is then to optimize for the worst-case realization of \(c\), i.e., we get a robust version

(26)\[\begin{split}\begin{array}{ll} \mbox{minimize}& \sup_{c\in {\cal E}} c^T x\\ \mbox{subject to}& Ax = b\\ & x\geq 0. \end{array}\end{split}\]

The worst-case objective can be evaluated as

\[\sup_{c\in {\cal E}} c^T x = g^T x + \sup_{\| y \|_2 \leq 1} y^T F^T x = g^T x + \| F^T x\|_2\]

where we used that \(\sup_{\| u \|_2\leq 1} v^T u = (v^T v)/\|v\|_2=\|v\|_2\). Thus the robust problem (26) is equivalent to

\[\begin{split}\begin{array}{ll} \mbox{minimize}& g^T x + \|F^T x \|_2\\ \mbox{subject to}& Ax = b\\ & x\geq 0, \end{array}\end{split}\]

which can be posed as a conic quadratic problem

(27)\[\begin{split}\begin{array}{ll} \mbox{minimize}& g^T x + t\\ \mbox{subject to}& Ax = b\\ & (t,F^T x) \in \mathcal{Q}^{n+1}\\ & x \geq 0. \end{array}\end{split}\]

3.3.3. Markowitz portfolio optimization

In classical Markowitz portfolio optimization we consider investment in \(n\) stocks or assets held over a period of time. Let \(x_i\) denote the amount we invest in asset \(i\), and assume a stochastic model where the return of the assets is a random variable \(r\) with known mean

\[\mu = \mathbf{E}r\]

and covariance

\[\Sigma = \mathbf{E}(r-\mu)(r-\mu)^T.\]

The return of our investment is also a random variable \(y = r^Tx\) with mean (or expected return)

\[\mathbf{E}y = \mu^T x\]

and variance (or risk)

\[(y - \mathbf{E}y)^2 = x^T \Sigma x.\]

We then wish to rebalance our portfolio to achieve a compromise between risk and expected return, e.g., we can maximize the expected return given an upper bound \(\gamma\) on the tolerable risk and a constraint that our total investment is fixed,

\[\begin{split}\begin{array}{ll} \mbox{maximize}& \mu^T x\\ \mbox{subject to}& x^T \Sigma x \leq \gamma\\ & e^T x = 1\\ & x \geq 0. \end{array}\end{split}\]

Suppose we factor \(\Sigma = GG^T\) (e.g., using a Cholesky or a eigenvalue decomposition). We then get a conic formulation

\[\begin{split}\begin{array}{ll} \mbox{maximize}& \mu^T x\\ \mbox{subject to}& (\sqrt{\gamma}, G^Tx) \in \mathcal{Q}^{n+1}\\ & e^T x = 1\\ & x \geq 0. \end{array}\end{split}\]

In practice both the average return and covariance are estimated using historical data. A recent trend is then to formulate a robust version of the portfolio optimization problem to combat the inherent uncertainty in those estimates, e.g., we can constrain \(\mu\) to an ellipsoidal uncertainty set as in Sec. 3.3.2.

3.4. Duality in conic quadratic optimization

To discuss conic quadratic duality, we consider a primal problem

(28)\[\begin{split}\begin{array}{ll} \mbox{minimize} & \langle c^l, x^l\rangle + \sum_{j=1}^{n_q} \langle c^q_j, x^q_j\rangle \\ \mbox{subject to} & A^l x^l + \sum_{j=1}^{n_q}A^q_j x^q_j = b\\ & x^l \in \R^{n_l}_+, \: x^q_j \in \mathcal{Q}^{n_j}, \end{array}\end{split}\]

where \(c^l \in \R^{n_l}, c^q_j\in\R^{n_j}, A^l\in \R^{m\times n^l}, A^q_j\in \R^{m \times n_j}\) and \(b\in \R^m\), i.e., a problem with both a linear cone and \(n_q\) quadratic cones. We can also write (28) more compactly as

(29)\[\begin{split}\begin{array}{ll} \mbox{minimize} & c^T x\\ \mbox{subject to}& Ax = b\\ & x \in {\cal C}, \end{array}\end{split}\]

with

\[\begin{split}x = \left(\begin{array}{c}x^l \\ x^q_1 \\ \vdots \\ x^q_{n_q}\end{array}\right), \quad c = \left(\begin{array}{c}c^l \\ c^q_1 \\ \vdots \\ c^q_{n_q}\end{array}\right), \quad A = \left(\begin{array}{cccc} A^l & A^q_1 & \dots & A^q_{n_q}\end{array}\right),\end{split}\]

for the cone \({\cal C} = \R^l_+ \times \mathcal{Q}^{n_1} \times \cdots \times \mathcal{Q}^{n_q}\). We note that problem (29) resembles a standard linear optimization, except for a more general cone. The Lagrangian function is

\[L(x,y,s) = c^Tx - y^T(Ax-b) - x^Ts.\]

For linear optimization the choice of nonnegative Lagrange multipliers \(s\geq 0\) ensures that \(x^T s \geq 0, \: \forall x\geq 0\) so the Lagrangian function provides a lower bound on the primal problem. For conic quadratic optimization the choice of Lagrange multipliers is more involved, and we need the notion of a dual cone

\[(\mathcal{Q}^n)^* = \{ v\in \R^n \mid u^T v \geq 0, \: \forall u \in \mathcal{Q}^n \}.\]

Then for \(s\in (\mathcal{Q}^n)^*\) we similarly have \(x^Ts\geq 0, \forall x\in \mathcal{Q}^n\), so the Lagrange function becomes a global lower bound.

Lemma 2
The quadratic cone is self-dual, \(({\mathcal{Q}^n})^*=\mathcal{Q}^n\).

Proof. Assume first that \(u_1\geq \|u_{2:n}\|\) and \(v_1 \geq \|v_{2:n}\|\). Then

\[u^T v = u_1v_1 + u_{2:n}^T v_{2:n} \geq u_1v_1 - \|u_{2:n}\| \cdot \|v_{2:n}\| \geq 0.\]

Conversely, assume that \(v_1< \| v_{2:n} \|\). Then \(u:=(1,-v_{2:n}/\|v_{2:n}\|)\in Q^n\) satisfies

\[u^T v = v_1 - \| v_{2:n} \| < 0,\]

i.e., \(v\notin (\mathcal{Q}^n)^*\).

The notion of a dual cone allows us to treat (29) as a linear optimization problem, where the dual variables belong to the dual cone, i.e., we have a dual problem

(30)\[\begin{split}\begin{array}{ll} \mbox{maximize} & b^T y \\ \mbox{subject to}& c- A^T y = s\\ & s \in {\cal C}^*, \end{array}\end{split}\]

with a dual cone

\[{\cal C}^* = {\cal C} = \R^l_+ \times \mathcal{Q}^{n_1} \times \cdots \times \mathcal{Q}^{n_q}.\]

Weak duality follows directly by taking inner product of \(x\) and the dual constraint,

\[c^T x - x^T A^T y = c^T x - b^T y = x^T s \geq 0.\]

3.4.1. Primal versus dual form

Let us consider the dual formulation (30) without linear terms. If we eliminate the dual variable \(s\), we can rewrite (30) as

\[\begin{split}\begin{array}{lll} \mbox{maximize} & b^T y & \\ \mbox{subject to}& (c^q_j - (A^q_j)^T y) \in \mathcal{Q}^{n_j}, & j=1,\dots, n_q, \end{array}\end{split}\]

or equivalently

\[\begin{split}\begin{array}{lll} \mbox{maximize} & b^T y & \\ \mbox{subject to}& \| (c^q_j)_{2:n_j} - ( (A^q_j)_{2:n_j,:} )^T y \|_2 \leq (c^q_j)_1 - ( (A^q_j)_{1,:} )^T y, & j=1,\dots, n_j, \end{array}\end{split}\]

where the dual constraints

\[\| (c^q_j)_{2:n_j} - ( (A^q_j)_{2:n_j,:} )^T y \|_2 \leq (c^q_j)_1 - ( (A^q_j)_{1,:} )^T y\]

correspond to the so-called dual form conic constraints in Sec. 3.2.5. If the matrix

\[A = \left(\begin{array}{ccc}A^q_1 & \dots & A^q_{n_q}\end{array}\right),\]

has more columns than rows then usually it is more efficient to solve the primal problem (29), and if the opposite is the case, then it is usually more efficient to solve the dual problem (30).

3.4.2. Examples of bad duality states

The issue of strong duality is more complicated than for linear programming, for example the primal (or dual) optimal values may be finite but unattained, as illustrated in the following example.

Example 1 Positive duality gap

The following example is from [ART02]: consider the problem

(31)\[\begin{split}\begin{array}{ll} \mbox{minimize}& x_1-x_2\\ \mbox{subject to}& x_3 = 1\\ & x_1 \geq \sqrt{x_2^2 + x_3^2}, \end{array}\end{split}\]

with feasible set \(\{x_1\geq \sqrt{1 + x_2^2}, \: x_3=1 \}\). The optimal objective value is \(p^\star=0\) for \((x_1,x_2)\rightarrow \infty\), and we say that the objective is unattained. The dual problem is

(32)\[\begin{split}\begin{array}{ll} \mbox{maximize}& y\\ \mbox{subject to}& 1 \geq \sqrt{1 + y^2}, \end{array}\end{split}\]

with feasible set \(\{y=0\}\) and optimal value \(d^\star=0\), so a positive duality gap exists for all bounded \((x,y)\).

It is also possible to have a positive duality gap even when both the primal and dual optimal objectives are attained.

Example 2 Duality gap

We consider the problem

(33)\[\begin{split}\begin{array}{lrcl} \mbox{minimize}& x_3\\ \mbox{subject to}& x_1 + x_2 + u_1 + u_2 & = & 0\\ & x_3 + u_2 & = & 1 \\ & \sqrt{x_2^2 + x_3^2} & \leq & x_1\\ & \sqrt{u_2^2} & \leq & u_1,\\ \end{array}\end{split}\]

with two quadratic cones \(x\in\mathcal{Q}^3, \: u\in \mathcal{Q}^2\). It follows from the inequality constraints that

\[x_1 \geq | x_2 |, \qquad u_1 \geq |u_2|\]

and therefore \(x_1 + x_2 \geq 0\) and \(u_1 + u_2 \geq 0\). But then \(x_1 + x_2 = 0\) and \(u_1 + u_2 = 0\), which combined with \(x_3=1-u_2\) gives a simplified problem

\[\begin{split}\begin{array}{lrcl} \mbox{minimize}& 1-u_2 & & \\ & \sqrt{x_1^2 + (1-u_2)^2} & \leq & x_1, \end{array}\end{split}\]

with feasible set \(\{ u_2=1, \: x_1\geq 0\}\) and optimal value \(p^\star=0\). The dual problem of (33) is

(34)\[\begin{split}\begin{array}{ll} \mbox{maximize} & y_2 \\ \mbox{subject to}& y_1 \geq \sqrt{ y_1^2 + (1-y_2)^2}\\ & y_1 \geq \sqrt{ (y_1 - y_2)^2 }, \end{array}\end{split}\]

with feasible set \(\{y_2=1, \: y_1\geq \frac{1}{2} \}\) and the optimal value is \(d^\star=-1\). For this example, both the primal and dual optimal values are attained, but we have positive duality gap \(p^\star-d^\star=1\).

To ensure strong duality for conic quadratic optimization, we need an additional regularity assumption. Consider the primal feasible set for (29),

\[{\cal F}_p = \{ x=(x^l,x^q_1,\dots,x^q_{n_q}) \mid Ax = b, \: x^l \geq 0, \: x^q_j \in \mathcal{Q}^{n_j} \}.\]

and dual feasible set for (30),

\[{\cal F}_d = \{ y\in \R^m \mid c-A^Ty=(s^l,s^q_1,\dots,s^q_{n_q}), \: s^l \geq 0, \: s^q_j \in \mathcal{Q}^{n_j} \},\]

respectively. If

\[\exists x\in {\cal F}_p \: : \: (x^q_j)_1 > \| (x^q_j)_{2:n_j} \|, \: j=1,\dots,n_q,\]

or

\[\exists y\in {\cal F}_p \: : \: c-A^Ty = (s^l,s^q_1,\dots,s^q_{n_q}), \: (s^q_j)_1 > \| (s^q_j)_{2:n_j} \|, \: j=1,\dots,n_q,\]

then strong duality between the problems (29) and (30) holds. The additional regularity assumptions are called a Slater constraint qualification. In other words, strong duality holds if the conic inequalities in the primal or dual problem are strictly feasible (the proofs in App. 9.5 can be adapted to handle this case).

Thus the deficiency of positive duality gap in Example 2 is caused by the fact that neither the primal or dual problem is strictly feasible; the inequality in the first problem

\[u_1 \geq \sqrt{u_2^2}\]

is not strict for any \(u\) in the primal feasible set, and similarly the inequality

\[y_1 \geq \sqrt{y_1^2 + (1+y_2)^2}\]

is not strict for any \(y\) in the dual feasible set.

3.5. Infeasibility in conic quadratic optimization

Since duality theory for linear and conic quadratic optimization is almost identical, the same is true for infeasibility concepts. In particular, consider a pair of primal and dual conic optimization problems (29) and (30). We then have generalized versions of Farkas Lemma.

Lemma 3 Generalized Farkas

Given \(A\) and \(b\), exactly one of the two statements are true:

  1. There exists an \(x\in {\cal C}\) such that \(Ax = b\).
  2. There exists a \(y\) such that \(-A^T y \in {\cal C}\) and \(b^T y > 0\).

Farkas’ lemma tells us that either the primal problem (29) is feasible or there exists a \(y\) such that \(-A^T y \in {\cal C}\) and \(b^T y > 0\). In other words, any \(y\) satisfying

\[-A^T y \in {\cal C}, \quad b^T y > 0\]

is a certificate of primal infeasibility. Similarly, the dual variant of Farkas’ lemma states that either the dual problem (30) is feasible or there exists an \(x\in {\cal C}\) such that \(Ax=0\) and \(c^T x <0\). More precisely

Lemma 4 Generalized Farkas dual variant.

Given \(A\) and \(c\), exactly one of the two statements are true:

  1. There exists an \(x\in {\cal C}\) such that \(Ax = 0\) and \(c^Tx < 0\).
  2. There exists a \(y\) such \((c-A^Ty) \in {\cal C}\).

In other words, any \(x\in {\cal C}\) satisfying \(Ax=0\) and \(c^T x < 0\) is a certificate of dual infeasibility.

Consider the conic quadratic problem

\[\begin{split}\begin{array}{lrcl} \mbox{minimize} & x_1 &\\ \mbox{subject to}& 2 x_1 - x_2 &= & 0\\ & x_1 - s &= & 1\\ & \sqrt{x_2^2} & \leq & x_1\\ & s & \geq & 0, \end{array}\end{split}\]

which is infeasible since \(x_1 \geq 2|x_1|\) and \(x_1 \geq 1\). The dual problem is

\[\begin{split}\begin{array}{lc} \mbox{maximize} & y_2\\ \mbox{subject to}& \left(\begin{array}{c} 1 \\ 0 \\ 0 \end{array} \right) + \left(\begin{array}{c} 2y_1 + y_2 \\ -y_1 \\ -y_2 \end{array} \right)\in \mathcal{Q}^3. \end{array}\end{split}\]

Any point \(y=(0,t),\: t\geq 0\) is a certificate of infeasibility since \((t,0,-t)\in \mathcal{Q}^3\); in fact \((0,t)\) is a dual feasible direction with unbounded objective.

3.6. Bibliography

The material in this chapter is based on the paper [LVBL98] and the books [BenTalN01], [BV04]. The papers [AG03], [ART03] contain additional theoretical and algorithmic aspects.