3.1 Introduction¶

This chapter extends the notion of linear optimization with quadratic cones; conic quadratic optimization is a straightforward generalization of linear optimization, in the sense that we optimize a linear function with linear (in)equalities with variables belonging to one or more (rotated) quadratic cones. In general we also allow some of the variables to be linear variables as long as some of the variables belong to a quadratic cone.

We discuss the basic concept of quadratic cones, and demonstrate the surprisingly large flexibility of conic quadratic modeling with several examples of (non-trivial) convex functions or sets that be represented using quadratic cones. These convex sets can then be combined arbitrarily to form different conic quadratic optimization problems.

We finally extend the duality theory and infeasibility analysis from linear to conic optimization, and discuss infeasibility of conic quadratic optimization problems.

We define an $$n$$-dimensional quadratic cone as

(1)$\mathcal{Q}^n = \left \{ x\in \R^n \mid x_1 \geq \sqrt{x_2^2 + x_3^2 + \cdots + x_n^2} \right \}.$

The geometric interpretation of a quadratic (or second-order) cone is shown in Fig. 7 for a cone with three variables, and illustrates how the boundary of the cone resembles an ice-cream cone.

The 1-dimensional quadratic cone simply implies standard nonnegativity $$x_1\geq 0$$.

A set $$S$$ is called a convex cone if for any $$x\in S$$ we have $$\alpha x \in S,\: \forall \alpha \geq 0$$. From the definition (1) it is clear that if $$x\in \mathcal{Q}^n$$ then obviously $$\alpha x \in \mathcal{Q}^n, \: \forall \alpha \geq 0$$, which justifies the notion quadratic cone.

Fig. 7 A quadratic or second-order cone satisfying $$x_1\geq \sqrt{x_2^2 + x_3^2}$$.

An $$n-$$dimensional rotated quadratic cone is defined as

(2)$\mathcal{Q}_r^n = \left \{ x\in \R^n \mid 2x_1 x_2 \geq x_3^2 + \cdots + x_n^2, \: x_1, x_2 \geq 0 \right \}.$

As the name indicates, there is simple relationship between quadratic and rotated quadratic cones. Define an orthogonal transformation

(3)$\begin{split}T_n := \left[ \begin{array}{ccc} 1/\sqrt{2} & \hphantom{-}1/\sqrt{2} & \hphantom{-}0 \\ 1/\sqrt{2} & -1/\sqrt{2} & \hphantom{-}0 \\ 0 & \hphantom{-}0 & I_{n-2} \end{array} \right].\end{split}$

Then it is easy to verify that

$x \in \mathcal{Q}^n \quad \Longleftrightarrow \quad (T_n x) \in \mathcal{Q}^n_r,$

and since $$T$$ is orthogonal we call $$\mathcal{Q}_r$$ a rotated cone; the transformation corresponds to a rotation of $$\pi/4$$ of the $$(x_1,x_2)$$ axis. For example if $$x\in \mathcal{Q}^3$$ and

$\begin{split}\left[ \begin{array}{c}z_1 \\ z_2 \\ z_3 \end{array} \right] = \left[ \begin{array}{ccc} \frac{1}{\sqrt{2}} & \hphantom{-}\frac{1}{\sqrt{2}} & \hphantom{-}0 \\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & \hphantom{-}0 \\ 0 & \hphantom{-}0 & 1 \end{array} \right] \left[ \begin{array}{c}x_1 \\ x_2 \\ x_3 \end{array} \right] = \left[ \begin{array}{c}\frac{1}{\sqrt{2}}(x_1 + x_2)\\\frac{1}{\sqrt{2}}(x_1 - x_2)\\x_3 \end{array} \right]\end{split}$

then

$2 z_1 z_2 \geq z_3^2, \: z_1,z_2 \geq 0 \quad \Longrightarrow \quad (x_1^2 - x_2^2) \geq x_3^2, \: x_1 \geq 0,$

and similarly (by interchanging roles of $$x$$ and $$z$$) we see that

$x_1^2 \geq x_2^2 + x_3^2, \: x_1\geq 0 \quad \Longrightarrow \quad 2 z_1 z_2 \geq z_3^2, \: z_1, z_2 \geq 0.$

Thus, one could argue that we only need quadratic cones, but there are many examples of functions where using an explicit rotated quadratic conic formulation is more natural; in Sec. 3.2 (Conic quadratic modeling) we discuss many examples involving both quadratic cones and rotated quadratic cones.

In the following we describe several convex sets that can be modeled using conic quadratic formulations. We describe the convex sets in their simplest embodiment, which we interpret as conic quadratic building blocks; it is then straightforward to combine those in arbitrary intersections and affine mappings to model complex sets. For example we will show that the set

$\frac{1}{x} \leq t, \quad x \geq 0$

can be represented using quadratic cones. Sets that can be represented using quadratic cones are called conic quadratic representable sets.

3.2.1 Absolute values¶

In Sec. 2.2.2 (Absolute values) we saw how to model $$|x|\leq t$$ using two linear inequalities, but in fact the epigraph of the absolute value is just the definition of a two-dimensional quadratic cone, i.e.,

$|x|\leq t \quad \Longleftrightarrow \quad (t, x) \in \mathcal{Q}^2.$

3.2.2 Euclidean norms¶

The Euclidean norm of $$x\in \R^n$$,

$\| x \|_2 = \sqrt { x_1^2 + x_2^2 + \cdots + x_n^2 }$

essentially defines the quadratic cone, i.e.,

$\| x \|_2 \leq t \quad \Longleftrightarrow \quad (t, x)\in \mathcal{Q}^{n+1}.$

3.2.3 Squared Euclidean norms¶

The epigraph of the squared Euclidean norm can be described as the intersection of a rotated quadratic cone with an affine hyperplane,

$\| x \|_2^2 \leq t \quad \Longleftrightarrow \quad (1/2, t, x)\in \mathcal{Q}_r^{n+2}.$

Assume $$Q \in \R^{n \times n}$$ is a symmetric positive semidefinite matrix. The convex inequality

$(1/2)x^T Q x + c^T x + r \leq 0$

may be rewritten as

(4)$\begin{split} \begin{array}{rcl} t + c^T x + r & = & 0, \\ x^T Q x & \leq & 2 t. \\ \end{array}\end{split}$

Since $$Q$$ is symmetric positive semidefinite the epigraph

(5)$x^T Q x \leq 2 t$

is a convex set and there exists a matrix $$F\in \R^{k\times n}$$ such that

(6)$Q = F^T F,$

(see Sec. 4 (Semidefinite optimization) for properties of semidefinite matrices); for instance $$F$$ could be the Cholesky factorization of $$Q$$. We then have an equivalent characterization of (5) as

$\begin{split}\begin{array}{rcl} F x - y & = & 0, \\ s & = & 1, \\ \| y \|^2 & \leq & 2 s t, \end{array}\end{split}$

or more succinctly: if $$Q=F^TF$$ then

$(1/2)x^T Q x \leq t \quad \Longleftrightarrow \quad (t,1,F x) \in \mathcal{Q}_r^{2+k}.$

Frequently $$Q$$ has the structure

$Q = I + F^T F$

where $$I$$ is the identity matrix, so

$x^T Q x = x^T x + x^T F^T F x$

and hence we can write

$\begin{split}\begin{array}{rclll} F x - y & = & 0, \\ f+h & = & t \\ \| x \|^2 & \leq & 2 e f, & e=1, & \\ \| y \|^2 & \leq & 2 g h, & g=1. & \\ \end{array}\end{split}$

In other words,

$(f,1,x ) \in \mathcal{Q}_r^{2+n}, \qquad (h,1,F x ) \in \mathcal{Q}_r^{2+k}, \qquad f+h=t$

is a conic quadratic representation of (5) in this case.

3.2.5 Second-order cones¶

A second-order cone is occasionally specified as

(7)$\| Ax + b \|_2 \leq c^T x + d$

where $$A \in \R^{m\times n}$$ and $$c \in \R^n$$. The formulation (7)) is equivalent to

$(c^T x + d, Ax + b) \in \mathcal{Q}^{m+1}$

or equivalently

(8)$\begin{split}\begin{array}{rcl} s & = & A x + b, \\ t & = & c^T x + d, \\ (t,s) & \in & \mathcal{Q}^{m+1}. \end{array}\end{split}$

As will be explained later, we refer to (7) as the dual form and (8) as the primal form, respectively. An alternative characterization of (7) is

(9)$\|Ax + b\|^2_2 - (c^T x + d)^2 \leq 0, \qquad c^T x+ d \geq 0$

which shows that certain quadratic inequalities are conic quadratic representable. The next section shows such an example.

3.2.6 Simple sets involving power functions¶

Next we show how to represent some frequently occurring convex sets represented by power like functions. The result is stated in the following lemma:

Lemma 3
The following propositions are true.
1. $$\left \{(t,x)\mid t \leq \sqrt{x}, \, x \geq 0 \right \} = \left \{(t,x) \mid (x,1/2,t) \in \mathcal{Q}_r^3 \right \}$$.
2. $$\left \{(t,x)\mid t \geq \frac{1}{x}, \, x \geq 0 \right \} = \left \{(t,x)\mid (x,t,\sqrt{2}) \in \mathcal{Q}_r^3 \right \}$$.
3. $$\left \{(t,x)\mid t \geq x^{3/2} ,\, x \geq 0 \right \} = \left \{(t,x)\mid (s,t,x), (x,1/8,s) \in \mathcal{Q}_r^3 \right \}$$.
4. $$\left \{(t,x)\mid t \geq x^{5/3} ,\, x \geq 0 \right \} = \left \{(t,x)\mid (s,t,x), (1/8,z,s),(s,x,z) \in \mathcal{Q}_r^3 \right \}$$.
5. $$\left \{(t,x)\mid t \geq \frac{1}{x^2}, \, x \geq 0 \right \} = \left \{(t,x) \mid (t,1/2,s), (x,s,\sqrt{2}) \in \mathcal{Q}_r^3 \right \}$$.
6. $$\left \{(t,x,y)\mid t \geq \frac{|x|^3}{y^2}, y\geq 0 \right \} = \left \{(t,x,y)\mid (z,x) \in \mathcal{Q}^2, (\frac{y}{2},s,z), (\frac{t}{2},z,s) \in \mathcal{Q}_r^3 \right \}$$.
Proof

Proposition (i) follows from

$\begin{split}\begin{array}{rl} & (x,\frac{1}{2},t) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & x \geq t^2, \, x \geq 0 \\ \Leftrightarrow & \sqrt{x} \geq t,\, x \geq 0. \end{array}\end{split}$

Proposition (ii) follows from

$\begin{split}\begin{array}{rl} & (x,t,\sqrt{2}) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & 2 x t \geq 2,\, x,t\geq 0 \\ \Leftrightarrow & t \geq \frac{1}{x},\, x\geq 0. \\ \end{array}\end{split}$

Proposition (iii) follows from

$\begin{split}\begin{array}{rl} & (s,t,x), (x,\frac{1}{8},s) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & 2st \geq x^2,\, \frac{1}{4} x \geq s^2,\, s,t,x \geq 0\\ \Leftrightarrow & \sqrt{x} t \geq x^2,\, t,x \geq 0 \\ \Leftrightarrow & t \geq x^{3/2}, \, x\geq 0. \ \end{array}\end{split}$

Proposition (iv) follows from

$\begin{split}\begin{array}{rl} & (s,t,x), (1/8,z,s),(s,x,z) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & \frac{1}{4} z \geq s^2,\, 2 s x \geq z^2,\, 2st \geq x^2,\, s,t,x \geq 0\\ \Leftrightarrow & 2 s x \geq (4s^2)^2,\, 2st \geq x^2,\, s,t,x \geq 0 \\ \Leftrightarrow & x \geq 8s^3,\, 2st \geq x^2,\, x,s \geq 0 \\ \Leftrightarrow & x^{1/3} t \geq x^2,\, x \geq 0 \\ \Leftrightarrow & t \geq x^{5/3},\, x \geq 0. \\ \end{array}\end{split}$

Proposition (v) follows from

$\begin{split}\begin{array}{rl} & (t,\frac{1}{2},s), (x,s,\sqrt{2}) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & t \geq s^2,\, 2 x s \geq 2,\, t,x \geq 0 \\ \Leftrightarrow & t \geq \frac{1}{x^2}.\\ \end{array}\end{split}$

Finally, proposition (vi) follows from

$\begin{split}\begin{array}{rl} & (z,x) \in \mathcal{Q}^2, (\frac{y}{2},s,z), (\frac{t}{2},z,s) \in \mathcal{Q}_r^3 \\ \Leftrightarrow & z \geq |x|,\, ys \geq z^2,\, zt \geq s^2,\, z,y,s,t \geq 0\\ \Leftrightarrow & z \geq |x|,\, zt \geq \frac{z^4}{y^2},\, z,y,t \geq 0 \\ \Leftrightarrow & t \geq \frac{|x|^3}{y^2}, \, y\geq 0. \ \end{array}\end{split}$

In the above proposition the terms $$x^{3/2}$$ and $$x^{5/3}$$ appeared. Those terms are special cases of

(10)$t \geq x^\frac{2k-1}{k},\, x \geq 0$

defined for $$k=2,3,4,5,...$$ (10) is identical to

$t x^{1/k} \geq x^2,\, x \geq 0$

which implies

$\begin{split}\begin{array}{rcll} 2 t z & \geq & x^2, & x,z \geq 0, \\ x^{1/k} & \geq & 2 z. \end{array}\end{split}$

The latter can be rewritten as

$\begin{split}\begin{array}{rcll} 2 t z & \geq & x^2, & x,z \geq 0,\\ x & \geq & 2^k z^k & \end{array}\end{split}$

for which it is easy to build a conic quadratic representation. Therefore we will leave it as an exercise for reader.

3.2.7 Geometric mean¶

Closely related is the hypograph of the geometric mean of nonnegative variables $$x_1$$ and $$x_2$$,

$K_\text{gm}^2 = \{ (x_1,x_2,t) \mid \sqrt{x_1 x_2} \geq t, \: x_1, x_2 \geq 0 \}.$

This corresponds to a scaled rotated quadratic cone

$K_\text{gm}^2 = \{ (x_1,x_2,t) \mid (x_1,x_2,\sqrt{2}t) \in \mathcal{Q}_r^3 \}.$

More generally, we can obtain a conic quadratic representation of the hypograph

$K_\text{gm}^n = \left \{ (x,t)\in \R^{n+1} \mid \left (x_1 x_2 \cdots x_n \right )^{1/n} \geq t, x \geq 0 \right \}.$

Initially let us assume that $$n=2^l$$ for an integer $$l\geq 1$$. In particular, let us assume $$l=3$$, i.e., we seek a conic quadratic representation of

(11)$t \leq (x_1 x_2 \cdots x_8)^{1/8}, \quad x \geq 0.$

If we introduce the cone relationships

(12)$(x_1,x_2,y_1), (x_3,x_4,y_2),(x_5,x_6,y_3),(x_7,x_8,y_4) \in \mathcal{Q}_r^3,$

then

$2 x_1 x_2 \geq y_1^2$

and so forth, which is obviously equivalent to

$x_1 x_2 \geq (1/2)y_1^2.$

$((1/2)y_1^2 (1/2)y_2^2 (1/2)y_3^2 (1/2)y_4^2)^{1/8} \leq (x_1 x_2 \cdots x_8)^{1/8},$

or equivalently

$\frac{1}{\sqrt{2}}( y_1 y_2 y_3 y_4)^{1/4} \leq (x_1 x_2 \cdots x_8)^{1/8}.$

Hence, by introducing 4 3-dimensional rotated quadratic cones, we have obtained a simpler problem with 4 $$y$$ variables instead of 8 $$x$$ variables. Clearly, we can apply the same idea to the reduced problem. To that end, introduce the inequalities

$(y_1,y_2,z_1),(y_3,y_4,z_2) \in \mathcal{Q}_r^3,$

implying that

$\frac{1}{\sqrt{2}} (z_1 z_2)^{1/2} \leq (y_1 y_2 y_3 y_4)^{1/4}.$

Finally, if we introduce the conic relationship

$(z_1,z_2,w_1) \in \mathcal{Q}_r^3$

we get

$w_1 \leq \sqrt{2}(z_1 z_2)^{1/2} \leq \sqrt{4} (y_1 y_2 y_3 y_4)^{1/4} \leq \sqrt{8} (x_1 x_2 \cdots x_8)^{1/8}.$

Therefore

(13)$\begin{split}\begin{array}{rcl} (x_1,x_2,y_1),(x_3,x_4,y_2),(x_5,x_6,y_3),(x_7,x_8,y_4) & \in & \mathcal{Q}_r^3, \\ (y_1,y_2,z_1),(y_3,y_4,z_2) & \in & \mathcal{Q}_r^3, \\ (z_1,z_2,w_1) & \in & \mathcal{Q}_r^3, \\ w_1 & = & \sqrt{8} t \\ \end{array}\end{split}$

is a representation of the set (11), which is by construction conic quadratic representable. The relaxation using $$l$$ levels of auxiliary variables and rotated quadratic cones can represented using a tree-structure as shown shown in Fig. 8.

Fig. 8 Tree-structure of the relaxations as in (13): on the edges we report the additional variables, on the nodes the additional constraints.

Next let us assume that $$n$$ is not a power of two. Let, for example, $$n=6$$. We then wish to characterize the set

$t \leq (x_1 x_2 x_3 x_4 x_5 x_6 )^{1/6}, \quad x \geq 0.$

which is equivalent to

$t \leq (x_1 x_2 x_3 x_4 x_5 x_6 x_7 x_8 )^{1/8}, \quad x_7=x_8=t, \quad x \geq 0.$

In other words, we can reuse the result (11) if we add simple equality constraints

$x_7 = x_8 = t.$

Thus, if $$n$$ is not a power of two, we take $$l=\lceil \log_2 n\rceil$$ and build the conic quadratic quadratic representation for that set, and we add $$2^l-n$$ simple equality constraints.

3.2.8 Harmonic mean¶

Consider next the hypograph of the harmonic mean,

$\left (\frac{1}{n} \sum_{i=1}^n x_i^{-1} \right )^{-1} \geq t, \quad x\geq 0.$

This is a convex inequality, but not conic quadratic representable. However, the reciprocal inequality

(14)$\frac{1}{n} \sum_{i=1}^n x_i^{-1} \leq y, \quad x\geq 0.$

with $$y=1/t$$ can be characterized as

$x_i z_i \geq 1, \quad \sum_{i=1}^n z_i = n y.$

In other words, the set (14) corresponding to the epigraph of the reciprocal harmonic mean of $$x$$ can be described using an intersection of rotated quadratic cones an affine hyperplanes

$2 x_i z_i \geq w_i^2, \quad x_i, z_i\geq 0, \quad w_i=\sqrt{2}, \quad \sum_{i=1}^n z_i = n y,$

or $$(x_i, z_i, \sqrt{2}) \in \mathcal{Q}_r^3$$, $$e^T z =ny$$.

3.2.9 Convex increasing rational powers¶

We can extend the formulations in Sec. 3.2.7 (Geometric mean) to describe the epigraph

$K^{p/q} = \{ (x, t) \mid x^{p/q}\leq t, \: x \geq 0 \}$

for any rational convex power $$p/q\geq 1$$, $$p, q\in \mathbb{Z}_+$$. For example, consider

$x^{5/3} \leq t, \quad x\geq 0.$

We rewrite this as

$x^5 \leq t^3, \quad x\geq 0,$

which can be described (using the approach in Sec. 3.2.7 (Geometric mean)) as

$0\leq x \leq (y_1 y_2 y_3 y_4 y_5)^{1/5}, \quad y_1=y_2=y_3=t, \quad y_4=y_5=1,$

where we note that

$y_1 y_2 y_3 y_4 y_5 = t^3.$

The inequality

$0\leq x \leq (y_1 y_2 y_3 y_4 y_5)^{1/5}$

is a geometric mean inequality discussed in Sec. 3.2.7 (Geometric mean). If we let $$e_k\in\R^k$$ denote the vector of all ones, then for general $$p/q \geq 1$$, $$p,q\in \mathbb{Z}_+$$ we have

$K^{p/q} = \{ (x, t) \mid (e_q t, e_{p-q}, x)\in K^p_\text{gm} \}.$

3.2.10 Convex decreasing rational powers¶

In a similar way we can describe the epigraph

$K^{-p/q} = \{ (x,t) \mid x^{-p/q}\leq t, \: x \geq 0 \}$

for any $$p, q\in \mathbb{Z}_+$$. For example, consider

$x^{-5/2} \leq t, \quad x\geq 0,$

which we rewrite as

$1 \leq x^5 t^2, \quad x\geq 0,$

or equivalently as

$1 \leq (y_1 y_2 \dots y_7)^{1/7}, \quad y_1=\dots=y_5=x, \quad y_6=y_7=t, \quad y,t\geq 0.$

Let $$e_k\in \R^k$$ denote the vector of all ones. For general $$p,q\in \mathbb{Z}_+$$ we then have

$K^{-p/q} = \{ (x,t) \mid (e_p x, e_q t, 1)\in K^{p+q}_\text{gm} \}.$

3.2.11 Power cones¶

The $$(n+1)$$-dimensional power cone is defined by

(15)$K^{n+1}_\alpha = \left \{ (x,y) \in \R_+^n \times \R\mid |y| \leq \prod_{j=1}^n x_j^{\alpha_j} \right \}$

where $$\alpha > 0$$ and

$\sum_{j=1}^n \alpha_j = 1.$

Of particular interest is the three-dimensional power cone given by

(16)$K^3_\alpha = \left \{ (x,y) \in \R_+^2 \times \R\mid |y| \leq x_1^{\alpha_1} x_2^{1-\alpha_1} \right \} .$

For example, the intersection of the three-dimensional power cone with an affine hyperplane

$|y| \leq x_1^{\alpha_1} x_2^{1-\alpha_1}, \qquad x_2 = 1$

is equivalent to

$|y|^{1/\alpha_1} \leq x_1,$

i.e., we can model the epigraph of convex increasing terms ($$1/\alpha_1\geq 1$$) using a three-dimensional power cone. As another example, the cone

$|y| \leq x_1^{\frac{2}{3}} x_2^{1-\frac{2}{3}}$

is equivalent to

$\frac{|y|^3}{x_1^2} \leq x_2.$

Assume next that $$\alpha_1 = p/q$$ where $$p$$ and $$q$$ are positive integers and $$p \leq q$$. Then

(17)$K^3_{(\frac{p}{q},1-\frac{p}{q})} = \left \{ (x,y) \in \R_+^2 \times \R\mid |y| \leq x_1^{\frac{p}{q}} x_2^{(1-\frac{p}{q})} \right \}.$

Now

$|y| \leq x_1^{\frac{p}{q}} x_2^{(1-\frac{p}{q})}$

is equivalent to

(18)$|y| \leq (x_1^p x_2^{q-p})^{\frac{1}{q}},$

so by introducing additional $$z$$ variables and the constraints

$z_1 = z_2 = \ldots = z_{q-p} = x_1, \qquad z_{q-p+1} = z_{q-p+2} = \ldots = z_q = x_2$

we can rewrite (18) as

(19)$|y| \leq (z_1 z_2 \ldots z_q)^{\frac{1}{p}}$

which essentially is the geometric mean inequality discussed above.

We next consider the $$n+1$$ dimensional power cone with rational powers,

(20)$K^{n+1}_\alpha = \left \{ (x,y) \in \R_+^n \times \R\mid |y| \leq \prod_{j=1}^n x_j^{p_j/q_j} \right \}$

where $$p_j,q_j$$ are integers satisfying $$0 < p_j \leq q_j$$ and $$\sum_{j=1}^n p_j/q_j = 1$$. To motivate the general procedure, we consider a simple example.

Consider the 4 dimensional power cone

$K^4_{(\frac{1}{4},\frac{3}{8},\frac{3}{8})} = \left \{ (x,y)\in\R_+^3 \times \R\mid | y | \leq (x_1^\frac{1}{4} x_2^\frac{3}{8} x_3^\frac{3}{8}) \right \}.$

An equivalent representation is given by the geometric mean inequality

$| y | \leq (z_1 z_2 \dots z_8)^{1/8}$

with

$z_1 = z_2 = x_1, \qquad z_3=z_4=z_5=x_2, \qquad z_6=z_7=z_8=x_3.$

In the general case, let

$\beta = \mbox{lcm}(q_1,q_2,\ldots,q_n)$

denote the least common multiple of $$\{ q_i \}$$. Then

$K^{n+1}_\alpha = \left \{ (x,y) \in \R_+^n \times \R\mid |y| \leq \left (\prod_{j=1}^n x_j^{\frac{\beta p_j}{q_j}} \right )^\frac{1}{\beta} \right \},$

where we note that $$\frac{\beta p_j}{q_j}$$ are integers and $$\sum_{j=1}^n \frac{\beta p_j}{q_j} = \beta$$. If we define

$s_j = \sum_{k=1}^{j}\frac{\beta p_k}{q_k}, \quad j=1,\ldots,n-1,$

we can then reformulate (20) as

$\begin{split}\begin{array}{c} |y| \leq (z_1 z_2 \ldots z_\beta)^{\frac{1}{q}} \\ z_1 = \ldots = z_{s_1} = x_1, \quad z_{s_1+1}= \ldots = z_{s_2}=x_2, \quad \ldots \quad z_{s_{n-1}+1} = \ldots = z_\beta = x_n. \end{array}\end{split}$

3.2.12 Quadratic forms with one negative eigenvalue¶

Assume that $$A \in \R^{n \times n}$$ is a symmetric matrix with exactly one negative eigenvalue, i.e., $$A$$ has a spectral factorization (i.e., eigenvalue decomposition)

$A = Q \Lambda Q^T = -\alpha_1 q_1 q_1^T + \sum_{i=2}^n \alpha_i q_i q_i^T$

where $$Q^T Q = I$$, $$\Lambda=\mathbf{diag}(-\alpha_1, \alpha_2, \dots, \alpha_n)$$, $$\alpha_i \geq 0, \: \forall i$$. Then

$x^T A x \leq 0$

is equivalent to

(21)$\sum_{j=2}^n \alpha_j (q_j^T x)^2 \leq \alpha_1 (q_1^T x)^2.$

Suppose $$q_1^T x \geq 0$$. We can characterize (21) in different ways, e.g., as

(22)$(\sqrt{\alpha_1/2}q_1^Tx, \sqrt{\alpha_1/2}q_1^Tx, \sqrt{\alpha_2}q_2^Tx, \dots, \sqrt{\alpha_n}q_n^Tx ) \in \mathcal{Q}_r^{n+1}$

or as

(23)$(\sqrt{\alpha_1}q_1^Tx, \sqrt{\alpha_2}q_2^Tx, \dots, \sqrt{\alpha_n}q_n^Tx ) \in \mathcal{Q}^{n}.$

The latter equivalence follows directly by using the orthogonal transformation $$T_{n+1}$$ on (22), i.e.,

$\begin{split}\left[ \begin{array}{ccccc} 1/\sqrt{2} & \hphantom{-}1/\sqrt{2} & \\ 1/\sqrt{2} & -1/\sqrt{2} & \\ & & \hphantom{-}1 \\ & & & \hphantom{-}\ddots \\ & & & & \hphantom{-}1 \end{array} \right] \left[ \begin{array}{c} \sqrt{\alpha_1/2} q_1^T x \\ \sqrt{\alpha_1/2} q_1^T x \\ \sqrt{\alpha_2} q_2^T x \\ \vdots \\ \sqrt{\alpha_n} q_n^T x \end{array} \right] = \left[ \begin{array}{c} \sqrt{\alpha_1} q_1^T x \\ 0 \\ \sqrt{\alpha_2} q_2^T x \\ \vdots \\ \sqrt{\alpha_n} q_n^T x \end{array} \right].\end{split}$

3.2.13 Ellipsoidal sets¶

The set

${\cal E} = \{ x\in \R^n \mid \| P(x-c) \|_2 \leq 1 \}$

describes an ellipsoid centred at $$c$$. It has a natural conic quadratic representation, i.e., $$x\in {\cal E}$$ if and only if

$y = P(x-c), \quad (t, y) \in \mathcal{Q}^{n+1}, \quad t=1.$

Suppose $$P$$ is nonsingular then

${\cal E} = \{ x \in \R^n \mid x = P^{-1} y + c, \: \| y \|_2 \leq 1 \}.$

is an alternatively characterization.

Depending on the context one characterization may be more useful than the other.

In this section we give different instructive examples of conic quadratic optimization problems formed using the formulations from Sec. 3.2 (Conic quadratic modeling).

A general convex quadratically constrained quadratic optimization problem can be written as

(24)$\begin{split}\begin{array}{ll} \mbox{minimize} & (1/2)x^T Q_0 x + c_0^T x + r_0\\ \mbox{subject to}& (1/2)x^T Q_i x + c_i^T x + r_i \leq 0, \quad i=1,\dots,p, \end{array}\end{split}$

where all $$Q_i$$ are symmetric positive. Let

$Q_i = F_i^T F_i, \quad i=0,\dots,p,$

where $$F_i \in \R^{k_i \times n}$$. Using the formulations in Sec. 3.2.4 (Convex quadratic sets) we then get an equivalent conic quadratic problem

(25)$\begin{split}\begin{array}{lll} \mbox{minimize} & t_0 + c_0^T x + r_0 & \\ \mbox{subject to} & t_i + c_i^T x + r_i = 0, & i=1,\dots,p, \\ & (t_i,1,F_i x) \in \mathcal{Q}_r^{k_i+2},& i=0,\dots,p. \\ \end{array}\end{split}$

Assume next that $$Q_i$$ is a rank 1 matrix, i.e., $$F_i$$ has 1 row and $$n$$ columns. Storing $$Q_i$$ requires about $$n^2/2$$ space whereas storing $$F_i$$ then only requires $$n$$ space. Moreover, the amount of work required to evaluate

$x^T Q_i x$

is proportional to $$n^2$$ whereas the work required to evaluate

$x^T F_i^T F_ i x = \| F_i x \|^2$

is proportional to $$n$$ only. In other words, if $$Q_i$$ have low rank, then (25) will require much less space to solve than (24). This fact usually also translates into much faster solution times.

3.3.2 Robust optimization with ellipsoidal uncertainties¶

Often in robust optimization some of the parameters in the model are assumed to be unknown, but we assume that the unknown parameters belong to a simple set describing the uncertainty. For example, for a standard linear optimization problem we may wish to find a robust solution for all vectors $$c$$ in an ellipsoid

${\cal E} = \{ c\in \R^n \mid c = Fy + g, \: \| y \|_2 \leq 1 \}.$

A common approach is then to optimize for the worst-case realization of $$c$$, i.e., we get a robust version

(26)$\begin{split}\begin{array}{ll} \mbox{minimize}& \sup_{c\in {\cal E}} c^T x\\ \mbox{subject to}& Ax = b\\ & x\geq 0. \end{array}\end{split}$

The worst-case objective can be evaluated as

$\sup_{c\in {\cal E}} c^T x = g^T x + \sup_{\| y \|_2 \leq 1} y^T F^T x = g^T x + \| F^T x\|_2$

where we used that $$\sup_{\| u \|_2\leq 1} v^T u = (v^T v)/\|v\|_2=\|v\|_2$$. Thus the robust problem (26) is equivalent to

$\begin{split}\begin{array}{ll} \mbox{minimize}& g^T x + \|F^T x \|_2\\ \mbox{subject to}& Ax = b\\ & x\geq 0, \end{array}\end{split}$

which can be posed as a conic quadratic problem

(27)$\begin{split}\begin{array}{ll} \mbox{minimize}& g^T x + t\\ \mbox{subject to}& Ax = b\\ & (t,F^T x) \in \mathcal{Q}^{n+1}\\ & x \geq 0. \end{array}\end{split}$

3.3.3 Markowitz portfolio optimization¶

In classical Markowitz portfolio optimization we consider investment in $$n$$ stocks or assets held over a period of time. Let $$x_i$$ denote the amount we invest in asset $$i$$, and assume a stochastic model where the return of the assets is a random variable $$r$$ with known mean

$\mu = \mathbf{E}r$

and covariance

$\Sigma = \mathbf{E}(r-\mu)(r-\mu)^T.$

The return of our investment is also a random variable $$y = r^Tx$$ with mean (or expected return)

$\mathbf{E}y = \mu^T x$

and variance (or risk)

$(y - \mathbf{E}y)^2 = x^T \Sigma x.$

We then wish to rebalance our portfolio to achieve a compromise between risk and expected return, e.g., we can maximize the expected return given an upper bound $$\gamma$$ on the tolerable risk and a constraint that our total investment is fixed,

$\begin{split}\begin{array}{ll} \mbox{maximize}& \mu^T x\\ \mbox{subject to}& x^T \Sigma x \leq \gamma\\ & e^T x = 1\\ & x \geq 0. \end{array}\end{split}$

Suppose we factor $$\Sigma = GG^T$$ (e.g., using a Cholesky or a eigenvalue decomposition). We then get a conic formulation

$\begin{split}\begin{array}{ll} \mbox{maximize}& \mu^T x\\ \mbox{subject to}& (\sqrt{\gamma}, G^Tx) \in \mathcal{Q}^{n+1}\\ & e^T x = 1\\ & x \geq 0. \end{array}\end{split}$

In practice both the average return and covariance are estimated using historical data. A recent trend is then to formulate a robust version of the portfolio optimization problem to combat the inherent uncertainty in those estimates, e.g., we can constrain $$\mu$$ to an ellipsoidal uncertainty set as in Sec. 3.3.2 (Robust optimization with ellipsoidal uncertainties).

3.4 Duality in conic quadratic optimization¶

To discuss conic quadratic duality, we consider a primal problem

(28)$\begin{split}\begin{array}{ll} \mbox{minimize} & \langle c^l, x^l\rangle + \sum_{j=1}^{n_q} \langle c^q_j, x^q_j\rangle \\ \mbox{subject to} & A^l x^l + \sum_{j=1}^{n_q}A^q_j x^q_j = b\\ & x^l \in \R^{n_l}_+, \: x^q_j \in \mathcal{Q}^{n_j}, \end{array}\end{split}$

where $$c^l \in \R^{n_l}, c^q_j\in\R^{n_j}, A^l\in \R^{m\times n^l}, A^q_j\in \R^{m \times n_j}$$ and $$b\in \R^m$$, i.e., a problem with both a linear cone and $$n_q$$ quadratic cones. We can also write (28) more compactly as

(29)$\begin{split}\begin{array}{ll} \mbox{minimize} & c^T x\\ \mbox{subject to}& Ax = b\\ & x \in {\cal C}, \end{array}\end{split}$

with

$\begin{split}x = \left(\begin{array}{c}x^l \\ x^q_1 \\ \vdots \\ x^q_{n_q}\end{array}\right), \quad c = \left(\begin{array}{c}c^l \\ c^q_1 \\ \vdots \\ c^q_{n_q}\end{array}\right), \quad A = \left(\begin{array}{cccc} A^l & A^q_1 & \dots & A^q_{n_q}\end{array}\right),\end{split}$

for the cone $${\cal C} = \R^l_+ \times \mathcal{Q}^{n_1} \times \cdots \times \mathcal{Q}^{n_q}$$. We note that problem (29) resembles a standard linear optimization, except for a more general cone. The Lagrangian function is

$L(x,y,s) = c^Tx - y^T(Ax-b) - x^Ts.$

For linear optimization the choice of nonnegative Lagrange multipliers $$s\geq 0$$ ensures that $$x^T s \geq 0, \: \forall x\geq 0$$ so the Lagrangian function provides a lower bound on the primal problem. For conic quadratic optimization the choice of Lagrange multipliers is more involved, and we need the notion of a dual cone

$(\mathcal{Q}^n)^* = \{ v\in \R^n \mid u^T v \geq 0, \: \forall u \in \mathcal{Q}^n \}.$

Then for $$s\in (\mathcal{Q}^n)^*$$ we similarly have $$x^Ts\geq 0, \forall x\in \mathcal{Q}^n$$, so the Lagrange function becomes a global lower bound.

Lemma 4
The quadratic cone is self-dual, $$({\mathcal{Q}^n})^*=\mathcal{Q}^n$$.
Proof

Assume first that $$u_1\geq \|u_{2:n}\|$$ and $$v_1 \geq \|v_{2:n}\|$$. Then

$u^T v = u_1v_1 + u_{2:n}^T v_{2:n} \geq u_1v_1 - \|u_{2:n}\| \cdot \|v_{2:n}\| \geq 0.$

Conversely, assume that $$v_1< \| v_{2:n} \|$$. Then $$u:=(1,-v_{2:n}/\|v_{2:n}\|)\in Q^n$$ satisfies

$u^T v = v_1 - \| v_{2:n} \| < 0,$

i.e., $$v\notin (\mathcal{Q}^n)^*$$.

The notion of a dual cone allows us to treat (29) as a linear optimization problem, where the dual variables belong to the dual cone, i.e., we have a dual problem

(30)$\begin{split}\begin{array}{ll} \mbox{maximize} & b^T y \\ \mbox{subject to}& c- A^T y = s\\ & s \in {\cal C}^*, \end{array}\end{split}$

with a dual cone

${\cal C}^* = {\cal C} = \R^l_+ \times \mathcal{Q}^{n_1} \times \cdots \times \mathcal{Q}^{n_q}.$

Weak duality follows directly by taking inner product of $$x$$ and the dual constraint,

$c^T x - x^T A^T y = c^T x - b^T y = x^T s \geq 0.$

3.4.1 Primal versus dual form¶

Let us consider the dual formulation (30) without linear terms. If we eliminate the dual variable $$s$$, we can rewrite (30) as

$\begin{split}\begin{array}{lll} \mbox{maximize} & b^T y & \\ \mbox{subject to}& (c^q_j - (A^q_j)^T y) \in \mathcal{Q}^{n_j}, & j=1,\dots, n_q, \end{array}\end{split}$

or equivalently

$\begin{split}\begin{array}{lll} \mbox{maximize} & b^T y & \\ \mbox{subject to}& \| (c^q_j)_{2:n_j} - ( (A^q_j)_{2:n_j,:} )^T y \|_2 \leq (c^q_j)_1 - ( (A^q_j)_{1,:} )^T y, & j=1,\dots, n_j, \end{array}\end{split}$

where the dual constraints

$\| (c^q_j)_{2:n_j} - ( (A^q_j)_{2:n_j,:} )^T y \|_2 \leq (c^q_j)_1 - ( (A^q_j)_{1,:} )^T y$

correspond to the so-called dual form conic constraints in Sec. 3.2.5 (Second-order cones). If the matrix

$A = \left(\begin{array}{ccc}A^q_1 & \dots & A^q_{n_q}\end{array}\right),$

has more columns than rows then usually it is more efficient to solve the primal problem (29), and if the opposite is the case, then it is usually more efficient to solve the dual problem (30).

3.4.2 Examples of bad duality states¶

The issue of strong duality is more complicated than for linear programming since the advent of nonlinear feasible sets enables one to construct a number of troublesome problems as shown in Fig. 9.

Fig. 9 Troublesome problems in the xy-plane. (a) Minimizing $$y \geq 1 / x$$ on the nonnegative orthant is unattained approaching zero for $$x \rightarrow \infty$$. (b) The only feasible point is $$(0,0)$$, but objectives maximizing $$x$$ are not blocked by the purely vertical normal vectors of active constraints at this point, falsely suggesting local progress could be made. (c) The intersection of the two subsets is empty, but the distance between them is zero. (d) Minimizing $$x$$ on $$y \geq x^2$$ is unbounded at minus infinity for $$y \rightarrow \infty$$, but there is no improving ray to follow.

Although seemingly unrelated, the four cases of Fig. 9 are actually primal-dual pairs; (a) with (b) and (c) with (d). In fact, the missing normal vector property in (b)—desired to certify optimality—can be attributed to (a) not attaining the best among objective values at distance zero to feasibility, and the missing positive distance property in (c)—desired to certify infeasibility—is because (d) has no improving ray. Optimization problems such as (b) and (c) are denoted illposed because, as seen in Fig. 9, even the tiniest problem perturbation may change their weak feasibility status. This can make illposed problems incredibly difficult to solve numerically. We formalize both pairs as examples.

Example 4 Unattainment; i.e., weakly feasible dual

Consider the problem

(31)$\begin{split}\begin{array}{ll} \mbox{minimize}& (0.5) x_2\\ \mbox{subject to}& 2 x_1 x_2 \geq (\sqrt{2})^2, \end{array}\end{split}$

with feasible set shown by Fig. 9 (a) in the $$(x_1,x_2)$$-plane. The optimal objective value is the limit $$p^\star=0$$ for $$x_1 \rightarrow \infty$$ and said to be unattained. The dual problem is

(32)$\begin{split}\begin{array}{ll} \mbox{maximize}& (\sqrt{2}) s_3\\ \mbox{subject to}& s_1 = 0,\\ & 2 s_1 (0.5) \geq s_3^2, \end{array}\end{split}$

with feasible set shown by Fig. 9 (b) in the $$(s_3, s_1)$$-plane, and the optimal objective value $$d^\star=0$$.

Example 5 Unboundedness without improving ray; i.e., weakly infeasible dual

The problem

(33)$\begin{split}\begin{array}{ll} \mbox{minimize}& (\sqrt{2}) x_3\\ \mbox{subject to}& 2 x_1 (0.5) \geq x_3^2,\\ \end{array}\end{split}$

has a feasible set shown by Fig. 9 (d) in the $$(x_3, x_1)$$-plane. The optimal objective value is the limit $$p^\star=-\infty$$ for $$x_1 \rightarrow \infty$$ and said to be unbounded. Nevertheless, the set of rays $$\{x_1 \geq 0, x_3=0\}$$, found by changing the constant $$0.5$$ in (33) to zero, shows that there is no improving ray to follow. The dual problem is

(34)$\begin{split}\begin{array}{ll} \mbox{maximize}& (0.5) s_1\\ \mbox{subject to}& s_2 = 0,\\ & 2 s_1 s_2 \geq (\sqrt{2})^2, \\ \end{array}\end{split}$

with empty feasible set shown by Fig. 9 (c) in the $$(s_1,s_2)$$-plane, and the optimal objective value $$d^\star=-\infty$$.

Only the dual problems were illposed in examples above. If the primal and dual problems are both illposed, strong duality may be lost with a positive gap between optimal objective values.

Example 6 Finite but nonzero duality gap

The problem

(35)$\begin{split}\begin{array}{ll} \mbox{minimize}& x_3\\ \mbox{subject to}& x_3 \geq -1,\\ & x_2 \geq x_1,\\ & x_1 \geq \sqrt{x_2^2 + x_3^2},\\ \end{array}\end{split}$

has a feasible set $$\{ x_1 = x_2 \geq 0, x_3 = 0 \}$$ because the conic constraint implies $$x_1 \geq x_2$$. The optimal objective value is thus $$p^\star=0$$. The dual problem is

(36)$\begin{split}\begin{array}{ll} \mbox{maximize}& - y_1\\ \mbox{subject to}& y_2 \geq \sqrt{(-y_2)^2 + (1-y_1)^2}, \\ & y_1,y_2 \geq 0, \\ \end{array}\end{split}$

with feasible set $$\{ y_1 = 1, y_2 \geq 0 \}$$, and hence optimal objective value $$d^\star=-1$$.

To ensure strong duality for conic quadratic optimization, we need an additional regularity assumption. Consider the primal feasible set for (29),

${\cal F}_p = \{ x=(x^l,x^q_1,\dots,x^q_{n_q}) \mid Ax = b, \: x^l \geq 0, \: x^q_j \in \mathcal{Q}^{n_j} \}.$

and dual feasible set for (30),

${\cal F}_d = \{ y\in \R^m, s \in (s^l,s^q_1,\dots,s^q_{n_q}) \mid c-A^Ty=s, \: s^l \geq 0, \: s^q_j \in \mathcal{Q}^{n_j} \},$

respectively. A sufficient condition for strong duality between the problems (29) and (30), with attainment of the optimal objective value in each, is

$\exists x\in {\cal F}_p \: : \: (x^q_j)_1 > \| (x^q_j)_{2:n_j} \|, \: j=1,\dots,n_q,$

and

$\exists (y,s) \in {\cal F}_d \: : \: (s^q_j)_1 > \| (s^q_j)_{2:n_j} \|, \: j=1,\dots,n_q.$

In other words, strong duality with attainment holds if the primal and dual problem have a feasible point satisfying all non-polyhedral conic inequalities strictly (the proofs in Sec. 9.5 (Strong duality) can be adapted to handle this case). This additional regularity assumption is called a Slater constraint qualification.

3.5 Infeasibility in conic quadratic optimization¶

Since duality theory for linear and conic quadratic optimization is almost identical, the same is true for infeasibility concepts. In particular, consider a pair of primal and dual conic optimization problems (29) and (30). We then have generalized versions of Farkas Lemma.

Lemma 5 Generalized Farkas

Given $$A$$ and $$b$$, at most one of the two statements are true:

1. There exists an $$x\in {\cal C}$$ such that $$Ax = b$$.
2. There exists a $$y$$ such that $$-A^T y \in {\cal C}$$ and $$b^T y > 0$$.

Farkas’ lemma tells us that a feasible point in the primal problem (29) and an improving ray in the dual problem (30) exclude each other. In other words, any $$y$$ satisfying

$-A^T y \in {\cal C}, \quad b^T y > 0$

is a certificate of primal infeasibility. Similarly the dual variant of Farkas’ lemma states that a feasible point in the dual problem (30) and an improving ray in the primal problem (29) exclude each other. More precisely

Lemma 6 Generalized Farkas dual variant

Given $$A$$ and $$c$$, at most one of the two statements are true:

1. There exists an $$x\in {\cal C}$$ such that $$Ax = 0$$ and $$c^Tx < 0$$.
2. There exists a $$y$$ such that $$(c-A^Ty) \in {\cal C}$$.

In other words, any $$x\in {\cal C}$$ satisfying $$Ax=0$$ and $$c^T x < 0$$ is a certificate of dual infeasibility. The only reason Lemma 5 and Lemma 6 cannot be strengthened to saying that exactly one of the two statements are true, is because of the illposed special cases of Sec. 3.4.2 (Examples of bad duality states).

$\begin{split}\begin{array}{lrcl} \mbox{minimize} & x_1 &\\ \mbox{subject to}& 2 x_1 - x_2 &= & 0\\ & x_1 - s &= & 1\\ & \sqrt{x_2^2} & \leq & x_1\\ & s & \geq & 0, \end{array}\end{split}$
which is infeasible since $$x_1 \geq 2|x_1|$$ and $$x_1 \geq 1$$. The dual problem is
$\begin{split}\begin{array}{lc} \mbox{maximize} & y_2\\ \mbox{subject to}& \left(\begin{array}{c} 1 \\ 0 \\ 0 \end{array} \right) + \left(\begin{array}{c} 2y_1 + y_2 \\ -y_1 \\ -y_2 \end{array} \right)\in \mathcal{Q}^3. \end{array}\end{split}$
Any point $$y=(0,t),\: t\geq 0$$ is a certificate of infeasibility since $$(t,0,-t)\in \mathcal{Q}^3$$; in fact $$(0,t)$$ is a dual feasible direction with unbounded objective.