# 3. Conic quadratic optimization¶

## 3.1. Introduction¶

This chapter extends the notion of linear optimization with *quadratic cones*; conic quadratic optimization is a straightforward generalization of linear optimization, in the sense that we optimize a linear function with linear (in)equalities with variables belonging to one or more (rotated) quadratic cones. In general we also allow some of the variables to be linear variables as long as some of the variables belong to a quadratic cone.

We discuss the basic concept of quadratic cones, and demonstrate the surprisingly large flexibility of conic quadratic modeling with several examples of (non-trivial) convex functions or sets that be represented using quadratic cones. These convex sets can then be combined arbitrarily to form different conic quadratic optimization problems.

We finally extend the duality theory and infeasibility analysis from linear to conic optimization, and discuss infeasibility of conic quadratic optimization problems.

### 3.1.1. Quadratic cones¶

We define an \(n\)-dimensional quadratic cone as

The geometric interpretation of a quadratic (or second-order) cone is shown in Fig. 3.1 for a cone with three variables, and illustrates how the boundary of the cone resembles an ice-cream cone.

The 1-dimensional quadratic cone simply implies standard nonnegativity \(x_1\geq 0\).

A set \(S\) is called a *convex cone* if for any \(x\in S\) we have \(\alpha x \in S,\: \forall \alpha \geq 0\). From the definition (1) it is clear that if \(x\in \mathcal{Q}^n\) then obviously \(\alpha x \in \mathcal{Q}^n, \: \forall \alpha \geq 0\), which justifies the notion *quadratic cone*.

### 3.1.2. Rotated quadratic cones¶

An \(n-\)dimensional *rotated quadratic cone* is defined as

As the name indicates, there is simple relationship between quadratic and rotated quadratic cones. Define an orthogonal transformation

Then it is easy to verify that

and since \(T\) is orthogonal we call \(\mathcal{Q}_r\) a rotated cone; the transformation corresponds to a rotation of \(\pi/4\) of the \((x_1,x_2)\) axis. For example if \(x\in \mathcal{Q}^3\) and

then

and similarly (by interchanging roles of \(x\) and \(z\)) we see that

Thus, one could argue that we only need quadratic cones, but there are many examples of functions where using an explicit rotated quadratic conic formulation is more natural; in Sec. 3.2 we discuss many examples involving both quadratic cones and rotated quadratic cones.

## 3.2. Conic quadratic modeling¶

In the following we describe several convex sets that can be modeled using conic quadratic formulations. We describe the convex sets in their simplest embodiment, which we interpret as *conic quadratic building blocks*; it is then straightforward to combine those in arbitrary intersections and affine mappings to model complex sets. For example we will show that the set

can be represented using quadratic cones. Sets that can be represented using quadratic cones are called *conic quadratic representable* sets.

### 3.2.1. Absolute values¶

In Sec. 2.2.2 we saw how to model \(|x|\leq t\) using two linear inequalities, but in fact the epigraph of the absolute value is just the definition of a two-dimensional quadratic cone, i.e.,

### 3.2.2. Euclidean norms¶

The Euclidean norm of \(x\in \R^n\),

essentially defines the quadratic cone, i.e.,

### 3.2.3. Squared Euclidean norms¶

The epigraph of the squared Euclidean norm can be described as the intersection of a rotated quadratic cone with an affine hyperplane,

### 3.2.4. Convex quadratic sets¶

Assume \(Q \in \R^{n \times n}\) is a symmetric positive semidefinite matrix. The convex inequality

may be rewritten as

Since \(Q\) is symmetric positive semidefinite the epigraph

is a convex set and there exists a matrix \(F\in \R^{k\times n}\) such that

(see Chap. 4 for properties of semidefinite matrices); for instance \(F\) could be the Cholesky factorization of \(Q\). We then have an equivalent characterization of (5) as

or more succinctly: if \(Q=F^TF\) then

Frequently \(Q\) has the structure

where \(I\) is the identity matrix, so

and hence we can write

In other words,

is a conic quadratic representation of (5) in this case.

### 3.2.5. Second-order cones¶

A second-order cone is occasionally specified as

where \(A \in \R^{m\times n}\) and \(c \in \R^n\). The formulation (7)) is equivalent to

or equivalently

As will be explained later, we refer to (7) as the dual form and (8) as the primal form, respectively. An alternative characterization of (7) is

which shows that certain quadratic inequalities are conic quadratic representable. The next section shows such an example.

### 3.2.6. Simple sets involving power functions¶

Next we show how to represent some frequently occurring convex sets represented by power like functions. The result is stated in the following lemma:

The following propositions are true.

- \(\left \{(t,x)\mid t \leq \sqrt{x}, \, x \geq 0 \right \} = \left \{(t,x) \mid (x,1/2,t) \in \mathcal{Q}_r^3 \right \}\).
- \(\left \{(t,x)\mid t \geq \frac{1}{x}, \, x \geq 0 \right \} = \left \{(t,x)\mid (x,t,\sqrt{2}) \in \mathcal{Q}_r^3 \right \}\).
- \(\left \{(t,x)\mid t \geq x^{3/2} ,\, x \geq 0 \right \} = \left \{(t,x)\mid (s,t,x), (x,1/8,s) \in \mathcal{Q}_r^3 \right \}\).
- \(\left \{(t,x)\mid t \geq x^{5/3} ,\, x \geq 0 \right \} = \left \{(t,x)\mid (s,t,x), (1/8,z,s),(s,x,z) \in \mathcal{Q}_r^3 \right \}\).
- \(\left \{(t,x)\mid t \geq \frac{1}{x^2}, \, x \geq 0 \right \} = \left \{(t,x) \mid (t,1/2,s), (x,s,\sqrt{2}) \in \mathcal{Q}_r^3 \right \}\).
- \(\left \{(t,x,y)\mid t \geq \frac{|x|^3}{y^2}, y\geq 0 \right \} = \left \{(t,x,y)\mid (z,x) \in \mathcal{Q}^2, (\frac{y}{2},s,z), (\frac{t}{2},z,s) \in \mathcal{Q}_r^3 \right \}\).

*Proof.* Proposition *(i)* follows from

Proposition *(ii)* follows from

Proposition *(iii)* follows from

Proposition *(iv)* follows from

Proposition *(v)* follows from

Finally, proposition *(vi)* follows from

In the above proposition the terms \(x^{3/2}\) and \(x^{5/3}\) appeared. Those terms are special cases of

defined for \(k=2,3,4,5,...\) (10) is identical to

which implies

The latter can be rewritten as

for which it is easy to build a conic quadratic representation. Therefore we will leave it as an exercise for reader.

### 3.2.7. Geometric mean¶

Closely related is the hypograph of the geometric mean of nonnegative variables \(x_1\) and \(x_2\),

This corresponds to a scaled rotated quadratic cone

More generally, we can obtain a conic quadratic representation of the hypograph

Initially let us assume that \(n=2^l\) for an integer \(l\geq 1\). In particular, let us assume \(l=3\), i.e., we seek a conic quadratic representation of

If we introduce the cone relationships

then

and so forth, which is obviously equivalent to

This leads to a characterization

or equivalently

Hence, by introducing 4 3-dimensional rotated quadratic cones, we have obtained a simpler problem with 4 \(y\) variables instead of 8 \(x\) variables. Clearly, we can apply the same idea to the reduced problem. To that end, introduce the inequalities

implying that

Finally, if we introduce the conic relationship

we get

Therefore

is a representation of the set (11), which is by construction conic quadratic representable. The relaxation using \(l\) levels of auxiliary variables and rotated quadratic cones can represented using a tree-structure as shown shown in Fig. 3.2.

Next let us assume that \(n\) is not a power of two. Let, for example, \(n=6\). We then wish to characterize the set

which is equivalent to

In other words, we can reuse the result (11) if we add simple equality constraints

Thus, if \(n\) is not a power of two, we take \(l=\lceil \log_2 n\rceil\) and build the conic quadratic quadratic representation for that set, and we add \(2^l-n\) simple equality constraints.

### 3.2.8. Harmonic mean¶

Consider next the hypograph of the harmonic mean,

This is a convex inequality, but not conic quadratic representable. However, the reciprocal inequality

with \(y=1/t\) can be characterized as

In other words, the set (14) corresponding to the epigraph of the reciprocal harmonic mean of \(x\) can be described using an intersection of rotated quadratic cones an affine hyperplanes

or \((x_i, z_i, \sqrt{2}) \in \mathcal{Q}_r^3\), \(e^T z =ny\).

### 3.2.9. Convex increasing rational powers¶

We can extend the formulations in Sec. 3.2.7 to describe the epigraph

for any rational convex power \(p/q\geq 1\), \(p, q\in \mathbb{Z}_+\). For example, consider

We rewrite this as

which can be described (using the approach in Sec. 3.2.7) as

where we note that

The inequality

is a geometric mean inequality discussed in Sec. 3.2.7. If we let \(e_k\in\R^k\) denote the vector of all ones, then for general \(p/q \geq 1\), \(p,q\in \mathbb{Z}_+\) we have

### 3.2.10. Convex decreasing rational powers¶

In a similar way we can describe the epigraph

for any \(p, q\in \mathbb{Z}_+\). For example, consider

which we rewrite as

or equivalently as

Let \(e_k\in \R^k\) denote the vector of all ones. For general \(p,q\in \mathbb{Z}_+\) we then have

### 3.2.11. Power cones¶

The \((n+1)\)-dimensional power cone is defined by

where \(\alpha > 0\) and

Of particular interest is the three-dimensional power cone given by

For example, the intersection of the three-dimensional power cone with an affine hyperplane

is equivalent to

i.e., we can model the epigraph of convex increasing terms (\(1/\alpha_1\geq 1\)) using a three-dimensional power cone. As another example, the cone

is equivalent to

Assume next that \(\alpha_1 = p/q\) where \(p\) and \(q\) are positive integers and \(p \leq q\). Then

Now

is equivalent to

so by introducing additional \(z\) variables and the constraints

we can rewrite (18) as

which essentially is the geometric mean inequality discussed above.

We next consider the \(n+1\) dimensional power cone with rational powers,

where \(p_j,q_j\) are integers satisfying \(0 < p_j \leq q_j\) and \(\sum_{j=1}^n p_j/q_j = 1\). To motivate the general procedure, we consider a simple example.

Consider the 4 dimensional power cone

An equivalent representation is given by the geometric mean inequality

with

In the general case, let

denote the least common multiple of \(\{ q_i \}\). Then

where we note that \(\frac{\beta p_j}{q_j}\) are integers and \(\sum_{j=1}^n \frac{\beta p_j}{q_j} = \beta\). If we define

we can then reformulate (20) as

### 3.2.12. Quadratic forms with one negative eigenvalue¶

Assume that \(A \in \R^{n \times n}\) is a symmetric matrix with exactly one negative eigenvalue, i.e., \(A\) has a spectral factorization (i.e., eigenvalue decomposition)

where \(Q^T Q = I\), \(\Lambda=\mathbf{diag}(-\alpha_1, \alpha_2, \dots, \alpha_n)\), \(\alpha_i \geq 0, \: \forall i\). Then

is equivalent to

Suppose \(q_1^T x \geq 0\). We can characterize (21) in different ways, e.g., as

or as

The latter equivalence follows directly by using the orthogonal transformation \(T_{n+1}\) on (22), i.e.,

### 3.2.13. Ellipsoidal sets¶

The set

describes an ellipsoid centred at \(c\). It has a natural conic quadratic representation, i.e., \(x\in {\cal E}\) if and only if

Suppose \(P\) is nonsingular then

is an alternatively characterization.

Depending on the context one characterization may be more useful than the other.

## 3.3. Conic quadratic optimization examples¶

In this section we give different instructive examples of conic quadratic optimization problems formed using the formulations from Sec. 3.2.

### 3.3.1. Quadratically constrained quadratic optimization¶

A general convex quadratically constrained quadratic optimization problem can be written as

where all \(Q_i\) are symmetric positive. Let

where \(F_i \in \R^{k_i \times n}\). Using the formulations in Sec. 3.2.4 we then get an equivalent conic quadratic problem

Assume next that \(Q_i\) is a rank 1 matrix, i.e., \(F_i\) has 1 row and \(n\) columns. Storing \(Q_i\) requires about \(n^2/2\) space whereas storing \(F_i\) then only requires \(n\) space. Moreover, the amount of work required to evaluate

is proportional to \(n^2\) whereas the work required to evaluate

is proportional to \(n\) only. In other words, if \(Q_i\) have low rank, then (25) will require much less space to solve than (24). This fact usually also translates into much faster solution times.

### 3.3.2. Robust optimization with ellipsoidal uncertainties¶

Often in robust optimization some of the parameters in the model are assumed to be unknown, but we assume that the unknown parameters belong to a simple set describing the uncertainty. For example, for a standard linear optimization problem we may wish to find a robust solution for all vectors \(c\) in an ellipsoid

A common approach is then to optimize for the *worst-case* realization of \(c\), i.e., we get a robust version

The worst-case objective can be evaluated as

where we used that \(\sup_{\| u \|_2\leq 1} v^T u = (v^T v)/\|v\|_2=\|v\|_2\). Thus the robust problem (26) is equivalent to

which can be posed as a conic quadratic problem

### 3.3.3. Markowitz portfolio optimization¶

In classical Markowitz portfolio optimization we consider investment in \(n\) stocks or assets held over a period of time. Let \(x_i\) denote the amount we invest in asset \(i\), and assume a stochastic model where the return of the assets is a random variable \(r\) with known mean

and covariance

The return of our investment is also a random variable \(y = r^Tx\) with mean (or expected return)

and variance (or risk)

We then wish to rebalance our portfolio to achieve a compromise between risk and expected return, e.g., we can maximize the expected return given an upper bound \(\gamma\) on the tolerable risk and a constraint that our total investment is fixed,

Suppose we factor \(\Sigma = GG^T\) (e.g., using a Cholesky or a eigenvalue decomposition). We then get a conic formulation

In practice both the average return and covariance are estimated using historical data. A recent trend is then to formulate a robust version of the portfolio optimization problem to combat the inherent uncertainty in those estimates, e.g., we can constrain \(\mu\) to an ellipsoidal uncertainty set as in Sec. 3.3.2.

## 3.4. Duality in conic quadratic optimization¶

To discuss conic quadratic duality, we consider a primal problem

where \(c^l \in \R^{n_l}, c^q_j\in\R^{n_j}, A^l\in \R^{m\times n^l}, A^q_j\in \R^{m \times n_j}\) and \(b\in \R^m\), i.e., a problem with both a linear cone and \(n_q\) quadratic cones. We can also write (28) more compactly as

with

for the cone \({\cal C} = \R^l_+ \times \mathcal{Q}^{n_1} \times \cdots \times \mathcal{Q}^{n_q}\). We note that problem (29) resembles a standard linear optimization, except for a more general cone. The Lagrangian function is

For linear optimization the choice of nonnegative Lagrange multipliers \(s\geq 0\) ensures that \(x^T s \geq 0, \: \forall x\geq 0\) so the Lagrangian function provides a lower bound on the primal problem. For conic quadratic optimization the choice of Lagrange multipliers is more involved, and we need the notion of a *dual cone*

Then for \(s\in (\mathcal{Q}^n)^*\) we similarly have \(x^Ts\geq 0, \forall x\in \mathcal{Q}^n\), so the Lagrange function becomes a global lower bound.

*Proof.* Assume first that \(u_1\geq \|u_{2:n}\|\) and \(v_1 \geq \|v_{2:n}\|\). Then

Conversely, assume that \(v_1< \| v_{2:n} \|\). Then \(u:=(1,-v_{2:n}/\|v_{2:n}\|)\in Q^n\) satisfies

i.e., \(v\notin (\mathcal{Q}^n)^*\).

The notion of a dual cone allows us to treat (29) as a linear optimization problem, where the dual variables belong to the dual cone, i.e., we have a dual problem

with a dual cone

Weak duality follows directly by taking inner product of \(x\) and the dual constraint,

### 3.4.1. Primal versus dual form¶

Let us consider the dual formulation (30) without linear terms. If we eliminate the dual variable \(s\), we can rewrite (30) as

or equivalently

where the dual constraints

correspond to the so-called *dual form* conic constraints in Sec. 3.2.5. If the matrix

has more columns than rows then usually it is more efficient to solve the primal problem (29), and if the opposite is the case, then it is usually more efficient to solve the dual problem (30).

### 3.4.2. Examples of bad duality states¶

The issue of strong duality is more complicated than for linear programming, for example the primal (or dual) optimal values may be finite but unattained, as illustrated in the following example.

The following example is from [ART02]: consider the problem

with feasible set \(\{x_1\geq \sqrt{1 + x_2^2}, \: x_3=1 \}\). The optimal objective value is \(p^\star=0\) for \((x_1,x_2)\rightarrow \infty\), and we say that the objective is unattained. The dual problem is

with feasible set \(\{y=0\}\) and optimal value \(d^\star=0\), so a positive duality gap exists for all bounded \((x,y)\).

It is also possible to have a positive duality gap even when both the primal and dual optimal objectives are attained.

We consider the problem

with two quadratic cones \(x\in\mathcal{Q}^3, \: u\in \mathcal{Q}^2\). It follows from the inequality constraints that

and therefore \(x_1 + x_2 \geq 0\) and \(u_1 + u_2 \geq 0\). But then \(x_1 + x_2 = 0\) and \(u_1 + u_2 = 0\), which combined with \(x_3=1-u_2\) gives a simplified problem

with feasible set \(\{ u_2=1, \: x_1\geq 0\}\) and optimal value \(p^\star=0\). The dual problem of (33) is

with feasible set \(\{y_2=1, \: y_1\geq \frac{1}{2} \}\) and the optimal value is \(d^\star=-1\). For this example, both the primal and dual optimal values are attained, but we have positive duality gap \(p^\star-d^\star=1\).

To ensure strong duality for conic quadratic optimization, we need an additional regularity assumption. Consider the primal feasible set for (29),

and dual feasible set for (30),

respectively. If

or

then strong duality between the problems (29) and (30) holds. The additional regularity assumptions are called a *Slater constraint qualification*. In other words, strong duality holds if the *conic inequalities* in the primal or dual problem are *strictly feasible* (the proofs in App. 9.5 can be adapted to handle this case).

Thus the deficiency of positive duality gap in Example 2 is caused by the fact that neither the primal or dual problem is strictly feasible; the inequality in the first problem

is not strict for any \(u\) in the primal feasible set, and similarly the inequality

is not strict for any \(y\) in the dual feasible set.

## 3.5. Infeasibility in conic quadratic optimization¶

Since duality theory for linear and conic quadratic optimization is almost identical, the same is true for infeasibility concepts. In particular, consider a pair of primal and dual conic optimization problems (29) and (30). We then have generalized versions of Farkas Lemma.

Given \(A\) and \(b\), exactly one of the two statements are true:

- There exists an \(x\in {\cal C}\) such that \(Ax = b\).
- There exists a \(y\) such that \(-A^T y \in {\cal C}\) and \(b^T y > 0\).

Farkas’ lemma tells us that either the primal problem (29) is feasible or there exists a \(y\) such that \(-A^T y \in {\cal C}\) and \(b^T y > 0\). In other words, any \(y\) satisfying

is a *certificate of primal infeasibility*. Similarly, the dual variant of Farkas’ lemma states that either the dual problem (30) is feasible or there exists an \(x\in {\cal C}\) such that \(Ax=0\) and \(c^T x <0\). More precisely

Given \(A\) and \(c\), exactly one of the two statements are true:

- There exists an \(x\in {\cal C}\) such that \(Ax = 0\) and \(c^Tx < 0\).
- There exists a \(y\) such \((c-A^Ty) \in {\cal C}\).

In other words, any \(x\in {\cal C}\) satisfying \(Ax=0\) and \(c^T x < 0\) is a *certificate of dual infeasibility*.

Consider the conic quadratic problem

which is infeasible since \(x_1 \geq 2|x_1|\) and \(x_1 \geq 1\). The dual problem is

Any point \(y=(0,t),\: t\geq 0\) is a certificate of infeasibility since \((t,0,-t)\in \mathcal{Q}^3\); in fact \((0,t)\) is a dual feasible direction with unbounded objective.