# 3 Conic quadratic optimization¶

## 3.1 Introduction¶

This chapter extends the notion of linear optimization with *quadratic cones*; conic quadratic optimization is a straightforward generalization of linear optimization, in the sense that we optimize a linear function with linear (in)equalities with variables belonging to one or more (rotated) quadratic cones. In general we also allow some of the variables to be linear variables as long as some of the variables belong to a quadratic cone.

We discuss the basic concept of quadratic cones, and demonstrate the surprisingly large flexibility of conic quadratic modeling with several examples of (non-trivial) convex functions or sets that be represented using quadratic cones. These convex sets can then be combined arbitrarily to form different conic quadratic optimization problems.

We finally extend the duality theory and infeasibility analysis from linear to conic optimization, and discuss infeasibility of conic quadratic optimization problems.

### 3.1.1 Quadratic cones¶

We define an \(n\)-dimensional quadratic cone as

The geometric interpretation of a quadratic (or second-order) cone is shown in Fig. 7 for a cone with three variables, and illustrates how the boundary of the cone resembles an ice-cream cone.

The 1-dimensional quadratic cone simply implies standard nonnegativity \(x_1\geq 0\).

A set \(S\) is called a *convex cone* if for any \(x\in S\) we have \(\alpha x \in S,\: \forall \alpha \geq 0\). From the definition (1) it is clear that if \(x\in \mathcal{Q}^n\) then obviously \(\alpha x \in \mathcal{Q}^n, \: \forall \alpha \geq 0\), which justifies the notion *quadratic cone*.

### 3.1.2 Rotated quadratic cones¶

An \(n-\)dimensional *rotated quadratic cone* is defined as

As the name indicates, there is simple relationship between quadratic and rotated quadratic cones. Define an orthogonal transformation

Then it is easy to verify that

and since \(T\) is orthogonal we call \(\mathcal{Q}_r\) a rotated cone; the transformation corresponds to a rotation of \(\pi/4\) of the \((x_1,x_2)\) axis. For example if \(x\in \mathcal{Q}^3\) and

then

and similarly (by interchanging roles of \(x\) and \(z\)) we see that

Thus, one could argue that we only need quadratic cones, but there are many examples of functions where using an explicit rotated quadratic conic formulation is more natural; in Sec. 3.2 (Conic quadratic modeling) we discuss many examples involving both quadratic cones and rotated quadratic cones.

## 3.2 Conic quadratic modeling¶

In the following we describe several convex sets that can be modeled using conic quadratic formulations. We describe the convex sets in their simplest embodiment, which we interpret as *conic quadratic building blocks*; it is then straightforward to combine those in arbitrary intersections and affine mappings to model complex sets. For example we will show that the set

can be represented using quadratic cones. Sets that can be represented using quadratic cones are called *conic quadratic representable* sets.

### 3.2.1 Absolute values¶

In Sec. 2.2.2 (Absolute values) we saw how to model \(|x|\leq t\) using two linear inequalities, but in fact the epigraph of the absolute value is just the definition of a two-dimensional quadratic cone, i.e.,

### 3.2.2 Euclidean norms¶

The Euclidean norm of \(x\in \R^n\),

essentially defines the quadratic cone, i.e.,

### 3.2.3 Squared Euclidean norms¶

The epigraph of the squared Euclidean norm can be described as the intersection of a rotated quadratic cone with an affine hyperplane,

### 3.2.4 Convex quadratic sets¶

Assume \(Q \in \R^{n \times n}\) is a symmetric positive semidefinite matrix. The convex inequality

may be rewritten as

Since \(Q\) is symmetric positive semidefinite the epigraph

is a convex set and there exists a matrix \(F\in \R^{k\times n}\) such that

(see Sec. 4 (Semidefinite optimization) for properties of semidefinite matrices); for instance \(F\) could be the Cholesky factorization of \(Q\). We then have an equivalent characterization of (5) as

or more succinctly: if \(Q=F^TF\) then

Frequently \(Q\) has the structure

where \(I\) is the identity matrix, so

and hence we can write

In other words,

is a conic quadratic representation of (5) in this case.

### 3.2.5 Second-order cones¶

A second-order cone is occasionally specified as

where \(A \in \R^{m\times n}\) and \(c \in \R^n\). The formulation (7)) is equivalent to

or equivalently

As will be explained later, we refer to (7) as the dual form and (8) as the primal form, respectively. An alternative characterization of (7) is

which shows that certain quadratic inequalities are conic quadratic representable. The next section shows such an example.

### 3.2.6 Simple sets involving power functions¶

Next we show how to represent some frequently occurring convex sets represented by power like functions. The result is stated in the following lemma:

The following propositions are true.

- \(\left \{(t,x)\mid t \leq \sqrt{x}, \, x \geq 0 \right \} = \left \{(t,x) \mid (x,1/2,t) \in \mathcal{Q}_r^3 \right \}\).
- \(\left \{(t,x)\mid t \geq \frac{1}{x}, \, x \geq 0 \right \} = \left \{(t,x)\mid (x,t,\sqrt{2}) \in \mathcal{Q}_r^3 \right \}\).
- \(\left \{(t,x)\mid t \geq x^{3/2} ,\, x \geq 0 \right \} = \left \{(t,x)\mid (s,t,x), (x,1/8,s) \in \mathcal{Q}_r^3 \right \}\).
- \(\left \{(t,x)\mid t \geq x^{5/3} ,\, x \geq 0 \right \} = \left \{(t,x)\mid (s,t,x), (1/8,z,s),(s,x,z) \in \mathcal{Q}_r^3 \right \}\).
- \(\left \{(t,x)\mid t \geq \frac{1}{x^2}, \, x \geq 0 \right \} = \left \{(t,x) \mid (t,1/2,s), (x,s,\sqrt{2}) \in \mathcal{Q}_r^3 \right \}\).
- \(\left \{(t,x,y)\mid t \geq \frac{|x|^3}{y^2}, y\geq 0 \right \} = \left \{(t,x,y)\mid (z,x) \in \mathcal{Q}^2, (\frac{y}{2},s,z), (\frac{t}{2},z,s) \in \mathcal{Q}_r^3 \right \}\).

Proposition *(i)* follows from

Proposition *(ii)* follows from

Proposition *(iii)* follows from

Proposition *(iv)* follows from

Proposition *(v)* follows from

Finally, proposition *(vi)* follows from

In the above proposition the terms \(x^{3/2}\) and \(x^{5/3}\) appeared. Those terms are special cases of

defined for \(k=2,3,4,5,...\) (10) is identical to

which implies

The latter can be rewritten as

for which it is easy to build a conic quadratic representation. Therefore we will leave it as an exercise for reader.

### 3.2.7 Geometric mean¶

Closely related is the hypograph of the geometric mean of nonnegative variables \(x_1\) and \(x_2\),

This corresponds to a scaled rotated quadratic cone

More generally, we can obtain a conic quadratic representation of the hypograph

Initially let us assume that \(n=2^l\) for an integer \(l\geq 1\). In particular, let us assume \(l=3\), i.e., we seek a conic quadratic representation of

If we introduce the cone relationships

then

and so forth, which is obviously equivalent to

This leads to a characterization

or equivalently

Hence, by introducing 4 3-dimensional rotated quadratic cones, we have obtained a simpler problem with 4 \(y\) variables instead of 8 \(x\) variables. Clearly, we can apply the same idea to the reduced problem. To that end, introduce the inequalities

implying that

Finally, if we introduce the conic relationship

we get

Therefore

is a representation of the set (11), which is by construction conic quadratic representable. The relaxation using \(l\) levels of auxiliary variables and rotated quadratic cones can represented using a tree-structure as shown shown in Fig. 8.

Next let us assume that \(n\) is not a power of two. Let, for example, \(n=6\). We then wish to characterize the set

which is equivalent to

In other words, we can reuse the result (11) if we add simple equality constraints

Thus, if \(n\) is not a power of two, we take \(l=\lceil \log_2 n\rceil\) and build the conic quadratic quadratic representation for that set, and we add \(2^l-n\) simple equality constraints.

### 3.2.8 Harmonic mean¶

Consider next the hypograph of the harmonic mean,

This is a convex inequality, but not conic quadratic representable. However, the reciprocal inequality

with \(y=1/t\) can be characterized as

In other words, the set (14) corresponding to the epigraph of the reciprocal harmonic mean of \(x\) can be described using an intersection of rotated quadratic cones an affine hyperplanes

or \((x_i, z_i, \sqrt{2}) \in \mathcal{Q}_r^3\), \(e^T z =ny\).

### 3.2.9 Convex increasing rational powers¶

We can extend the formulations in Sec. 3.2.7 (Geometric mean) to describe the epigraph

for any rational convex power \(p/q\geq 1\), \(p, q\in \mathbb{Z}_+\). For example, consider

We rewrite this as

which can be described (using the approach in Sec. 3.2.7 (Geometric mean)) as

where we note that

The inequality

is a geometric mean inequality discussed in Sec. 3.2.7 (Geometric mean). If we let \(e_k\in\R^k\) denote the vector of all ones, then for general \(p/q \geq 1\), \(p,q\in \mathbb{Z}_+\) we have

### 3.2.10 Convex decreasing rational powers¶

In a similar way we can describe the epigraph

for any \(p, q\in \mathbb{Z}_+\). For example, consider

which we rewrite as

or equivalently as

Let \(e_k\in \R^k\) denote the vector of all ones. For general \(p,q\in \mathbb{Z}_+\) we then have

### 3.2.11 Power cones¶

The \((n+1)\)-dimensional power cone is defined by

where \(\alpha > 0\) and

Of particular interest is the three-dimensional power cone given by

For example, the intersection of the three-dimensional power cone with an affine hyperplane

is equivalent to

i.e., we can model the epigraph of convex increasing terms (\(1/\alpha_1\geq 1\)) using a three-dimensional power cone. As another example, the cone

is equivalent to

Assume next that \(\alpha_1 = p/q\) where \(p\) and \(q\) are positive integers and \(p \leq q\). Then

Now

is equivalent to

so by introducing additional \(z\) variables and the constraints

we can rewrite (18) as

which essentially is the geometric mean inequality discussed above.

We next consider the \(n+1\) dimensional power cone with rational powers,

where \(p_j,q_j\) are integers satisfying \(0 < p_j \leq q_j\) and \(\sum_{j=1}^n p_j/q_j = 1\). To motivate the general procedure, we consider a simple example.

Consider the 4 dimensional power cone

An equivalent representation is given by the geometric mean inequality

with

In the general case, let

denote the least common multiple of \(\{ q_i \}\). Then

where we note that \(\frac{\beta p_j}{q_j}\) are integers and \(\sum_{j=1}^n \frac{\beta p_j}{q_j} = \beta\). If we define

we can then reformulate (20) as

### 3.2.12 Quadratic forms with one negative eigenvalue¶

Assume that \(A \in \R^{n \times n}\) is a symmetric matrix with exactly one negative eigenvalue, i.e., \(A\) has a spectral factorization (i.e., eigenvalue decomposition)

where \(Q^T Q = I\), \(\Lambda=\mathbf{diag}(-\alpha_1, \alpha_2, \dots, \alpha_n)\), \(\alpha_i \geq 0, \: \forall i\). Then

is equivalent to

Suppose \(q_1^T x \geq 0\). We can characterize (21) in different ways, e.g., as

or as

The latter equivalence follows directly by using the orthogonal transformation \(T_{n+1}\) on (22), i.e.,

### 3.2.13 Ellipsoidal sets¶

The set

describes an ellipsoid centred at \(c\). It has a natural conic quadratic representation, i.e., \(x\in {\cal E}\) if and only if

Suppose \(P\) is nonsingular then

is an alternatively characterization.

Depending on the context one characterization may be more useful than the other.

## 3.3 Conic quadratic optimization examples¶

In this section we give different instructive examples of conic quadratic optimization problems formed using the formulations from Sec. 3.2 (Conic quadratic modeling).

### 3.3.1 Quadratically constrained quadratic optimization¶

A general convex quadratically constrained quadratic optimization problem can be written as

where all \(Q_i\) are symmetric positive. Let

where \(F_i \in \R^{k_i \times n}\). Using the formulations in Sec. 3.2.4 (Convex quadratic sets) we then get an equivalent conic quadratic problem

Assume next that \(Q_i\) is a rank 1 matrix, i.e., \(F_i\) has 1 row and \(n\) columns. Storing \(Q_i\) requires about \(n^2/2\) space whereas storing \(F_i\) then only requires \(n\) space. Moreover, the amount of work required to evaluate

is proportional to \(n^2\) whereas the work required to evaluate

is proportional to \(n\) only. In other words, if \(Q_i\) have low rank, then (25) will require much less space to solve than (24). This fact usually also translates into much faster solution times.

### 3.3.2 Robust optimization with ellipsoidal uncertainties¶

Often in robust optimization some of the parameters in the model are assumed to be unknown, but we assume that the unknown parameters belong to a simple set describing the uncertainty. For example, for a standard linear optimization problem we may wish to find a robust solution for all vectors \(c\) in an ellipsoid

A common approach is then to optimize for the *worst-case* realization of \(c\), i.e., we get a robust version

The worst-case objective can be evaluated as

where we used that \(\sup_{\| u \|_2\leq 1} v^T u = (v^T v)/\|v\|_2=\|v\|_2\). Thus the robust problem (26) is equivalent to

which can be posed as a conic quadratic problem

### 3.3.3 Markowitz portfolio optimization¶

In classical Markowitz portfolio optimization we consider investment in \(n\) stocks or assets held over a period of time. Let \(x_i\) denote the amount we invest in asset \(i\), and assume a stochastic model where the return of the assets is a random variable \(r\) with known mean

and covariance

The return of our investment is also a random variable \(y = r^Tx\) with mean (or expected return)

and variance (or risk)

We then wish to rebalance our portfolio to achieve a compromise between risk and expected return, e.g., we can maximize the expected return given an upper bound \(\gamma\) on the tolerable risk and a constraint that our total investment is fixed,

Suppose we factor \(\Sigma = GG^T\) (e.g., using a Cholesky or a eigenvalue decomposition). We then get a conic formulation

In practice both the average return and covariance are estimated using historical data. A recent trend is then to formulate a robust version of the portfolio optimization problem to combat the inherent uncertainty in those estimates, e.g., we can constrain \(\mu\) to an ellipsoidal uncertainty set as in Sec. 3.3.2 (Robust optimization with ellipsoidal uncertainties).

## 3.4 Duality in conic quadratic optimization¶

To discuss conic quadratic duality, we consider a primal problem

where \(c^l \in \R^{n_l}, c^q_j\in\R^{n_j}, A^l\in \R^{m\times n^l}, A^q_j\in \R^{m \times n_j}\) and \(b\in \R^m\), i.e., a problem with both a linear cone and \(n_q\) quadratic cones. We can also write (28) more compactly as

with

for the cone \({\cal C} = \R^l_+ \times \mathcal{Q}^{n_1} \times \cdots \times \mathcal{Q}^{n_q}\). We note that problem (29) resembles a standard linear optimization, except for a more general cone. The Lagrangian function is

For linear optimization the choice of nonnegative Lagrange multipliers \(s\geq 0\) ensures that \(x^T s \geq 0, \: \forall x\geq 0\) so the Lagrangian function provides a lower bound on the primal problem. For conic quadratic optimization the choice of Lagrange multipliers is more involved, and we need the notion of a *dual cone*

Then for \(s\in (\mathcal{Q}^n)^*\) we similarly have \(x^Ts\geq 0, \forall x\in \mathcal{Q}^n\), so the Lagrange function becomes a global lower bound.

Assume first that \(u_1\geq \|u_{2:n}\|\) and \(v_1 \geq \|v_{2:n}\|\). Then

Conversely, assume that \(v_1< \| v_{2:n} \|\). Then \(u:=(1,-v_{2:n}/\|v_{2:n}\|)\in Q^n\) satisfies

i.e., \(v\notin (\mathcal{Q}^n)^*\).

The notion of a dual cone allows us to treat (29) as a linear optimization problem, where the dual variables belong to the dual cone, i.e., we have a dual problem

with a dual cone

Weak duality follows directly by taking inner product of \(x\) and the dual constraint,

### 3.4.1 Primal versus dual form¶

Let us consider the dual formulation (30) without linear terms. If we eliminate the dual variable \(s\), we can rewrite (30) as

or equivalently

where the dual constraints

correspond to the so-called *dual form* conic constraints in Sec. 3.2.5 (Second-order cones). If the matrix

has more columns than rows then usually it is more efficient to solve the primal problem (29), and if the opposite is the case, then it is usually more efficient to solve the dual problem (30).

### 3.4.2 Examples of bad duality states¶

The issue of strong duality is more complicated than for linear programming since the advent of nonlinear feasible sets enables one to construct a number of troublesome problems as shown in Fig. 9.

Although seemingly unrelated, the four cases of Fig. 9 are actually primal-dual pairs; (a) with (b) and (c) with (d). In fact, the missing normal vector property in (b)—desired to certify optimality—can be attributed to (a) not attaining the best among objective values at distance zero to feasibility, and the missing positive distance property in (c)—desired to certify infeasibility—is because (d) has no improving ray. Optimization problems such as (b) and (c) are denoted *illposed* because, as seen in Fig. 9, even the tiniest problem perturbation may change their *weak* feasibility status. This can make illposed problems incredibly difficult to solve numerically. We formalize both pairs as examples.

Consider the problem

with feasible set shown by Fig. 9 (a) in the \((x_1,x_2)\)-plane. The optimal objective value is the limit \(p^\star=0\) for \(x_1 \rightarrow \infty\) and said to be unattained. The dual problem is

with feasible set shown by Fig. 9 (b) in the \((s_3, s_1)\)-plane, and the optimal objective value \(d^\star=0\).

The problem

has a feasible set shown by Fig. 9 (d) in the \((x_3, x_1)\)-plane. The optimal objective value is the limit \(p^\star=-\infty\) for \(x_1 \rightarrow \infty\) and said to be unbounded. Nevertheless, the set of rays \(\{x_1 \geq 0, x_3=0\}\), found by changing the constant \(0.5\) in (33) to zero, shows that there is no improving ray to follow. The dual problem is

with empty feasible set shown by Fig. 9 (c) in the \((s_1,s_2)\)-plane, and the optimal objective value \(d^\star=-\infty\).

Only the dual problems were illposed in examples above. If the primal and dual problems are both illposed, strong duality may be lost with a positive gap between optimal objective values.

The problem

has a feasible set \(\{ x_1 = x_2 \geq 0, x_3 = 0 \}\) because the conic constraint implies \(x_1 \geq x_2\). The optimal objective value is thus \(p^\star=0\). The dual problem is

with feasible set \(\{ y_1 = 1, y_2 \geq 0 \}\), and hence optimal objective value \(d^\star=-1\).

To ensure strong duality for conic quadratic optimization, we need an additional regularity assumption. Consider the primal feasible set for (29),

and dual feasible set for (30),

respectively. A sufficient condition for strong duality between the problems (29) and (30), with attainment of the optimal objective value in each, is

and

In other words, strong duality with attainment holds if the primal and dual problem have a feasible point satisfying all *non-polyhedral* conic inequalities strictly (the proofs in Sec. 9.5 (Strong duality) can be adapted to handle this case). This additional regularity assumption is called a *Slater constraint qualification*.

## 3.5 Infeasibility in conic quadratic optimization¶

Since duality theory for linear and conic quadratic optimization is almost identical, the same is true for infeasibility concepts. In particular, consider a pair of primal and dual conic optimization problems (29) and (30). We then have generalized versions of Farkas Lemma.

Given \(A\) and \(b\), at most one of the two statements are true:

- There exists an \(x\in {\cal C}\) such that \(Ax = b\).
- There exists a \(y\) such that \(-A^T y \in {\cal C}\) and \(b^T y > 0\).

Farkas’ lemma tells us that a feasible point in the primal problem (29) and an improving ray in the dual problem (30) exclude each other. In other words, any \(y\) satisfying

is a *certificate of primal infeasibility*. Similarly the dual variant of Farkas’ lemma states that a feasible point in the dual problem (30) and an improving ray in the primal problem (29) exclude each other. More precisely

Given \(A\) and \(c\), at most one of the two statements are true:

- There exists an \(x\in {\cal C}\) such that \(Ax = 0\) and \(c^Tx < 0\).
- There exists a \(y\) such that \((c-A^Ty) \in {\cal C}\).

In other words, any \(x\in {\cal C}\) satisfying \(Ax=0\) and \(c^T x < 0\) is a *certificate of dual infeasibility*. The only reason Lemma 5 and Lemma 6 cannot be strengthened to saying that *exactly one* of the two statements are true, is because of the illposed special cases of Sec. 3.4.2 (Examples of bad duality states).

Consider the conic quadratic problem

which is infeasible since \(x_1 \geq 2|x_1|\) and \(x_1 \geq 1\). The dual problem is

Any point \(y=(0,t),\: t\geq 0\) is a certificate of infeasibility since \((t,0,-t)\in \mathcal{Q}^3\); in fact \((0,t)\) is a dual feasible direction with unbounded objective.