3 Conic quadratic optimization

This chapter extends the notion of linear optimization with quadratic cones. Conic quadratic optimization, also known as second-order cone optimization, is a straightforward generalization of linear optimization, in the sense that we optimize a linear function under linear (in)equalities with some variables belonging to one or more (rotated) quadratic cones. We discuss the basic concept of quadratic cones, and demonstrate the surprisingly large flexibility of conic quadratic modeling.

3.1 Cones

Since this is the first place where we introduce a non-linear cone, it seems suitable to make our most important definition:

A set KRn is called a convex cone if

  • for every x,yK we have x+yK,

  • for every xK and α0 we have αxK.

For example a linear subspace of Rn, the positive orthant R0n or any ray (half-line) starting at the origin are examples of convex cones. We leave it for the reader to check that the intersection of convex cones is a convex cone; this property enables us to assemble complicated optimization models from individual conic bricks.

3.1.1 Quadratic cones

We define the n-dimensional quadratic cone as

(3.1)Qn={xRnx1x22+x32++xn2}.

The geometric interpretation of a quadratic (or second-order) cone is shown in Fig. 3.1 for a cone with three variables, and illustrates how the boundary of the cone resembles an ice-cream cone. The 1-dimensional quadratic cone simply states nonnegativity x10.

_images/qcone.png

Fig. 3.1 Boundary of quadratic cone x1x22+x32 and rotated quadratic cone 2x1x2x32, x1,x20.

3.1.2 Rotated quadratic cones

An ndimensional rotated quadratic cone is defined as

(3.2)Qrn={xRn2x1x2x32++xn2,x1,x20}.

As the name indicates, there is a simple relationship between quadratic and rotated quadratic cones. Define an orthogonal transformation

(3.3)Tn:=[1/21/201/21/2000In2].

Then it is easy to verify that

xQnTnxQrn,

and since T is orthogonal we call Qrn a rotated cone; the transformation corresponds to a rotation of π/4 in the (x1,x2) plane. For example if xQ3 and

[z1z2z3]=[1212012120001][x1x2x3]=[12(x1+x2)12(x1x2)x3]

then

2z1z2z32,z1,z20(x12x22)x32,x10,

and similarly we see that

x12x22+x32,x102z1z2z32,z1,z20.

Thus, one could argue that we only need quadratic cones Qn, but there are many examples where using an explicit rotated quadratic cone Qrn is more natural, as we will see next.

3.2 Conic quadratic modeling

In the following we describe several convex sets that can be modeled using conic quadratic formulations or, as we call them, are conic quadratic representable.

3.2.1 Absolute values

In Sec. 2.2.2 (Absolute value) we saw how to model |x|t using two linear inequalities, but in fact the epigraph of the absolute value is just the definition of a two-dimensional quadratic cone, i.e.,

|x|t(t,x)Q2.

3.2.2 Euclidean norms

The Euclidean norm of xRn,

x2=x12+x22++xn2

essentially defines the quadratic cone, i.e.,

x2t(t,x)Qn+1.

The epigraph of the squared Euclidean norm can be described as the intersection of a rotated quadratic cone with an affine hyperplane,

x12++xn2=x22t(1/2,t,x)Qrn+2.

3.2.3 Convex quadratic sets

Assume QRn×n is a symmetric positive semidefinite matrix. The convex inequality

(1/2)xTQx+cTx+r0

may be rewritten as

(3.4)t+cTx+r=0,xTQx2t.

Since Q is symmetric positive semidefinite the epigraph

(3.5)xTQx2t

is a convex set and there exists a matrix FRk×n such that

(3.6)Q=FTF

(see Sec. 6 (Semidefinite optimization) for properties of semidefinite matrices). For instance F could be the Cholesky factorization of Q. Then

xTQx=xTFTFx=Fx22

and we have an equivalent characterization of (3.5) as

(1/2)xTQxt(t,1,Fx)Qr2+k.

Frequently Q has the structure

Q=I+FTF

where I is the identity matrix, so

xTQx=xTx+xTFTFx=x22+Fx22

and hence

(t,1,[IF]x)Qr2+n+k

is a conic quadratic representation of (3.5) in this case.

3.2.4 Second-order cones

A second-order cone is occasionally specified as

(3.7)Ax+b2cTx+d

where ARm×n and cRn. The formulation (3.7) is simply

(3.8)(cTx+d,Ax+b)Qm+1

or equivalently

(3.9)s=Ax+b,t=cTx+d,(t,s)Qm+1.

As will be explained in Sec. 8 (Duality in conic optimization), we refer to (3.8) as the dual form and (3.9) as the primal form. An alternative characterization of (3.7) is

(3.10)Ax+b22(cTx+d)20,cTx+d0

which shows that certain quadratic inequalities are conic quadratic representable.

3.2.5 Simple sets involving power functions

Some power-like inequalities are conic quadratic representable, even though it need not be obvious at first glance. For example, we have

|t|x,x0(x,1/2,t)Qr3,

or in a similar fashion

t1x,x0(x,t,2)Qr3.

For a more complicated example, consider the constraint

tx3/2,x0.

This is equivalent to a statement involving two cones and an extra variable

(s,t,x),(x,1/8,s)Qr3

because

2stx2, 218xs2,4s2t214xx4s2tx3/2.

In practice power-like inequalities representable with similar tricks can often be expressed much more naturally using the power cone (see Sec. 4 (The power cones)), so we will not dwell on these examples much longer.

3.2.6 Rational functions

A general non-degenerate rational function of one variable f(x)=ax+bgx+h can always be written in the form f(x)=p+qgx+h, and when q>0 it is convex on the set where gx+h>0. The conic quadratic model of

tp+qgx+h,gx+h>0,q>0

with variables t,x can be written as:

(tp,gx+h,2q)Qr3.

We can generalize it to a quadratic-over-linear function f(x)=ax2+bx+cgx+h. By performing long polynomial division we can write it as f(x)=rx+s+qgx+h, and as before it is convex when q>0 on the set where gx+h>0. In particular

trx+s+qgx+h,gx+h>0,q>0

can be written as

(trxs,gx+h,2q)Qr3.

In both cases the argument boils down to the observation that the target function is a sum of an affine expression and the inverse of a positive affine expression, see Sec. 3.2.5 (Simple sets involving power functions).

3.2.7 Harmonic mean

Consider next the hypograph of the harmonic mean,

(1ni=1nxi1)1t0,x0.

It is not obvious that the inequality defines a convex set, nor that it is conic quadratic representable. However, we can rewrite it in the form

i=1nt2xint,

which suggests the conic representation:

(3.11)2xizit2,xi,zi0,2i=1nzi=nt.

Alternatively, in case n=2, the hypograph of the harmonic mean can be represented by a single quadratic cone:

(3.12)(x1+x2t/2,x1,x2,t/2)Q4.

As this representation generalizes the harmonic mean to all points x1+x20, as implied by (3.12), the variable bounds x0 and t0 have to be added seperately if so desired.

3.2.8 Quadratic forms with one negative eigenvalue

Assume that ARn×n is a symmetric matrix with exactly one negative eigenvalue, i.e., A has a spectral factorization (i.e., eigenvalue decomposition)

A=QΛQT=α1q1q1T+i=2nαiqiqiT,

where QTQ=I, Λ=Diag(α1,α2,,αn), αi0. Then

xTAx0

is equivalent to

(3.13)j=2nαj(qjTx)2α1(q1Tx)2.

This shows xTAx0 to be a union of two convex subsets, each representable by a quadratic cone. For example, assuming q1Tx0, we can characterize (3.13) as

(3.14)(α1q1Tx,α2q2Tx,,αnqnTx)Qn.

3.2.9 Ellipsoidal sets

The set

E={xRnP(xc)21}

describes an ellipsoid centred at c. It has a natural conic quadratic representation, i.e., xE if and only if

xE(1,P(xc))Qn+1.

3.3 Conic quadratic case studies

3.3.1 Quadratically constrained quadratic optimization

A general convex quadratically constrained quadratic optimization problem can be written as

(3.15)minimize(1/2)xTQ0x+c0Tx+r0subject to(1/2)xTQix+ciTx+ri0,i=1,,p,

where all QiRn×n are symmetric positive semidefinite. Let

Qi=FiTFi,i=0,,p,

where FiRki×n. Using the formulations in Sec. 3.2.3 (Convex quadratic sets) we then get an equivalent conic quadratic problem

(3.16)minimizet0+c0Tx+r0subject toti+ciTx+ri=0,i=1,,p,(ti,1,Fix)Qrki+2,i=0,,p.

Assume next that ki, the number of rows in Fi, is small compared to n. Storing Qi requires about n2/2 space whereas storing Fi then only requires nki space. Moreover, the amount of work required to evaluate xTQix is proportional to n2 whereas the work required to evaluate xTFiTFix=Fix2 is proportional to nki only. In other words, if Qi have low rank, then (3.16) will require much less space and time to solve than (3.15). We will study the reformulation (3.16) in much more detail in Sec. 10 (Quadratic optimization).

3.3.2 Robust optimization with ellipsoidal uncertainties

Often in robust optimization some of the parameters in the model are assumed to be unknown exactly, but there is a simple set describing the uncertainty. For example, for a standard linear optimization problem we may wish to find a robust solution for all objective vectors c in an ellipsoid

E={cRnc=Fy+g,y21}.

A common approach is then to optimize for the worst-case scenario for c, so we get a robust version

(3.17)minimizesupcEcTxsubject toAx=b,x0.

The worst-case objective can be evaluated as

supcEcTx=gTx+supy21yTFTx=gTx+FTx2

where we used that supu21vTu=(vTv)/v2=v2. Thus the robust problem (3.17) is equivalent to

minimizegTx+FTx2subject toAx=b,x0,

which can be posed as a conic quadratic problem

(3.18)minimizegTx+tsubject toAx=b,(t,FTx)Qn+1,x0.

3.3.3 Markowitz portfolio optimization

In classical Markowitz portfolio optimization we consider investment in n stocks or assets held over a period of time. Let xi denote the amount we invest in asset i, and assume a stochastic model where the return of the assets is a random variable r with known mean

μ=Er

and covariance

Σ=E(rμ)(rμ)T.

The return of our investment is also a random variable y=rTx with mean (or expected return)

Ey=μTx

and variance (or risk)

(yEy)2=xTΣx.

We then wish to rebalance our portfolio to achieve a compromise between risk and expected return, e.g., we can maximize the expected return given an upper bound γ on the tolerable risk and a constraint that our total investment is fixed,

(3.19)maximizeμTxsubject toxTΣxγeTx=1x0.

Suppose we factor Σ=GGT (e.g., using a Cholesky or a eigenvalue decomposition). We then get a conic formulation

(3.20)maximizeμTxsubject to(γ,GTx)Qn+1eTx=1x0.

In practice both the average return and covariance are estimated using historical data. A recent trend is then to formulate a robust version of the portfolio optimization problem to combat the inherent uncertainty in those estimates, e.g., we can constrain μ to an ellipsoidal uncertainty set as in Sec. 3.3.2 (Robust optimization with ellipsoidal uncertainties).

It is also common that the data for a portfolio optimization problem is already given in the form of a factor model Σ=FTF of Σ=I+FTF and a conic quadratic formulation as in Sec. 3.3.1 (Quadratically constrained quadratic optimization) is most natural. For more details see Sec. 10.3 (Example: Factor model).

3.3.4 Maximizing the Sharpe ratio

Continuing the previous example, the Sharpe ratio defines an efficiency metric of a portfolio as the expected return per unit risk, i.e.,

S(x)=μTxrf(xTΣx)1/2,

where rf denotes the return of a risk-free asset. We assume that there is a portfolio with μTx>rf, so maximizing the Sharpe ratio is equivalent to minimizing 1/S(x). In other words, we have the following problem

minimizeGTxμTxrfsubject toeTx=1,x0.

The objective has the same nature as a quotient of two affine functions we studied in Sec. 2.2.5 (Homogenization). We reformulate the problem in a similar way, introducing a scalar variable z0 and a variable transformation

y=zx.

Since a positive z can be chosen arbitrarily and (μrfe)Tx>0, we can without loss of generality assume that

(μrfe)Ty=1.

Thus, we obtain the following conic problem for maximizing the Sharpe ratio,

minimizetsubject to(t,GTy)Qk+1,eTy=z,(μrfe)Ty=1,y,z0,

and we recover x=y/z.

3.3.5 A resource constrained production and inventory problem

The resource constrained production and inventory problem [Zie82] can be formulated as follows:

(3.21)minimizej=1n(djxj+ej/xj)subject toj=1nrjxjb,xj0,j=1,,n,

where n denotes the number of items to be produced, b denotes the amount of common resource, and rj is the consumption of the limited resource to produce one unit of item j. The objective function represents inventory and ordering costs. Let cjp denote the holding cost per unit of product j and cjr denote the rate of holding costs, respectively. Further, let

dj=cjpcjr2

so that

djxj

is the average holding costs for product j. If Dj denotes the total demand for product j and cjo the ordering cost per order of product j then let

ej=cjoDj

and hence

ejxj=cjoDjxj

is the average ordering costs for product j. In summary, the problem finds the optimal batch size such that the inventory and ordering cost are minimized while satisfying the constraints on the common resource. Given dj,ej0 problem (3.21) is equivalent to the conic quadratic problem

minimizej=1n(djxj+ejtj)subject toj=1nrjxjb,(tj,xj,2)Qr3,j=1,,n.

It is not always possible to produce a fractional number of items. In such case xj should be constrained to be integers. See Sec. 9 (Mixed integer optimization).