5. Quadratic optimization

5.1. Introduction

In this chapter we discuss convex quadratic optimization. Our discussion is fairly brief compared to the previous chapters for three reasons; (i) quadratic optimization is a special case of conic quadratic optimization, (ii) for most problems it is actually more efficient for the solver to pose the problem in conic form, and (iii) duality theory (including infeasibility certificates) is much simpler for conic quadratic optimization. Therefore, we generally recommend a conic quadratic formulation.

5.2. Convex quadratic optimization

A standard (convex) quadratic optimization problem

(1)\[\begin{split}\begin{array}{ll} \mbox{minimize} & \half x^T Q x + c^T x\\ \mbox{subject to} & Ax = b\\ & x \geq 0, \end{array}\end{split}\]

with \(Q\in\PSD^n\) is conceptually a simple extension of a standard linear optimization problem with a quadratic term \(x^T Q x\). Note the important requirement that \(Q\) is symmetric positive semidefinite; otherwise the objective function is not convex.

5.2.1. Geometry of quadratic optimization

Quadratic optimization has a simple geometric interpretation; we minimize a convex quadratic function over a polyhedron. In Fig. 5.1 we illustrate this interpretation for the same example as in Fig. 2.5 extended with an additional quadratic term with

\[\begin{split}Q = \left[\begin{array}{cc}1 & \half\\ \half & 1\end{array}\right].\end{split}\]

It is intuitively clear that the following different cases can occur:

  • The optimal solution \(x^\star\) is at the boundary of the polyhedron (as shown in Fig. 5.1). At \(x^\star\) one of the hyperplanes is tangential to an ellipsoidal level curve.
  • The optimal solution is inside the polyhedron; this occurs if the unconstrained minimizer \(\arg\min_x \half x^T Q x + c^T x=-Q^\dagger c\) (i.e., the center of the ellipsoidal level curves) is inside the polyhedron.
  • If the polyhedron is unbounded in the opposite direction of \(c\), and if the ellipsoid level curves are degenerate in that direction (i.e., \(Qc=0\)), then the problem is unbounded. If \(Q\in \mathcal{S}_{++}^n\), then the problem cannot be unbounded.
  • The problem is infeasible, i.e., \(\{ x \mid Ax=b, \: x\geq 0 \}=\emptyset\).

Fig. 5.1 Geometric interpretation of quadratic optimization. At the optimal point \(x^\star\) the hyperplane \(\{x\mid a_1^T x=b\}\) is tangential to an ellipsoidal level curve.

Possibly because of its simple geometric interpretation and similarity to linear optimization, quadratic optimization has been more widely adopted by optimization practitioners than conic quadratic optimization.

5.2.2. Duality in quadratic optimization

The Lagrangian function for (1) is

(2)\[ L(x, y, s) = \half x^T Q x + x^T(c - A^Ty - s) + b^T y\]

with Lagrange multipliers \(s\in \R^n_+\), and from \(\nabla_x L(x,y,s)=0\) we get the necessary first-order optimality condition

\[Qx = A^Ty + s - c,\]

i.e., \((A^Ty+s-c)\in \mathcal{R}(Q)\). Then

\[\arg \min_x L(x,y,s) = Q^\dagger(A^Ty+s-c),\]

which can be substituted into (2) to give a dual function

\[\begin{split}g(y,s) = \left\{ \begin{array}{ll} b^T y - \half(A^T y + s - c) Q^\dagger (A^T y + s -c), & (A^T y + s - c) \in \mathcal{R}(Q)\\ -\infty & \text{otherwise}, \end{array} \right.\end{split}\]

Thus we get a dual problem

(3)\[\begin{split}\begin{array}{ll} \mbox{maximize} & b^T y - \half(A^T y + s - c) Q^\dagger (A^T y + s - c)\\ \mbox{subject to} & (A^T y + s - c) \in \mathcal{R}(Q)\\ & s \geq 0. \end{array}\end{split}\]

Alternatively, we can characterize the constraint \((A^T y + s - c) \in \mathcal{R}(Q)\) as \(Qw = A^T y + s - c\) with an extraneous variable \(w\) to get an equivalent dual problem

(4)\[\begin{split}\begin{array}{ll} \mbox{maximize} & b^T y - \half w^T Q w\\ \mbox{subject to} & Qw = A^T y - c + s\\ & s \geq 0. \end{array}\end{split}\]

Note from the optimality conditions that \(w=x\), so (4) is an unusual dual problem in the sense that it involves both primal and dual variables.

Weak duality follows from the inner product between \(x\) and the dual equality constraint,

\[x^T Q x = b^T y -c^T x + x^T s,\]

which shows that

\[\half x^T Q x + c^T x - \left(b^Ty - \half x^T Q x\right) = x^T s \geq 0.\]

Furthermore, strong duality holds if a Slater constraint qualification is satisfied, i.e., if either the primal or dual problem is strictly feasible (as in the previous chapters).

5.2.3. Infeasibility in quadratic optimization

Feasibility of the primal problem (1) is described by Lemma 6; either problem (1) is feasible or there exists a certificate \(y\) of primal infeasibility satisfying

\[A^T y \leq 0, \qquad b^T y > 0.\]

In order to characterize feasibility of the dual problem, we need another pair of alternatives, which follows directly from Lemma 7 by considering

\[\begin{split}A' : = \left(\begin{array}{c}A\\-Q\end{array}\right).\end{split}\]

Either there exists an \(x\geq 0\) satisfying

\[Ax = 0, \qquad Qx = 0, \qquad c^T x < 0\]

or there exists \((y,w)\) satisfying

\[Qw - A^Ty + c \geq 0.\]

5.3. Quadratically constrained optimization

A general convex quadratically constrained quadratic optimization problem is

(5)\[\begin{split}\begin{array}{ll} \mbox{minimize} & \half x^T Q_0 x + c_0^T x + r_0\\ \mbox{subject to} & \half x^T Q_i x + c_i^T x + r_i \leq 0, \quad i=1,\dots,m, \end{array}\end{split}\]

where \(Q_i\in \mathcal{S}_+^n, \: \forall i\). This corresponds to minimizing a convex quadratic function over an intersection of ellipsoids (or affine halfspaces if some of \(Q_i\) are zero). Note the important requirement \(Q_i\succeq 0,\: \forall i\), so that the objective function is convex and the constraints

\[\half x^T Q_i x + c_i^T x + r_i \leq 0\]

characterize convex sets. It is important to note that neither quadratic equalities

\[\half x^T Q_i x + c_i^T x + r_i = 0\]

or inequalities of the form

\[\half x^T Q_i x + c_i^T x + r_i \geq 0\]

characterize convex sets, and therefore cannot be included.

5.3.1. Duality in quadratically constrained optimization

The Lagrangian function for (5) is

\[L(x,\lambda) = \half x^T Q_0 x + c_0^Tx + r_0 + \sum_{i=1}^m \lambda_i \left( \half x^T Q_i x + c_i^Tx + r_i \right) = \half x^T Q(\lambda) x + c(\lambda)^T x + r(\lambda),\]


\[Q(\lambda)=Q_0 + \sum_{i=1}^m \lambda_i Q_i, \quad c(\lambda)=c_0 + \sum_{i=1}^m \lambda_i c_i, \quad r(\lambda)=r_0 + \sum_{i=1}^m \lambda_i r_i.\]

From the Lagrangian we get the first-order optimality conditions

(6)\[Q(\lambda) x = -c(\lambda),\]

and similar to the case of quadratic optimization we get a dual problem

(7)\[\begin{split}\begin{array}{ll} \mbox{maximize} & -\half c(\lambda)^T Q(\lambda)^\dagger c(\lambda) + r(\lambda)\\ \mbox{subject to} & c(\lambda) \in \mathcal{R}\left( Q(\lambda) \right)\\ & \lambda \geq 0, \end{array}\end{split}\]

or equivalently

(8)\[\begin{split}\begin{array}{ll} \mbox{maximize} & -\half w^T Q(\lambda) w + r(\lambda)\\ \mbox{subject to} & Q(\lambda) w = -c(\lambda)\\ & \lambda \geq 0, \end{array}\end{split}\]

Weak duality is easily verified; from (6) we have

\[x^T Q(\lambda) x + c(\lambda)^T x = 0\]

which implies

\[\half x^T Q_0 x + c_0^T x + r_0 -\left(-\half x^T Q(\lambda) x + r(\lambda) \right) = -\sum_{i=1}^m \lambda_i\left( \half x^T Q_i x + c_i^T x + r_i\right) \geq 0,\]

and strong duality holds provided the quadratic inequalities are strictly feasible,

\[\half x^T Q_i x + c_i^T x + r_i < 0, \quad i=1,\dots,m\]

for some \(x\in\R^n\). Using a general version of the Schur Lemma for singular matrices, we can also write (7) as an equivalent semidefinite optimization problem,

(9)\[\begin{split}\begin{array}{ll} \mbox{maximize} & t\\ \mbox{subject to} & \left [ \begin{array}{cc}2(r(\lambda)-t) & c(\lambda)^T\\ c(\lambda) & Q(\lambda)\end{array} \right]\succeq 0\\ & \lambda \geq 0. \end{array}\end{split}\]

5.3.2. Infeasibility in quadratically constrained optimization

Feasibility of (5) is characterized by the following lemma.

Given \(Q_i\succeq 0\) \(i=1,\dots,m\), either there exists an \(x\in \R^n\) satisfying

(10)\[\half x^T Q_i x + c_i^T x + r_i < 0, \quad i=1,\dots,m\]

or there exists \(\lambda \geq 0, \: \lambda \neq 0\) satisfying

(11)\[\begin{split}\sum_{i=1}^m \lambda_i \left[ \begin{array}{cc} 2r_i & c_i^T \\ c_i & Q_i \end{array} \right] \succeq 0.\end{split}\]

Suppose (10) and (11) are both satisfied. Then

\[\begin{split}\sum_{i=1}^m \lambda_i \left[\begin{array}{c}1\\ x\end{array}\right]^T \left[ \begin{array}{cc} 2r_i & c_i^T \\ c_i & Q_i \end{array} \right] \left[\begin{array}{c}1\\ x\end{array}\right] = 2\sum_{i=1}^m\lambda_i \left(\half x^T Q_i x + c_i^T x + r_i \right) \geq 0,\end{split}\]

which is a contradiction. Conversely, assume (11) is not satisfied, i.e.,

\[\begin{split}\sum_{i=1}^m\lambda_i \left[ \begin{array}{cc} 2r_i & c_i^T \\ c_i & Q_i \end{array} \right] \prec 0, \quad \forall \lambda \geq 0, \: \lambda \neq 0.\end{split}\]

But then \(X=\left( \begin{smallmatrix}1 & x^T\\ x & xx^T\end{smallmatrix} \right)\) satisfies

\[\begin{split}\left \langle \sum_{i=1}^m \lambda_i \left[ \begin{array}{cc} 2r_i & c_i^T \\ c_i & Q_i \end{array} \right] , \left[\begin{array}{c}1\\ x\end{array}\right] \left[\begin{array}{c}1\\ x\end{array}\right]^T \right \rangle < 0, \quad \forall \lambda > 0, \: \lambda \neq 0\end{split}\]

or equivalently

\[2\sum_{i=1}^m\lambda_i \left(\half x^T Q_i x + c_i^T x + r_i \right) < 0, \quad \forall \lambda > 0, \: \lambda \neq 0\]

which in turn implies (10).

Dual infeasibility is characterized by the following set of alternatives; either problem (8) is feasible or there exists a certificate \(x\) of dual infeasibility satisfying (13).

Given \(Q_i\succeq 0\), \(i=0,\dots,m\), either there exists \((\lambda,w)\) satisfying

(12)\[Q(\lambda) w + c(\lambda) = 0, \qquad \lambda \geq 0,\]

or there exists \(x\in \R^n\) satisfying

(13)\[Q_0 x = 0, \qquad c_0^T x < 0, \qquad Q_ix = 0, \qquad c_i^T x = 0, \quad i=1,\dots, m.\]

Both cannot be true, because then

\[c_0^T x = -x^T(Q_0 + \sum_{i=1}^m \lambda_i Q_i) w - \sum_{i=1}^m \lambda_i c_i^T x = 0,\]

which is a contradiction. On the other hand, suppose

\[Q(\lambda) w + c(\lambda) \neq 0, \quad \forall w, \forall \lambda \geq 0,\]

i.e., \(Q(\lambda)\) is singular \(\forall \lambda\geq 0\), so there exists \(x\in \R^n\) satisfying \(Q_ix=0, \: i=0,\dots,m\).

5.4. Separable quadratic optimization

Often a factorization of \(Q=F^T F\) is either known or readily available, in which case we get an alternative formulation of (1) as

(14)\[\begin{split}\begin{array}{ll} \mbox{minimize} & \half z^T z + c^T x\\ \mbox{subject to} & Ax = b\\ & Fx = z\\ & x \geq 0. \end{array}\end{split}\]

Formulation (14) has several advantages; convexity of the problem is obvious (which can occasionally be difficult to detect in finite precision arithmetic), and the structure and sparsity of the separable reformulation is typically more efficient for an optimization solver. In the following example we consider a related reformulation, which can result in a significant reduction of complexity and solution time (see also Sec. 3.3.3).

Consider a standard quadratic optimization problem (14) where

\[Q = \mathbf{diag}(w) + F^T F, \quad w > 0.\]

We then get a separable problem

\[\begin{split}\begin{array}{ll} \mbox{minimize} & \half x^T \mathbf{diag}(w) x + z^T z + c^T x\\ \mbox{subject to} & Ax = b\\ & Fx = z\\ & x \geq 0. \end{array}\end{split}\]

In general we can reformulate (5) as

(15)\[\begin{split}\begin{array}{lll} \mbox{minimize} & \half z_0^T z_0 + c_0^T x + r_0\\ \mbox{subject to} & \half z_i^T z_i + c_i^T x + r_i \leq 0, & i=1,\dots,m\\ & F_i x = z_i, & i=0,\dots,m,\\ \end{array}\end{split}\]

which is very similar to the conic formulation found in Section 3.3.1. If the formulation (15) is sparse compared to (5) (i.e., if it is described using fewer non-zeros), then it generally results in a reduction of computation time. On the other hand, if we form (15), we might as well use the conic formulation instead.