6 Semidefinite optimization¶
In this chapter we extend the conic optimization framework introduced before with symmetric positive semidefinite matrix variables.
6.1 Introduction to semidefinite matrices¶
6.1.1 Semidefinite matrices and cones¶
A symmetric matrix
We then define the cone of symmetric positive semidefinite matrices as
For brevity we will often use the shorter notion semidefinite instead of symmetric positive semidefinite, and we will write
It is easy to see that (6.1) indeed specifies a convex cone; it is pointed (with origin
where
which shows that
Another useful characterization is that
i.e., if
In a completely analogous way we define the cone of symmetric positive definite matrices as
and we write
The one dimensional cone
with determinant
which characterizes a three-dimensional scaled rotated cone, i.e.,
More generally we have
and
where the latter equivalence follows immediately from Lemma 6.1. Thus both the linear and quadratic cone are embedded in the semidefinite cone. In practice, however, linear and quadratic cones should never be described using semidefinite constraints, which would result in a large performance penalty by squaring the number of variables.
As a more interesting example, consider the symmetric matrix
parametrized by
(shown in Fig. 6.1) is called a spectrahedron and is perhaps the simplest bounded semidefinite representable set, which cannot be represented using (finitely many) linear or quadratic cones. To gain a geometric intuition of
so the boundary of
or equivalently as
For

Fig. 6.1 Plot of spectrahedron
6.1.2 Properties of semidefinite matrices¶
Many useful properties of (semi)definite matrices follow directly from the definitions (6.1)-(6.4) and their definite counterparts.
The diagonal elements of
are nonnegative. Let denote the th standard basis vector (i.e., , ). Then , so (6.1) implies that .A block-diagonal matrix
is (semi)definite if and only if each diagonal block is (semi)definite.Given a quadratic transformation
, if and only if and has full rank. This follows directly from the Grammian characterization . For we only require that . As an example, if is (semi)definite then so is any permutation .Any principal submatrix of
( restricted to the same set of rows as columns) is positive semidefinite; this follows by restricting the Grammian characterization to a submatrix of .The inner product of positive (semi)definite matrices is positive (nonnegative). For any
let and where and have full rank. Thenwhere strict positivity follows from the assumption that
has full column-rank, i.e., .The inverse of a positive definite matrix is positive definite. This follows from the positive spectral factorization
, which gives uswhere
. If is semidefinite then the pseudo-inverse of is semidefinite.Consider a matrix
partitioned asLet us find necessary and sufficient conditions for
. We know that and (since any principal submatrix must be positive definite). Furthermore, we can simplify the analysis using a nonsingular transformationto diagonalize
as , where is block-diagonal. Note that , so is indeed nonsingular. Then if and only if . Expanding , we getSince
(by assuming that ) we see that and direct substitution gives us
In the last part we have thus established the following useful result.
A symmetric matrix
is positive definite if and only if
6.2 Semidefinite modeling¶
Having discussed different characterizations and properties of semidefinite matrices, we next turn to different functions and sets that can be modeled using semidefinite cones and variables. Most of those representations involve semidefinite matrix-valued affine functions, which we discuss next.
6.2.1 Linear matrix inequalities¶
Consider an affine matrix-valued mapping
A linear matrix inequality (LMI) is a constraint of the form
in the variable
with
Alternatively, we can describe the linear matrix inequality
i.e., as a semidefinite variable with fixed diagonal; these two alternative formulations illustrate the difference between primal and dual form of semidefinite problems, see Sec. 8.6 (Semidefinite duality and LMIs).
6.2.2 Eigenvalue optimization¶
Consider a symmetric matrix
A number of different functions of
Sum of eigenvalues
The sum of the eigenvalues corresponds to
Largest eigenvalue
The largest eigenvalue can be characterized in epigraph form
To verify this, suppose we have a spectral factorization
Thus we can minimize the largest eigenvalue of
Smallest eigenvalue
The smallest eigenvalue can be described in hypograph form
i.e., we can maximize the smallest eigenvalue of
Eigenvalue spread
The eigenvalue spread can be modeled in epigraph form
by combining the two linear matrix inequalities in (6.10) and (6.11), i.e.,
Spectral radius
The spectral radius
Condition number of a positive definite matrix
Suppose now that
or equivalently if and only if
from which we recover the solution
6.2.3 Log-determinant¶
Consider again a symmetric positive-definite matrix
is neither convex or concave, but
in the form of the following problem:
The equivalence of the two problems follows from Lemma 6.1 and subadditivity of determinant for semidefinite matrices. That is:
On the other hand the optimal value
The last inequality in problem (6.13) can of course be modeled using the exponential cone as in Sec. 5.2 (Modeling with the exponential cone). Note that we can replace that bound with
6.2.4 Singular value optimization¶
We next consider a non-square matrix
The singular values are connected to the eigenvalues of
and if
Largest singular value
The epigraph
which from Schur’s lemma is equivalent to
The largest singular value
Sum of singular values
The trace norm or the nuclear norm of
It turns out that the nuclear norm corresponds to the sum of the singular values,
which is easy to verify using singular value decomposition
which shows (6.17). Alternatively, we can express (6.16) as the solution to
with the dual problem (see Example 8.8)
In other words, using strong duality we can characterize the epigraph
For a symmetric matrix the nuclear norm corresponds to the sum of absolute values of eigenvalues, and for a semidefinite matrix it simply corresponds to the trace of the matrix.
6.2.5 Matrix inequalities from Schur’s Lemma¶
Several quadratic or quadratic-over-linear matrix inequalities follow immediately from Schur’s lemma. Suppose
if and only if
6.2.6 Nonnegative polynomials¶
We next consider characterizations of polynomials constrained to be nonnegative on the real axis. To that end, consider a polynomial basis function
It is then well-known that a polynomial
if and only if it can be written as a sum of squared polynomials of degree
It turns out that an equivalent characterization of
where
When there is no ambiguity, we drop the superscript on
To verify that (6.21) and (6.23) are equivalent, we first note that
i.e.,
Assume next that
i.e., we have
since
Checking nonnegativity of a univariate polynomial thus corresponds to a semidefinite feasibility problem.
6.2.6.1 Nonnegativity on a finite interval¶
As an extension we consider a basis function of degree
A polynomial
where
we distinguish between polynomials of odd and even degree.
Even degree. Let
and denoteWe choose
and and note that on . Then if and only iffor some
, and an equivalent semidefinite characterization can be found as(6.25)¶Odd degree. Let
and denote . We choose and . We then have that if and only iffor some
, and an equivalent semidefinite characterization can be found as(6.26)¶
6.2.7 Hermitian matrices¶
Semidefinite optimization can be extended to complex-valued matrices. To that end, let
where superscript ’
In other words,
Note that (6.27) implies skew-symmetry of
6.2.8 Nonnegative trigonometric polynomials¶
As a complex-valued variation of the sum-of-squares representation we consider trigonometric polynomials; optimization over cones of nonnegative trigonometric polynomials has several important engineering applications. Consider a trigonometric polynomial evaluated on the complex unit-circle
parametrized by
Consider a complex-valued basis function
The Riesz-Fejer Theorem states that a trigonometric polynomial
Analogously to Sec. 6.2.6 (Nonnegative polynomials) we have a semidefinite characterization of the sum-of-squares representation, i.e.,
where
When there is no ambiguity, we drop the superscript on
To prove correctness of the semidefinite characterization, we first note that
i.e.,
Next assume that (6.30) is satisfied. Then
with
In other words, we have shown that
6.2.8.1 Nonnegativity on a subinterval¶
We next sketch a few useful extensions. An extension of the Riesz-Fejer Theorem states that a trigonometric polynomial
where
for
Similarly
i.e.,
6.3 Semidefinite optimization case studies¶
6.3.1 Nearest correlation matrix¶
We consider the set
(shown in Fig. 6.1 for
i.e., the projection of
which extracts and scales the lower-triangular part of
This is an example of a problem with both conic quadratic and semidefinite constraints in primal form. We can add different constraints to the problem, for example a bound
6.3.2 Extremal ellipsoids¶
Given a polytope we can find the largest ellipsoid contained in the polytope, or the smallest ellipsoid containing the polytope (for certain representations of the polytope).
6.3.2.1 Maximal inscribed ellipsoid¶
Consider a polytope
The ellipsoid
is contained in
or equivalently, if and only if
Since
In Sec. 6.2.2 (Eigenvalue optimization) we show how to maximize the determinant of a positive definite matrix.
6.3.2.2 Minimal enclosing ellipsoid¶
Next consider a polytope given as the convex hull of a set of points,
The ellipsoid
has

Fig. 6.2 Example of inner and outer ellipsoidal approximations of a pentagon in
6.3.3 Polynomial curve-fitting¶
Consider a univariate polynomial of degree
Often we wish to fit such a polynomial to a given set of measurements or control points
i.e., we wish to determine coefficients
To that end, define the Vandermonde matrix
We can then express the desired curve-fit compactly as
i.e., as a linear expression in the coefficients
to find a polynomial that passes through all the control points
which (assuming again there are no duplicate rows) equals
On the other hand, if
which is given by the formula (see Sec. 8.5.1 (Linear regression and the normal equation))
In the following we discuss how the semidefinite characterizations of nonnegative polynomials (see Sec. 6.2.6 (Nonnegative polynomials)) lead to more advanced and useful polynomial curve-fitting constraints.
Nonnegativity. One possible constraint is nonnegativity on an interval,
with a semidefinite characterization embedded in
, see (6.25).Monotonicity. We can ensure monotonicity of
by requiring that (or ), i.e.,or
respectively.
Convexity or concavity. Convexity (or concavity) of
corresponds to (or ), i.e.,or
respectively.
As an example, we consider fitting a smooth polynomial
to the points
or equivalently
Finally, we use the characterizations

Fig. 6.3 Graph of univariate polynomials of degree 2, 4, and 8, respectively, passing through
In Fig. 6.3 we show the graphs for the resulting polynomails of degree 2, 4 and 8, respectively. The second degree polynomial is uniquely determined by the three constraints
after rounding coefficients to rational numbers. Furthermore, the largest derivative is given by
and
In this case, we get
and the largest derivative increases to

Fig. 6.4 Graph of univariate polynomials of degree
6.3.4 Filter design problems¶
Filter design is an important application of optimization over trigonometric polynomials in signal processing. We consider a trigonometric polynomial
where
We often wish a transfer function where
which corresponds to minimizing
which is a semidefinite optimization problem. In Fig. 6.5 we show

Fig. 6.5 Plot of lowpass filter transfer function.¶
6.3.5 Relaxations of binary optimization¶
Semidefinite optimization is also useful for computing bounds on difficult non-convex or combinatorial optimization problems. For example consider the binary linear optimization problem
In general, problem (6.37) is a very difficult non-convex problem where we have to explore
which is, in fact, equivalent to a rank constraint on a semidefinite variable,
By relaxing the rank 1 constraint on
where we note that
Since (6.38) is a semidefinite optimization problem, it can be solved very efficiently. Suppose
We can tighten (or improve) the relaxation (6.38) by adding other constraints that cut away parts of the feasible set, without excluding rank 1 solutions. By tightening the relaxation, we reduce the gap between the optimal values of the original problem and the relaxation. For example, we can add the constraints
and so on. This will usually have a dramatic impact on solution times and memory requirements. Already constraining a semidefinite matrix to be doubly nonnegative (
6.3.6 Relaxations of boolean optimization¶
Similarly to Sec. 6.3.5 (Relaxations of binary optimization) we can use semidefinite relaxations of boolean constraints
with a semidefinite relaxation
As a (standard) example of a combinatorial problem with boolean constraints, we consider an undirected graph
The capacity of a cut is then defined as the number of edges in the cut-set,

Fig. 6.6 Undirected graph. The cut
To maximize the capacity of a cut we define the symmetric adjacency matrix
where
Suppose
and discarding the constant term
with a semidefinite relaxation