# 9. Duality in conic optimization¶

## 9.1. Introduction¶

Duality theory is a rich and powerful area of convex optimization, and central to understanding sensitivity analysis and infeasibility issues in linear (and convex) optimization. Furthermore, it provides a simple and systematic way of obtaining non-trivial lower bounds on the optimal value for many difficult non-convex problem.

## 9.2. The Lagrangian¶

Duality theory is inherently connected with a (primal) optimization problem. Initially we consider an optimization problem in a general form

where \(f_0:\R^n \mapsto \R\) is the objective funtion, \(f:\R^n \mapsto \R^m\) encodes inequality constraints, and \(h: \R^n \mapsto \R^p\) encodes equality constraints. We will denote (1) the *primal problem*, and its optimal value by \(p^\star\). A general problem such as (1) is well-suited for our initial discussion of duality (especially weak duality); for our discussion of strong duality in Section 9.5 we only consider a particular class of linear optimization problems.

The *Lagrangian* for (1) is a function \(L:\R^n\times \R^m \times \R^p \mapsto \R\) that augments the objective with a weighted combination of all the constraints,

The variables \(y\in \R^p\) and \(s\in \R_+^m\) are called *Lagrange multipliers* or *dual variables*. Readers familiar with the method of penalization for enforcing constraints may recognize the Lagrangian as something similar; if we fix \(y\) and \(s\) large enough, then the minimizer of \(L(x)\) may approximately satisfy \(h(x)=0\) and \(f(x)\leq 0\). That is just a (very) simplifying interpretation, however. The main property of the Lagrangian is that it provides a lower bound on feasible solutions to (1); for a feasible \(x\) we have \(y^T h(x)=0\) and \(s^T f(x)\leq 0\) (since \(s\geq 0\)), so obviously \(L(x,y,s) \leq f_0(x)\).

## 9.3. The dual function and problem¶

The Lagrange dual function is defined as the minimum of the Lagrangian over \(x\),

and where the Lagrangian is unbounded below in \(x\), we assign the dual function the value \(-\infty\). For fixed \((y,s)\) the dual function is an affine function, so we can think of \(g(y,s)\) as the pointwise infimum of infinitely many such affine functions (recall our discussion in Section 2.2.1 of convex piecewise-linear functions defined as the maximum of a set of affine function); thus \(g(y,s)\) is a concave function, even when the problem (1) is not convex.

The Lagrange dual problem is then obtained by maximizing \(g(x, y)\), i.e., is it defined as

which is then always a concave optimization problem.

## 9.4. Weak duality¶

We already established that

which is a global (and weak) inequality, i.e., it holds for all \((x, y, s)\). The Lagrange dual

provides a bound on the optimal value \(p^\star=f_0(x^\star)\) by minimizing over \(x\). The best such lower bound is given by the dual problem (i.e., \(d^\star=\sup_{y,s} g(y,s)\)), and the relationship

is called the *weak duality* property, and holds for any kind of optimization problem with a well-defined Lagrangian function (including, e.g., nonconvex problems, but excluding general integer problems).

## 9.5. Strong duality¶

*Strong duality* is one the most important concepts in convex optimization. The problem (3) is generally not convex; for that we require that the objective function \(f_0(x)\) and the inequalities
\(f_i(x), \: i=1,\dots,m\) are convex functions, and that the equality constraints are affine functions. Then a sufficient condition for strong duality is known as *Slater’s constraint qualifications*, which basically assumes that the existence of a strictly feasible point
exists for either the primal or dual problem.

In this section we only show strong duality for the linear optimization problem. Since all linear optimization can be rewritten into this form, this incurs no loss of generality. For linear optimization strong duality is traditionally proved using ideas from the simplex algorithm; our proof is based on a more recent approach using separating hyperplanes and Farkas’ lemma, which are also important for the discussion of infeasibility in Section 2.4.

To prove strong duality we need two intermediate results. The first (which we state without proof) is a special case of a more general separating hyperplane theorem.

Let \(S\) by a closed convex set, and \(b\notin S\). Then there exists a strictly separating hyperplane \(a\) such that

The geometric interpretation of the *Separating hyperplane theorem* is given in Fig. 9.1, and indicates that a constructive proof follows from projecting \(b\) onto \(S\).

We next show an important result, which makes use of the separating hyperplane theorem.

Given \(A\) and \(b\), exactly one of the two propositions is true:

- \(\exists x \: : \: Ax = b, \quad x\geq 0\),
- \(\exists y \: : \: A^Ty \leq 0, \quad b^T y > 0\).

*Proof.* Both cannot be true, because then

which is a contradiction. Next assume that 1. is not true, i.e., \(b\notin S = \{Ax \: | \: x\geq 0 \}.\) The set \(S\) is a closed and convex, so there exists a separating hyperplane \(y\) (separating \(b\) and \(S\)) such that

But this implies that \(b^T y > 0\) and \(A^Ty\leq 0\).

We can now prove strong duality.

*Proof* Assume that the dual optimal value is attained (i.e., \(d^\star := b^T y^\star < \infty\)). Since \(s^\star = c - A^T y^\star\geq 0\) there can then be no \(z\) satisfying

because otherwise

contradicting the optimality assumption on \(y^\star\). From Farkas’ lemma (4) is infeasible if and only if there exists \(x\) such that

For that particular choice of \(x\) we have

i.e., \(p^\star=d^\star\). On the other hand, assume that \(-\infty < p^\star < \infty\). From weak duality we then have \(d^\star < \infty\), and from Farkas’ lemma \(d^\star > -\infty\), i.e., the dual optimal value is attained. In other words, if either the primal or dual optimal values are attained, we have \(p^\star = d^\star\).

Farkas’ lemma is an example of theorems of alternatives, where exactly one of two systems have a solution, and we can also formulate a dual variant.

Given \(A\) and \(c\), exactly one of the two propositions is true:

- \(\exists x \: : \: Ax = 0, \quad x\geq 0, \quad c^Tx < 0\),
- \(\exists y \: : \: c-A^Ty \geq 0\).

*Proof.* Both cannot be true, because then

Next assume that 2. is not true, i.e., \(\forall y \: : \: c < A^Ty\). But then for \(x\geq 0\) we have

implying that \(c^Tx<0\) and \(Ax=0\).