6.2 From Linear to Conic Optimization

In Sec. 6.1 (Linear Optimization) we demonstrated setting up the linear part of an optimization problem, namely the objective, linear bounds and linear (in)equalities. In this tutorial we show how to define conic constraints.

A single conic constraint in MOSEK is constructed in the following form

(6.2)\[F_i x+g_i \in \D_i\]


  • \(x\in\real^n`\) is the optimization variable vector of length \(n\),

  • \(F_i\in\real^{d\times n}\) is a \(d \times n\) matrix of coefficients (problem data), where \(d\) is the number of affine expressions (AFEs) in the conic constraint,

  • \(g_i\in \real^d\) is a vector of constants (problem data). Thus, the affine combination \(F_i x + g_i\) results in a \(d\)-vector where each element is a scalar-valued AFE,

  • \(\D_i\subseteq \real^d\) is a conic domain of dimension \(d\), representing one of the cone types supported by MOSEK.

Constraints of this form are called affine conic constraints, or ACC for short. Therefore, in this section we show how to set up a problem of the form

(6.3)\[\begin{split}\begin{array}{lccccl} \mbox{minimize} & & & c^T x + c^f & & \\ \mbox{subject to} & l^c & \leq & A x & \leq & u^c, \\ & l^x & \leq & x & \leq & u^x, \\ & & & Fx+g \in \D_1\times\cdots\times \D_l, & & \end{array}\end{split}\]

where \(F\in\real^{k\times n}\), \(g\in\real^k\), \(k = \sum_{i=1}^{l} d_i\) and \(d_i=\text{dim}(\D_i)\). The problem in (6.3) consists of \(l\) affine conic constraints. The first ACC is made by restricting the first \(d_1\) affine expressions (out of the total \(k\)) to the \(\D_1\) domain. The \(d_2\) AFEs thereafter belong to the \(\D_2\) domain, forming the second ACC, and so on. The complete ACC data of a problem is therefore obtained by stacking together the descriptions of \(l\) ACCs.

Generalization of linear constraints

Conic constraints are a natural generalization of linear constraints to the general nonlinear case. For example, a typical linear constraint of the form

\[Ax+b\geq 0\]

can also be written as membership in the cone of nonnegative real numbers (also called the positive orthant cone):

\[Ax+b \in \real_{\geq 0}^d,\]

and that naturally generalizes to

\[Fx+g\in \D\]

for more complicated domains \(\D\) from Sec. 13.6 (Supported domains).

6.2.1 Example AFFCO1

Consider the following simple optimization problem:

(6.4)\[\begin{split}\begin{array} {lrcl} \mbox{maximize} & x_1^{1/3} + (x_1+x_2+0.1)^{1/4} & & \\ \mbox{subject to} & (x_1-0.5)^2+(x_2-0.6)^2 & \leq & 1, \\ & x_1 - x_2 & \leq & 1. \end{array}\end{split}\]

Adding auxiliary variables we convert this problem into an equivalent conic form:

(6.5)\[\begin{split}\begin{array} {lrcl} \mbox{maximize} & t_1+t_2 & & \\ \mbox{subject to} & (1,x_1-0.5,x_2-0.6) & \in & \Q^3, \\ & (x_1,1,t_1) & \in & \POW_3^{(1/3,2/3)}, \\ & (x_1+x_2+0.1,1,t_2) & \in & \POW_3^{(1/4,3/4)}, \\ & x_1-x_2 & \leq & 1. \end{array}\end{split}\]

Note that each of the vectors constrained to a cone is in a natural way an affine combination of the problem variables.

We first set up the linear part of the problem, including the number of variables, objective and all bounds precisely as in Sec. 6.1 (Linear Optimization). Affine conic constraints will be defined using the field cones in the problem structure. We construct the matrices \(F,g\) for each of the three cones. For example, the constraint \((1,x_1-0.5,x_2-0.6)\in \Q^3\) is written in matrix form as

\[\begin{split}\left[\begin{array}{cccc}0&0&0&0\\1&0&0&0\\0&1&0&0\end{array}\right] \left[\begin{array}{c}x_1\\x_2\\t_1\\t_2\end{array}\right] + \left[\begin{array}{c}1\\-0.5\\-0.6\end{array}\right] \in \Q^3.\end{split}\]

Below we set up the matrices and define the cone type as a "MSK_DOMAIN_QUADRATIC_CONE" of length \(3\). The conic domain for each affine conic constraint is specified as a column in a matrix with rows corresponding to each associated detail of the domain. The last row for the quadratic cone is set to NULL because it is a non-parametric cone (unlike the power cones).

    # The quadratic cone
    FQ <- rbind(c(0,0,0,0), c(1,0,0,0), c(0,1,0,0))
    gQ <- c(1, -0.5, -0.6)
    cQ <- matrix(list("QUAD", 3, NULL), nrow=3, ncol=1)

Next we demonstrate how to do the same for the second of the power cone constraints. Its affine representation is:

\[\begin{split}\left[\begin{array}{cccc}1&1&0&0\\0&0&0&0\\0&0&0&1\end{array}\right] \left[\begin{array}{c}x_1\\x_2\\t_1\\t_2\end{array}\right] + \left[\begin{array}{c}0.1\\1\\0\end{array}\right] \in \POW^{1/4,3/4}_3.\end{split}\]

The power cone is defined by its type, length, and the additional parameter which is a vector of exponents \(\alpha,1-\alpha\) appearing in the cone definition. In fact any pair of positive real numbers proportional to \((\alpha,1-\alpha)\) may be used. They will be normalized to add up to 1:

    # The power cone for (x_1+x_2+0.1, 1, t_2) \in POW3^(1/4,3/4)
    FP2 <- rbind(c(1,1,0,0), c(0,0,0,0), c(0,0,0,1))
    gP2 <- c(0.1, 1, 0)
    cP2 <- matrix(list("PPOW", 3, c(1.0, 3.0)), nrow=3, ncol=1)

Once affine conic descriptions of all cones are ready it remains to stack them vertically into the matrix \(F\) and vector \(g\) and concatenate the cone descriptions in one matrix, one column per cone. Below is the full code for problem (6.5).

Listing 6.2 Script implementing conic version of problem (6.4). Click here to download.

affco1 <- function()

    prob <- list(sense="max")

    # Variables [x1; x2; t1; t2]
    prob$c <- c(0, 0, 1, 1)

    # Linear inequality x_1 - x_2 <= 1
    prob$A <- Matrix(c(1, -1, 0, 0), nrow=1, sparse=TRUE)
    prob$bc <- rbind(blc=-Inf, buc=1)
    prob$bx <- rbind(blx=rep(-Inf,4), bux=rep(Inf,4))

    # The quadratic cone
    FQ <- rbind(c(0,0,0,0), c(1,0,0,0), c(0,1,0,0))
    gQ <- c(1, -0.5, -0.6)
    cQ <- matrix(list("QUAD", 3, NULL), nrow=3, ncol=1)

    # The power cone for (x_1, 1, t_1) \in POW3^(1/3,2/3)
    FP1 <- rbind(c(1,0,0,0), c(0,0,0,0), c(0,0,1,0))
    gP1 <- c(0, 1, 0)
    cP1 <- matrix(list("PPOW", 3, c(1/3, 2/3)), nrow=3, ncol=1)

    # The power cone for (x_1+x_2+0.1, 1, t_2) \in POW3^(1/4,3/4)
    FP2 <- rbind(c(1,1,0,0), c(0,0,0,0), c(0,0,0,1))
    gP2 <- c(0.1, 1, 0)
    cP2 <- matrix(list("PPOW", 3, c(1.0, 3.0)), nrow=3, ncol=1)

    # All cones
    prob$F <- rbind(FQ, FP1, FP2)
    prob$g <- cbind(gQ, gP1, gP2)
    prob$cones <- cbind(cQ, cP1, cP2)
    rownames(prob$cones) <- c("type","dim","conepar")

    r <- mosek(prob, list(soldetail=1))
    stopifnot(identical(r$response$code, 0))


6.2.2 Example AFFCO2

Consider the following simple linear dynamical system. A point in \(\real^n\) moves along a trajectory given by \(z(t) = z(0)\exp(At)\), where \(z(0)\) is the starting position and \(A=\Diag(a_1,\ldots,a_n)\) is a diagonal matrix with \(a_i<0\). Find the time after which \(z(t)\) is within euclidean distance \(d\) from the origin. Denoting the coordinates of the starting point by \(z(0)=(z_1,\ldots,z_n)\) we can write this as an optimization problem in one variable \(t\):

\[\begin{split}\begin{array}{lrcl} \minimize & t & & \\ \st & \sqrt{\sum_i \left(z_i\exp(a_it)\right)^2} & \leq & d, \end{array}\end{split}\]

which can be cast into conic form as:

(6.6)\[\begin{split}\begin{array}{lrcl} \minimize & t & & \\ \st & (d,z_1y_1,\ldots,z_ny_n) & \in & \Q^{n+1},\\ & (y_i, 1, a_it) & \in & \EXP, \ i=1,\ldots,n, \end{array}\end{split}\]

with variable vector \(x=[t,y_1,\ldots,y_n]^T\).

We assemble all conic constraints in the form

\[Fx+g\in \Q^{n+1}\times(\EXP)^n.\]

For the conic quadratic constraint this representation is

\[\begin{split}\left[\begin{array}{cc}0&0_n^T\\0_n& \Diag(z_1,\ldots,z_n)\end{array}\right] \left[\begin{array}{c}t\\y\end{array}\right] + \left[\begin{array}{c}d\\0_n\end{array}\right] \in \Q^{n+1}.\end{split}\]

For the \(i\)-th exponential cone we have

\[\begin{split}\left[\begin{array}{cc}0&e_i^T\\0&0_n\\a_i&0_n\end{array}\right] \left[\begin{array}{c}t\\y\end{array}\right] + \left[\begin{array}{c}0\\1\\0\end{array}\right] \in \EXP,\end{split}\]

where \(e_i\) denotes a vector of length \(n\) with a single \(1\) in position \(i\).

Listing 6.3 Script implementing problem (6.6). Click here to download.
firstHittingTime <- function(n, z, a, d)
    prob <- list(sense="min")
    # Variables [t, y1, ..., yn]
    prob$A <- Matrix(nrow=0, ncol=n+1)
    prob$bx<- rbind(rep(-Inf,n+1), rep(Inf,n+1))
    prob$c <- c(1, rep(0, n))

    # Quadratic cone
    FQ <- Diagonal(n+1, c(0, z))
    gQ <- c(d, rep(0, n))

    # All exponential cones
    FE <- sparseMatrix(  c(seq(1,3*n,by=3), seq(3,3*n,by=3)),
                         c(2:(n+1),         rep(1,n)),
                       x=c(rep(1,n),        t(a)))
    gE <- rep(c(0, 1, 0), n)

    # Assemble input data
    prob$F <- rbind(FQ, FE)
    prob$g <- c(gQ, gE)
    prob$cones <- cbind(matrix(list("QUAD", 1+n, NULL), nrow=3, ncol=1),
                        matrix(list("PEXP", 3, NULL), nrow=3, ncol=n))
    rownames(prob$cones) <- c("type","dim","conepar")

    # Solve
    r <- mosek(prob, list(soldetail=1))
    stopifnot(identical(r$response$code, 0))