13 Appendix¶
13.1 Conic optimization refresher¶
In this section we give a short summary about conic optimization. Read more about the topic from a mathematical and modeling point of view in our other publication, the MOSEK Modeling Cookbook [MOSEKApS24].
Conic optimization is a class of convex optimization problems, which contains and generalizes in an unified way many specific and well known convex models. These models include linear optimization (LO), quadratic optimization (QO), quadratically constrained quadratic optimization (QCQO), second-order cone optimization (SOCO), and semidefinite optimization (SDO). The general form of a conic optimization problem is
where
Linear cone:
Linear cones model all traditionally LO problems.
Quadratic cone and rotated quadratic cone:
The quadratic cone is the set
The rotated quadratic cone is the set
Modeling with these two cones cover the class of SOCO problems, which include all traditionally QO and QCQO problems as well. See in Sec. 13.1.2 (Traditional quadratic models) for more details.
Primal power cone:
or its dual cone.
Primal exponential cone:
or its dual cone.
Semidefinite cone:
Semidefinite cones model all traditionally SDO problems.
Each of these cones allow formulating different types of convex constraints.
13.1.1 Selection of conic constraints¶
In the following we will list the constraints appearing in financial context in this book and show how can we convert them to conic form.
13.1.1.1 Maximum function¶
We can model the maximum constraint
For instance, we could write a series of constraints
where
13.1.1.2 Positive and negative part¶
A special case of modeling the maximum function is to model the positive part
We can model them explicitly based on Sec. 13.1.1.1 (Maximum function) by relaxing the definitions to inequalities
Note however, that in either case a freedom remains in the magnitude of
We can do workarounds to ensure that equalities
13.1.1.3 Absolute value¶
We can model the absolute value constraint
Another possibility is to model it using the quadratic cone:
13.1.1.4 Sum of largest elements¶
The sum of the
Here
Problem (13.8) is actually the same as
13.1.1.5 Linear combination of largest elements¶
We can slightly extend the problem in Sec. 13.1.1.4 (Sum of largest elements) such that
This has the optimal objective
If we take the dual of problem (13.9), we get:
which is the same as
13.1.1.6 Manhattan norm (1-norm)¶
Let
where
13.1.1.7 Euclidean norm (2-norm)¶
Let
13.1.1.8 Hyperbolic constraint¶
Let
13.1.1.9 Squared Euclidean norm¶
We can model the squared Euclidean norm or sum-of-squares constraint
13.1.1.10 Quadratic form¶
Let
Most common ways to compute the matrix
Cholesky decomposition:
, where is a lower triangular matrix with nonnegative entries in the diagonal. From this decomposition we have .Eigenvalue decomposition:
, where the diagonal matrix contains the (nonnegative) eigenvalues of and the unitary matrix contains the corresponding eigenvectors in its columns. From this decomposition we have .Matrix square root:
, where is the symmetric positive semidefinite “square root” matrix of . From this decomposition we have .Factor model: If
is a covariance matrix of some data, then we can approximate the data series with the combination of common factors. Then we have the decomposition , where is the covariance of the factors, is the exposure of the data series to each factor, and is diagonal. From this, by computing the Cholesky decomposition we have . The advantage of factor models is that is very sparse and the factors have a direct financial interpretation (see Sec. 5 (Factor models) for details).
After obtaining the matrix
We can choose to model this using the rotated quadratic cone as
or we can choose to model its square root using the quadratic cone as
Whether to use the quadratic cone or the rotated quadratic cone for modeling can be decided based on which is more natural. Typically the quadratic cone is used to model 2-norm constraints, while the rotated quadratic cone is more natural for modeling of quadratic functions. There can be exceptions, however; see for example in Sec. 2.3 (Conic formulation).
13.1.1.11 Power¶
Let
13.1.1.12 Exponential¶
Let
13.1.1.13 Log-sum-exp¶
Let
13.1.1.14 Perspective of function¶
The perspective of a function
13.1.1.15 Perspective of log-sum-exp¶
Let
13.1.2 Traditional quadratic models¶
Optimization problems involving quadratic functions often appear in practice. In this section we show how to convert the traditionally QO or QCQO problems into conic optimization form, because most of the time the solution of these conic models is computationally more efficient.
13.1.2.1 Quadratic optimization¶
The standard form of a quadratic optimization (QO) problem is the following:
The matrix
Assuming the factorization
13.1.2.2 Quadratically constrained quadratic optimization¶
Consider the quadratically constrained quadratic optimization (QCQO) problem of the form
The matrices
Assuming the factorization
13.1.2.3 Practical benefits of conic models¶
The key step in the model conversion is to transform quadratic terms
The storage requirement
of can be much lower than the storage requirement of .The amount of work to evaluate
is proportional to whereas the work to evaluate is proportional to only.No need to numerically validate positive semidefiniteness of the matrix
. This is otherwise difficult owing to the presence of rounding errors that can make indefinite.Duality theory is much simpler for conic quadratic optimization.
In summary, the conic equivalents are not only as easy to solve as the original QP or QCQP problems, but in most cases also need less space and solution time.
13.2 Mixed-integer models¶
Mixed integer optimization (MIO) is an extension of convex optimization, which introduces variables taking values only in the set of integers or in some subset of integers. This allows for solving a much wider range of optimization problems in addition to the ones convex optimization already covers. However, these problems are no longer convex. Handling of integer variables make the optimization NP-hard and needs specific algorithms, thus the solution of MIO problems is much slower. In many practical cases we cannot expect to find the optimal solution in reasonable time, and only near-optimal solutions will be available.
A general mixed integer optimization problem has the following form:
where
13.2.1 Selection of integer constraints¶
In the following we will list the integer constraints appearing in financial context in this book and show how to model them.
13.2.1.1 Switch¶
In some practical cases we might wish to impose conditions on parts of our optimization model. For example, allowing nonzero value for a variable only in presence of a condition. We can model such situations using binary variables (or indicator variables), which can only take
The number
If we have a vector variable
where
13.2.1.2 Semi-continuous variable¶
A slight extension of Sec. 13.2.1.1 (Switch) is to model semi-continuous variables, i. e. when
13.2.1.3 Cardinality¶
We might need to limit the number of nonzeros in a vector
13.2.1.4 Positive and negative part¶
We introduced the positive part
Here the binary variable
Sometimes we need to handle separately the case when both the positive and the negative part is fixed to be zero. Then we need to introduce two binary variables
13.3 Quadratic cones and riskless solution¶
In Sec. 6 (Transaction costs) we consider portfolio optimization problems with risk-free security. In this case there is a difference in the computation of the efficient frontier, if we model using the quadratic cone instead of the rotated quadratic cone. Namely, the optimization problem will not have solutions that are a mix of the risk-free security and risky securities. Instead, it will only have either the 100% risk-free solution, or a portfolio of only risky securities. Along the derivation below we will see that the same behavior applies to problems not having a risk-free security but having
Consider the following simple model:
where
We can transform this problem using
The feasible region in this form is a probability simplex. The solution
Let us denote the objective function value by
This does not depend on the norm of
Thus the objective is linear along the 1-dimensional slice between
Furthermore, along every 1-dimensional slice represented by a direction
This is the Sharpe ratio of all portfolios of the form
In general, the larger we set
13.4 Monte Carlo Optimization Selection (MCOS)¶
We can use the following procedure to compare the accuracy of different portfolio optimization methods.
Inputs of the procedure are the estimated mean return
and estimated covariance . Treat this pair of inputs as true values.Compute the optimal asset allocation
for the original input pair .Repeat
times:Do parametric bootstrap resampling, i. e., draw a new
return data sample and derive a simulated pair .De-noise the covariance matrix
using the method in Sec. 4.2.2 (Covariance de-noising).Compute the optimal portfolio
using the optimization method(s) that we wish to analyze.
To estimate the error for each security in the optimal portfolio, compute the standard deviation of the difference vectors
. Then by taking the mean we get a single number estimating the error of the optimization method.
Regarding the final step, we can also measure the average decay in performance by any of the following:
the mean difference in expected outcomes:
the mean difference in variance:
the mean difference in Sharpe ratio or other metric computed from the above statistics.