# 10.1 Portfolio Optimization¶

In this section the Markowitz portfolio optimization problem and variants are implemented using Rmosek package.

## 10.1.1 The Basic Model¶

The classical Markowitz portfolio optimization problem considers investing in \(n\) stocks or assets held over a period of time. Let \(x_j\) denote the amount invested in asset \(j\), and assume a stochastic model where the return of the assets is a random variable \(r\) with known mean

and covariance

The return of the investment is also a random variable \(y = r^Tx\) with mean (or expected return)

and variance

The standard deviation

is usually associated with risk.

The problem facing the investor is to rebalance the portfolio to achieve a good compromise between risk and expected return, e.g., maximize the expected return subject to a budget constraint and an upper bound (denoted \(\gamma\)) on the tolerable risk. This leads to the optimization problem

The variables \(x\) denote the investment i.e. \(x_j\) is the amount invested in asset \(j\) and \(x_j^0\) is the initial holding of asset \(j\). Finally, \(w\) is the initial amount of cash available.

A popular choice is \(x^0=0\) and \(w=1\) because then \(x_j\) may be interpreted as the relative amount of the total portfolio that is invested in asset \(j\).

Since \(e\) is the vector of all ones then

is the total investment. Clearly, the total amount invested must be equal to the initial wealth, which is

\[w + e^T x^0.\]

This leads to the first constraint

\[e^T x = w + e^T x^0.\]

The second constraint

\[x^T \Sigma x \leq \gamma^2\]

ensures that the variance, is bounded by the parameter \(\gamma^2\). Therefore, \(\gamma\) specifies an upper bound of the standard deviation (risk) the investor is willing to undertake. Finally, the constraint

\[x_j \geq 0\]

excludes the possibility of short-selling. This constraint can of course be excluded if short-selling is allowed.

The covariance matrix \(\Sigma\) is positive semidefinite by definition and therefore there exist a matrix \(G\) such that

In general the choice of \(G\) is **not** unique and one possible choice of \(G\) is the Cholesky factorization of \(\Sigma\). However, in many cases another choice is better for efficiency reasons as discussed in Sec. 10.1.3 (Factor model and efficiency). For a given \(G\) we have that

Hence, we may write the risk constraint as

or equivalently

where \(\Q^{n+1}\) is the \((n+1)\)-dimensional quadratic cone. Therefore, problem (10.1) can be written as

which is a conic quadratic optimization problem that can easily be formulated and solved with Rmosek package. Subsequently we will use the example data

and

This implies

Why a Conic Formulation?

Problem (10.1) is a convex quadratically constrained optimization problem that can be solved directly using **MOSEK**. Why then reformulate it as a conic quadratic optimization problem (10.3)? The main reason for choosing a conic model is that it is more robust and usually solves faster and more reliably. For instance it is not always easy to numerically validate that the matrix \(\Sigma\) in (10.1) is positive semidefinite due to the presence of rounding errors. It is also very easy to make a mistake so \(\Sigma\) becomes indefinite. These problems are completely eliminated in the conic formulation.

Moreover, observe the constraint

more numerically robust than

for very small and very large values of \(\gamma\). Indeed, if say \(\gamma \approx 10^4\) then \(\gamma^2\approx 10^8\), which introduces a scaling issue in the model. Hence, using conic formulation we work with the standard deviation instead of variance, which usually gives rise to a better scaled model.

Example code

Listing 10.1 demonstrates how the basic Markowitz model (10.3) is implemented.

```
BasicMarkowitz <- function(
n, # Number of assets
mu, # An n-dimmensional vector of expected returns
GT, # A matrix with n columns so (GT')*GT = covariance matrix
x0, # Initial holdings
w, # Initial cash holding
gamma) # Maximum risk (=std. dev) accepted
{
prob <- list(sense="max")
prob$c <- mu
prob$A <- Matrix(1.0, ncol=n)
prob$bc <- rbind(blc=w+sum(x0),
buc=w+sum(x0))
prob$bx <- rbind(blx=rep(0.0,n),
bux=rep(Inf,n))
# Specify the affine conic constraints.
NUMCONES <- 1
prob$F <- rbind(
Matrix(0.0,ncol=n),
GT
)
prob$g <- c(gamma,rep(0,n))
prob$cones <- matrix(list(), nrow=3, ncol=NUMCONES)
rownames(prob$cones) <- c("type","dim","conepar")
prob$cones[-3,1] <- list("QUAD", n+1)
# Solve the problem
r <- mosek(prob,list(verbose=1))
stopifnot(identical(r$response$code, 0))
# Return the solution
x <- r$sol$itr$xx
list(expret=drop(mu %*% x), stddev=gamma, x=x)
}
```

The source code should be self-explanatory except perhaps for

```
NUMCONES <- 1
prob$F <- rbind(
Matrix(0.0,ncol=n),
GT
)
prob$g <- c(gamma,rep(0,n))
prob$cones <- matrix(list(), nrow=3, ncol=NUMCONES)
rownames(prob$cones) <- c("type","dim","conepar")
prob$cones[-3,1] <- list("QUAD", n+1)
```

where the constraint

is created as an *affine conic constraint format* of the form \(Fx+g\in\K\), in this specific case:

## 10.1.2 The Efficient Frontier¶

The portfolio computed by the Markowitz model is efficient in the sense that there is no other portfolio giving a strictly higher return for the same amount of risk. An efficient portfolio is also sometimes called a Pareto optimal portfolio. Clearly, an investor should only invest in efficient portfolios and therefore it may be relevant to present the investor with all efficient portfolios so the investor can choose the portfolio that has the desired tradeoff between return and risk.

Given a nonnegative \(\alpha\) the problem

is one standard way to trade the expected return against penalizing variance. Note that, in contrast to the previous example, we explicitly use the variance (\(\|G^Tx\|_2^2\)) rather than standard deviation (\(\|G^Tx\|_2\)), therefore the conic model includes a rotated quadratic cone:

The parameter \(\alpha\) specifies the tradeoff between expected return and variance. Ideally the problem (10.4) should be solved for all values \(\alpha \geq 0\) but in practice it is impossible. Using the example data from Sec. 10.1.1 (The Basic Model), the optimal values of return and variance for several values of \(\alpha\) are shown in the figure.

Example code

Listing 10.2 demonstrates how to compute the efficient portfolios for several values of \(\alpha\).

```
EfficientFrontier <- function(
n, # Number of assets
mu, # An n-dimmensional vector of expected returns
GT, # A matrix with n columns so (GT')*GT = covariance matrix
x0, # Initial holdings
w, # Initial cash holding
alphas) # List of risk penalties (we maximize expected return - alpha * variance)
{
prob <- list(sense="max")
prob$A <- cbind(Matrix(1.0, ncol=n), 0.0)
prob$bc <- rbind(blc=w+sum(x0),
buc=w+sum(x0))
prob$bx <- rbind(blx=c(rep(0.0,n), -Inf),
bux=rep(Inf,n+1))
# Specify the affine conic constraints.
NUMCONES <- 1
prob$F <- rbind(
cbind(Matrix(0.0,ncol=n), 1.0),
rep(0, n+1),
cbind(GT , 0.0)
)
prob$g <- c(0, 0.5, rep(0, n))
prob$cones <- matrix(list(), nrow=3, ncol=NUMCONES)
rownames(prob$cones) <- c("type","dim","conepar")
prob$cones[-3,1] <- list("RQUAD", n+2)
frontier <- matrix(NaN, ncol=3, nrow=length(alphas))
colnames(frontier) <- c("alpha","exp.ret.","variance")
for (i in seq_along(alphas))
{
prob$c <- c(mu, -alphas[i])
r <- mosek(prob, list(verbose=1))
stopifnot(identical(r$response$code, 0))
x <- r$sol$itr$xx[1:n]
gamma <- r$sol$itr$xx[n+1]
frontier[i,] <- c(alphas[i], drop(mu%*%x), gamma)
}
frontier
}
```

## 10.1.3 Factor model and efficiency¶

In practice it is often important to solve the portfolio problem very quickly. Therefore, in this section we discuss how to improve computational efficiency at the modeling stage.

The computational cost is of course to some extent dependent on the number of constraints and variables in the optimization problem. However, in practice a more important factor is the sparsity: the number of nonzeros used to represent the problem. Indeed it is often better to focus on the number of nonzeros in \(G\) see (10.2) and try to reduce that number by for instance changing the choice of \(G\).

In other words if the computational efficiency should be improved then it is always good idea to start with focusing at the covariance matrix. As an example assume that

where \(D\) is a positive definite diagonal matrix. Moreover, \(V\) is a matrix with \(n\) rows and \(p\) columns. Such a model for the covariance matrix is called a factor model and usually \(p\) is much smaller than \(n\). In practice \(p\) tends to be a small number independent of \(n\), say less than 100.

One possible choice for \(G\) is the Cholesky factorization of \(\Sigma\) which requires storage proportional to \(n(n+1)/2\). However, another choice is

because then

This choice requires storage proportional to \(n+pn\) which is much less than for the Cholesky choice of \(G\). Indeed assuming \(p\) is a constant storage requirements are reduced by a factor of \(n\).

The example above exploits the so-called factor structure and demonstrates that an alternative choice of \(G\) may lead to a significant reduction in the amount of storage used to represent the problem. This will in most cases also lead to a significant reduction in the solution time.

The lesson to be learned is that it is important to investigate how the covariance matrix is formed. Given this knowledge it might be possible to make a special choice for \(G\) that helps reducing the storage requirements and enhance the computational efficiency. More details about this process can be found in [And13].

## 10.1.4 Slippage Cost¶

The basic Markowitz model assumes that there are no costs associated with trading the assets and that the returns of the assets are independent of the amount traded. Neither of those assumptions is usually valid in practice. Therefore, a more realistic model is

Here \(\Delta x_j\) is the change in the holding of asset \(j\) i.e.

and \(T_j(\Delta x_j)\) specifies the transaction costs when the holding of asset \(j\) is changed from its initial value. In the next two sections we show two different variants of this problem with two nonlinear cost functions \(T\).

## 10.1.5 Market Impact Costs¶

If the initial wealth is fairly small and no short selling is allowed, then the holdings will be small and the traded amount of each asset must also be small. Therefore, it is reasonable to assume that the prices of the assets are independent of the amount traded. However, if a large volume of an asset is sold or purchased, the price, and hence return, can be expected to change. This effect is called market impact costs. It is common to assume that the market impact cost for asset \(j\) can be modeled by

where \(m_j\) is a constant that is estimated in some way by the trader. See [GK00] [p. 452] for details. From the Modeling Cookbook we know that \(t \geq |z|^{3/2}\) can be modeled directly using the power cone \(\POW_3^{2/3,1/3}\):

Hence, it follows that \(\sum_{j=1}^n T_j(\Delta x_j)=\sum_{j=1}^n m_j|x_j-x_j^0|^{3/2}\) can be modeled by \(\sum_{j=1}^n m_jt_j\) under the constraints

Unfortunately this set of constraints is nonconvex due to the constraint

but in many cases the constraint may be replaced by the relaxed constraint

For instance if the universe of assets contains a risk free asset then

cannot hold for an optimal solution.

If the optimal solution has the property (10.9) then the market impact cost within the model is larger than the true market impact cost and hence money are essentially considered garbage and removed by generating transaction costs. This may happen if a portfolio with very small risk is requested because the only way to obtain a small risk is to get rid of some of the assets by generating transaction costs. We generally assume that this is not the case and hence the models (10.7) and (10.8) are equivalent.

The above observations lead to

The revised budget constraint

specifies that the initial wealth covers the investment and the transaction costs. It should be mentioned that transaction costs of the form

where \(p>1\) is a real number can be modeled with the power cone as

See the Modeling Cookbook for details.

Example code

Listing 10.3 demonstrates how to compute an optimal portfolio when market impact cost are included.

```
MarkowitzWithMarketImpact <- function(
n, # Number of assets
mu, # An n-dimmensional vector of expected returns
GT, # A matrix with n columns so (GT')*GT = covariance matrix
x0, # Initial holdings
w, # Initial cash holding
gamma, # Maximum risk (=std. dev) accepted
m) # Market impacts (we use m_j|x_j-x0_j|^3/2 for j'th asset)
{
prob <- list(sense="max")
prob$c <- c(mu, rep(0,n))
prob$A <- cbind(Matrix(1.0,ncol=n), t(m))
prob$bc <- rbind(blc=w+sum(x0),
buc=w+sum(x0))
prob$bx <- rbind(blx=rep(0.0,2*n),
bux=rep(Inf,2*n))
# Specify the affine conic constraints.
# 1) Risk
Fr <- rbind(
Matrix(0.0,nrow=1,ncol=2*n),
cbind(GT, Matrix(0.0,nrow=n,ncol=n))
)
gr <- c(gamma,rep(0,n))
Kr <- matrix(list("QUAD", 1+n, NULL), nrow=3, ncol=1)
# 2) Market impact (t_j >= |x_j-x0_j|^3/2)
# [ t_j ]
# [ 1 ] \in PPOW(2,1)
# [ x_j - x0_j ]
Fm <- sparseMatrix(
i=c(seq(from=1,by=3,len=n), seq(from=3,by=3,len=n)),
j=c(seq(from=n+1,len=n), seq(from=1,len=n)),
x=c(rep(1.0,n), rep(1.0,n)),
dims=c(3*n, 2*n))
gm <- rep(c(0,1,0), n)
gm[seq(from=3,by=3,len=n)] <- -x0
Km <- matrix(list("PPOW", 3, c(2,1)), nrow=3, ncol=n)
prob$F <- rbind(Fr, Fm)
prob$g <- c(gr, gm)
prob$cones <- cbind(Kr, Km)
rownames(prob$cones) <- c("type","dim","conepar")
# Solve the problem
r <- mosek(prob,list(verbose=1))
stopifnot(identical(r$response$code, 0))
# Return the solution
x <- r$sol$itr$xx[1:n]
tx <- r$sol$itr$xx[(n+1):(2*n)]
list(expret=drop(mu %*% x), stddev=gamma, cost=drop(m %*% tx), x=x)
}
```

In the last part of the code we extend the affine conic constraint with triples of the form \((t_k, 1, x_k-x_k^0)\). Such a triple is constructed as an affine conic constraint with:

where \(e_{j}\) denotes the vector of length \(2n\) with \(1\) at position \(j\) and \(0\) otherwise. Membership of a sequence of triples in power cones \(\POW_3^{2/3,1/3}\) is specified with the syntax:

```
Km <- matrix(list("PPOW", 3, c(2,1)), nrow=3, ncol=n)
```

Note that the construction `list("PPOW", d, c(a,b))`

creates a power done of dimension \(d\) with exponents

## 10.1.6 Transaction Costs¶

Now assume there is a cost associated with trading asset \(j\) given by

Hence, whenever asset \(j\) is traded we pay a fixed setup cost \(f_j\) and a variable cost of \(g_j\) per unit traded. Given the assumptions about transaction costs in this section problem (10.6) may be formulated as

(10.11)¶\[\begin{split}\begin{array}{lrcll} \mbox{maximize} & \mu^T x & & &\\ \mbox{subject to} & e^T x + f^Ty + g^T z & = & w + e^T x^0, &\\ & (\gamma,G^T x) & \in & \Q^{n+1}, & \\ & z_j & \geq & x_j - x_j^0, & j=1,\ldots,n,\\ & z_j & \geq & x_j^0 - x_j, & j=1,\ldots,n,\\ & z_j & \leq & U_j y_j, & j=1,\ldots,n,\\ & y_j & \in & \{0,1\}, & j=1,\ldots,n, \\ & x & \geq & 0. & \end{array}\end{split}\]

First observe that

We choose \(U_j\) as some a priori upper bound on the amount of trading in asset \(j\) and therefore if \(z_j>0\) then \(y_j = 1\) has to be the case. This implies that the transaction cost for asset \(j\) is given by

Example code

The following example code demonstrates how to compute an optimal portfolio when transaction costs are included.

```
MarkowitzWithTransactionCosts <- function(
n, # Number of assets
mu, # An n-dimmensional vector of expected returns
GT, # A matrix with n columns so (GT')*GT = covariance matrix
x0, # Initial holdings
w, # Initial cash holding
gamma, # Maximum risk (=std. dev) accepted
f, # Fixed transaction cost
g) # Linear part of transaction cost
{
# Upper bound on the traded amount
u <- w+sum(x0)
prob <- list(sense="max")
prob$c <- c(mu, rep(0,2*n))
# Specify linear constraints
# [ e' g' f' ] [ x ] = w + e'*x0
# [ I -I 0 ] * [ z ] <= x0
# [ I I 0 ] [ y ] >= x0
# [ 0 I -U ] <= 0
prob$A <- rbind(cbind(Matrix(1.0,ncol=n), t(g), t(f)),
cbind(Diagonal(n, 1.0), -Diagonal(n, 1.0), Matrix(0,n,n)),
cbind(Diagonal(n, 1.0), Diagonal(n, 1.0), Matrix(0,n,n)),
cbind(Matrix(0,n,n), Diagonal(n, 1.0), Diagonal(n, -u)))
prob$bc <- rbind(blc=c(w+sum(x0), rep(-Inf,n), x0, rep(-Inf,n)),
buc=c(w+sum(x0), x0, rep(Inf,n), rep(0.0,n)))
# No shortselling and the linear bound 0 <= y <= 1
prob$bx <- rbind(blx=c(rep(0.0,n), rep(-Inf,n), rep(0.0,n)),
bux=c(rep(Inf,n), rep(Inf, n), rep(1.0,n)))
# Specify the affine conic constraints for risk
prob$F <- rbind(
Matrix(0.0,nrow=1,ncol=3*n),
cbind(GT, Matrix(0.0,nrow=n,ncol=2*n))
)
prob$g <- c(gamma,rep(0,n))
prob$cones <- matrix(list("QUAD", 1+n, NULL), nrow=3, ncol=1)
rownames(prob$cones) <- c("type","dim","conepar")
# Demand y to be integer (hence binary)
prob$intsub <- (2*n+1):(3*n);
# Solve the problem
r <- mosek(prob,list(verbose=1))
stopifnot(identical(r$response$code, 0))
# Return the solution
x <- r$sol$int$xx[1:n]
z <- r$sol$int$xx[(n+1):(2*n)]
y <- r$sol$int$xx[(2*n+1):(3*n)]
list(expret=drop(mu %*% x), stddev=gamma, cost = drop(f %*% y)+drop(g %*% z), x=x)
}
```

## 10.1.7 Cardinality constraints¶

Another method to reduce costs involved with processing transactions is to only change positions in a small number of assets. In other words, at most \(k\) of the differences \(|\Delta x_j|=|x_j - x_j^0|\) are allowed to be non-zero, where \(k\) is (much) smaller than the total number of assets \(n\).

This type of constraint can be again modeled by introducing a binary variable \(y_j\) which indicates if \(\Delta x_j\neq 0\) and bounding the sum of \(y_j\). The basic Markowitz model then gets updated as follows:

(10.12)¶\[\begin{split}\begin{array}{lrcll} \mbox{maximize} & \mu^T x & & &\\ \mbox{subject to} & e^T x & = & w + e^T x^0, &\\ & (\gamma,G^T x) & \in & \Q^{n+1}, & \\ & z_j & \geq & x_j - x_j^0, & j=1,\ldots,n,\\ & z_j & \geq & x_j^0 - x_j, & j=1,\ldots,n,\\ & z_j & \leq & U_j y_j, & j=1,\ldots,n,\\ & y_j & \in & \{0,1\}, & j=1,\ldots,n, \\ & e^T y & \leq & k, & \\ & x & \geq & 0, & \end{array}\end{split}\]

were \(U_j\) is some a priori chosen upper bound on the amount of trading in asset \(j\).

Example code

The following example code demonstrates how to compute an optimal portfolio with cardinality bounds.

```
MarkowitzWithCardinality <- function(
n, # Number of assets
mu, # An n-dimmensional vector of expected returns
GT, # A matrix with n columns so (GT')*GT = covariance matrix
x0, # Initial holdings
w, # Initial cash holding
gamma, # Maximum risk (=std. dev) accepted
k) # Cardinality bound
{
# Upper bound on the traded amount
u <- w+sum(x0)
prob <- list(sense="max")
prob$c <- c(mu, rep(0,2*n))
# Specify linear constraints
# [ e' 0 0 ] = w + e'*x0
# [ I -I 0 ] [ x ] <= x0
# [ I I 0 ] * [ z ] >= x0
# [ 0 I -U ] [ y ] <= 0
# [ 0 0 e' ] <= k
prob$A <- rbind(cbind(Matrix(1.0,ncol=n), Matrix(0.0,ncol=2*n)),
cbind(Diagonal(n, 1.0), -Diagonal(n, 1.0), Matrix(0,n,n)),
cbind(Diagonal(n, 1.0), Diagonal(n, 1.0), Matrix(0,n,n)),
cbind(Matrix(0,n,n), Diagonal(n, 1.0), Diagonal(n, -u)),
cbind(Matrix(0.0,ncol=2*n), Matrix(1.0,ncol=n)))
prob$bc <- rbind(blc=c(w+sum(x0), rep(-Inf,n), x0, rep(-Inf,n), 0.0),
buc=c(w+sum(x0), x0, rep(Inf,n), rep(0.0,n), k))
# No shortselling and the linear bound 0 <= y <= 1
prob$bx <- rbind(blx=c(rep(0.0,n), rep(-Inf,n), rep(0.0,n)),
bux=c(rep(Inf,n), rep(Inf, n), rep(1.0,n)))
# Specify the affine conic constraints for risk
prob$F <- rbind(
Matrix(0.0,nrow=1,ncol=3*n),
cbind(GT, Matrix(0.0,nrow=n,ncol=2*n))
)
prob$g <- c(gamma,rep(0,n))
prob$cones <- matrix(list("QUAD", 1+n, NULL), nrow=3, ncol=1)
rownames(prob$cones) <- c("type","dim","conepar")
# Demand y to be integer (hence binary)
prob$intsub <- (2*n+1):(3*n);
# Solve the problem
r <- mosek(prob,list(verbose=1))
stopifnot(identical(r$response$code, 0))
# Return the solution
x <- r$sol$int$xx[1:n]
list(card=k, expret=drop(mu %*% x), x=x)
}
```

If we solve our running example with \(k=1,2,3\) then we get the following solutions, with increasing expected returns:

```
Bound: 1 Expected return: 0,0627 Solution: 0,0000 0,0000 1,0000
Bound: 2 Expected return: 0,0669 Solution: 0,0939 0,0000 0,9061
Bound: 3 Expected return: 0,0685 Solution: 0,1010 0,1156 0,7834
```