12 Regression and regularization¶
Regression is statistical tool used in many disciplines, in particular in finance and investing. It is used to determine the type and strength of the function relationship between statistical variables. In some cases when data is scarce or has errors, regularization is also useful. In this chapter we introduce regression and regularizaton in the context of conic optimization along the lines of [SHAD13], and show how to apply these methods for portfolio optimization.
12.1 Linear regression¶
The most basic regression problem assumes a linear relationship
The same problem can also be written with squared norm:
The geometric interpretation is that we are looking for the vector in the column space of
12.1.1 Assumptions¶
Let
Exogeneity:
, meaning that the error term is orthogonal to the explanatory variables, so there are no endogeneous drivers for in the model. This also implies .No autocorrelation: The error terms are uncorrelated between observations. This implies that the off-diagonal of
is zero.Homoscedasticity: The error term has the same variance for all observations, i. e., for any values of the explanatory variables. This implies that
.No linear dependence: The observation matrix
must have full column rank.
We also assume that
Some of the above assumptions can be relaxed by using specific extensions of the OLS method. However, these extensions might be more complex and might have a greater data requirement in order to produce an equally precise model.
12.1.2 Solution¶
12.1.2.1 Normal equations¶
If problem (12.1) is unconstrained, we can also derive its explicit solution called the normal equations:
Note however, that this should be used only as a theoretical result. In practice, it can be numerically unstable because of the matrix inversion step when solving for
12.1.2.2 Conic optimization¶
If we convert the problem (12.2) into a conic optimization problem, we can not only solve it in a more efficient and numerically stable way, but will also be able to extend the problem with constraints. Here we state the conic equivalent of (12.1):
For (12.2) we use the rotated quadratic cone,
Let us now extend (12.5) with constraints:
12.1.3 Portfolio optimization as regression problem¶
Portfolio optimization problems can also take the form of (12.2), if instead of working with the covariance matrix of securities
Equation (12.7) gives us a way to define the matrix
Then we can write a simple benchmark relative optimization problem that minimizes tracking error as
where we omitted the factor
The conic form of (12.8) becomes
12.2 Regularization¶
In some cases the data matrix
Regularization typically means that we add a penalty term to our objective function, that helps to direct the solution procedure towards optimal solutions
12.2.1 Example penalty terms¶
Here we discuss three examples of regularization terms, two of which are commonly applied. Let
12.2.1.1 Ridge regularization¶
Ridge regularization adds a quadratic penalty term to the objective function:
If
We can derive the normal equations also in this case, and we can observe that there is a relation between ridge regularization and covariance shrinkage:
The conic equivalent of (12.10) will introduce an extra quadratic cone constraint for the regularization term:
12.2.1.2 LASSO regularization¶
LASSO regularization adds a linear penalty term to the objective function:
It is also called L1 regularization or sparse regularization, because it gives preference to sparse solutions, i.e., ones with few nonzeros.
The conic equivalent of (12.13) will model the 1-norm constraint for the regularization term, as described in Sec. 13.1.1.6 (Manhattan norm (1-norm)):
12.2.1.3 The regularization¶
The penalty term discussed here is not common, but has applications in modeling market impact cost (see Sec. 6.3 (Market impact costs)). We add a subquadratic penalty term to the objective function:
where
The conic equivalent of (12.15) uses the power cone for the regularization term,
12.2.2 Regularization in portfolio optimization¶
In the context of portfolio optimization, examples of regularization terms appear when we penalize trading costs or do robust optimization. Here we show an example of a trading cost penalty. For robust optimization, see Sec. 10 (Robust optimization).
Let
Suppose we have linear trading costs, expressed by the penalty
The conic form of (12.17) becomes
12.3 Example¶
In this chapter we present the example (12.18) as MOSEK Fusion code. We will use yearly linear return scenario data from the same eight stocks used in other examples. The benchmark will be the return series of the SPY ETF.
12.3.1 Data preparation¶
We generate the data the same way as in Sec. 3.4 (Example) up to the point where we have the expected yearly logarithmic return vector
# Generate logarithmic return observations assuming normal distribution
scenarios_log = \
np.random.default_rng().multivariate_normal(m_log, S_log, T)
# Convert logarithmic return observations to linear return observations
scenarios_lin = np.exp(scenarios_log) - 1
Next, we center the data and separate security data from the benchmark data. Note that we also scale the returns by the factor of
# Center the return data
centered_return = scenarios_lin - scenarios_lin.mean(axis=0)
# Security return scenarios (scaled)
security_return = scenarios_lin[:, :N] / np.sqrt(T - 1)
# Benchmark return scenarios (scaled)
benchmark_return = scenarios_lin[:, -1] / np.sqrt(T - 1)
12.3.2 Optimization model¶
We start by defining the variables. The variable
# Variables
# The variable x is the fraction of holdings in each security.
# It is restricted to be positive - no short-selling.
x = M.variable("x", N, Domain.greaterThan(0.0))
xt = x - x0
# The variable t models the OLS objective function term (tracking error).
t = M.variable("t", 1, Domain.unbounded())
# The variables u and v model the regularization terms
# (transaction cost penalties).
u = M.variable("u", 1, Domain.unbounded())
v = M.variable("v", N, Domain.unbounded())
# Budget constraint
M.constraint('budget', Expr.sum(x) == 1.0)
The objective function will thus be the sum of the above variables:
# Objective
penalty_lin = lambda_1 * u
penalty_32 = lambda_2 * Expr.sum(v)
M.objective('obj', ObjectiveSense.Minimize,
t + penalty_lin + penalty_32)
The constraints corresponding to the penalty terms are modeled by the following rows:
# Constraints for the penalties
norm1(M, xt, u)
M.constraint('market_impact',
Expr.hstack(v, Expr.constTerm(N, 1.0), xt),
Domain.inPPowerCone(1.0 / beta))
The norm1
custom function definition will be given below where the full code of the Fusion model is presented.
Finally, we implement the tracking error constraint:
# Constraint for the tracking error
residual = R.T @ x - r_bm
M.constraint('tracking_error',
Expr.vstack(t, 0.5, residual),
Domain.inRotatedQCone())
The complete Fusion model will then be the following code:
def absval(M, x, z):
M.constraint(z >= x)
M.constraint(z >= -x)
def norm1(M, x, t):
z = M.variable(x.getSize(), Domain.greaterThan(0.0))
absval(M, x, z)
M.constraint(Expr.sum(z) == t)
def MinTrackingError(N, R, r_bm, x0, lambda_1, lambda_2, beta=1.5):
with Model("Regression") as M:
# Variables
# The variable x is the fraction of holdings in each security.
# It is restricted to be positive - no short-selling.
x = M.variable("x", N, Domain.greaterThan(0.0))
xt = x - x0
# The variable t models the OLS objective function term
# (tracking error).
t = M.variable("t", 1, Domain.unbounded())
# The variables u and v model the regularization terms
# (transaction cost penalties).
u = M.variable("u", 1, Domain.unbounded())
v = M.variable("v", N, Domain.unbounded())
# Budget constraint
M.constraint('budget', Expr.sum(x) == 1.0)
# Objective
penalty_lin = lambda_1 * u
penalty_32 = lambda_2 * Expr.sum(v)
M.objective('obj', ObjectiveSense.Minimize,
t + penalty_lin + penalty_32)
# Constraints for the penalties
norm1(M, xt, u)
M.constraint('market_impact',
Expr.hstack(v, Expr.constTerm(N, 1.0), xt),
Domain.inPPowerCone(1.0 / beta))
# Constraint for the tracking error
residual = R.T @ x - r_bm
M.constraint('tracking_error',
Expr.vstack(t, 0.5, residual),
Domain.inRotatedQCone())
# Create DataFrame to store the results.
# Last security name (the SPY) is removed.
columns = ["track_err", "lin_tcost", "mkt_tcost"] + \
df_prices.columns[:N].tolist()
df_result = pd.DataFrame(columns=columns)
# Solve optimization
M.solve()
# Save results
tracking_error = t.level()[0]
linear_tcost = u.level()[0]
market_impact_tcost = np.sum(v.level())
row = pd.Series(
[tracking_error, linear_tcost, market_impact_tcost] + \
list(x.level()), index=columns)
df_result = pd.concat([df_result, pd.DataFrame([row])],
ignore_index=True)
return df_result
12.3.3 Results¶
Here we show how to run the optimization model, and present the results. First, we set the number of stocks, then we set the penalty coefficients
N = 8
lambda_1 = 0.0001
lambda_2 = 0.0001
x0 = np.ones(N) / N
Then we run the optimization model:
df_result = MinTrackingError(N, security_return.T,
benchmark_return, x0, lambda_1, lambda_2)
The optimal portfolio composition is
.
Footnotes