# 10.5 Nearest Correlation Matrix Problem¶

A correlation matrix is a symmetric positive definite matrix with unit diagonal. This term has origins in statistics, since the matrix whose entries are the correlation coefficients of a sequence of random variables has all these properties.

In this section we study variants of the problem of approximating a given symmetric matrix $$A$$ with correlation matrices:

• find the correlation matrix $$X$$ nearest to $$A$$ in the Frobenius norm,
• find an approximation of the form $$D+X$$ where $$D$$ is a diagonal matrix with positive diagonal and $$X$$ is a positive semidefinite matrix of low rank, using the combination of Frobenius and nuclear norm.

Both problems are related to portfolio optimization, where one can often have a matrix $$A$$ that only approximates the correlations of stocks. For subsequent optimizations one would like to approximate $$A$$ with a correlation matrix or, in the factor model, with $$D+VV^T$$ with $$VV^T$$ of small rank.

## 10.5.1 Nearest correlation with the Frobenius norm¶

The Frobenius norm of a real matrix $$M$$ is defined as

$\|M\|_F = \left(\sum_{i,j}M_{i,j}^2\right)^{1/2}$

and with respect to this norm our optimization problem can be expressed simply as:

(1)$\begin{split}\begin{array}{ll} \minimize & \|A-X\|_F\\ \st & \mathbf{diag}(X) = e,\\ & X \succeq 0.\\ \end{array}\end{split}$

We can exploit the symmetry of $$A$$ and $$X$$ to get a compact vector representation. To this end we make use of the following mapping from a symmetric matrix to a flattened vector containing the (scaled) lower triangular part of the matrix:

(2)$\begin{split}\begin{array}{ll} \mbox{vec}: & \R^{n\times n} \rightarrow \real^{n(n+1)/2} \\ \mbox{vec}(M) = & (\alpha_{11}M_{11},\alpha_{21}M_{21},\alpha_{22}M_{22},\ldots,\alpha_{n1}M_{n1},\ldots,\alpha_{nn}M_{nn}) \\ \alpha_{ij}=&\begin{cases}1 & j=i\\ \sqrt{2} & j<i\end{cases} \end{array}\end{split}$

Note that $$\|M\|_F=\|\mbox{vec}(M)\|_2$$. The Fusion implementation of $$\mbox{vec}$$ is as follows:

Listing 26 Implementation of function $$vec$$ in (2). Click here to download.
Expression::t vec(Expression::t e)
{

int N = e->getShape()->dim(0);
int dim = N * (N + 1) / 2;

auto msubi = new_array_ptr<int, 1>(dim);
auto msubj = new_array_ptr<int, 1>(dim);
auto mcof  = new_array_ptr<double, 1>(dim);

for (int i = 0, k = 0; i < N; ++i)
for (int j = 0; j < i + 1; ++j, ++k)
{
(*msubi)[k] = k;
(*msubj)[k] = i * N + j;
(*mcof) [k] = (i == j) ? 1.0 : std::sqrt(2.0);
}

Matrix::t S = Matrix::sparse(N * (N + 1) / 2, N * N, msubi, msubj, mcof);
return Expr::mul(S, Expr::reshape(e, N * N));
}


That leads to an optimization problem with both conic quadratic and semidefinite constraints:

(3)$\begin{split}\begin{array}{ll} \minimize & t\\ \st & (t, \mbox{vec} (A-X)) \in \Q,\\ & \mathbf{diag}(X) = e,\\ & X \succeq 0.\\ \end{array}\end{split}$

Code example

Listing 27 Implementation of problem (3). Click here to download.
void nearestcorr( std::shared_ptr<ndarray<double, 2>> A)
{
int N = A->size(0);

// Create a model
Model::t M = new Model("NearestCorrelation"); auto _M = finally([&]() { M->dispose(); });

// Setting up the variables
Variable::t X = M->variable("X", Domain::inPSDCone(N));
Variable::t t = M->variable("t", 1, Domain::unbounded());

// (t, vec (A-X)) \in Q
M->constraint( Expr::vstack(t, vec(Expr::sub(A, X))), Domain::inQCone() );

// diag(X) = e
M->constraint(X->diag(), Domain::equalsTo(1.0));

// Objective: Minimize t
M->objective(ObjectiveSense::Minimize, t);

// Solve the problem
M->solve();

// Get the solution values
std::cout << "X = \n"; print_mat(std::cout, X->level());
std::cout << "t = " << *(t->level()->begin()) << std::endl;
}


We use the following input

Listing 28 Input for the nearest correlation problem.
  int N = 5;
auto A = new_array_ptr<double, 2>(
{ { 0.0,  0.5,  -0.1,  -0.2,   0.5},
{ 0.5,  1.25, -0.05, -0.1,   0.25},
{ -0.1, -0.05,  0.51,  0.02, -0.05},
{ -0.2, -0.1,   0.02,  0.54, -0.1},
{ 0.5,  0.25, -0.05, -0.1,   1.25}
});


The expected output is the following (small differences may apply):

X =
[ 1 0.500019 -0.0999999 -0.200001 0.500019]
[ 0.500019 1 -0.0499955 -0.0999915 0.249991]
[ -0.0999999 -0.0499955 1 0.0199975 -0.0499955]
[ -0.200001 -0.0999915 0.0199975 1 -0.0999915]
[ 0.500019 0.249991 -0.0499955 -0.0999915 1]


## 10.5.2 Nearest Correlation with Nuclear-norm Penalty¶

Next, we consider the approximation of $$A$$ of the form $$D+X$$ where $$D=\diag(w),\ w\geq 0$$ and $$X\succeq 0$$. We will also aim at minimizing the rank of $$X$$. This can be approximated by a relaxed linear objective penalizing the trace $$\trace(X)$$ (which in this case is the nuclear norm of $$X$$ and happens to be the sum of its eigenvalues).

The combination of these constraints leads to a problem:

$\begin{split}\begin{array}{ll} \minimize & \left\|X+\diag(w)-A\right\|_F + \gamma \trace(X),\\ \st & X \succeq 0, w \geq 0, \end{array}\end{split}$

where the parameter $$\gamma$$ controls the tradeoff between the quality of approximation and the rank of $$X$$.

Exploit the mapping $$\mbox{vec}$$ defined in (2) we can express this problem as:

(4)$\begin{split}\begin{array}{ll} \minimize & t + \gamma\trace(X) \\ \st & (t, \mbox{vec} (X + \diag(w) - A) ) \in \Q, \\ & X \succeq 0 , w \geq 0. \end{array}\end{split}$

Code example

Listing 29 Implementation of problem (4). Click here to download.
void nearestcorr_nn(
std::shared_ptr<ndarray<double, 2>>  A,
const std::vector<double>           & gammas,
std::vector<double>                 & res,
std::vector<double>                 & rank)
{
int N = A->size(0);

Model::t M = new Model("NucNorm"); auto M_ = monty::finally([&]() { M->dispose(); });

// Setup variables
Variable::t t = M->variable("t", 1, Domain::unbounded());
Variable::t X = M->variable("X", Domain::inPSDCone(N));
Variable::t w = M->variable("w", N, Domain::greaterThan(0.0));

// (t, vec (X + diag(w) - A)) in Q
Expression::t D = Expr::mulElm( Matrix::eye(N), Var::repeat(w, 1, N) );
M->constraint( Expr::vstack( t, vec(Expr::sub(Expr::add(X, D), A)) ), Domain::inQCone() );

// Trace(X)
auto TrX = Expr::sum(X->diag());

for (int k = 0; k < gammas.size(); ++k)
{
// Objective: Minimize t + gamma*Tr(X)
M->objective(ObjectiveSense::Minimize, Expr::add(t, Expr::mul(gammas[k], TrX )));
M->solve();

// Find the eigenvalues of X and approximate its rank
auto d = new_array_ptr<double, 1>(N);
mosek::LinAlg::syeig(MSK_UPLO_LO, N, X->level(), d);
int rnk = 0; for (int i = 0; i < N; ++i) if ((*d)[i] > 1e-6) ++rnk;

res[k]  = (*(t->level()))[0];
rank[k] = rnk;
}
}


We feed MOSEK with the same input as in Sec. 10.5.1 (Nearest correlation with the Frobenius norm). The problem is solved for a range of values $$\gamma$$ values, to demonstrate how the penalty term helps achieve a low rank solution. To this extent we report both the rank of $$X$$ and the residual norm $$\left\|X+\diag(w)-A\right\|_F$$.

gamma = 0.00, rank = 4.00, res = 0.31
gamma = 0.10, rank = 2.00, res = 0.43
gamma = 0.20, rank = 1.00, res = 0.51
gamma = 0.30, rank = 1.00, res = 0.53
gamma = 0.40, rank = 1.00, res = 0.56
gamma = 0.50, rank = 1.00, res = 0.60
gamma = 0.60, rank = 1.00, res = 0.68
gamma = 0.70, rank = 1.00, res = 0.80
gamma = 0.80, rank = 1.00, res = 1.06
gamma = 0.90, rank = 0.00, res = 1.13
gamma = 1.00, rank = 0.00, res = 1.13