11.9 Nearest Correlation Matrix Problem¶

A correlation matrix is a symmetric positive definite matrix with unit diagonal. This term has origins in statistics, since the matrix whose entries are the correlation coefficients of a sequence of random variables has all these properties.

In this section we study variants of the problem of approximating a given symmetric matrix $A$ with correlation matrices:

find the correlation matrix $X$ nearest to $A$ in the Frobenius norm,
find an approximation of the form $D + X$ where $D$ is a diagonal matrix with positive diagonal and $X$ is a positive semidefinite matrix of low rank, using the combination of Frobenius and nuclear norm.

Both problems are related to portfolio optimization, where one can often have a matrix $A$ that only approximates the correlations of stocks. For subsequent optimizations one would like to approximate $A$ with a correlation matrix or, in the factor model, with $D + V V^{T}$ with $V V^{T}$ of small rank.

11.9.1 Nearest correlation with the Frobenius norm¶

The Frobenius norm of a real matrix $M$ is defined as

‖ M ‖_{F} = {(\sum_{i, j} M_{i, j}^{2})}^{1 / 2}

and with respect to this norm our optimization problem can be expressed simply as:

(11.36)¶

\begin{array}{r} \begin{array}{ll} minimize & ‖ A - X ‖_{F} \\ subject to & diag (X) = e, \\ X ⪰ 0. \end{array} \end{array}

We can exploit the symmetry of $A$ and $X$ to get a compact vector representation. To this end we make use of the following mapping from a symmetric matrix to a flattened vector containing the (scaled) lower triangular part of the matrix:

(11.37)¶

\begin{array}{r} \begin{array}{ll} vec : & R^{n \times n} \to R^{n (n + 1) / 2} \\ vec (M) = & (α_{11} M_{11}, α_{21} M_{21}, α_{22} M_{22}, \dots, α_{n 1} M_{n 1}, \dots, α_{n n} M_{n n}) \\ α_{i j} = & {\begin{cases} 1 & j = i \\ \sqrt{2} & j < i \end{cases} \end{array} \end{array}

Note that $‖ M ‖_{F} = ‖ vec (M) ‖_{2}$ . The Fusion implementation of $vec$ is as follows:

Listing 11.18 Implementation of function

v e c

in (11.37). Click here to download.¶

  /** Assuming that e is an NxN expression, return the lower triangular part as a vector.
   */
  public static Expression vec(Expression e) {
    int N         = e.getShape()[0];
    int[] msubi   = new int[N * (N + 1) / 2];
    int[] msubj   = new int[N * (N + 1) / 2];
    double[] mcof = new double[N * (N + 1) / 2];

    for (int i = 0, k = 0; i < N; ++i)
      for (int j = 0; j < i + 1; ++j, ++k) {
        msubi[k] = k;
        msubj[k] = i * N + j;
        if (i == j) mcof[k] = 1.0;
        else        mcof[k] = Math.sqrt(2);
      }

    Matrix S = Matrix.sparse(N * (N + 1) / 2, N * N, msubi, msubj, mcof);
    return Expr.mul(S, Expr.flatten(e));
  }

That leads to an optimization problem with both conic quadratic and semidefinite constraints:

(11.38)¶

\begin{array}{r} \begin{array}{ll} minimize & t \\ subject to & (t, vec (A - X)) \in Q, \\ diag (X) = e, \\ X ⪰ 0. \end{array} \end{array}

Code example

Listing 11.19 Implementation of problem (11.38). Click here to download.¶

  private static void nearestcorr(Matrix A)
  throws SolutionError {
    int N = A.numRows();

    Model M = new Model("NearestCorrelation");
    try {
      // Setting up the variables
      Variable X = M.variable("X", Domain.inPSDCone(N));
      Variable t = M.variable("t", 1, Domain.unbounded());

      // (t, vec (A-X)) \in Q
      M.constraint( Expr.vstack(t, vec(Expr.sub( A, X))), Domain.inQCone() );

      // diag(X) = e
      M.constraint(X.diag(), Domain.equalsTo(1.0));

      // Objective: Minimize t
      M.objective(ObjectiveSense.Minimize, t);

      // Solve the problem
      M.solve();

      // Get the solution values
      System.out.println("X = \n" + mattostr(X.level(), N));
      System.out.println("t = \n" + mattostr(t.level(), N));

    } finally {
      M.dispose();
    }
  }

We use the following input

Listing 11.20 Input for the nearest correlation problem.¶

    int N = 5;
    Matrix A = Matrix.dense(N, N,
                            new double[] { 0.0,  0.5,  -0.1,  -0.2,   0.5,
                                           0.5,  1.25, -0.05, -0.1,   0.25,
                                           -0.1, -0.05,  0.51,  0.02, -0.05,
                                           -0.2, -0.1,   0.02,  0.54, -0.1,
                                           0.5,  0.25, -0.05, -0.1,   1.25
                                         });

The expected output is the following (small differences may apply):

X =
[[ 1.          0.50001941 -0.09999994 -0.20000084  0.50001941]
 [ 0.50001941  1.         -0.04999551 -0.09999154  0.24999101]
 [-0.09999994 -0.04999551  1.          0.01999746 -0.04999551]
 [-0.20000084 -0.09999154  0.01999746  1.         -0.09999154]
 [ 0.50001941  0.24999101 -0.04999551 -0.09999154  1.        ]]

11.9.2 Nearest Correlation with Nuclear-norm Penalty¶

Next, we consider the approximation of $A$ of the form $D + X$ where $D = diag (w), w \geq 0$ and $X ⪰ 0$ . We will also aim at minimizing the rank of $X$ . This can be approximated by a relaxed linear objective penalizing the trace $Tr (X)$ (which in this case is the nuclear norm of $X$ and happens to be the sum of its eigenvalues).

The combination of these constraints leads to a problem:

\begin{array}{r} \begin{array}{ll} minimize & {‖ X + diag (w) - A ‖}_{F} + γ Tr (X), \\ subject to & X ⪰ 0, w \geq 0, \end{array} \end{array}

where the parameter $γ$ controls the tradeoff between the quality of approximation and the rank of $X$ .

Exploit the mapping $vec$ defined in (11.37) we can express this problem as:

(11.39)¶

\begin{array}{r} \begin{array}{ll} minimize & t + γ Tr (X) \\ subject to & (t, vec (X + diag (w) - A)) \in Q, \\ X ⪰ 0, w \geq 0. \end{array} \end{array}

Code example

Listing 11.21 Implementation of problem (11.39). Click here to download.¶

  /* Nearest correlation with nuclear norm penalty */
  private static void nearestcorr_nn(Matrix A, double[] gammas, double[] res, int[] rank)
  throws SolutionError {
    int N = A.numRows();
    Model M = new Model("NucNorm");
    try {
      // Setup variables
      Variable t = M.variable("t", 1, Domain.unbounded());
      Variable X = M.variable("X", Domain.inPSDCone(N));
      Variable w = M.variable("w", N, Domain.greaterThan(0.0));

      // (t, vec (X + diag(w) - A)) in Q
      Expression D = Expr.mulElm( Matrix.eye(N), Var.repeat(w, N, 1) );
      M.constraint( Expr.vstack( t, vec(Expr.sub(Expr.add(X, D), A)) ), Domain.inQCone() );

      // Trace(X)
      Expression TX = Expr.sum(X.diag());

      for (int k = 0; k < gammas.length; ++k) {
        // Objective: Minimize t + gamma*Tr(X)
        M.objective(ObjectiveSense.Minimize, Expr.add(t, Expr.mul(gammas[k], TX)));
        M.solve();

        // Get the eigenvalues of X and approximate its rank
        double[] d = new double[N];
        LinAlg.syeig(mosek.uplo.lo, N, X.level(), d);
        int rnk = 0; for (int i = 0; i < d.length; ++i) if (d[i] > 1e-6) ++rnk;

        res[k] = t.level()[0];
        rank[k] = rnk;
      }
    } finally {
      M.dispose();
    }
  }

We feed MOSEK with the same input as in Sec. 11.9.1 (Nearest correlation with the Frobenius norm). The problem is solved for a range of values $γ$ values, to demonstrate how the penalty term helps achieve a low rank solution. To this extent we report both the rank of $X$ and the residual norm ${‖ X + diag (w) - A ‖}_{F}$ .

--- Nearest Correlation with Nuclear Norm---
gamma=0.000000, res=3.076163e-01, rank=4
gamma=0.100000, res=4.251692e-01, rank=2
gamma=0.200000, res=5.112082e-01, rank=1
gamma=0.300000, res=5.298432e-01, rank=1
gamma=0.400000, res=5.592686e-01, rank=1
gamma=0.500000, res=6.045702e-01, rank=1
gamma=0.600000, res=6.764402e-01, rank=1
gamma=0.700000, res=8.009913e-01, rank=1
gamma=0.800000, res=1.062385e+00, rank=1
gamma=0.900000, res=1.129513e+00, rank=0
gamma=1.000000, res=1.129513e+00, rank=0

11.9 Nearest Correlation Matrix Problem¶

11.9.1 Nearest correlation with the Frobenius norm¶

11.9.2 Nearest Correlation with Nuclear-norm Penalty¶

Table of Contents

Download PDF

Modeling Cookbook

Cheatsheet