# 9.3 Computing a Sparse Cholesky Factorization¶

Given a positive semidefinite symmetric (PSD) matrix

$A \in \real^{n\times n}$

it is well known there exists a matrix $$L$$ such that

$A = LL^T.$

If the matrix $$L$$ is lower triangular then it is called a Cholesky factorization. Given $$A$$ is positive definite (nonsingular) then $$L$$ is also nonsingular. A Cholesky factorization is useful for many reasons:

• A system of linear equations $$Ax=b$$ can be solved by first solving the lower triangular system $$L y = b$$ followed by the upper triangular system $$L^T x = y$$.

• A quadratic term $$x^TAx$$ in a constraint or objective can be replaced with $$y^Ty$$ for $$y=L^Tx$$, potentially leading to a more robust formulation (see [And13]).

Therefore, MOSEK provides a function that can compute a Cholesky factorization of a PSD matrix. In addition a function for solving linear systems with a nonsingular lower or upper triangular matrix is available.

In practice $$A$$ may be very large with $$n$$ is in the range of millions. However, then $$A$$ is typically sparse which means that most of the elements in $$A$$ are zero, and sparsity can be exploited to reduce the cost of computing the Cholesky factorization. The computational savings depend on the positions of zeros in $$A$$. For example, below a matrix $$A$$ is given together with a Cholesky factor up to 5 digits of accuracy:

(9.6)$\begin{split}A = \left[ \begin{array}{cccc} 4 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 0 \\ 1 & 0 & 0 & 1 \end{array} \right ], \quad L = \left[ \begin{array}{cccc} 2.0000 & 0 & 0 & 0 \\ 0.5000 & 0.8660 & 0 & 0 \\ 0.5000 & -0.2887 & 0.8165 & 0 \\ 0.5000 & -0.2887 & -0.4082 & 0.7071 \end{array} \right ].\end{split}$

However, if we symmetrically permute the rows and columns of $$A$$ using a permutation matrix $$P$$

$\begin{split}P = \left [ \begin{array}{cccc} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ \end{array} \right ],\quad A' = PAP^T = \left [ \begin{array}{cccc} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 1 \\ 1 & 1 & 1 & 4 \\ \end{array} \right ],\end{split}$

then the Cholesky factorization of $$A'=L'L'^T$$ is

$\begin{split}L' = \left [ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 1 & 1 & 1 \\ \end{array} \right ]\end{split}$

which is sparser than $$L$$.

Computing a permutation matrix that leads to the sparsest Cholesky factorization or the minimal amount of work is NP-hard. Good permutations can be chosen by using heuristics, such as the minimum degree heuristic and variants. The function MSK_computesparsecholesky provided by MOSEK for computing a Cholesky factorization has a build in permutation aka. reordering heuristic. The following code illustrates the use of MSK_computesparsecholesky and MSK_sparsetriangularsolvedense.

Listing 9.3 How to use the sparse Cholesky factorization routine available in MOSEK. Click here to download.
    r = MSK_computesparsecholesky(env,
0,         /* Mosek chooses number of threads */
1,         /* Apply a reordering heuristic */
1.0e-14,   /* Singularity tolerance */
n, anzc, aptrc, asubc, avalc,
&perm, &diag, &lnzc, &lptrc, &lensubnval, &lsubc, &lvalc);

if (r == MSK_RES_OK)
{
MSKint32t i, j;
MSKrealt  *x;
printsparse(n, perm, diag, lnzc, lptrc, lensubnval, lsubc, lvalc);

x = MSK_callocenv(env, n, sizeof(MSKrealt));
if (x)
{
/* Permuted b is stored as x. */
for (i = 0; i < n; ++i) x[i] = b[perm[i]];

/* Compute inv(L)*x. */
r = MSK_sparsetriangularsolvedense(env, MSK_TRANSPOSE_NO, n,
lnzc, lptrc, lensubnval, lsubc, lvalc, x);

if (r == MSK_RES_OK) {
/* Compute inv(L^T)*x. */
r = MSK_sparsetriangularsolvedense(env, MSK_TRANSPOSE_YES, n,
lnzc, lptrc, lensubnval, lsubc, lvalc, x);
printf("\nSolution A x = b, x = [ ");
for (i = 0; i < n; i++)
for (j = 0; j < n; j++) if (perm[j] == i) printf("%.2f ", x[j]);
printf("]\n");
}

MSK_freeenv(env, x);
}
else
printf("Out of space while creating x.\n");
}
else
printf("Cholesky computation failed: %d\n", (int) r);


We can set up the data to recreate the matrix $$A$$ from (9.6):

    const MSKint32t n       = 4;      // Data from the example in the text
//Observe that anzc, aptrc, asubc and avalc only specify the lower triangular part.
const MSKint32t anzc[]  = {4, 1, 1, 1},
asubc[] = {0, 1, 2, 3, 1, 2, 3};
const MSKint64t aptrc[] = {0, 4, 5, 6};
const MSKrealt  avalc[] = {4.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0},
b[]     = {13.0, 3.0, 4.0, 5.0};
MSKint32t       *perm = NULL, *lnzc = NULL, *lsubc = NULL;
MSKint64t       lensubnval, *lptrc = NULL;
MSKrealt        *diag = NULL, *lvalc = NULL;


and we obtain the following output:

Example with positive definite A.
P = [ 3 2 0 1 ]
diag(D) = [ 0.00 0.00 0.00 0.00 ]
L=
1.00 0.00 0.00 0.00
0.00 1.00 0.00 0.00
1.00 1.00 1.41 0.00
0.00 0.00 0.71 0.71

Solution A x = b, x = [ 1.00 2.00 3.00 4.00 ]


The output indicates that with the permutation matrix

$\begin{split}P = \left [ \begin{array}{cccc} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ \end{array} \right ]\end{split}$

there is a Cholesky factorization $$PAP^T=LL^T$$, where

$\begin{split}L = \left [ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 1 & 1.4142 & 0 \\ 0 & 0 & 0.7071 & 0.7071 \\ \end{array} \right ]\end{split}$

The remaining part of the code solvers the linear system $$Ax=b$$ for $$b=[13,3,4,5]^T$$. The solution is reported to be $$x = [1, 2, 3, 4]^T$$, which is correct.

The second example shows what happens when we compute a sparse Cholesky factorization of a singular matrix. In this example $$A$$ is a rank 1 matrix

(9.7)$\begin{split}A = \left[ \begin{array}{cccc} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{array} \right ] = \left[ \begin{array}{cccc} 1 \\ 1 \\ 1 \end{array} \right ] \left[ \begin{array}{cccc} 1 \\ 1 \\ 1 \end{array} \right ]^T\end{split}$
    const MSKint32t n       = 3;
const MSKint32t anzc[]  = {3, 2, 1},
asubc[] = {0, 1, 2, 1, 2, 2};
const MSKint64t aptrc[] = {0, 3, 5};
const MSKrealt  avalc[] = {1.0, 1.0, 1.0, 1.0, 1.0, 1.0};
MSKint32t       *perm = NULL, *lnzc = NULL, *lsubc = NULL;
MSKint64t       lensubnval, *lptrc = NULL;
MSKrealt        *diag = NULL, *lvalc = NULL;


Now we get the output

P = [ 0 2 1 ]
diag(D) = [ 0.00e+00 1.00e-14 1.00e-14 ]
L=
1.00e+00 0.00e+00 0.00e+00
1.00e+00 1.00e-07 0.00e+00
1.00e+00 0.00e+00 1.00e-07


which indicates the decomposition

$P A P^T = LL^T - D$

where

$\begin{split}P = \left[ \begin{array}{cccc} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{array} \right ],\quad L = \left[ \begin{array}{cccc} 1 & 0 & 0 \\ 1 & 10^{-7} & 0 \\ 1 & 0 & 10^{-7} \end{array} \right ],\quad D = \left[ \begin{array}{cccc} 1 & 0 & 0 \\ 0 & 10^{-14} & 0 \\ 0 & 0 & 10^{-14} \end{array} \right ].\end{split}$

Since $$A$$ is only positive semdefinite, but not of full rank, some of diagonal elements of $$A$$ are boosted to make it truely positive definite. The amount of boosting is passed as an argument to MSK_computesparsecholesky, in this case $$10^{-14}$$. Note that

$P A P^T = LL^T - D$

where $$D$$ is a small matrix so the computed Cholesky factorization is exact of slightly perturbed $$A$$. In general this is the best we can hope for in finite precision and when $$A$$ is singular or close to being singular.

We will end this section by a word of caution. Computing a Cholesky factorization of a matrix that is not of full rank and that is not suffciently well conditioned may lead to incorrect results i.e. a matrix that is indefinite may declared positive semidefinite and vice versa.