# 9.3 Computing a Sparse Cholesky Factorization¶

Given a positive semidefinite symmetric (PSD) matrix

it is well known there exists a matrix \(L\) such that

If the matrix \(L\) is lower triangular then it is called a *Cholesky factorization*. Given \(A\) is positive definite (nonsingular) then \(L\) is also nonsingular. A Cholesky factorization is useful for many reasons:

A system of linear equations \(Ax=b\) can be solved by first solving the lower triangular system \(L y = b\) followed by the upper triangular system \(L^T x = y\).

A quadratic term \(x^TAx\) in a constraint or objective can be replaced with \(y^Ty\) for \(y=L^Tx\), potentially leading to a more robust formulation (see [And13]).

Therefore, **MOSEK** provides a function that can compute a Cholesky factorization of a PSD matrix. In addition a function for solving linear systems with a nonsingular lower or upper triangular matrix is available.

In practice \(A\) may be very large with \(n\) is in the range of millions. However, then \(A\) is typically sparse which means that most of the elements in \(A\) are zero, and sparsity can be exploited to reduce the cost of computing the Cholesky factorization. The computational savings depend on the positions of zeros in \(A\). For example, below a matrix \(A\) is given together with a Cholesky factor up to 5 digits of accuracy:

However, if we symmetrically permute the rows and columns of \(A\) using a permutation matrix \(P\)

then the Cholesky factorization of \(A'=L'L'^T\) is

which is sparser than \(L\).

Computing a permutation matrix that leads to the sparsest Cholesky factorization or the minimal amount of work is NP-hard. Good permutations can be chosen by using heuristics, such as the minimum degree heuristic and variants. The function `Env.computesparsecholesky`

provided by **MOSEK** for computing a Cholesky factorization has a build in permutation aka. reordering heuristic. The following code illustrates the use of `Env.computesparsecholesky`

and `Env.sparsetriangularsolvedense`

.

```
try:
perm, diag, lnzc, lptrc, lensubnval, lsubc, lvalc = env.computesparsecholesky(
0, # Mosek chooses number of threads
1, # Use reordering heuristic
1.0e-14,# Singularity tolerance
anzc, aptrc, asubc, avalc)
printsparse(n, perm, diag, lnzc, lptrc, lensubnval, lsubc, lvalc)
x = [b[p] for p in perm] # Permuted b is stored as x.
# Compute inv(L)*x.
env.sparsetriangularsolvedense(mosek.transpose.no,
lnzc, lptrc, lsubc, lvalc, x)
# Compute inv(L^T)*x.
env.sparsetriangularsolvedense(mosek.transpose.yes,
lnzc, lptrc, lsubc, lvalc, x)
print("\nSolution Ax=b: x = ", numpy.array(
[x[j] for i in range(n) for j in range(n) if perm[j] == i]), "\n")
except:
raise
```

We can set up the data to recreate the matrix \(A\) from (9.6):

```
# Observe that anzc, aptrc, asubc and avalc only specify the lower
# triangular part.
n = 4
anzc = [4, 1, 1, 1]
asubc = [0, 1, 2, 3, 1, 2, 3]
aptrc = [0, 4, 5, 6]
avalc = [4.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
b = [13.0, 3.0, 4.0, 5.0]
```

and we obtain the following output:

```
Example with positive definite A.
P = [ 3 2 0 1 ]
diag(D) = [ 0.00 0.00 0.00 0.00 ]
L=
1.00 0.00 0.00 0.00
0.00 1.00 0.00 0.00
1.00 1.00 1.41 0.00
0.00 0.00 0.71 0.71
Solution A x = b, x = [ 1.00 2.00 3.00 4.00 ]
```

The output indicates that with the permutation matrix

there is a Cholesky factorization \(PAP^T=LL^T\), where

The remaining part of the code solvers the linear system \(Ax=b\) for \(b=[13,3,4,5]^T\). The solution is reported to be \(x = [1, 2, 3, 4]^T\), which is correct.

The second example shows what happens when we compute a sparse Cholesky factorization of a singular matrix. In this example \(A\) is a rank 1 matrix

```
#Example 2 - singular A
n = 3
anzc = [3, 2, 1]
asubc = [0, 1, 2, 1, 2, 2]
aptrc = [0, 3, 5]
avalc = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
```

Now we get the output

```
P = [ 0 2 1 ]
diag(D) = [ 0.00e+00 1.00e-14 1.00e-14 ]
L=
1.00e+00 0.00e+00 0.00e+00
1.00e+00 1.00e-07 0.00e+00
1.00e+00 0.00e+00 1.00e-07
```

which indicates the decomposition

where

Since \(A\) is only positive semdefinite, but not of full rank, some of diagonal elements of \(A\) are boosted to make it truely positive definite. The amount of boosting is passed as an argument to `Env.computesparsecholesky`

, in this case \(10^{-14}\). Note that

where \(D\) is a small matrix so the computed Cholesky factorization is exact of slightly perturbed \(A\). In general this is the best we can hope for in finite precision and when \(A\) is singular or close to being singular.

We will end this section by a word of caution. Computing a Cholesky factorization of a matrix that is not of full rank and that is not suffciently well conditioned may lead to incorrect results i.e. a matrix that is indefinite may declared positive semidefinite and vice versa.