# 9.3 Computing a Sparse Cholesky Factorization¶

Given a positive semidefinite symmetric (PSD) matrix

$A \in \real^{n\times n}$

it is well known there exists a matrix $$L$$ such that

$A = LL^T.$

If the matrix $$L$$ is lower triangular then it is called a Cholesky factorization. Given $$A$$ is positive definite (nonsingular) then $$L$$ is also nonsingular. A Cholesky factorization is useful for many reasons:

• A system of linear equations $$Ax=b$$ can be solved by first solving the lower triangular system $$L y = b$$ followed by the upper triangular system $$L^T x = y$$.

• A quadratic term $$x^TAx$$ in a constraint or objective can be replaced with $$y^Ty$$ for $$y=L^Tx$$, potentially leading to a more robust formulation (see [And13]).

Therefore, MOSEK provides a function that can compute a Cholesky factorization of a PSD matrix. In addition a function for solving linear systems with a nonsingular lower or upper triangular matrix is available.

In practice $$A$$ may be very large with $$n$$ is in the range of millions. However, then $$A$$ is typically sparse which means that most of the elements in $$A$$ are zero, and sparsity can be exploited to reduce the cost of computing the Cholesky factorization. The computational savings depend on the positions of zeros in $$A$$. For example, below a matrix $$A$$ is given together with a Cholesky factor up to 5 digits of accuracy:

(9.6)$\begin{split}A = \left[ \begin{array}{cccc} 4 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 0 \\ 1 & 0 & 0 & 1 \end{array} \right ], \quad L = \left[ \begin{array}{cccc} 2.0000 & 0 & 0 & 0 \\ 0.5000 & 0.8660 & 0 & 0 \\ 0.5000 & -0.2887 & 0.8165 & 0 \\ 0.5000 & -0.2887 & -0.4082 & 0.7071 \end{array} \right ].\end{split}$

However, if we symmetrically permute the rows and columns of $$A$$ using a permutation matrix $$P$$

$\begin{split}P = \left [ \begin{array}{cccc} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ \end{array} \right ],\quad A' = PAP^T = \left [ \begin{array}{cccc} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 1 \\ 1 & 1 & 1 & 4 \\ \end{array} \right ],\end{split}$

then the Cholesky factorization of $$A'=L'L'^T$$ is

$\begin{split}L' = \left [ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 1 & 1 & 1 \\ \end{array} \right ]\end{split}$

which is sparser than $$L$$.

Computing a permutation matrix that leads to the sparsest Cholesky factorization or the minimal amount of work is NP-hard. Good permutations can be chosen by using heuristics, such as the minimum degree heuristic and variants. The function Env.computesparsecholesky provided by MOSEK for computing a Cholesky factorization has a build in permutation aka. reordering heuristic. The following code illustrates the use of Env.computesparsecholesky and Env.sparsetriangularsolvedense.

try:
perm, diag, lnzc, lptrc, lensubnval, lsubc, lvalc = env.computesparsecholesky(
0,      # Mosek chooses number of threads
1,      # Use reordering heuristic
1.0e-14,# Singularity tolerance
anzc, aptrc, asubc, avalc)

printsparse(n, perm, diag, lnzc, lptrc, lensubnval, lsubc, lvalc)

x = [b[p] for p in perm]   # Permuted b is stored as x.

# Compute inv(L)*x.
env.sparsetriangularsolvedense(mosek.transpose.no,
lnzc, lptrc, lsubc, lvalc, x)

# Compute inv(L^T)*x.
env.sparsetriangularsolvedense(mosek.transpose.yes,
lnzc, lptrc, lsubc, lvalc, x)

print("\nSolution Ax=b: x = ", numpy.array(
[x[j] for i in range(n) for j in range(n) if perm[j] == i]), "\n")
except:
raise

We can set up the data to recreate the matrix $$A$$ from (9.6):

# Observe that anzc, aptrc, asubc and avalc only specify the lower
# triangular part.
n = 4
anzc = [4, 1, 1, 1]
asubc = [0, 1, 2, 3, 1, 2, 3]
aptrc = [0, 4, 5, 6]
avalc = [4.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
b = [13.0, 3.0, 4.0, 5.0]

and we obtain the following output:

Example with positive definite A.
P = [ 3 2 0 1 ]
diag(D) = [ 0.00 0.00 0.00 0.00 ]
L=
1.00 0.00 0.00 0.00
0.00 1.00 0.00 0.00
1.00 1.00 1.41 0.00
0.00 0.00 0.71 0.71

Solution A x = b, x = [ 1.00 2.00 3.00 4.00 ]

The output indicates that with the permutation matrix

$\begin{split}P = \left [ \begin{array}{cccc} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ \end{array} \right ]\end{split}$

there is a Cholesky factorization $$PAP^T=LL^T$$, where

$\begin{split}L = \left [ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 1 & 1.4142 & 0 \\ 0 & 0 & 0.7071 & 0.7071 \\ \end{array} \right ]\end{split}$

The remaining part of the code solvers the linear system $$Ax=b$$ for $$b=[13,3,4,5]^T$$. The solution is reported to be $$x = [1, 2, 3, 4]^T$$, which is correct.

The second example shows what happens when we compute a sparse Cholesky factorization of a singular matrix. In this example $$A$$ is a rank 1 matrix

(9.7)$\begin{split}A = \left[ \begin{array}{cccc} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{array} \right ] = \left[ \begin{array}{cccc} 1 \\ 1 \\ 1 \end{array} \right ] \left[ \begin{array}{cccc} 1 \\ 1 \\ 1 \end{array} \right ]^T\end{split}$
#Example 2 - singular A
n = 3
anzc = [3, 2, 1]
asubc = [0, 1, 2, 1, 2, 2]
aptrc = [0, 3, 5]
avalc = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0]

Now we get the output

P = [ 0 2 1 ]
diag(D) = [ 0.00e+00 1.00e-14 1.00e-14 ]
L=
1.00e+00 0.00e+00 0.00e+00
1.00e+00 1.00e-07 0.00e+00
1.00e+00 0.00e+00 1.00e-07

which indicates the decomposition

$P A P^T = LL^T - D$

where

$\begin{split}P = \left[ \begin{array}{cccc} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{array} \right ],\quad L = \left[ \begin{array}{cccc} 1 & 0 & 0 \\ 1 & 10^{-7} & 0 \\ 1 & 0 & 10^{-7} \end{array} \right ],\quad D = \left[ \begin{array}{cccc} 1 & 0 & 0 \\ 0 & 10^{-14} & 0 \\ 0 & 0 & 10^{-14} \end{array} \right ].\end{split}$

Since $$A$$ is only positive semdefinite, but not of full rank, some of diagonal elements of $$A$$ are boosted to make it truely positive definite. The amount of boosting is passed as an argument to Env.computesparsecholesky, in this case $$10^{-14}$$. Note that

$P A P^T = LL^T - D$

where $$D$$ is a small matrix so the computed Cholesky factorization is exact of slightly perturbed $$A$$. In general this is the best we can hope for in finite precision and when $$A$$ is singular or close to being singular.

We will end this section by a word of caution. Computing a Cholesky factorization of a matrix that is not of full rank and that is not suffciently well conditioned may lead to incorrect results i.e. a matrix that is indefinite may declared positive semidefinite and vice versa.