9.3 Computing a Sparse Cholesky Factorization¶

Given a positive semidefinite symmetric (PSD) matrix

A \in R^{n \times n}

it is well known there exists a matrix $L$ such that

A = L L^{T} .

If the matrix $L$ is lower triangular then it is called a Cholesky factorization. Given $A$ is positive definite (nonsingular) then $L$ is also nonsingular. A Cholesky factorization is useful for many reasons:

A system of linear equations $A x = b$ can be solved by first solving the lower triangular system $L y = b$ followed by the upper triangular system $L^{T} x = y$ .
A quadratic term $x^{T} A x$ in a constraint or objective can be replaced with $y^{T} y$ for $y = L^{T} x$ , potentially leading to a more robust formulation (see [And13]).

Therefore, MOSEK provides a function that can compute a Cholesky factorization of a PSD matrix. In addition a function for solving linear systems with a nonsingular lower or upper triangular matrix is available.

In practice $A$ may be very large with $n$ is in the range of millions. However, then $A$ is typically sparse which means that most of the elements in $A$ are zero, and sparsity can be exploited to reduce the cost of computing the Cholesky factorization. The computational savings depend on the positions of zeros in $A$ . For example, below a matrix $A$ is given together with a Cholesky factor up to 5 digits of accuracy:

(9.6)¶

\begin{array}{r} A = [\begin{array}{cccc} 4 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 0 \\ 1 & 0 & 0 & 1 \end{array}], L = [\begin{array}{cccc} 2.0000 & 0 & 0 & 0 \\ 0.5000 & 0.8660 & 0 & 0 \\ 0.5000 & - 0.2887 & 0.8165 & 0 \\ 0.5000 & - 0.2887 & - 0.4082 & 0.7071 \end{array}] . \end{array}

However, if we symmetrically permute the rows and columns of $A$ using a permutation matrix $P$

\begin{array}{r} P = [\begin{array}{cccc} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \end{array}], A^{'} = P A P^{T} = [\begin{array}{cccc} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 1 \\ 1 & 1 & 1 & 4 \end{array}], \end{array}

then the Cholesky factorization of $A^{'} = L^{'} L^{' T}$ is

\begin{array}{r} L^{'} = [\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 1 & 1 & 1 \end{array}] \end{array}

which is sparser than $L$ .

Computing a permutation matrix that leads to the sparsest Cholesky factorization or the minimal amount of work is NP-hard. Good permutations can be chosen by using heuristics, such as the minimum degree heuristic and variants. The function Env.compute_sparse_cholesky provided by MOSEK for computing a Cholesky factorization has a build in permutation aka. reordering heuristic. The following code illustrates the use of Env.compute_sparse_cholesky and Env.sparse_triangular_solve_dense.

Listing 9.3 How to use the sparse Cholesky factorization routine available in MOSEK. Click here to download.¶

    mosek::compute_sparse_cholesky(0,        //Mosek chooses number of threads
                                   1,        //Apply reordering heuristic
                                   1.0e-14,  //Singularity tolerance
                                   &anzc, &aptrc, &asubc, &avalc,
                                   & mut perm, & mut diag,
                                   & mut lnzc, & mut lptrc, & mut lensubnval, & mut lsubc, & mut lvalc)?;
    print_sparse(n, perm.as_slice(), diag.as_slice(), lnzc.as_slice(), lptrc.as_slice(), lsubc.as_slice(), lvalc.as_slice());

    /* Permuted b is stored as x. */
    let mut x : Vec<f64> = perm.iter().map(|&i| b[i as usize]).collect();

    /*Compute  inv(L)*x.*/
    mosek::sparse_triangular_solve_dense(Transpose::NO,  lnzc.as_slice(), lptrc.as_slice(), lsubc.as_slice(), lvalc.as_slice(), x.as_mut_slice())?;
    /*Compute  inv(L^T)*x.*/
    mosek::sparse_triangular_solve_dense(Transpose::YES, lnzc.as_slice(), lptrc.as_slice(), lsubc.as_slice(), lvalc.as_slice(), x.as_mut_slice())?;

    print!("\nSolution A x = b, x = [ {:?} ]",
           iproduct!(0..n,izip!(perm.iter(),x.iter()))
           .filter_map(|(i,(&pj,&xj))|if pj as usize == i { Some(xj) } else { None })
           .collect::<Vec<f64>>());

We can set up the data to recreate the matrix $A$ from (9.6):

    //Observe that anzc, aptrc, asubc and avalc only specify the lower triangular part.
    let n     = 4;
    let anzc  = [4, 1, 1, 1];
    let asubc = [0, 1, 2, 3, 1, 2, 3];
    let aptrc = [0, 4, 5, 6];
    let avalc = [4.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0];
    let b     = [13.0, 3.0, 4.0, 5.0];

and we obtain the following output:

Example with positive definite A.
P = [ 3 2 0 1 ]
diag(D) = [ 0.00 0.00 0.00 0.00 ]
L=
1.00 0.00 0.00 0.00
0.00 1.00 0.00 0.00
1.00 1.00 1.41 0.00
0.00 0.00 0.71 0.71

Solution A x = b, x = [ 1.00 2.00 3.00 4.00 ]

The output indicates that with the permutation matrix

\begin{array}{r} P = [\begin{array}{cccc} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{array}] \end{array}

there is a Cholesky factorization $P A P^{T} = L L^{T}$ , where

\begin{array}{r} L = [\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 1 & 1.4142 & 0 \\ 0 & 0 & 0.7071 & 0.7071 \end{array}] \end{array}

The remaining part of the code solvers the linear system $A x = b$ for $b = [13, 3, 4, 5]^{T}$ . The solution is reported to be $x = [1, 2, 3, 4]^{T}$ , which is correct.

The second example shows what happens when we compute a sparse Cholesky factorization of a singular matrix. In this example $A$ is a rank 1 matrix

(9.7)¶

\begin{array}{r} A = [\begin{array}{cccc} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{array}] = [\begin{array}{cccc} 1 \\ 1 \\ 1 \end{array}] {[\begin{array}{cccc} 1 \\ 1 \\ 1 \end{array}]}^{T} \end{array}

    let n     = 3;
    let anzc  = [3, 2, 1];
    let asubc = [0, 1, 2, 1, 2, 2];
    let aptrc = [0, 3, 5, ];
    let avalc = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0];

Now we get the output

P = [ 0 2 1 ]
diag(D) = [ 0.00e+00 1.00e-14 1.00e-14 ]
L=
1.00e+00 0.00e+00 0.00e+00
1.00e+00 1.00e-07 0.00e+00
1.00e+00 0.00e+00 1.00e-07

which indicates the decomposition

P A P^{T} = L L^{T} - D

where

\begin{array}{r} P = [\begin{array}{cccc} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{array}], L = [\begin{array}{cccc} 1 & 0 & 0 \\ 1 & 10^{- 7} & 0 \\ 1 & 0 & 10^{- 7} \end{array}], D = [\begin{array}{cccc} 1 & 0 & 0 \\ 0 & 10^{- 14} & 0 \\ 0 & 0 & 10^{- 14} \end{array}] . \end{array}

Since $A$ is only positive semdefinite, but not of full rank, some of diagonal elements of $A$ are boosted to make it truely positive definite. The amount of boosting is passed as an argument to Env.compute_sparse_cholesky, in this case $10^{- 14}$ . Note that

P A P^{T} = L L^{T} - D

where $D$ is a small matrix so the computed Cholesky factorization is exact of slightly perturbed $A$ . In general this is the best we can hope for in finite precision and when $A$ is singular or close to being singular.

We will end this section by a word of caution. Computing a Cholesky factorization of a matrix that is not of full rank and that is not suffciently well conditioned may lead to incorrect results i.e. a matrix that is indefinite may declared positive semidefinite and vice versa.

9.3 Computing a Sparse Cholesky Factorization¶

Table of Contents

Download PDF

Modeling Cookbook

Cheatsheet