9.3 Computing a Sparse Cholesky Factorization

Given a positive semidefinite symmetric (PSD) matrix

ARn×n

it is well known there exists a matrix L such that

A=LLT.

If the matrix L is lower triangular then it is called a Cholesky factorization. Given A is positive definite (nonsingular) then L is also nonsingular. A Cholesky factorization is useful for many reasons:

  • A system of linear equations Ax=b can be solved by first solving the lower triangular system Ly=b followed by the upper triangular system LTx=y.

  • A quadratic term xTAx in a constraint or objective can be replaced with yTy for y=LTx, potentially leading to a more robust formulation (see [And13]).

Therefore, MOSEK provides a function that can compute a Cholesky factorization of a PSD matrix. In addition a function for solving linear systems with a nonsingular lower or upper triangular matrix is available.

In practice A may be very large with n is in the range of millions. However, then A is typically sparse which means that most of the elements in A are zero, and sparsity can be exploited to reduce the cost of computing the Cholesky factorization. The computational savings depend on the positions of zeros in A. For example, below a matrix A is given together with a Cholesky factor up to 5 digits of accuracy:

(9.6)A=[4111110010101001],L=[2.00000000.50000.8660000.50000.28870.816500.50000.28870.40820.7071].

However, if we symmetrically permute the rows and columns of A using a permutation matrix P

P=[0100001000011000],A=PAPT=[1001010100111114],

then the Cholesky factorization of A=LLT is

L=[1000010000101111]

which is sparser than L.

Computing a permutation matrix that leads to the sparsest Cholesky factorization or the minimal amount of work is NP-hard. Good permutations can be chosen by using heuristics, such as the minimum degree heuristic and variants. The function Env.compute_sparse_cholesky provided by MOSEK for computing a Cholesky factorization has a build in permutation aka. reordering heuristic. The following code illustrates the use of Env.compute_sparse_cholesky and Env.sparse_triangular_solve_dense.

Listing 9.3 How to use the sparse Cholesky factorization routine available in MOSEK. Click here to download.
    mosek::compute_sparse_cholesky(0,        //Mosek chooses number of threads
                                   1,        //Apply reordering heuristic
                                   1.0e-14,  //Singularity tolerance
                                   &anzc, &aptrc, &asubc, &avalc,
                                   & mut perm, & mut diag,
                                   & mut lnzc, & mut lptrc, & mut lensubnval, & mut lsubc, & mut lvalc)?;
    print_sparse(n, perm.as_slice(), diag.as_slice(), lnzc.as_slice(), lptrc.as_slice(), lsubc.as_slice(), lvalc.as_slice());

    /* Permuted b is stored as x. */
    let mut x : Vec<f64> = perm.iter().map(|&i| b[i as usize]).collect();

    /*Compute  inv(L)*x.*/
    mosek::sparse_triangular_solve_dense(Transpose::NO,  lnzc.as_slice(), lptrc.as_slice(), lsubc.as_slice(), lvalc.as_slice(), x.as_mut_slice())?;
    /*Compute  inv(L^T)*x.*/
    mosek::sparse_triangular_solve_dense(Transpose::YES, lnzc.as_slice(), lptrc.as_slice(), lsubc.as_slice(), lvalc.as_slice(), x.as_mut_slice())?;

    print!("\nSolution A x = b, x = [ {:?} ]",
           iproduct!(0..n,izip!(perm.iter(),x.iter()))
           .filter_map(|(i,(&pj,&xj))|if pj as usize == i { Some(xj) } else { None })
           .collect::<Vec<f64>>());

We can set up the data to recreate the matrix A from (9.6):

    //Observe that anzc, aptrc, asubc and avalc only specify the lower triangular part.
    let n     = 4;
    let anzc  = [4, 1, 1, 1];
    let asubc = [0, 1, 2, 3, 1, 2, 3];
    let aptrc = [0, 4, 5, 6];
    let avalc = [4.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0];
    let b     = [13.0, 3.0, 4.0, 5.0];

and we obtain the following output:

Example with positive definite A.
P = [ 3 2 0 1 ]
diag(D) = [ 0.00 0.00 0.00 0.00 ]
L=
1.00 0.00 0.00 0.00
0.00 1.00 0.00 0.00
1.00 1.00 1.41 0.00
0.00 0.00 0.71 0.71

Solution A x = b, x = [ 1.00 2.00 3.00 4.00 ]

The output indicates that with the permutation matrix

P=[0010000101001000]

there is a Cholesky factorization PAPT=LLT, where

L=[10000100111.41420000.70710.7071]

The remaining part of the code solvers the linear system Ax=b for b=[13,3,4,5]T. The solution is reported to be x=[1,2,3,4]T, which is correct.

The second example shows what happens when we compute a sparse Cholesky factorization of a singular matrix. In this example A is a rank 1 matrix

(9.7)A=[111111111]=[111][111]T
    let n     = 3;
    let anzc  = [3, 2, 1];
    let asubc = [0, 1, 2, 1, 2, 2];
    let aptrc = [0, 3, 5, ];
    let avalc = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0];

Now we get the output

P = [ 0 2 1 ]
diag(D) = [ 0.00e+00 1.00e-14 1.00e-14 ]
L=
1.00e+00 0.00e+00 0.00e+00
1.00e+00 1.00e-07 0.00e+00
1.00e+00 0.00e+00 1.00e-07

which indicates the decomposition

PAPT=LLTD

where

P=[100001010],L=[1001107010107],D=[100010140001014].

Since A is only positive semdefinite, but not of full rank, some of diagonal elements of A are boosted to make it truely positive definite. The amount of boosting is passed as an argument to Env.compute_sparse_cholesky, in this case 1014. Note that

PAPT=LLTD

where D is a small matrix so the computed Cholesky factorization is exact of slightly perturbed A. In general this is the best we can hope for in finite precision and when A is singular or close to being singular.

We will end this section by a word of caution. Computing a Cholesky factorization of a matrix that is not of full rank and that is not suffciently well conditioned may lead to incorrect results i.e. a matrix that is indefinite may declared positive semidefinite and vice versa.