Introduction to Fill-In

The Graph of a Matrix

Central to the ideas described in these pages is the concept of the graph of a symmetric matrix. That graph has one node for each unknown and an edge connecting nodes $ i$ and $ j$ if $ a_{ij}$ is non-zero. When eliminating the $ i$-th variable (or node) the graph changes in a simple way: All pairs of neighbors of node $ i$ are connected by an edge, and the node $ i$ itself is removed. Each new edge corresponds to an entry of $ A$ that started out zero, but becomes non-zero during the elimination.Fill-In is the total number of new edges arising in that process. By labeling the nodes of the graph in various ways fill-in can be substantially increased or decreased.

Suppose we are solving the Poisson Equation

$\displaystyle \bigtrianglup u =
u_{xx}+u_{yy} = f$

on the unit square. Discretizing suitably gives rise to a linear system

$\displaystyle Ax=b$

where the graph of $ A$ is a rectangular $ m\times n$ grid as indicated in this Figure (for m=n=6):

The labels of of the nodes thus form a rectangular $ m\times n$ array of the numbers from 0 to $ N-1$ where $ N=m*n$. We start labeling with 0 instead of 1 because that is the convention in the Java language used for the applet on this page. We illustrate the various labeling techniques described below with the special choice

$\displaystyle m=n=10$

Thus the matrix $ A$ is a 100 by 100 square matrix which has at most 5 non-zero entries in each row. Typical sparse systems are much larger.

Shown below for each technique there is the square array giving the labeling, and a plot of the resulting matrix. Clicking on the plot shows an enlarged image. The green squares in the plot indicate the originally non-zero entries of the matrix. Red Squares indicate the elements that become non-zero in the elimination process. Technically we are constructing a triangular matrix, but the fill-in is shown in both halves of th matrix.

Lexicographical Ordering

The nodes are simply ordered as we would read them;

\begin{displaymath}\begin{array}{cccccccccc}
0& 1& 2& 3& 4& 5& 6& 7& 8& 9 \cr
...
... 89 \cr
90& 91& 92& 93& 94& 95& 96& 97& 98& 99 \cr
\end{array}\end{displaymath}

This ordering gives rise to the following filled-in matrix:

The Figure clearly illustrates the familiar block-tridiagonal structure of the resulting matrix. All the fill-in occurs in a band of width 21 along the diagonal. You can generate Figures like this after clicking on the above Applet. Detailed instructions are in the User's Guide. The writing at the bottom of the Figure gives details of the fill-in. (You can see it clearly by clicking on the Figure and examining larger version.) The actual fill-in for this ordering is 729 above (or below) the diagonal.

The Cuthill-McKee Algorithm

Fill-In in each row of $ A$ occurs only between the diagonal and the non-zero element of the row that is farthest from the diagonal. The idea of the Cuthill-McKee algorithm is to reduce those maximum distances. In terms of the graph labels at neighboring points should be close in numerical value. For a general graph, the Cuthill-McKee algorithm proceeds as follows:

  1. Pick a first point. Label it 0.
  2. Until all points are labeled: Find the point with the lowest index that still has unlabeled neighbors. Label its neighbors sequentially starting with smallest still available label..

For our special example this process creates the labels

\begin{displaymath}\begin{array}{cccccccccc}
0& 1& 3& 6& 10& 15& 21& 28& 36& 45...
... 97 \cr
54& 63& 71& 78& 84& 89& 93& 96& 98& 99 \cr
\end{array}\end{displaymath}

We obtain:

This is clearly an improvement over the lexicographical ordering. The bandwidth of the matrix is still 21, but there is less fill-in at the beginning and the end of the diagonal. Indeed, the fill-in is only 525 compared to 729.

In general, it turns out to be advantageous if the labeling is then reversed, from highest to lowest, which gives rise to the reverse Cuthill-McKee Algorithm. For the particular case of the Poisson Equation reversing the labeling creates an equivalent labeling.

Nested Disection

This idea was introduced by Alan George. Suppose we label last a row (or column) R of points that divides the remaining points into two set S and T that have approximately the same size. Elimination in S will not introduce fill-in in T, and vice versa. Thus the problem is effectively divided into two parts, each of which has a matrix that has roughly one fourth (one half squared) the size of the original matrix. The two sets will be connected when eliminating in R, but that will cause fill-in only in the last few columns of A. This idea is then applied recursively to the sets S and T, and their subsets. In our example we obtain the array

\begin{displaymath}\begin{array}{cccccccccc}
74& 77& 84& 64& 67& 99& 26& 29& 31...
... 13& 3 \cr
48& 49& 56& 40& 41& 90& 4& 5& 12& 0 \cr
\end{array}\end{displaymath}

which warrants close examination. Note how the process is started in the sixth column of the graph.

This time we obtain quite a different picture:

There is no band structure in this matrix. Neighboring nodes may have labels that are far apart. Ignoring the last 10 rows and columns we see a block diagonal matrix, as expected. Look closely and you'll be able to discern the recursive structure of the ordering.

The Greedy Algorithm

The basic idea is at each stage to eliminate one of the variables that currently has the smallest number of neighbors. This is properly called the minimum degree algorithm, but the term greedy algorithm is more descriptive and more popular. There may be several nodes with the minimal degree, which gives rise to some discretion in the process. Picking each time the first available vertex with minimum degree leads to this graph:

\begin{displaymath}\begin{array}{cccccccccc}
0& 7& 72& 48& 4& 98& 5& 68& 6& 1 \...
...4& 45 \cr
2& 16& 65& 17& 94& 18& 76& 19& 47& 3 \cr
\end{array}\end{displaymath}

Note how the vertices of the square (each of which has only three neighbors) are labeled first.

We obtain

Red-Black Ordering

Think of the graph as a checker board and label first the black points and then the red ones. There is no fill-in when eliminating among the black points. This can be thought of as a variant of the greedy algorithm: we begin with half the points not introducing any fill-in at all. However, a price must be paid when moving on to the remaining points.

\begin{displaymath}\begin{array}{cccccccccc}
0& 50& 1& 51& 2& 52& 3& 53& 4& 54 ...
... 94 \cr
45& 95& 46& 96& 47& 97& 48& 98& 49& 99 \cr
\end{array}\end{displaymath}

We obtain

The simple idea of red/black ordering can be surprisingly effective!

Maximizing Fill-In

While of course one wants to minimize fill-in it may be intriguing to ask about the maximum possible fill-in. The following labeling is obtained by perverting the greedy algorithm and having it pick a vertex of maximum degree at each step:

The resulting matrix is:

Summary

The following table shows the fill-in for the various labeling techniques, ordered by increasing fill-in. The percentage given is that of twice the fill-in with reference to the entire matrix, reflecting the fact that we would ordinarily only keep one triangle of the matrix.

Method Fill-In Percent
greedy algorithm 351 7.02%
Red/Black 433 8.66%
Nested Disection 500 10.0%
Cuthill-McKee 525 10.5%
lexicographical 729 14.6%
maximum greed 2891 57.8%

You can see many more such statistics. However, the main purpose of these pages is to let you explore fill-in yourself using the applet on this page. It's pretty straightforward, but there is also a detailed User's Guide. Come up with your own labeling strategy and beat the greedy algorithm!