A simple implementation code based on a line compression graph in C

  • 2020-04-01 21:39:13
  • OfStack

First of all, let's talk a little bit about what a row compression graph is, which is strictly a row compression matrix. Normally, a matrix is simply stored in a two-dimensional array, but a sparse matrix with a lot of zeros is a waste of space. So there are various ways to save space, triples storage is one of them.

What is a triple? A triplet is (row,col,value), so that all non-zero values form a vector. This saves a lot of space over two dimensional arrays, and of course you can save even more because row or col is stored repeatedly in triples, one row or column at a time.

So what exactly is row compression storage? Row compression is stored, all is not zero value in line access order form a vector, then each line save column subscript values are not zero, the size of the two vectors and sparse matrix in does not have the same number of 0 is worth, of course to achieve access to the row compression matrix, and the column of each row of the 0 subscript position at the beginning of the second vector save, someone call this pointer. With these three vectors you can have efficient row access to the matrix. Row compression storage is better than triples not only for space compression, but also for efficient row access. Triples can be binary searched to access a row if they are ordered, but the time complexity is constant when row compression storage is accessed by row. You can refer to the following row compression matrix diagram:

< img Alt = "" border = 0 SRC =" / / files.jb51.net/file_images/article/201305/2013050417172623.png ">

< img Alt = "" border = 0 SRC =" / / files.jb51.net/file_images/article/201305/2013050417172624.png ">

Now, you might be wondering, well, how did you get so many row compression matrices out of your row compression diagram? In fact, the graph and the matrix are equivalent, the row of the matrix can be regarded as the outgoing edge of a node of the graph, and the column of the matrix can be regarded as the incoming edge of a node of the graph. Of course, there are two conditions that must be met: the first is that the number of nodes in the graph must be continuous from 0 or 1 (which can be solved by mapping the nodes of the graph once), and the second is that the graph must be at least weakly connected (non-connected graphs can be split into connected pictures). The efficient storage access of sparse matrix is realized.

So let's talk about my implementation. My implementation differs from the classic row compression matrix in two ways: the first is that the classic row compression matrix does not take into account the case where all rows are zero, which I do (not because I'm bored, of course, but because of the need). The second is that the classic row compression graph is slow to access by column (relative to the speed of accessing by row, of course), and the time complexity is linear to access by column on the row compression graph. I also dealt with this situation.

Here is my idea briefly:

The first problem is that I set all Pointers to -1 by default, which means that the row may be all zeros, and only the correct pointer if there is a non-zero value. Of course, the access should also do the corresponding processing.

Here's how I solved the second problem. I've stored a copy of the row index for each column that is not 0, and the starting position of the row index for each column that is not 0. So I have two more vectors in my implementation, which wastes storage, but improves the efficiency of column access.

Ok, talk is cheap, show me the code. Here is my code (there may be errors, I just did a simple test)

The edge vector is used to construct the compression graph



    private void buildGraph(Vector<Edge> edges) {
        int edgeSize = edges.size();
        weight = new Vector<Float>(edgeSize);
        rowIndex = new Vector<Integer>(edgeSize);
        rowPtr = new Vector<Integer>(nodeCount + 1);
        colIndex = new Vector<Integer>(edgeSize);
        colPtr = new Vector<Integer>(nodeCount + 1);
        // set default value as -1
        for (int i = 0; i < nodeCount; ++i) {
            rowPtr.add(-1);
            colPtr.add(-1);
        }
        rowPtr.add(edges.size());
        colPtr.add(edges.size());
        // sort the edge based on first node
        EdgeBasedOnFirstNodeComparator cmp = new EdgeBasedOnFirstNodeComparator();
        Collections.sort(edges, cmp);
        // build row index and pointer
        int curNode = edges.elementAt(0).getFirstNode();
        int curPtr = 0;
        for (int i = 0; i < edgeSize; ++i) {
            Edge e = edges.elementAt(i);
            // System.out.println("curNode" + curNode + "firstNode: "
            // + e.getFirstNode());
            weight.add(e.getWeight());
            rowIndex.add(e.getSecondNode());
            if (curNode != e.getFirstNode()) {
                rowPtr.set(curNode, curPtr);
                curNode = e.getFirstNode();
                curPtr = i;
            }
        }
        rowPtr.set(curNode, curPtr);
        // sort the edge based on second node
        EdgeBasedOnSecondNodeComparator cmp2 = new EdgeBasedOnSecondNodeComparator();
        Collections.sort(edges, cmp2);
        // build column index and pointer
        curNode = edges.elementAt(0).getSecondNode();
        curPtr = 0;
        for (int i = 0; i < edgeSize; ++i) {
            Edge e = edges.elementAt(i);
            colIndex.add(e.getFirstNode());
            if (curNode != e.getSecondNode()) {
                colPtr.set(curNode, curPtr);
                curNode = e.getSecondNode();
                curPtr = i;
            }
        }
        colPtr.set(curNode, curPtr);
    }


 Gets the outgoing edge of a node 

@Override
public Vector<Edge> getOutEdges(int node) {
    Vector<Edge> res = new Vector<Edge>();
    int startIndex = getStartIndex(node, true);
    if (startIndex == -1) {
        //There's no edge
        return null;
    }
    int endIndex = getEndIndex(node, true);
    float value;
    Edge e;
    int outNode;
    for (int index = startIndex; index < endIndex; ++index) {
        value = weight.elementAt(index);
        outNode = rowIndex.elementAt(index);
        e = new Edge(node, outNode, value);
        res.add(e);
    }
    return res;
}
 Gets the input edge of a node 
?

@Override
public Vector<Edge> getInEdges(int node) {
    Vector<Edge> res = new Vector<Edge>();
    int startIndex = getStartIndex(node, false);
    //There are no edges
    if (startIndex == -1) {
        return null;
    }
    int endIndex = getEndIndex(node, false);
    float value;
    Edge e;
    int inNode;
    for (int index = startIndex; index < endIndex; ++index) {
        inNode = colIndex.elementAt(index);
        value = getWeight(inNode, node);
        e = new Edge(inNode, node, value);
        res.add(e);
    }
    return res;
}

So this is not the same as row access, so when you do row access, you just read the value of the weight vector, but not here, so the weight vector is stored in row access order. My solution is to get the incoming node and then access the entire node pair by row to get the corresponding value. This is a little bit of a wrap, but for sparse graphs, it's basically constant. Here is the code for getWeight


    private float getWeight(int row, int col) {
        int startIndex = getStartIndex(row, true);
        if(startIndex==-1)
            return 0;
        int endIndex = getEndIndex(row, true);
        for (int i = startIndex; i < endIndex; ++i) {
            if (rowIndex.elementAt(i) == col)
                return weight.elementAt(i);
        }
        return 0;
    }

And then finally, the special treatment for rows or columns that are all zeroes, in this case, is the function that gets the start and end positions from the pointer vector


    private int getStartIndex(int node, boolean direction) {
        // true : out edge
        if (direction)
            return rowPtr.elementAt(node);
        else
            return colPtr.elementAt(node);
    }
 
?

    private int getEndIndex(int node, boolean direction) {
        // true:out edge
        if (direction) {
            int i = 1;
            while ((node + i) < nodeCount) {
                if (rowPtr.elementAt(node + i) != -1)
                    return rowPtr.elementAt(node + i);
                else
                    ++i;
            }
            return rowPtr.elementAt(nodeCount);
        } else {
            int i = 1;
            while ((node + i) < nodeCount) {
                if (colPtr.elementAt(node + i) != -1)
                    return colPtr.elementAt(node + i);
                else
                    ++i;
            }
            return colPtr.elementAt(nodeCount);
        }
    }

I've only implemented two of the simplest functions here, getting the in and out edges. On the one hand, these two functions are enough for what I'm doing, and on the other hand, for a graph, with these two functions, all the other functions can be implemented accordingly.


Related articles: