← Back to team overview

dolfin team mailing list archive

Assembly timings

 

Hey everyone.

Andy (Terrel) has been trying out some new ways to improve the speed
of assembly in DOLFIN. We just sat down to make some timings and here
are the results/conclusions. Comments and suggestions are welcome.

The basic idea is to store the sparse matrix as a std::vector of
std::maps (implemented as a binary tree):

    std::vector<std::map<int, real> > rows(M);

With this data structure, we replace the line

    A.add(block, test_dofs, m, trial_dofs, n);

which just calls PETSc MatSetValues with the following code
for insertion into the sparse matrix

for(uint i=0; i<m; i++)
{
    std::map<int, real>& row = rows[test_dofs[i]];
    for(uint j = 0; j < n; j++)
    {
        const int J = trial_dofs[j];
        iter = row.find(J);
        const real val = block[i*n+j];
        if ( iter == row.end() )
            row.insert(std::map<int,real>::value_type(J,val));
	else 
	    (*iter).second += val;
    }
}

This is done once for each element. At the end of assembly, we copy
the values into a PETSc matrix.

The result: PETSc MatSetValues is about twice as fast as the STL.

Here are the numbers, broken into timings for the different stages of
assembly (times in seconds for assembly of Poisson on a 128x128
triangular mesh):

    - Iteration over mesh + misc stuff    0.04
    - Mapping dofs, updating affine map   0.23 (this can be faster)
    - Computing element matrix            0.01  :-)
    - Inserting into STL binary tree      0.58 
    - PETSc assemble begin/end            0.05
    - Copying data to PETSc matrix        0.18 (this can be faster)

As a comparison, just calling PETSc MatSetValues takes 0.24s, which
should be compared to the sum 0.58s + 0.18s.

Perhaps inserting into the STL binary tree can be improved if one
could reserve the size of each map before assembly. (Anyone knows if
this is possible? Something like std::vector::reserve()?)

Again, PETSc wins. The results from previous benchmarks against PETSc
seem to hold: PETSc is about twice as a simple sparse data
structure.

This seems to hold both when the the sparse data structure is

    double** values;
    int** columns;

or when it is

    std::vector<std::map<int, real> > rows(M);    

/Anders



Follow ups