← Back to team overview

dolfin team mailing list archive

Re: Assembly benchmark

 



Anders Logg wrote:
I've run the old and new assembly through valgrind --tool=callgrind
and here's a summary of the results:

Old assembly, below assembleCommonOld():

    assembleElementOld()         86.74%
    other                        13.26%

New assembly, below assembleCommon():

    assembleElementTensor()      42.73%
    DofMap::numNonZeroesRow()    39.19%
    DofMap::~DofMap()            12.09%
    other                         5.99%

So in summary the DofMap class adds about 100% in extra work.

A problem is that DofMap is called at each assemble, whereas it should only be called for the first assemble.

Are you using uBLAS or PETSc matrices?

Garth

(The 60% for running the benchmark is because the benchmark also
includes running some local benchmarks for the element tensor which is
not affected by the DofMap.)

/Anders


On Tue, Jan 16, 2007 at 10:04:30AM +0100, Anders Logg wrote:
I have added an old version of the assembly to FEM. Running the
benchmark in src/bench/fem/assembly/ with the old assembly confirms
the result that the new assembly is about 60% slower.

To change between the two (old and new), edit lines 81-82 in
src/bench/fem/assembly/main.cpp.

/Anders



On Tue, Jan 16, 2007 at 12:06:10AM +0100, Anders Logg wrote:
On Mon, Jan 15, 2007 at 10:54:18PM +0100, Garth N. Wells wrote:
The extra time could be in the mesh initialisation. DofMap is pretty fast.
I'm not sure. The benchmark initializes the mesh connectivity before
assembling and then assembles 100 times, so at least the last 99 times
the mesh initialization should not do anything. (Calling init() on the
mesh twice does not do anything the second time.)

What needs to be done is to use the DofMap to initialise the sparse matrix pattern (at least for uBLAS). What happens now is that everything is assembled into a uBLAS vector-of-compressed-vectors (fast assembly), which is then copied to a CSR matrix (for fast matrix vector mult or for passing to UMFPACK). Using DofMap, the CSR matrix could be initialised. This should make things faster and use less memory
ok, let's do that.

Also, which problem are you assembling? Scalar Poisson equation can be misleading as it tends to fill sparse matrices quite fast even if they are not well initialised. This tends to show up with vector equations.
Advection operator for cubic Lagrange in 3D:

    a = v*c[i]*U.dx(i)*dx

/Anders


Garth

Anders Logg wrote:
There is a benchmark for assembly in src/bench/fem/assembly/.
We should use this to track the speed of assembly during the
planned changes to the Assembly and DofMap classes.

The assembly is now 59% slower than it was before the DofMap (and
maybe other changes).

  0.63-dev (changeset 1d485299d95)
  CPU time: 87.8056

  0.6.4 (changeset 49bf8f3ca3d9)
  CPU time: 139.686

Note that the benchmarks are run with --enable-optimization.

Let's try to get back to the speed we had before or at least find out
exactly what takes time. It might be that computing the nonzero
pattern is what takes more time, but that this is necessary to reduce
the memory usage.

/Anders
_______________________________________________
DOLFIN-dev mailing list
DOLFIN-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/dolfin-dev


_______________________________________________
DOLFIN-dev mailing list
DOLFIN-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/dolfin-dev
_______________________________________________
DOLFIN-dev mailing list
DOLFIN-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/dolfin-dev
_______________________________________________
DOLFIN-dev mailing list
DOLFIN-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/dolfin-dev
_______________________________________________
DOLFIN-dev mailing list
DOLFIN-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/dolfin-dev



Follow ups

References