dolfin team mailing list archive

Thread
Date

Re: MTL4 backend: Significant performance results

To: Dag Lindbo <dag@xxxxxxxxxx>
From: "Garth N. Wells" <gnw20@xxxxxxxxx>
Date: Wed, 16 Jul 2008 19:11:38 +0100
Cc: dolfin-dev@xxxxxxxxxx
Delivered-to: dolfin-dev@xxxxxxxxxx
In-reply-to: <487E27AA.7050104@csc.kth.se>
User-agent: Thunderbird 2.0.0.14 (X11/20080505)



Dag Lindbo wrote:

kent-and@xxxxxxxxx wrote:

Sounds amazing!

I'd like to see that code although I can not promise you to
much response during my holiday, which is starting tomorrow.

Have you compared matrix vector product with vector products using uBlas
or PETSc ?


Will do.

More good news: After some discussion about insertion operations on the
MTL list, Peter Gottschling (the lead developer of MTL4) implemented
some optimizations based on some other bechmarks I ran. For insertion
into very sparse matrices (like Poisson) I got a further 30% speedup.

I could put the MTL4 experimental stuff in the sandbox. Does that sound
good? I'm going on vacation too.


Sounds good. Send a hg bundle and any special instructions for building.

Garth

/Dag

Kent

Hello!

In light of the long and interesting discussion we had a while ago about
assembler performance I decided to try to squeeze more out of the uBlas
backend. This was not very successful.

However, I've been following the development of MTL4
(http://www.osl.iu.edu/research/mtl/mtl4/) with a keen eye on the
interesting insertion scheme they provide. I implemented a backend --
without sparsity pattern computation -- for the dolfin assembler and here
are some first benchmarks results:

Incomp Navier Stokes on 50x50x50 unit cube

MTL --------------------------------------------------------
assembly time: 8.510000
reassembly time: 6.750000
vecor assembly time: 6.070000

memory: 230 mb

UBLAS ------------------------------------------------------
assembly time: 23.030000
reassembly time: 12.140000
vector assembly time: 6.030000

memory: 642 mb

Poisson on 2000x2000 unit square

MTL --------------------------------------------------------
assembly time: 9.520000
reassembly time: 6.650000
assembly time: 4.730000
vector linear solve: 0.000000

memory: 452 mb

UBLAS ------------------------------------------------------
assembly time: 15.400000
reassembly time: 7.520000
vector assembly time: 5.020000

memory: 1169 mb

Conclusions? MTL is more than twice as fast and allocates less than half
the memory (since there is no sparsity pattern computation) across a set
of forms I've tested.

The code is not perfectly done yet, but I'd still be happy to share it
with whoever wants to mess around with it.

Cheers!

/Dag

_______________________________________________
DOLFIN-dev mailing list
DOLFIN-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/dolfin-dev



------------------------------------------------------------------------

_______________________________________________
DOLFIN-dev mailing list
DOLFIN-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/dolfin-dev

References

MTL4 backend: Significant performance results
From: Dag Lindbo, 2008-07-15
Re: MTL4 backend: Significant performance results
From: kent-and, 2008-07-16
Re: MTL4 backend: Significant performance results
From: Dag Lindbo, 2008-07-16