dolfin team mailing list archive

Thread
Date

Re: MTL4 backend: Significant performance results

To: dolfin-dev@xxxxxxxxxx
From: Anders Logg <logg@xxxxxxxxx>
Date: Wed, 16 Jul 2008 22:27:43 +0200
Delivered-to: dolfin-dev@xxxxxxxxxx
In-reply-to: <33628.80.217.175.108.1216159085.squirrel@webmail.csc.kth.se>
Mail-followup-to: dolfin-dev@xxxxxxxxxx
User-agent: Mutt/1.5.17+20080114 (2008-01-14)

Very nice!

Some comments:

1. Beating uBLAS by a factor 3 is not that hard. Didem Unat (PhD
student at UCSD/Simula) and Ilmar have been looking at the assembly in
DOLFIN recently. We've done some initial benchmarks and have started
investigating how to speedup the assembly. Take a look at what happens
when we assemble into uBLAS:

  (i)   Compute sparsity pattern
  (ii)  Reset tensor
  (iii) Assemble

For uBLAS, each of these steps is approximately an assembly process.
I don't remember the exact numbers, but by just using an
std::vector<std::map<int, double> > instead of a uBLAS matrix, one may
skip (i) and (ii) and get a speedup.

We've just started and don't have anything to present yet.

2. I've also looked at MTL before. We even considered using it as the
main LA backend a (long) while back.

3. With the new LA interfaces in place, I wouldn't mind having MTL as
an optional backend.

-- 
Anders


On Tue, Jul 15, 2008 at 11:58:05PM +0200, Dag Lindbo wrote:
> Hello!
> 
> In light of the long and interesting discussion we had a while ago about
> assembler performance I decided to try to squeeze more out of the uBlas
> backend. This was not very successful.
> 
> However, I've been following the development of MTL4
> (http://www.osl.iu.edu/research/mtl/mtl4/) with a keen eye on the
> interesting insertion scheme they provide. I implemented a backend --
> without sparsity pattern computation -- for the dolfin assembler and here
> are some first benchmarks results:
> 
> Incomp Navier Stokes on 50x50x50 unit cube
> 
> MTL --------------------------------------------------------
> assembly time: 8.510000
> reassembly time: 6.750000
> vecor assembly time: 6.070000
> 
> memory: 230 mb
> 
> UBLAS ------------------------------------------------------
> assembly time: 23.030000
> reassembly time: 12.140000
> vector assembly time: 6.030000
> 
> memory: 642 mb
> 
> Poisson on 2000x2000 unit square
> 
> MTL --------------------------------------------------------
> assembly time: 9.520000
> reassembly time: 6.650000
> assembly time: 4.730000
> vector linear solve: 0.000000
> 
> memory: 452 mb
> 
> UBLAS ------------------------------------------------------
> assembly time: 15.400000
> reassembly time: 7.520000
> vector assembly time: 5.020000
> 
> memory: 1169 mb
> 
> Conclusions? MTL is more than twice as fast and allocates less than half
> the memory (since there is no sparsity pattern computation) across a set
> of forms I've tested.
> 
> The code is not perfectly done yet, but I'd still be happy to share it
> with whoever wants to mess around with it.
> 
> Cheers!
> 
> /Dag
> 
> _______________________________________________
> DOLFIN-dev mailing list
> DOLFIN-dev@xxxxxxxxxx
> http://www.fenics.org/mailman/listinfo/dolfin-dev

Attachment: signature.asc
Description: Digital signature

Follow ups

Re: MTL4 backend: Significant performance results
From: Garth N. Wells, 2008-07-16

References

MTL4 backend: Significant performance results
From: Dag Lindbo, 2008-07-15