dolfin team mailing list archive

Thread
Date

Re: Fwd: Assembly benchmark

To: dolfin-dev@xxxxxxxxxx
From: Dag Lindbo <dag@xxxxxxxxxx>
Date: Mon, 11 Aug 2008 18:23:31 +0200
Delivered-to: dolfin-dev@xxxxxxxxxx
In-reply-to: <20080811160025.GD2438@bunjil.simula.no>
User-agent: Thunderbird 2.0.0.16 (X11/20080724)

Anders Logg wrote:
> On Mon, Aug 11, 2008 at 10:32:23AM +0200, Dag Lindbo wrote:
>> Hello! (I've been out of e-mail range for a week)
>>
>> I've posted some results on the wiki (and I actually have MTL4 :-) )
>> where I just show total assembly times with and without optimization
>> turned on. MTL4 responds quite well to -O3 (unlike uBLAS).
>>
>> As far as the MTL4 backend goes, I have implemented and tested almost
>> all the operators and member functions. We are running into some quirks
>> with namespaces in MTL4 which Peter G is going to sort out. Also, there
>> is a bug somewhere in the version of the backend currently in place --
>> so please don't use it without manually initializing matrix dimensions.
>>
>> I will sort this out in a day or two and then have a bundle ready that
>> finalizes the MTL4 backend. The performance results have been
>> encouraging enough to keep me (and Garth) motivated to get this backend
>> ready for "real" use.
>>
>> /Dag
> 
> Good. I've updated my results (this time with MTL4 actually installed)
> and MTL4 does very good.
> 
> The problem is it's cheating. Other backends would also do much better
> if they were allowed to guess the number of nonzeros and guess right.
> 

Of course, PETSc should be just as fast as MTL4. uBlas, I don't think
could. As far as Epetra goes, I have no understanding of the insertion
procedure. The STL backend is cheating as well, since the matrix is not
(as far as I know) in a state which is suitable for linear solve.

In my opinion it really is necessary to give the user the option of
specifying the number of nonzeros per row. Many forms get assembled a
_lot_ (e.g. the ICNS-3D form used by Hoffman et. al. has not changed in
years) and it seems absurd that DOLFIN should insist on recomputing the
information that is already known by the user.

There has been a lot of discussion of assembler performance on the list
and so I made the MTL4 backend as a demonstration of how fast the core
DOLFIN/FFC/UFC assembler design could be. I intend to keep trying to
squeeze more performance out of DOLFIN and you may then decide if my
suggestions are worthwhile or "cheating".

/Dag

> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> DOLFIN-dev mailing list
> DOLFIN-dev@xxxxxxxxxx
> http://www.fenics.org/mailman/listinfo/dolfin-dev

Attachment: signature.asc
Description: OpenPGP digital signature

Follow ups

Re: Fwd: Assembly benchmark
From: Garth N. Wells, 2008-08-11

References

Re: Fwd: Assembly benchmark
From: Garth N. Wells, 2008-08-06
Re: Fwd: Assembly benchmark
From: Matthew Knepley, 2008-08-07
Re: Fwd: Assembly benchmark
From: Anders Logg, 2008-08-07
Re: Fwd: Assembly benchmark
From: Garth N. Wells, 2008-08-07
Re: Fwd: Assembly benchmark
From: Anders Logg, 2008-08-08
Re: Fwd: Assembly benchmark
From: Garth N. Wells, 2008-08-08
Re: Fwd: Assembly benchmark
From: Anders Logg, 2008-08-08
Re: Fwd: Assembly benchmark
From: Garth N. Wells, 2008-08-08
Re: Fwd: Assembly benchmark
From: Ilmar Wilbers, 2008-08-08
Re: Fwd: Assembly benchmark
From: Dag Lindbo, 2008-08-11
Re: Fwd: Assembly benchmark
From: Anders Logg, 2008-08-11