← Back to team overview

dolfin team mailing list archive

Re: Fwd: Assembly benchmark

 



Anders Logg wrote:
On Mon, Aug 11, 2008 at 05:49:22PM +0100, Garth N. Wells wrote:

Dag Lindbo wrote:
Anders Logg wrote:
On Mon, Aug 11, 2008 at 10:32:23AM +0200, Dag Lindbo wrote:
Hello! (I've been out of e-mail range for a week)

I've posted some results on the wiki (and I actually have MTL4 :-) )
where I just show total assembly times with and without optimization
turned on. MTL4 responds quite well to -O3 (unlike uBLAS).

As far as the MTL4 backend goes, I have implemented and tested almost
all the operators and member functions. We are running into some quirks
with namespaces in MTL4 which Peter G is going to sort out. Also, there
is a bug somewhere in the version of the backend currently in place --
so please don't use it without manually initializing matrix dimensions.

I will sort this out in a day or two and then have a bundle ready that
finalizes the MTL4 backend. The performance results have been
encouraging enough to keep me (and Garth) motivated to get this backend
ready for "real" use.

/Dag
Good. I've updated my results (this time with MTL4 actually installed)
and MTL4 does very good.

The problem is it's cheating. Other backends would also do much better
if they were allowed to guess the number of nonzeros and guess right.

Of course, PETSc should be just as fast as MTL4. uBlas, I don't think
could. As far as Epetra goes, I have no understanding of the insertion
procedure. The STL backend is cheating as well, since the matrix is not
(as far as I know) in a state which is suitable for linear solve.

In my opinion it really is necessary to give the user the option of
specifying the number of nonzeros per row. Many forms get assembled a
_lot_ (e.g. the ICNS-3D form used by Hoffman et. al. has not changed in
years) and it seems absurd that DOLFIN should insist on recomputing the
information that is already known by the user.

The same form is not always assembled on the same mesh, which doesn't make it possible to specify the non-zeroes based on the form alone. What is likely to be the best approach is letting FFC create a function to compute the maximum non-zeroes for a row based on mesh information.

And remove SparsityPatternBuilder?



Maybe, or it could look different. Perhaps it should be able to compute a detailed sparsity pattern, or just some key features of the sparsity pattern (non-zeroes per row, max non-zeroes for a row), depending on what's required.

Garth


------------------------------------------------------------------------

_______________________________________________
DOLFIN-dev mailing list
DOLFIN-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/dolfin-dev


References