← Back to team overview

ffc team mailing list archive

Re: Preliminary benchmark results for FFC

 

Anders, In my estimation, you can only do so much to the quadrature
approach.  There are basically two things without doing really bizarre
compiler tricks:
- get a better quadrature rule
- get a better loop nest.

The savings of the first one can be significant if you are doing Gauss
quadrature mapped to a tet in 3D versus using the "optimal" points.

For the second, the savings is about a factor of two at best.

This holds if you do one element at a time.  You can do a space/time
tradeoff and do lots of elements in batch by vectorizing and/or using
level 3 blas.  This can get your speed up, and it applies to both FFC-type
contractions and quadrature.  Incidentally, this is how Kevin can afford
to do interpretation on his tree at run-time.

Part of our system should do some kind of heuristics to
test what the write balance between space and time is and generate code
that will do elements in batch.  This is an optimization, but an important
one to play with.

Rob

Robert Kirby
Assistant Professor
Department of Computer Science
The University of Chicago
http://people.cs.uchicago.edu/~kirby

On Tue, 22 Mar 2005, Anders Logg wrote:

> I'm working on some benchmarks comparing FFC with the standard
> quadrature approach and the results look pretty good. The typical
> speedup is a factor 10-100.
>
> I've run tests for Lagrange elements with q = 1,2,3 for a simple mass
> matrix, Poisson, the nonlinear term of Navier-Stokes and the
> strain-strain term of linear elasticity. Higher order is on it's way,
> but it takes a long time for FIAT to evaluate the basis functions... :-).
>
> See the attached file for some preliminary results. The times
> reported are for computing the element matrix (local stiffness matrix)
> 10,000 times.
>
> Note that this is without any FErari optimizations. (On the other
> hand, the quadrature-based code can probably also be optimized.)
>
> /Anders
>



Follow ups

References