dolfin team mailing list archive

Thread
Date

Re: ufc ordering in parallel

To: dolfin-dev@xxxxxxxxxx
From: Jed Brown <jed@xxxxxxxx>
Date: Wed, 29 Apr 2009 18:38:43 +0200
Delivered-to: dolfin-dev@xxxxxxxxxx
In-reply-to: <a9f269830904290531x66e9b7f9wa1e57ecb60baa2cf@mail.gmail.com>
Sender: Jed Brown <five9a2@xxxxxxxxx>
User-agent: Thunderbird 2.0.0.21 (X11/20090319)

Matthew Knepley wrote:

> Yep, it is geometric, not topological.

Orientations are topological, but maybe that particular function isn't
used.

> None of which is necessary for this crap.

We have different requirements here.  You are satisfied for
restrictClosure to return the dofs in basically any order, the idea
being that the application doesn't care as long as the basis functions
behave correctly.  This pushes some complexity of different element
topologies into a "geometric" issue, but doesn't completely circumvent
it.

I actually use tensor product structure (for element topologies that
have it) to evaluate basis functions efficiently and to assemble much
sparser spectrally equivalent Jacobians (huge performance and memory
benefits compared to quadratic elements).  So I would need
restrictClosure to produce dofs in a different ordering.  In principle
this could be done with another level of indirection which is
specialized on topology, but restrictClosure does a lot of work already,
and this code executes every element during function evalutation,
matrix-free Jacobian application, and assembly.

There is a fair amount of complexity in orientedClosure; it's not
completely clear to me what is happening for large values of 'o'.  Is
the "orientation" arrow section basically just an encoding for the
permutation of vertices (negative means the object is flipped, the
magnitude is the index of the vertex in position 0?  If not, how do you
create it?  If so, how would you encode the orientation of a 3D element?
How are the dofs ordered for a high-order n-gon?

The way I deal with this part of the FEM mechanics is to use a MAIJ
matrix in place of restrictClosure.  When setting up this matrix, I do
switch on topology because there is a special ordering that I want for
each supported topology, otherwise it could be avoided in the same way
Sieve does it.  If the matrix was actually expensive, which it is not
unless you have a terribly non-conforming mesh and very high order, I
could replace it with a matrix-free version.  The cost of this is
significantly more code to execute on each element, but the design would
end up being very similar to Sieve.  I'm curious if you find L1I misses
in function evaluation to be at all limiting.  I see quite high
throughput with almost no time spent outside of the essential kernels.

> The real way to solve this is to construct the intermediates on the
> fly, but it takes some real thinking to make this efficient, and I am
> only one person.

I'm skeptical of building them on the fly for a nonconforming mesh (even
just p-nonconforming).  The issue is that these dofs may be a linear
combination of primal dofs and inferring the number of primal dofs
either requires full ghosting or communication.  Even then, the query
for a particular edge is really expensive if you don't have upward
adjacencies.

Anders Logg wrote:
> I care. It sure has a certain entertainment value. ;-)

Entertaining, yes.  Perhaps less than the petsc-dev compiler rant. ;-)

Jed

Attachment: signature.asc
Description: OpenPGP digital signature

References

ufc ordering in parallel
From: Robert Kirby, 2009-04-28
Re: ufc ordering in parallel
From: Anders Logg, 2009-04-28
Re: ufc ordering in parallel
From: Matthew Knepley, 2009-04-28
Re: ufc ordering in parallel
From: Anders Logg, 2009-04-28
Re: ufc ordering in parallel
From: Matthew Knepley, 2009-04-28
Re: ufc ordering in parallel
From: Jed Brown, 2009-04-28
Re: ufc ordering in parallel
From: Matthew Knepley, 2009-04-28
Re: ufc ordering in parallel
From: Jed Brown, 2009-04-29
Re: ufc ordering in parallel
From: Matthew Knepley, 2009-04-29
Re: ufc ordering in parallel
From: Jed Brown, 2009-04-29
Re: ufc ordering in parallel
From: Matthew Knepley, 2009-04-29