← Back to team overview

dolfin team mailing list archive

Re: profiling an assembly

 

On Mon, May 19, 2008 at 09:31:16AM +0200, Martin Sandve Alnæs wrote:
> 2008/5/19 Jed Brown <jed@xxxxxxxx>:
> > On Mon 2008-05-19 08:41, Johan Hoffman wrote:
> >> > 2008/5/19 Johan Hoffman <jhoffman@xxxxxxxxxx>:
> >> >>> On Sun 2008-05-18 22:55, Johan Hoffman wrote:
> >> >>>> > On Sun 2008-05-18 21:54, Johan Hoffman wrote:
> >> >>>> >> > On Sat, May 17, 2008 at 04:40:48PM +0200, Johan Hoffman wrote:
> >> >>>> >> >
> >> >>>> >> > 1. Solve time may dominate assemble anyway so that's where we
> >> >>>> should
> >> >>>> >> > optimize.
> >> >>>> >>
> >> >>>> >> Yes, there may be such cases, in particular for simple forms
> >> >>>> (Laplace
> >> >>>> >> equation etc.). For more complex forms with more terms and
> >> >>>> coefficients,
> >> >>>> >> assembly typically dominates, from what I have seen. This is the
> >> >>>> case
> >> >>>> >> for
> >> >>>> >> the flow problems of Murtazo for example.
> >> >>>> >
> >> >>>> > This probably depends if you use are using a projection method.  If
> >> >>>> you
> >> >>>> > are
> >> >>>> > solving the saddle point problem, you can forget about assembly
> >> >>>> time.
> >> >>>>
> >> >>>> Well, this is not what we see. I agree that this is what you would
> >> >>>> like,
> >> >>>> but this is not the case now. That is why we are now focusing on the
> >> >>>> assembly bottleneck.
> >> >>>
> >> >>> This just occurred to me.  If you have a Newtonian fluid, then the
> >> >>> momentum
> >> >>> equations are block diagonal, but this is not reflected in the matrix
> >> >>> structure.
> >> >>> Sure enough, run the stokes demo with -mat_view_draw -draw_pause -1 and
> >> >>> note
> >> >>> that the off-diagonal blocks of the momentum equations are cyan which
> >> >>> means they
> >> >>> are set, but have value zero.  This almost doubles the number of
> >> >>> insertions into
> >> >>> the global matrix.
> >> >>
> >> >> Good. You are right, this piece of information is not used.
> >> >>
> >> >> I guess the most general thing is to have ffc delete zero matrix entries
> >> >> in computing the sparsity pattern. I do not think this is done today?
> >> >
> >> > We could add it with an appropriate
> >> > ufc::dof_map::tabulate_sparsity_pattern(...),
> >> > since the form compiler can figure out which entries are always zero.
> >> > Currently, this information is hidden from dolfin, and therefore it must
> >> > simply use the local blocks.
> >>
> >> Yes, this would be very useful to add.
> >>
> >> > But A.add(...) calls a single block-addition function in PETSc
> >> > or Trilinos, does anyone know how these will perform if the values
> >> > contain zeros that are outside the initialized sparsity pattern?
> >>
> >> Worst case scenario is that a reallocation is triggered, but maybe it is
> >> dealt with in a less drastic way?
> >
> > It is important to preallocate for all the entries you will be inserting.  If
> > the option MAT_IGNORE_ZERO_ENTRIES is set then zero entries will not be
> > inserted.  I don't think you want to rely on this as a general mechanism for
> > eliminating non-existant coupling since it would also eliminate entries that
> > just happened to be missing for the current Jacobian.
> >
> > Note that MatSetValues() needs to be called with a ``full'' block so if we
> > eliminate the coupling zeros, we must make separate calls for the rows of each
> > vector component.  However, this should not make much performance difference
> 
> Ok, this was what I wanted to know. And if the ongoing work with assembling
> semi-local matrices over patches of the mesh and inserting one row at a time
> goes well, then we'll use single row insertion anyway. Epetra::FECrsMatrix
> (Trilinos) has a function to add a single row which is documented to ignore
> values that aren't part of the sparsity pattern (Epetra::FECrsGraph).
> 
> Regarding adding sparsity pattern information to an ufc::form, it seems to me
> it should be part of the *integral classes, one
> tabulate_sparsity_pattern to match
> each tabulate_tensor, plus a num_nonzeros() function.

I don't understand what tabulate_sparsity_pattern is supposed to do.
Should it tabulate the full matrix sparsity pattern, or is it some
kind of "equation sparsity pattern" tabulating which components of a
system meet which components?

-- 
Anders


Follow ups

References