← Back to team overview

dolfin team mailing list archive

Re: profiling an assembly

 

On Mon 2008-05-19 08:41, Johan Hoffman wrote:
> > 2008/5/19 Johan Hoffman <jhoffman@xxxxxxxxxx>:
> >>> On Sun 2008-05-18 22:55, Johan Hoffman wrote:
> >>>> > On Sun 2008-05-18 21:54, Johan Hoffman wrote:
> >>>> >> > On Sat, May 17, 2008 at 04:40:48PM +0200, Johan Hoffman wrote:
> >>>> >> >
> >>>> >> > 1. Solve time may dominate assemble anyway so that's where we
> >>>> should
> >>>> >> > optimize.
> >>>> >>
> >>>> >> Yes, there may be such cases, in particular for simple forms
> >>>> (Laplace
> >>>> >> equation etc.). For more complex forms with more terms and
> >>>> coefficients,
> >>>> >> assembly typically dominates, from what I have seen. This is the
> >>>> case
> >>>> >> for
> >>>> >> the flow problems of Murtazo for example.
> >>>> >
> >>>> > This probably depends if you use are using a projection method.  If
> >>>> you
> >>>> > are
> >>>> > solving the saddle point problem, you can forget about assembly
> >>>> time.
> >>>>
> >>>> Well, this is not what we see. I agree that this is what you would
> >>>> like,
> >>>> but this is not the case now. That is why we are now focusing on the
> >>>> assembly bottleneck.
> >>>
> >>> This just occurred to me.  If you have a Newtonian fluid, then the
> >>> momentum
> >>> equations are block diagonal, but this is not reflected in the matrix
> >>> structure.
> >>> Sure enough, run the stokes demo with -mat_view_draw -draw_pause -1 and
> >>> note
> >>> that the off-diagonal blocks of the momentum equations are cyan which
> >>> means they
> >>> are set, but have value zero.  This almost doubles the number of
> >>> insertions into
> >>> the global matrix.
> >>
> >> Good. You are right, this piece of information is not used.
> >>
> >> I guess the most general thing is to have ffc delete zero matrix entries
> >> in computing the sparsity pattern. I do not think this is done today?
> >
> > We could add it with an appropriate
> > ufc::dof_map::tabulate_sparsity_pattern(...),
> > since the form compiler can figure out which entries are always zero.
> > Currently, this information is hidden from dolfin, and therefore it must
> > simply use the local blocks.
> 
> Yes, this would be very useful to add.
> 
> > But A.add(...) calls a single block-addition function in PETSc
> > or Trilinos, does anyone know how these will perform if the values
> > contain zeros that are outside the initialized sparsity pattern?
> 
> Worst case scenario is that a reallocation is triggered, but maybe it is
> dealt with in a less drastic way?

It is important to preallocate for all the entries you will be inserting.  If
the option MAT_IGNORE_ZERO_ENTRIES is set then zero entries will not be
inserted.  I don't think you want to rely on this as a general mechanism for
eliminating non-existant coupling since it would also eliminate entries that
just happened to be missing for the current Jacobian.

Note that MatSetValues() needs to be called with a ``full'' block so if we
eliminate the coupling zeros, we must make separate calls for the rows of each
vector component.  However, this should not make much performance difference
since the only overhead from splitting the rows is the function call.  Of
course, if we use a hashed cache then this is definitely a non-issue.

> >>  Of course, if you really care about speed, you could
> >>> implement the momentum equations matrix-free with a custom
> >>> preconditioner
> >>> that
> >>> only uses one block (since they are the same).
> >>>
> >>> Also, it seems like a strange design decision for general coupled
> >>> systems
> >>> for
> >>> the blocks to be separate like this.  (Actually, ordering the unknowns
> >>> this way
> >>> makes some factorizations ``work'' even when the matrix is indefinite.)
> >>
> >>  Ok. We should look in to this.
> >
> > The reasoning behind this decision was that it makes access to subvectors
> > for components of general hierarchial mixed elements easy. Vector elements
> > and mixed elements are treated as basically the same thing.
> >
> > Thus, renumbering should preferably happen in dolfin, which will have to
> > be
> > done for many occasions anyway.
> 
> Yes, it seems natural to have the option to renumber freely in dolfin, in
> particular to switch between "vertex-oriented" and "vector-component
> oriented" numbering.

I think this would be good.  Note that the non-square coupling matrices between
mixed elements never need to be formed (for most saddle point preconditioners)
so it would be natural to write them in separate form files.


> Iterating once is similar to a projection method, but we iterate until
> convergence in the outer non linear residual. Of course, after some time
> the number of iterations is typically 1, so the method collapses into a
> projection type method.

Okay, cool.  Thanks for clarifying.  How fast is the convergence the first time
for the nonlinear problem?  My (limited) experience and impression from the
literature is that you only see quadratic convergence from the Newton method if
you solve the linear saddle point problem instead of replacing it with the
projection preconditioner.  Of course, if it only takes one iteration after some
time, then $\Delta t \eta$ must be small in which case we expect the projection
method to be good (this is not the case for the slow non-Newtonian fluids I work
with).

Jed

Attachment: pgpwZ0FsGJDw7.pgp
Description: PGP signature


Follow ups

References