← Back to team overview

dolfin team mailing list archive

Re: profiling an assembly

 

> 2008/5/19 Johan Hoffman <jhoffman@xxxxxxxxxx>:
>>> On Sun 2008-05-18 22:55, Johan Hoffman wrote:
>>>> > On Sun 2008-05-18 21:54, Johan Hoffman wrote:
>>>> >> > On Sat, May 17, 2008 at 04:40:48PM +0200, Johan Hoffman wrote:
>>>> >> >
>>>> >> > 1. Solve time may dominate assemble anyway so that's where we
>>>> should
>>>> >> > optimize.
>>>> >>
>>>> >> Yes, there may be such cases, in particular for simple forms
>>>> (Laplace
>>>> >> equation etc.). For more complex forms with more terms and
>>>> coefficients,
>>>> >> assembly typically dominates, from what I have seen. This is the
>>>> case
>>>> >> for
>>>> >> the flow problems of Murtazo for example.
>>>> >
>>>> > This probably depends if you use are using a projection method.  If
>>>> you
>>>> > are
>>>> > solving the saddle point problem, you can forget about assembly
>>>> time.
>>>>
>>>> Well, this is not what we see. I agree that this is what you would
>>>> like,
>>>> but this is not the case now. That is why we are now focusing on the
>>>> assembly bottleneck.
>>>
>>> This just occurred to me.  If you have a Newtonian fluid, then the
>>> momentum
>>> equations are block diagonal, but this is not reflected in the matrix
>>> structure.
>>> Sure enough, run the stokes demo with -mat_view_draw -draw_pause -1 and
>>> note
>>> that the off-diagonal blocks of the momentum equations are cyan which
>>> means they
>>> are set, but have value zero.  This almost doubles the number of
>>> insertions into
>>> the global matrix.
>>
>> Good. You are right, this piece of information is not used.
>>
>> I guess the most general thing is to have ffc delete zero matrix entries
>> in computing the sparsity pattern. I do not think this is done today?
>
> We could add it with an appropriate
> ufc::dof_map::tabulate_sparsity_pattern(...),
> since the form compiler can figure out which entries are always zero.
> Currently, this information is hidden from dolfin, and therefore it must
> simply use the local blocks.

Yes, this would be very useful to add.

> But A.add(...) calls a single block-addition function in PETSc
> or Trilinos, does anyone know how these will perform if the values
> contain zeros that are outside the initialized sparsity pattern?

Worst case scenario is that a reallocation is triggered, but maybe it is
dealt with in a less drastic way?

>
>>  Of course, if you really care about speed, you could
>>> implement the momentum equations matrix-free with a custom
>>> preconditioner
>>> that
>>> only uses one block (since they are the same).
>>>
>>> Also, it seems like a strange design decision for general coupled
>>> systems
>>> for
>>> the blocks to be separate like this.  (Actually, ordering the unknowns
>>> this way
>>> makes some factorizations ``work'' even when the matrix is indefinite.)
>>
>>  Ok. We should look in to this.
>
> The reasoning behind this decision was that it makes access to subvectors
> for components of general hierarchial mixed elements easy. Vector elements
> and mixed elements are treated as basically the same thing.
>
> Thus, renumbering should preferably happen in dolfin, which will have to
> be
> done for many occasions anyway.

Yes, it seems natural to have the option to renumber freely in dolfin, in
particular to switch between "vertex-oriented" and "vector-component
oriented" numbering.

/Johan

>
> --
> Martin
>




Follow ups

References