← Back to team overview

dolfin team mailing list archive

Re: MatSetValues profiling

 

On Sat 2008-05-17 22:11, Murtazo Nazarov wrote:
> > On Sat 2008-05-17 21:19, Murtazo Nazarov wrote:
> >> This is from 3D Navier-Stokes equation with linear elements. The first
> >> time assembly takes longer because it does initialization of the matrix,
> >> it calculates the sparsity pattern. But, then the sparsity pattern and
> >> A.init() is not done. Maybe I cannot gain so much, but as I posted in
> >> previous posts, I did exactly the same test with another package, with
> >> the
> >> same mesh, the same elements and the same equations, and that was at
> >> least
> >> 3 times faster than assembly we have in dolfin. So, instead of 7 seconds
> >> it is spent 7/3 seconds which I can gain 10 days from my simulation
> >> which
> >> takes now 14 days.
> >
> > I don't understand.  Don't you solve a system with these matrices?  Your
> > numbers
> > indicate that the solve takes negative time.
> 
> What do you mean? In the test I sent you I do not solve the system, it was
> just testing the assembly process.

Presumably your 14 day run isn't just assembling matrices and throwing them
away.  What fraction of the real runtime is spent in assembly?

> > If you want to speed up the insertion of element matrices, I think the
> > correct
> > way is to do clever caching so that, for instance, an entire row
> > contribution
> > can be inserted at once.  Anders mentioned this earlier in this thread.
> > My
> > understanding is that computing the element matrices is supposed to be
> > very fast
> > in Dolfin since it uses FFC-optimized code.  Can you profile and see
> > exactly
> > where the time is being spent?  Is it in FFC-generated code or in
> > insertion?
> >
> 
> I think that the FFC is pretty fast and there is no problem with that so
> far. The element matrices is calculated fast. I did profiling and posted
> previously, here it is again:
> 
> Dolfin::Assembler::assembleCells:
> 
> 1. tabulate_tensor for bilinearform of Momentum in NSE: 6.04%
>     tabulate_tensor for    linearform of Momentum in NSE: 11.98%
> 
> 2. Dolfin::GeneriMatrix::add: 68.98%
> 
> 3. Dolfin::Function::Interpolate: 9.05%
> 
> The most time is spent to add, which calls MatSetValues.

Right, it is in insertion.  I guess we are back to better caching.  One
enhancement would be to order the element matrices so the columns indices are
increasing.  MatSetValues_SeqAIJ just inserts entries directly into the system
matrix.  If the columns are out of order, it resets the lower bound in a binary
search.  To get the fastest insertion, Dolfin could cache several nearby
elements so as to insert entire rows at once with the columns sorted.  A first
step would be to just sort the columns in the element matrices.

Jed

Attachment: pgpxeIdM1bTQI.pgp
Description: PGP signature


References