dolfin team mailing list archive
-
dolfin team
-
Mailing list archive
-
Message #07926
Re: MatSetValues profiling
On Sat 2008-05-17 22:11, Murtazo Nazarov wrote:
> > On Sat 2008-05-17 21:19, Murtazo Nazarov wrote:
> >> This is from 3D Navier-Stokes equation with linear elements. The first
> >> time assembly takes longer because it does initialization of the matrix,
> >> it calculates the sparsity pattern. But, then the sparsity pattern and
> >> A.init() is not done. Maybe I cannot gain so much, but as I posted in
> >> previous posts, I did exactly the same test with another package, with
> >> the
> >> same mesh, the same elements and the same equations, and that was at
> >> least
> >> 3 times faster than assembly we have in dolfin. So, instead of 7 seconds
> >> it is spent 7/3 seconds which I can gain 10 days from my simulation
> >> which
> >> takes now 14 days.
> >
> > I don't understand. Don't you solve a system with these matrices? Your
> > numbers
> > indicate that the solve takes negative time.
>
> What do you mean? In the test I sent you I do not solve the system, it was
> just testing the assembly process.
Presumably your 14 day run isn't just assembling matrices and throwing them
away. What fraction of the real runtime is spent in assembly?
> > If you want to speed up the insertion of element matrices, I think the
> > correct
> > way is to do clever caching so that, for instance, an entire row
> > contribution
> > can be inserted at once. Anders mentioned this earlier in this thread.
> > My
> > understanding is that computing the element matrices is supposed to be
> > very fast
> > in Dolfin since it uses FFC-optimized code. Can you profile and see
> > exactly
> > where the time is being spent? Is it in FFC-generated code or in
> > insertion?
> >
>
> I think that the FFC is pretty fast and there is no problem with that so
> far. The element matrices is calculated fast. I did profiling and posted
> previously, here it is again:
>
> Dolfin::Assembler::assembleCells:
>
> 1. tabulate_tensor for bilinearform of Momentum in NSE: 6.04%
> tabulate_tensor for linearform of Momentum in NSE: 11.98%
>
> 2. Dolfin::GeneriMatrix::add: 68.98%
>
> 3. Dolfin::Function::Interpolate: 9.05%
>
> The most time is spent to add, which calls MatSetValues.
Right, it is in insertion. I guess we are back to better caching. One
enhancement would be to order the element matrices so the columns indices are
increasing. MatSetValues_SeqAIJ just inserts entries directly into the system
matrix. If the columns are out of order, it resets the lower bound in a binary
search. To get the fastest insertion, Dolfin could cache several nearby
elements so as to insert entire rows at once with the columns sorted. A first
step would be to just sort the columns in the element matrices.
Jed
Attachment:
pgpxeIdM1bTQI.pgp
Description: PGP signature
References
-
Re: profiling an assembly
From: Garth N. Wells, 2008-05-16
-
Re: profiling an assembly
From: Dag Lindbo, 2008-05-17
-
Re: profiling an assembly
From: Murtazo Nazarov, 2008-05-17
-
Re: profiling an assembly
From: Johan Hoffman, 2008-05-17
-
Re: profiling an assembly
From: Jed Brown, 2008-05-17
-
MatSetValues profiling
From: Murtazo Nazarov, 2008-05-17
-
Re: MatSetValues profiling
From: Jed Brown, 2008-05-17
-
Re: MatSetValues profiling
From: Murtazo Nazarov, 2008-05-17
-
Re: MatSetValues profiling
From: Jed Brown, 2008-05-17
-
Re: MatSetValues profiling
From: Murtazo Nazarov, 2008-05-17