dolfin team mailing list archive
-
dolfin team
-
Mailing list archive
-
Message #08807
Re: Assembly benchmark
On Mon, Jul 21, 2008 at 4:48 PM, Anders Logg <logg@xxxxxxxxx> wrote:
> On Mon, Jul 21, 2008 at 04:37:28PM -0500, Matthew Knepley wrote:
>> On Mon, Jul 21, 2008 at 4:35 PM, Anders Logg <logg@xxxxxxxxx> wrote:
>> > On Mon, Jul 21, 2008 at 04:03:11PM -0500, Matthew Knepley wrote:
>> >> On Mon, Jul 21, 2008 at 3:55 PM, Matthew Knepley <knepley@xxxxxxxxx> wrote:
>> >> > On Mon, Jul 21, 2008 at 3:50 PM, Garth N. Wells <gnw20@xxxxxxxxx> wrote:
>> >> >>
>> >> >>
>> >> >> Anders Logg wrote:
>> >> >>> On Mon, Jul 21, 2008 at 01:48:23PM +0100, Garth N. Wells wrote:
>> >> >>>>
>> >> >>>> Anders Logg wrote:
>> >> >>>>> I have updated the assembly benchmark to include also MTL4, see
>> >> >>>>>
>> >> >>>>> bench/fem/assembly/
>> >> >>>>>
>> >> >>>>> Here are the current results:
>> >> >>>>>
>> >> >>>>> Assembly benchmark | Elasticity3D PoissonP1 PoissonP2 PoissonP3 THStokes2D NSEMomentum3D StabStokes2D
>> >> >>>>> -------------------------------------------------------------------------------------------------------------
>> >> >>>>> uBLAS | 9.0789 0.45645 3.8042 8.0736 14.937 9.2507 3.8455
>> >> >>>>> PETSc | 7.7758 0.42798 3.5483 7.3898 13.945 8.1632 3.258
>> >> >>>>> Epetra | 8.9516 0.45448 3.7976 8.0679 15.404 9.2341 3.8332
>> >> >>>>> MTL4 | 8.9729 0.45554 3.7966 8.0759 14.94 9.2568 3.8658
>> >> >>>>> Assembly | 7.474 0.43673 3.7341 8.3793 14.633 7.6695 3.3878
>> >> >>>>>
>> >> >>
>> >> >>
>> >> >> I specified in MTL4Matrix maximum 30 nonzeroes per row, and the results
>> >> >> change quite a bit,
>> >> >>
>> >> >> Assembly benchmark | Elasticity3D PoissonP1 PoissonP2 PoissonP3
>> >> >> THStokes2D NSEMomentum3D StabStokes2D
>> >> >>
>> >> >> -------------------------------------------------------------------------------------------------------------
>> >> >> uBLAS | 7.1881 0.32748 2.7633 5.8311
>> >> >> 10.968 7.0735 2.8184
>> >> >> PETSc | 5.7868 0.30673 2.5489 5.2344
>> >> >> 9.8896 6.069 2.3661
>> >> >> MTL4 | 2.8641 0.18339 1.6628 2.6811
>> >> >> 2.8519 3.4843 0.85029
>> >> >> Assembly | 5.5564 0.30896 2.6858 5.9675
>> >> >> 10.622 5.7144 2.4519
>> >> >>
>> >> >>
>> >> >> MTL4 is a lot faster in all cases.
>> >>
>> >> Okay, if you run KSP ex2 (Poisson 2D) and add a logging stage that
>> >> times assembly (I checked it in to petsc-dev)
>> >> then 1M unknowns takes about 1s
>> >>
>> >> Matrix Object:
>> >> type=seqaij, rows=1000000, cols=1000000
>> >> total: nonzeros=4996000, allocated nonzeros=5000000
>> >> not using I-node routines
>> >> Summary of Stages: ----- Time ------ ----- Flops ----- ---
>> >> Messages --- -- Message Lengths -- -- Reductions --
>> >> Avg %Total Avg %Total counts
>> >> %Total Avg %Total counts %Total
>> >> 0: Main Stage: 1.4997e+00 56.3% 3.8891e+08 100.0% 0.000e+00
>> >> 0.0% 0.000e+00 0.0% 2.200e+01 51.2%
>> >> 1: Assembly: 1.1648e+00 43.7% 0.0000e+00 0.0% 0.000e+00
>> >> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>> >>
>> >> I just cut the solve off. Thus all thos enumber are extemely fishy.
>> >>
>> >> Matt
>> >
>> > We shouldn't trust those numbers just yet. Some of it may be Python
>> > overhead (calling the FFC JIT compiler etc).
>> >
>> > Does 1M unknowns mean a unit square divided into 2x1000x1000 right
>> > triangles?
>>
>> Its FD Poisson, which gives the same sparsity and values as P1 Poisson, so
>> its a 1000x1000 quadrilateral grid. This was just to time insertion.
>>
>> Matt
>
> But this is a different problem. Since you know the sparsity pattern a
> priori, you may be able to (i) not compute the sparsity pattern, (ii)
No, we only allocate correctly here.
> compute the entries more efficiently, (iii) not compute the
> local-to-global mapping, and (iv) insert the entries more efficiently.
Insertion is the same and we compute the same mapping we always use.
I think you guys overcompute for the l2g.
Matt
> Our timings include all these steps + Python overhead. I'm going to
> rewrite it in C++ so we can eliminate that source of uncertainty.
>
> --
> Anders
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
>
> iD8DBQFIhQQgTuwUCDsYZdERAnUzAJ93hfI/Psx6IccOdOr3GhbODAdFgACdFAj9
> Mc0MiBbB+aiTEMXOajyrnog=
> =oLL0
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> DOLFIN-dev mailing list
> DOLFIN-dev@xxxxxxxxxx
> http://www.fenics.org/mailman/listinfo/dolfin-dev
>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener
Follow ups
References