← Back to team overview

dolfin team mailing list archive

Re: Assembly benchmark

 

On Mon, Jul 21, 2008 at 04:03:11PM -0500, Matthew Knepley wrote:
> On Mon, Jul 21, 2008 at 3:55 PM, Matthew Knepley <knepley@xxxxxxxxx> wrote:
> > On Mon, Jul 21, 2008 at 3:50 PM, Garth N. Wells <gnw20@xxxxxxxxx> wrote:
> >>
> >>
> >> Anders Logg wrote:
> >>> On Mon, Jul 21, 2008 at 01:48:23PM +0100, Garth N. Wells wrote:
> >>>>
> >>>> Anders Logg wrote:
> >>>>> I have updated the assembly benchmark to include also MTL4, see
> >>>>>
> >>>>>    bench/fem/assembly/
> >>>>>
> >>>>> Here are the current results:
> >>>>>
> >>>>> Assembly benchmark  |  Elasticity3D  PoissonP1  PoissonP2  PoissonP3  THStokes2D  NSEMomentum3D  StabStokes2D
> >>>>> -------------------------------------------------------------------------------------------------------------
> >>>>> uBLAS               |        9.0789    0.45645     3.8042     8.0736  14.937         9.2507        3.8455
> >>>>> PETSc               |        7.7758    0.42798     3.5483     7.3898  13.945         8.1632         3.258
> >>>>> Epetra              |        8.9516    0.45448     3.7976     8.0679  15.404         9.2341        3.8332
> >>>>> MTL4                |        8.9729    0.45554     3.7966     8.0759  14.94          9.2568        3.8658
> >>>>> Assembly            |         7.474    0.43673     3.7341     8.3793  14.633         7.6695        3.3878
> >>>>>
> >>
> >>
> >> I specified in MTL4Matrix maximum 30 nonzeroes per row, and the results
> >> change quite a bit,
> >>
> >>  Assembly benchmark  |  Elasticity3D  PoissonP1  PoissonP2  PoissonP3
> >> THStokes2D  NSEMomentum3D  StabStokes2D
> >>
> >> -------------------------------------------------------------------------------------------------------------
> >>  uBLAS               |        7.1881    0.32748     2.7633     5.8311
> >>     10.968         7.0735        2.8184
> >>  PETSc               |        5.7868    0.30673     2.5489     5.2344
> >>     9.8896          6.069        2.3661
> >>  MTL4                |        2.8641    0.18339     1.6628     2.6811
> >>     2.8519         3.4843       0.85029
> >>  Assembly            |        5.5564    0.30896     2.6858     5.9675
> >>     10.622         5.7144        2.4519
> >>
> >>
> >> MTL4 is a lot faster in all cases.
> 
> Okay, if you run KSP ex2 (Poisson 2D) and add a logging stage that
> times assembly (I checked it in to petsc-dev)
> then 1M unknowns takes about 1s
> 
>   Matrix Object:
>     type=seqaij, rows=1000000, cols=1000000
>     total: nonzeros=4996000, allocated nonzeros=5000000
>       not using I-node routines
> Summary of Stages:   ----- Time ------  ----- Flops -----  ---
> Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
>  0:      Main Stage: 1.4997e+00  56.3%  3.8891e+08 100.0%  0.000e+00
> 0.0%  0.000e+00        0.0%  2.200e+01  51.2%
>  1:        Assembly: 1.1648e+00  43.7%  0.0000e+00   0.0%  0.000e+00
> 0.0%  0.000e+00        0.0%  0.000e+00   0.0%
> 
> I just cut the solve off. Thus all thos enumber are extemely fishy.
> 
>   Matt

We shouldn't trust those numbers just yet. Some of it may be Python
overhead (calling the FFC JIT compiler etc).

Does 1M unknowns mean a unit square divided into 2x1000x1000 right
triangles?

-- 
Anders

Attachment: signature.asc
Description: Digital signature


Follow ups

References