dolfin team mailing list archive

Thread
Date

Re: Results: Parallel speedup

To: dolfin-dev@xxxxxxxxxx
From: Anders Logg <logg@xxxxxxxxx>
Date: Tue, 22 Sep 2009 18:20:44 +0200
Delivered-to: dolfin-dev@xxxxxxxxxx
In-reply-to: <13f28b55fac4121aa1a0f4ae7e5d4942.squirrel@webmail.uio.no>
Mail-followup-to: dolfin-dev@xxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)

On Tue, Sep 22, 2009 at 08:59:19AM +0200, kent-and@xxxxxxxxx wrote:
> > On Mon, Sep 21, 2009 at 09:44:11PM +0200, Johan Hake wrote:
> >> On Monday 21 September 2009 21:37:03 Anders Logg wrote:
> >> > Johan and I have set up a benchmark for parallel speedup in
> >> >
> >> >   bench/fem/speedup
> >> >
> >> > Here are some preliminary results:
> >> >
> >> >   Speedup  |  Assemble  Assemble + solve
> >> >   --------------------------------------
> >> >   1        |         1                 1
> >> >   2        |    1.4351            4.0785
> >> >   4        |    2.3763            6.9076
> >> >   8        |    3.7458            9.4648
> >> >   16       |    6.3143            19.369
> >> >   32       |    7.6207            33.699
> >> >
> >> > These numbers look a bit strange, especially the superlinear speedup
> >> > for assemble + solve. There might be a bug somewhere in the benchmark
> >> > code.
> >> >
> >> > Anyway, we have some preliminary results that at least show some kind
> >> > of speedup.
> >> >
> >> > It would be interesting to hear some comments on what kind of numbers
> >> > we should expect to get from Matt and others.
> >> >
> >> > The benchmark is for assembling and solving Poisson on a 64 x 64 x 64
> >> > mesh using PETSc/MUMPS. Partitioning time is not included in the
> >> > numbers.
> >>
> >> What solver is used when the number of processors is 1? If this is
> >> different
> >> from MUMPS, we will have the performance difference between the two
> >> solvers
> >> included in the speedup bump when going from 1 -> 2 processors.
> >
> > It's the default PETSc LU solver which should be UMFPACK.
> >
> > So one explanation could be that MUMPS is twice as fast as UMFPACK
> > (looking at the speedup for two processes), which means we should
> > divide the numbers by 2, giving a speedup of 17 instead of 34 which
> > would be more reasonable.
> >
> > The total speedup of 17 includes both assemble + solve. Since assemble
> > is obviously not scaling as it should, MUMPS may still be scaling
> > pretty good.
> >
> > So some preliminary conclusions are:
> >
> > 1. Something is not right with assembly.
> >
> > 2. MUMPS scales well and runs relatively faster than UMFPACK.
> >
>
> That MUMPS scales well probably also suggest that something is wrong with
> the assembley. Is the solution of the problem correct ?

Haven't looked but it should be. The buildbots are running a system
test for parallel assembly and comparing with the serial result and
those tests pass. This example is simpler than that test.

--
Anders

Attachment: signature.asc
Description: Digital signature

References

Results: Parallel speedup
From: Anders Logg, 2009-09-21
Re: Results: Parallel speedup
From: Johan Hake, 2009-09-21
Re: Results: Parallel speedup
From: Anders Logg, 2009-09-21
Re: Results: Parallel speedup
From: kent-and, 2009-09-22