dolfin team mailing list archive

Thread
Date

Re: Results: Parallel speedup

To: dolfin-dev@xxxxxxxxxx
From: kent-and@xxxxxxxxx
Date: Tue, 22 Sep 2009 08:59:19 +0200
Delivered-to: dolfin-dev@xxxxxxxxxx
In-reply-to: <20090921204629.GH13010@olorin>
User-agent: SquirrelMail/1.4.19

> On Mon, Sep 21, 2009 at 09:44:11PM +0200, Johan Hake wrote:
>> On Monday 21 September 2009 21:37:03 Anders Logg wrote:
>> > Johan and I have set up a benchmark for parallel speedup in
>> >
>> >   bench/fem/speedup
>> >
>> > Here are some preliminary results:
>> >
>> >   Speedup  |  Assemble  Assemble + solve
>> >   --------------------------------------
>> >   1        |         1                 1
>> >   2        |    1.4351            4.0785
>> >   4        |    2.3763            6.9076
>> >   8        |    3.7458            9.4648
>> >   16       |    6.3143            19.369
>> >   32       |    7.6207            33.699
>> >
>> > These numbers look a bit strange, especially the superlinear speedup
>> > for assemble + solve. There might be a bug somewhere in the benchmark
>> > code.
>> >
>> > Anyway, we have some preliminary results that at least show some kind
>> > of speedup.
>> >
>> > It would be interesting to hear some comments on what kind of numbers
>> > we should expect to get from Matt and others.
>> >
>> > The benchmark is for assembling and solving Poisson on a 64 x 64 x 64
>> > mesh using PETSc/MUMPS. Partitioning time is not included in the
>> > numbers.
>>
>> What solver is used when the number of processors is 1? If this is
>> different
>> from MUMPS, we will have the performance difference between the two
>> solvers
>> included in the speedup bump when going from 1 -> 2 processors.
>
> It's the default PETSc LU solver which should be UMFPACK.
>
> So one explanation could be that MUMPS is twice as fast as UMFPACK
> (looking at the speedup for two processes), which means we should
> divide the numbers by 2, giving a speedup of 17 instead of 34 which
> would be more reasonable.
>
> The total speedup of 17 includes both assemble + solve. Since assemble
> is obviously not scaling as it should, MUMPS may still be scaling
> pretty good.
>
> So some preliminary conclusions are:
>
> 1. Something is not right with assembly.
>
> 2. MUMPS scales well and runs relatively faster than UMFPACK.
>
> --
> Anders

That MUMPS scales well probably also suggest that something is wrong with
the assembley. Is the solution of the problem correct ?

Kent

Follow ups

Re: Results: Parallel speedup
From: Anders Logg, 2009-09-22

References

Results: Parallel speedup
From: Anders Logg, 2009-09-21
Re: Results: Parallel speedup
From: Johan Hake, 2009-09-21
Re: Results: Parallel speedup
From: Anders Logg, 2009-09-21