yade-dev team mailing list archive
-
yade-dev team
-
Mailing list archive
-
Message #01026
3 x speedups with new containers and openMP!
-
To:
Yade Development Group <yade-dev@xxxxxxxxxxxxxxxxxxx>
-
From:
Václav Šmilauer <eudoxos@xxxxxxxx>
-
Date:
Tue, 24 Feb 2009 14:23:39 +0100
-
User-agent:
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.19) Gecko/20090105 Lightning/0.8 Thunderbird/2.0.0.19 Mnenhy/0.7.6.666
Hi, I put a table on the wiki
http://yade.wikia.com/wiki/Performance_Tuning that shows how simulation
time can be made (much) smaller.
1. By switching from InteractionVecSet to InteractionVecMap, simulation
takes 75% of time. This is already the default container since r1689.
2. By switching from PhysicaActionVectorVector to BexContainer
(compile-time option), simulation is faster by 9%.
3. Both of the previous steps are necessary to run openMP-enabled
(compile-time option) simulation. I got almost half time by running 3
threads (environment variable OMP_NUM_THREADS), in total 37% time of the
"reference" (InteractionVecSet and PhysicalActionVectorVector) case.
Only InteractionPhysicsMetaEngine, InteractionGeometryMetaEngine and
ConstitutiveLawDispatcher were parallelized, as they take the most
simulation time
(http://yade.wikia.com/wiki/Speed_profiling_using_TimingInfo_and_TimingDeltas_classes).
Running htop during the simulation, most cores were running at 50%,
meaning that memory access becomes the limiting factor. (The system I
was using has 3 physical dual-core CPUs and if probably quite heavily
non-uniform with respect to memory access from different processors).
Having more than 3 threads doesn't speed the simulation any further, but
this number will be probably higher for very large simulations.
Let me know (sega?) if you get comparable speedups in your case.
Regards, Vaclav
Follow ups