← Back to team overview

yade-dev team mailing list archive

Re: parallel collider - testing needed

 

Hi Bruno,

I have tested this version of collider and have got a speedup for
about 5..10% with number of cores 2..6. But it was quasi-static
simulations, so the contact list is updating not so often.

I think, we can include this code into the master branch in git.
Let`s check the code more precisely and merge it.

Thank you!

Anton


2014-02-24 16:36 GMT+01:00 Bruno Chareyre <bruno.chareyre@xxxxxxxxxxx>:
> Hi there,
> I implemented a parallel version of the InsertionSortCollider. It is
> almost ready but not yet pushed to the main trunk, as I have a few
> things to check before that.
> It would be helpful if some of you could 1/ test that your scripts work
> correctly and 2/ benchmark this for N>100k and j>4.
> If you run benchmarks, please remember to always activate timing and
> report the result of timing.stats(). It gives much more interesting data
> than the wall clock time.
>
> Preliminary benchmark results are below (from my laptop...), showing a
> speedup by a factor 2 on the total computation time for j4/200k
> particles (compared to the sequential collider).
> The speedup on collider alone is in fact of the order of x3.68 for 4
> threads. Nearly linear at least for such small number of threads.
>
> My expectation is that it should change almost nothing for small number
> of particles (say, N<10k), where colliding is an inexpensive step.
> For 1million of particles OTOH, there could be significant speedup,
> since the collider takes most of the time.
>
> You can get the "pc" branch at my github repo:
> git clone -b pc https://github.com/bchareyre/trunk.git
>
> Results of yade -j4 --performance are below (I7 quad-core with
> hyperthreading enabled, lightly loaded by background tasks -  j>4 not
> reported as hyperthreading is probably doing no good).
>
> Happy benchmarking. :)
>
> Bruno
>
>
> ====================
> ./yade-trunk -j4 --performance  (the current trunk)
> .......
> number of bodies 200813
>
> Elapsed  29.4102840424  sec
> Performance  6.80034234664  iter/sec
> Extrapolation on 1e5 iters  4.08476167255  hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name
> Count                 Time            Rel. time
> -------------------------------------------------------------------------------------------------------
> ForceResetter                                       200
> 700881us                2.38%
> InsertionSortCollider                                 7
> 18816625us               64.02%
> InteractionLoop                                     200
> 6581283us               22.39%
> NewtonIntegrator                                    200
> 3293119us               11.20%
> TOTAL
> 29391910us              100.00%
>
> Common time  597.731503963 s
>
>
> 5037  spheres, velocity= 327.689688709 +- 5.13604387635 %
> 25103  spheres, velocity= 81.2726909754 +- 1.0105334405 %
> 50250  spheres, velocity= 45.4114521341 +- 3.02333274436 %
> 100467  spheres, velocity= 19.0287424005 +- 2.26073439157 %
> 200813  spheres, velocity= 6.51664351023 +- 4.03351515402 %
>
>
> SCORE: 13777
> Number of threads  4
>
>
> ========================
> ./yade-parallel -j4 --performance  (my "pc" branch)
> ....
>
> number of bodies 200813
>
> Elapsed  15.4320101738  sec
> Performance  12.9600744004  iter/sec
> Extrapolation on 1e5 iters  2.14333474636  hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name
> Count                 Time            Rel. time
> -------------------------------------------------------------------------------------------------------
> ForceResetter                                       200
> 671157us                4.36%
> InsertionSortCollider                                 7
> 5145114us               33.42%
>   boundDispatcher                                       7
> 93186us                1.81%
>   bound
> 7                 12us                0.00%
>   copy                                                  7
> 160891us                3.13%
>   erase                                                 7
> 66932us                1.30%
>   sort&collide                                          7
> 4824071us               93.76%
>   TOTAL                                                35
> 5145095us              100.00%
> InteractionLoop                                     200
> 6545848us               42.52%
> NewtonIntegrator                                    200
> 3030989us               19.69%
> TOTAL
> 15393110us              100.00%
>
> Common time  460.37680912 s
>
>
> 5037  spheres, velocity= 365.599773471 +- 8.02397068512 %
> 25103  spheres, velocity= 92.0077536966 +- 3.81069496509 %
> 50250  spheres, velocity= 54.1683980588 +- 0.528288534811 %
> 100467  spheres, velocity= 25.7134767981 +- 1.0796373464 %
> 200813  spheres, velocity= 12.6488486429 +- 4.66276699319 %
>
>
> SCORE: 18800
> Number of threads  4
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~yade-dev
> Post to     : yade-dev@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~yade-dev
> More help   : https://help.launchpad.net/ListHelp


Follow ups

References