yade-dev team mailing list archive
-
yade-dev team
-
Mailing list archive
-
Message #10499
Re: parallel collider - testing needed
I forgot to mention two things:
1- I tried the benchmark used by Christian for comparisons with PFC [1],
however it seems that this test is very special. I get large differences
between two runs. Basically, it seems the simulation only depends on
truncation errors: vertical columns of sheres remain stable until a
small bit of horizontal noise makes them fall down one by one. If you
look at the simulation in the GUI it looks strange. I did not insist
with this one, I think it could be improved by replacing the lattice by
disordered packings.
2- The benchmark done by Alexander some time ago (on the same problem
but with -j>1) is not visible anywhere if I'm not wrong. I have a copy
of the pdf, is it ok to upload it on the wiki? It is an interesting
starting point for evaluating the parallel collider.
Bruno
[1] https://www.yade-dem.org/wiki/Comparisons_with_PFC3D
On 24/02/14 16:36, Bruno Chareyre wrote:
> Hi there,
> I implemented a parallel version of the InsertionSortCollider. It is
> almost ready but not yet pushed to the main trunk, as I have a few
> things to check before that.
> It would be helpful if some of you could 1/ test that your scripts work
> correctly and 2/ benchmark this for N>100k and j>4.
> If you run benchmarks, please remember to always activate timing and
> report the result of timing.stats(). It gives much more interesting data
> than the wall clock time.
>
> Preliminary benchmark results are below (from my laptop...), showing a
> speedup by a factor 2 on the total computation time for j4/200k
> particles (compared to the sequential collider).
> The speedup on collider alone is in fact of the order of x3.68 for 4
> threads. Nearly linear at least for such small number of threads.
>
> My expectation is that it should change almost nothing for small number
> of particles (say, N<10k), where colliding is an inexpensive step.
> For 1million of particles OTOH, there could be significant speedup,
> since the collider takes most of the time.
>
> You can get the "pc" branch at my github repo:
> git clone -b pc https://github.com/bchareyre/trunk.git
>
> Results of yade -j4 --performance are below (I7 quad-core with
> hyperthreading enabled, lightly loaded by background tasks - j>4 not
> reported as hyperthreading is probably doing no good).
>
> Happy benchmarking. :)
>
> Bruno
>
>
> ====================
> ./yade-trunk -j4 --performance (the current trunk)
> .......
> number of bodies 200813
>
> Elapsed 29.4102840424 sec
> Performance 6.80034234664 iter/sec
> Extrapolation on 1e5 iters 4.08476167255 hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name
> Count Time Rel. time
> -------------------------------------------------------------------------------------------------------
> ForceResetter 200
> 700881us 2.38%
> InsertionSortCollider 7
> 18816625us 64.02%
> InteractionLoop 200
> 6581283us 22.39%
> NewtonIntegrator 200
> 3293119us 11.20%
> TOTAL
> 29391910us 100.00%
>
> Common time 597.731503963 s
>
>
> 5037 spheres, velocity= 327.689688709 +- 5.13604387635 %
> 25103 spheres, velocity= 81.2726909754 +- 1.0105334405 %
> 50250 spheres, velocity= 45.4114521341 +- 3.02333274436 %
> 100467 spheres, velocity= 19.0287424005 +- 2.26073439157 %
> 200813 spheres, velocity= 6.51664351023 +- 4.03351515402 %
>
>
> SCORE: 13777
> Number of threads 4
>
>
> ========================
> ./yade-parallel -j4 --performance (my "pc" branch)
> ....
>
> number of bodies 200813
>
> Elapsed 15.4320101738 sec
> Performance 12.9600744004 iter/sec
> Extrapolation on 1e5 iters 2.14333474636 hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name
> Count Time Rel. time
> -------------------------------------------------------------------------------------------------------
> ForceResetter 200
> 671157us 4.36%
> InsertionSortCollider 7
> 5145114us 33.42%
> boundDispatcher 7
> 93186us 1.81%
> bound
> 7 12us 0.00%
> copy 7
> 160891us 3.13%
> erase 7
> 66932us 1.30%
> sort&collide 7
> 4824071us 93.76%
> TOTAL 35
> 5145095us 100.00%
> InteractionLoop 200
> 6545848us 42.52%
> NewtonIntegrator 200
> 3030989us 19.69%
> TOTAL
> 15393110us 100.00%
>
> Common time 460.37680912 s
>
>
> 5037 spheres, velocity= 365.599773471 +- 8.02397068512 %
> 25103 spheres, velocity= 92.0077536966 +- 3.81069496509 %
> 50250 spheres, velocity= 54.1683980588 +- 0.528288534811 %
> 100467 spheres, velocity= 25.7134767981 +- 1.0796373464 %
> 200813 spheres, velocity= 12.6488486429 +- 4.66276699319 %
>
>
> SCORE: 18800
> Number of threads 4
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~yade-dev
> Post to : yade-dev@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~yade-dev
> More help : https://help.launchpad.net/ListHelp
>
>
>
Follow ups
References