← Back to team overview

yade-dev team mailing list archive

Re: parallel collider - testing needed

 

sorry for late reply. Feel free to share the pdf. Originally it was supposed to be transferred to the wiki, anyway.
I'm thinking about a good way to measure performance for highly dynamic simulations, now. Maybe the script that martin-niehoff posted[1] would be useful. It is basicly a regular cubic pack of spheres that is placed in vibrating tub. The simulation runs for a single second (simulation time) and excitation of the tub causes the pack to disperse. It is of great interest to see whether such simulations benefit from the new collider, too, I think.

Alex
[1] https://answers.launchpad.net/yade/+question/242644 answer #10

-----Ursprüngliche Nachricht-----
Von: Yade-dev [mailto:yade-dev-bounces+alexander.eulitz=iwf.tu-berlin.de@xxxxxxxxxxxxxxxxxxx] Im Auftrag von Bruno Chareyre
Gesendet: Montag, 24. Februar 2014 16:57
An: yade-dev@xxxxxxxxxxxxxxxxxxx
Betreff: Re: [Yade-dev] parallel collider - testing needed

I forgot to mention two things:
1- I tried the benchmark used by Christian for comparisons with PFC [1], however it seems that this test is very special. I get large differences between two runs. Basically, it seems the simulation only depends on truncation errors: vertical columns of sheres remain stable until a small bit of horizontal noise makes them fall down one by one. If you look at the simulation in the GUI it looks strange. I did not insist with this one, I think it could be improved by replacing the lattice by disordered packings.
2- The benchmark done by Alexander some time ago (on the same problem but with -j>1) is not visible anywhere if I'm not wrong. I have a copy of the pdf, is it ok to upload it on the wiki? It is an interesting starting point for evaluating the parallel collider.

Bruno

[1] https://www.yade-dem.org/wiki/Comparisons_with_PFC3D


On 24/02/14 16:36, Bruno Chareyre wrote:
> Hi there,
> I implemented a parallel version of the InsertionSortCollider. It is 
> almost ready but not yet pushed to the main trunk, as I have a few 
> things to check before that.
> It would be helpful if some of you could 1/ test that your scripts 
> work correctly and 2/ benchmark this for N>100k and j>4.
> If you run benchmarks, please remember to always activate timing and 
> report the result of timing.stats(). It gives much more interesting 
> data than the wall clock time.
>
> Preliminary benchmark results are below (from my laptop...), showing a 
> speedup by a factor 2 on the total computation time for j4/200k 
> particles (compared to the sequential collider).
> The speedup on collider alone is in fact of the order of x3.68 for 4 
> threads. Nearly linear at least for such small number of threads.
>
> My expectation is that it should change almost nothing for small 
> number of particles (say, N<10k), where colliding is an inexpensive step.
> For 1million of particles OTOH, there could be significant speedup, 
> since the collider takes most of the time.
>
> You can get the "pc" branch at my github repo:
> git clone -b pc https://github.com/bchareyre/trunk.git
>
> Results of yade -j4 --performance are below (I7 quad-core with 
> hyperthreading enabled, lightly loaded by background tasks -  j>4 not 
> reported as hyperthreading is probably doing no good).
>
> Happy benchmarking. :)
>
> Bruno
>
>
> ====================
> ./yade-trunk -j4 --performance  (the current trunk) .......
> number of bodies 200813
>
> Elapsed  29.4102840424  sec
> Performance  6.80034234664  iter/sec
> Extrapolation on 1e5 iters  4.08476167255  hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name                                                   
> Count                 Time            Rel. time
> -------------------------------------------------------------------------------------------------------
> ForceResetter                                       200            
> 700881us                2.38%     
> InsertionSortCollider                                 7          
> 18816625us               64.02%     
> InteractionLoop                                     200           
> 6581283us               22.39%     
> NewtonIntegrator                                    200           
> 3293119us               11.20%     
> TOTAL                                                            
> 29391910us              100.00%     
>
> Common time  597.731503963 s
>
>
> 5037  spheres, velocity= 327.689688709 +- 5.13604387635 %
> 25103  spheres, velocity= 81.2726909754 +- 1.0105334405 %
> 50250  spheres, velocity= 45.4114521341 +- 3.02333274436 %
> 100467  spheres, velocity= 19.0287424005 +- 2.26073439157 %
> 200813  spheres, velocity= 6.51664351023 +- 4.03351515402 %
>
>
> SCORE: 13777
> Number of threads  4
>
>
> ========================
> ./yade-parallel -j4 --performance  (my "pc" branch) ....
>
> number of bodies 200813
>
> Elapsed  15.4320101738  sec
> Performance  12.9600744004  iter/sec
> Extrapolation on 1e5 iters  2.14333474636  hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name                                                   
> Count                 Time            Rel. time
> -------------------------------------------------------------------------------------------------------
> ForceResetter                                       200            
> 671157us                4.36%     
> InsertionSortCollider                                 7           
> 5145114us               33.42%     
>   boundDispatcher                                       7             
> 93186us                1.81%   
>   bound                                                
> 7                 12us                0.00%   
>   copy                                                  7            
> 160891us                3.13%   
>   erase                                                 7             
> 66932us                1.30%   
>   sort&collide                                          7           
> 4824071us               93.76%   
>   TOTAL                                                35           
> 5145095us              100.00%   
> InteractionLoop                                     200           
> 6545848us               42.52%     
> NewtonIntegrator                                    200           
> 3030989us               19.69%     
> TOTAL                                                            
> 15393110us              100.00%     
>
> Common time  460.37680912 s
>
>
> 5037  spheres, velocity= 365.599773471 +- 8.02397068512 %
> 25103  spheres, velocity= 92.0077536966 +- 3.81069496509 %
> 50250  spheres, velocity= 54.1683980588 +- 0.528288534811 %
> 100467  spheres, velocity= 25.7134767981 +- 1.0796373464 %
> 200813  spheres, velocity= 12.6488486429 +- 4.66276699319 %
>
>
> SCORE: 18800
> Number of threads  4
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~yade-dev
> Post to     : yade-dev@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~yade-dev
> More help   : https://help.launchpad.net/ListHelp
>
>
>


_______________________________________________
Mailing list: https://launchpad.net/~yade-dev
Post to     : yade-dev@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~yade-dev
More help   : https://help.launchpad.net/ListHelp


Follow ups

References