yade-dev team mailing list archive
-
yade-dev team
-
Mailing list archive
-
Message #10748
Re: parallel collider - testing needed
Thanks Matthias,
Actually I don't understand your benchmark results. You are the first
one to find no speedup on the colliding part.
It seems the results below were not using the parallel collider, since
the time it takes is exactly the same for all number of threads.
What version is that (diplayed at yade startup)?
Bruno
On 16/04/14 17:14, Matthias Frank wrote:
> hi bruno,
>
> i use your first version of the parallel collider for quiet a while
> during model development and also calibration. i saw no differences
> between yade-1.07 and your version.
>
> i did some benchmarks with 4 to 16 sandy bridge cores at our bull
> cluster. getting more than 16 cores for openmp applications is quit
> difficult.
> done on an exclusively used 16 core node
>
> =============== 1 threads =============================
> number of bodies 200813
>
> Elapsed 47.6222550869 sec
> Performance 4.19971712039 iter/sec
> Extrapolation on 1e5 iters 6.6142020954 hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name Count Time Rel. time
> -------------------------------------------------------------------------------------------------------
>
> ForceResetter 200 594120us 1.25%
> InsertionSortCollider 7
> 15686671us 32.95%
> InteractionLoop 200
> 21787610us 45.76%
> NewtonIntegrator 200
> 9541243us 20.04%
> TOTAL 47609645us 100.00%
>
> Common time 1383.60180092 s
>
>
> 5037 spheres, velocity= 103.875852973 +- 6.56561134015 %
> 25103 spheres, velocity= 31.681069095 +- 3.69992939292 %
> 50250 spheres, velocity= 15.6112167455 +- 0.651579666153 %
> 100467 spheres, velocity= 7.65955209926 +- 0.740064173207 %
> Calculation velocity is unstable, try to close all programs and start
> performance tests again
> 200813 spheres, velocity= 4.52368811131 +- 12.3907756519 %
>
>
> SCORE: 6055
> Number of threads 1
> =============== 4 threads =============================
> number of bodies 200813
>
> Elapsed 29.6409780979 sec
> Performance 6.7474156669 iter/sec
> Extrapolation on 1e5 iters 4.1168025136 hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name Count Time Rel. time
> -------------------------------------------------------------------------------------------------------
>
> ForceResetter 200
> 2919976us 9.85%
> InsertionSortCollider 7
> 15675024us 52.89%
> InteractionLoop 200
> 5309648us 17.92%
> NewtonIntegrator 200
> 5730646us 19.34%
> TOTAL 29635295us 100.00%
>
> Common time 641.693111897 s
>
>
> Calculation velocity is unstable, try to close all programs and start
> performance tests again
> 5037 spheres, velocity= 232.725838879 +- 14.3014472878 %
> Calculation velocity is unstable, try to close all programs and start
> performance tests again
> 25103 spheres, velocity= 72.3475644141 +- 12.8106054968 %
> 50250 spheres, velocity= 50.2926096116 +- 3.01250915287 %
> 100467 spheres, velocity= 18.9664279425 +- 1.40241049531 %
> 200813 spheres, velocity= 6.95879166249 +- 2.72955035307 %
>
>
> SCORE: 13080
> Number of threads 4
> =============== 8 threads =============================
> number of bodies 200813
>
> Elapsed 28.8497908115 sec
> Performance 6.9324592787 iter/sec
> Extrapolation on 1e5 iters 4.00691539049 hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name Count Time Rel. time
> -------------------------------------------------------------------------------------------------------
>
> ForceResetter 200
> 4760739us 16.51%
> InsertionSortCollider 7
> 15682352us 54.38%
> InteractionLoop 200
> 3398981us 11.79%
> NewtonIntegrator 200
> 4997676us 17.33%
> TOTAL 28839750us 100.00%
>
> Common time 629.34264183 s
>
>
> Calculation velocity is unstable, try to close all programs and start
> performance tests again
> 5037 spheres, velocity= 242.232297207 +- 18.7054194438 %
> 25103 spheres, velocity= 78.2112705997 +- 4.19360243937 %
> 50250 spheres, velocity= 46.6877664726 +- 2.81481812835 %
> 100467 spheres, velocity= 19.9932164704 +- 3.06039659404 %
> 200813 spheres, velocity= 6.92396036557 +- 0.361116951928 %
>
>
> SCORE: 13272
> Number of threads 8
> =============== 12 threads =============================
> number of bodies 200813
>
> Elapsed 29.2484679222 sec
> Performance 6.83796500151 iter/sec
> Extrapolation on 1e5 iters 4.06228721142 hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name Count Time Rel. time
> -------------------------------------------------------------------------------------------------------
>
> ForceResetter 200
> 7943958us 27.17%
> InsertionSortCollider 7
> 15713441us 53.75%
> InteractionLoop 200
> 2522508us 8.63%
> NewtonIntegrator 200
> 3055652us 10.45%
> TOTAL 29235560us 100.00%
>
> Common time 667.634572983 s
>
>
> 5037 spheres, velocity= 189.874951285 +- 9.74398679139 %
> 25103 spheres, velocity= 79.4292831485 +- 6.59393629842 %
> 50250 spheres, velocity= 48.2684323576 +- 4.29336410346 %
> 100467 spheres, velocity= 19.2778991779 +- 6.87288661534 %
> 200813 spheres, velocity= 7.05669848487 +- 2.29609774368 %
>
>
> SCORE: 12914
> Number of threads 12
>
> =============== 16 threads =============================
> number of bodies 200813
>
> Elapsed 27.1387059689 sec
> Performance 7.36954813651 iter/sec
> Extrapolation on 1e5 iters 3.7692647179 hours
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
> Name Count Time Rel. time
> -------------------------------------------------------------------------------------------------------
>
> ForceResetter 200
> 5765158us 21.26%
> InsertionSortCollider 7
> 15685115us 57.84%
> InteractionLoop 200
> 2002979us 7.39%
> NewtonIntegrator 200
> 3665586us 13.52%
> TOTAL 27118839us 100.00%
>
> Common time 781.653450966 s
>
>
> 5037 spheres, velocity= 155.295128456 +- 5.31523351848 %
> 25103 spheres, velocity= 58.9500296071 +- 7.67003146996 %
> 50250 spheres, velocity= 38.5475112683 +- 2.84583454585 %
> 100467 spheres, velocity= 17.2375970816 +- 6.15206324777 %
> 200813 spheres, velocity= 6.87034005987 +- 7.15657372906 %
>
> SCORE: 11009
> Number of threads 16
>
>
> matthias
>
> On 10.04.2014 12:58, Bruno Chareyre wrote:
>> On 10/04/14 02:01, Klaus Thoeni wrote:
>>> just to clarify, Test 2 is done by increasing the number of
>>> iterations (1x, 3x
>>> and 12x the number of iterations specified in checkPerf.py). This
>>> means the
>>> number of interactions should increase as well and, hence, particle
>>> velocities
>>> should decrease because of more interactions.
>> That is what I was thinking. And more interactions means less (relative)
>> time spent in collider.
>>
>>> I added a table with the collider scaling factor for 1 million
>>> particles and
>>> iter x 12.
>> Thanks! So there is still an optimum near 12-14. It may be possible to
>> improve (choosing approriate chunksizes internally), but it needs
>> serious testing.
>>
>>> Note your T(j8)=T(j1)/5.8 is actually T(j8)=T(j1)/4.8. Where did you
>>> get the
>>> number from? You must look into the uploaded files in order to get
>>> this numbers
>> I used the x1 line since I was not expecting any influence of the number
>> of steps on the collider's performance:
>> 187/20=5.8
>> Now I see it is different with other lines. Weird.
>>
>> Bruno
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~yade-dev
>> Post to : yade-dev@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~yade-dev
>> More help : https://help.launchpad.net/ListHelp
>
>
--
_______________
Bruno Chareyre
Associate Professor
ENSE³ - Grenoble INP
Lab. 3SR
BP 53
38041 Grenoble cedex 9
Tél : +33 4 56 52 86 21
Fax : +33 4 76 82 70 43
________________
References