← Back to team overview

yade-dev team mailing list archive

Re: parallel collider - testing needed

 

Hi Bruno,

I did some tests with your new collider:

My "old" machine (2 cpu sockets with 4 cores each, Intel(R) Xeon(R) CPU X5460 @ 3.16GHz) says:


yade-trunk -j4 --performance

Welcome to Yade 2014-02-18.git-af75797
.....
number of bodies 200813

Elapsed  74.6882498264  sec
Performance  2.67779738399  iter/sec
Extrapolation on 1e5 iters  10.3733680314  hours
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
Name Count Time Rel. time
-------------------------------------------------------------------------------------------------------
ForceResetter 200 2625848us 3.52% InsertionSortCollider 7 21494603us 28.79% InteractionLoop 200 32631323us 43.70% NewtonIntegrator 200 17913859us 23.99% TOTAL 74665635us 100.00%

Common time  3845.09048295 s


Calculation velocity is unstable, try to close all programs and start performance tests again
5037  spheres, velocity= 44.7832284176 +- 60.1189421161 %
25103  spheres, velocity= 17.4121076601 +- 0.99355345037 %
50250  spheres, velocity= 10.0714940216 +- 1.53896666769 %
100467  spheres, velocity= 5.05891811219 +- 0.434738330959 %
200813  spheres, velocity= 2.65826879857 +- 0.933088603948 %


SCORE: 3479
Number of threads  4

....

###########################################################

yade-parallel -j4 --performance (your pc branch)

Welcome to Yade 2014-02-24.git-b60d388
.....
number of bodies 200813

Elapsed  75.6688189507  sec
Performance  2.64309662518  iter/sec
Extrapolation on 1e5 iters  10.5095581876  hours
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
Name Count Time Rel. time
-------------------------------------------------------------------------------------------------------
ForceResetter 200 2600100us 3.44% InsertionSortCollider 7 20746020us 27.43% InteractionLoop 200 34455725us 45.55% NewtonIntegrator 200 17838205us 23.58% TOTAL 75640051us 100.00%

Common time  4093.34840894 s


Calculation velocity is unstable, try to close all programs and start performance tests again
5037  spheres, velocity= 44.3999135517 +- 61.0812025756 %
25103  spheres, velocity= 16.8531534243 +- 1.32470154863 %
50250  spheres, velocity= 9.61504490252 +- 0.670186229301 %
100467  spheres, velocity= 4.86679881913 +- 0.487840014886 %
200813  spheres, velocity= 2.64490152313 +- 0.285084118261 %


SCORE: 3402
Number of threads  4

######################################################


For my computer it seems to have nearly no speed up ...

Looking at htop tells my, that -j4 --performance is using 4 threads, but just on 1 core ...

Regards,

Christian



Zitat von Bruno Chareyre <bruno.chareyre@xxxxxxxxxxx>:

Hi there,
I implemented a parallel version of the InsertionSortCollider. It is
almost ready but not yet pushed to the main trunk, as I have a few
things to check before that.
It would be helpful if some of you could 1/ test that your scripts work
correctly and 2/ benchmark this for N>100k and j>4.
If you run benchmarks, please remember to always activate timing and
report the result of timing.stats(). It gives much more interesting data
than the wall clock time.

Preliminary benchmark results are below (from my laptop...), showing a
speedup by a factor 2 on the total computation time for j4/200k
particles (compared to the sequential collider).
The speedup on collider alone is in fact of the order of x3.68 for 4
threads. Nearly linear at least for such small number of threads.

My expectation is that it should change almost nothing for small number
of particles (say, N<10k), where colliding is an inexpensive step.
For 1million of particles OTOH, there could be significant speedup,
since the collider takes most of the time.

You can get the "pc" branch at my github repo:
git clone -b pc https://github.com/bchareyre/trunk.git

Results of yade -j4 --performance are below (I7 quad-core with
hyperthreading enabled, lightly loaded by background tasks -  j>4 not
reported as hyperthreading is probably doing no good).

Happy benchmarking. :)

Bruno


====================
./yade-trunk -j4 --performance  (the current trunk)
.......
number of bodies 200813

Elapsed  29.4102840424  sec
Performance  6.80034234664  iter/sec
Extrapolation on 1e5 iters  4.08476167255  hours
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
Name
Count                 Time            Rel. time
-------------------------------------------------------------------------------------------------------
ForceResetter                                       200
700881us                2.38%
InsertionSortCollider                                 7
18816625us               64.02%
InteractionLoop                                     200
6581283us               22.39%
NewtonIntegrator                                    200
3293119us               11.20%
TOTAL
29391910us              100.00%

Common time  597.731503963 s


5037  spheres, velocity= 327.689688709 +- 5.13604387635 %
25103  spheres, velocity= 81.2726909754 +- 1.0105334405 %
50250  spheres, velocity= 45.4114521341 +- 3.02333274436 %
100467  spheres, velocity= 19.0287424005 +- 2.26073439157 %
200813  spheres, velocity= 6.51664351023 +- 4.03351515402 %


SCORE: 13777
Number of threads  4


========================
./yade-parallel -j4 --performance  (my "pc" branch)
....

number of bodies 200813

Elapsed  15.4320101738  sec
Performance  12.9600744004  iter/sec
Extrapolation on 1e5 iters  2.14333474636  hours
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
Name
Count                 Time            Rel. time
-------------------------------------------------------------------------------------------------------
ForceResetter                                       200
671157us                4.36%
InsertionSortCollider                                 7
5145114us               33.42%
  boundDispatcher                                       7
93186us                1.81%
  bound
7                 12us                0.00%
  copy                                                  7
160891us                3.13%
  erase                                                 7
66932us                1.30%
  sort&collide                                          7
4824071us               93.76%
  TOTAL                                                35
5145095us              100.00%
InteractionLoop                                     200
6545848us               42.52%
NewtonIntegrator                                    200
3030989us               19.69%
TOTAL
15393110us              100.00%

Common time  460.37680912 s


5037  spheres, velocity= 365.599773471 +- 8.02397068512 %
25103  spheres, velocity= 92.0077536966 +- 3.81069496509 %
50250  spheres, velocity= 54.1683980588 +- 0.528288534811 %
100467  spheres, velocity= 25.7134767981 +- 1.0796373464 %
200813  spheres, velocity= 12.6488486429 +- 4.66276699319 %


SCORE: 18800
Number of threads  4


_______________________________________________
Mailing list: https://launchpad.net/~yade-dev
Post to     : yade-dev@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~yade-dev
More help   : https://help.launchpad.net/ListHelp






Follow ups

References