yade-dev team mailing list archive
-
yade-dev team
-
Mailing list archive
-
Message #10500
Re: parallel collider - testing needed
Zitat von Bruno Chareyre <bruno.chareyre@xxxxxxxxxxx>:
I forgot to mention two things:
1- I tried the benchmark used by Christian for comparisons with PFC [1],
however it seems that this test is very special. I get large differences
between two runs. Basically, it seems the simulation only depends on
truncation errors: vertical columns of sheres remain stable until a
small bit of horizontal noise makes them fall down one by one. If you
I rotated the wall below a little bit to make it slightly aslope. This
is the reason why columns can collapse (not because truncation error):
#rotation quaternion:
orientationWall = Quaternion(Vector3(.01,.01,1),math.pi)
#create box:
id_box=O.bodies.append(utils.box((origin_wall,origin_wall,-.5),(200,200,.5),orientationWall,fixed=True,material=WallMat))
As you mentioned in a previous post we should define two benchmarking
scripts. One for quasi-static simulations and one for dynamic ones.
The one I used for comparison to PFC is quasi-static at the beginning
and turns into a dynamic one.
It seems not to be the best choice for a benchmark.
look at the simulation in the GUI it looks strange. I did not insist
with this one, I think it could be improved by replacing the lattice by
disordered packings.
2- The benchmark done by Alexander some time ago (on the same problem
but with -j>1) is not visible anywhere if I'm not wrong. I have a copy
of the pdf, is it ok to upload it on the wiki? It is an interesting
starting point for evaluating the parallel collider.
Bruno
[1] https://www.yade-dem.org/wiki/Comparisons_with_PFC3D
On 24/02/14 16:36, Bruno Chareyre wrote:
Hi there,
I implemented a parallel version of the InsertionSortCollider. It is
almost ready but not yet pushed to the main trunk, as I have a few
things to check before that.
It would be helpful if some of you could 1/ test that your scripts work
correctly and 2/ benchmark this for N>100k and j>4.
If you run benchmarks, please remember to always activate timing and
report the result of timing.stats(). It gives much more interesting data
than the wall clock time.
Preliminary benchmark results are below (from my laptop...), showing a
speedup by a factor 2 on the total computation time for j4/200k
particles (compared to the sequential collider).
The speedup on collider alone is in fact of the order of x3.68 for 4
threads. Nearly linear at least for such small number of threads.
My expectation is that it should change almost nothing for small number
of particles (say, N<10k), where colliding is an inexpensive step.
For 1million of particles OTOH, there could be significant speedup,
since the collider takes most of the time.
You can get the "pc" branch at my github repo:
git clone -b pc https://github.com/bchareyre/trunk.git
Results of yade -j4 --performance are below (I7 quad-core with
hyperthreading enabled, lightly loaded by background tasks - j>4 not
reported as hyperthreading is probably doing no good).
Happy benchmarking. :)
Bruno
====================
./yade-trunk -j4 --performance (the current trunk)
.......
number of bodies 200813
Elapsed 29.4102840424 sec
Performance 6.80034234664 iter/sec
Extrapolation on 1e5 iters 4.08476167255 hours
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
Name
Count Time Rel. time
-------------------------------------------------------------------------------------------------------
ForceResetter 200
700881us 2.38%
InsertionSortCollider 7
18816625us 64.02%
InteractionLoop 200
6581283us 22.39%
NewtonIntegrator 200
3293119us 11.20%
TOTAL
29391910us 100.00%
Common time 597.731503963 s
5037 spheres, velocity= 327.689688709 +- 5.13604387635 %
25103 spheres, velocity= 81.2726909754 +- 1.0105334405 %
50250 spheres, velocity= 45.4114521341 +- 3.02333274436 %
100467 spheres, velocity= 19.0287424005 +- 2.26073439157 %
200813 spheres, velocity= 6.51664351023 +- 4.03351515402 %
SCORE: 13777
Number of threads 4
========================
./yade-parallel -j4 --performance (my "pc" branch)
....
number of bodies 200813
Elapsed 15.4320101738 sec
Performance 12.9600744004 iter/sec
Extrapolation on 1e5 iters 2.14333474636 hours
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
Name
Count Time Rel. time
-------------------------------------------------------------------------------------------------------
ForceResetter 200
671157us 4.36%
InsertionSortCollider 7
5145114us 33.42%
boundDispatcher 7
93186us 1.81%
bound
7 12us 0.00%
copy 7
160891us 3.13%
erase 7
66932us 1.30%
sort&collide 7
4824071us 93.76%
TOTAL 35
5145095us 100.00%
InteractionLoop 200
6545848us 42.52%
NewtonIntegrator 200
3030989us 19.69%
TOTAL
15393110us 100.00%
Common time 460.37680912 s
5037 spheres, velocity= 365.599773471 +- 8.02397068512 %
25103 spheres, velocity= 92.0077536966 +- 3.81069496509 %
50250 spheres, velocity= 54.1683980588 +- 0.528288534811 %
100467 spheres, velocity= 25.7134767981 +- 1.0796373464 %
200813 spheres, velocity= 12.6488486429 +- 4.66276699319 %
SCORE: 18800
Number of threads 4
_______________________________________________
Mailing list: https://launchpad.net/~yade-dev
Post to : yade-dev@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~yade-dev
More help : https://help.launchpad.net/ListHelp
_______________________________________________
Mailing list: https://launchpad.net/~yade-dev
Post to : yade-dev@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~yade-dev
More help : https://help.launchpad.net/ListHelp
Follow ups
References