yade-dev team mailing list archive
-
yade-dev team
-
Mailing list archive
-
Message #07480
Re: [Bug 729079] Re: Performance optimization of InsertionSortCollider
Ok, so you don't need to create a branch via launchpad interface finally.
You just type "bzr push --no-strict lp:~bruno-chareyre/yade/collide",
and it creates a branch called "collide".
You can get this branch with "bzr branch lp:~bruno-
chareyre/yade/collide".
There are different sort of the changes (I'll need to clean before merging):
- 1. some are between #ifdef ORI_VERLET guards (mostly those adding some
data to bounds and engines), ORI_VERLET is defined by default, in Bound.hpp
- 2. some are hardcoded and apply always (whatever nBin, ORI_VERLET,
Collider::oriVerlet)
- 3. some are activated or not at runtime, depending on attribute
oriVerlet of the collider. If oriVerlet=false (true by default), the old
bins algorithm is used, but it is still improved on the basis of 1 and 2.
Using default behaviour of the branch will give some of the best
performance according to my tests (using the default verletDist=-0.15),
although verletDist<-0.15 can give better performance in some case (very
high number of particles).
The best speedup is apparently obtained with 100k particles and more,
but I didn't do any serious benchmark yet.
Any feedback welcome.
--
You received this bug notification because you are a member of Yade
developers, which is the registrant for Yade.
https://bugs.launchpad.net/bugs/729079
Title:
Performance optimization of InsertionSortCollider
Status in Yet Another Dynamic Engine:
New
Bug description:
Sergei Dorofeenko (https://launchpad.net/~sergei.dorofeenko) found,
that InsertionSortCollider is probably is a "bottle neck" in
simulations with >10^5 number of particles even in many-threads mode.
http://www.mail-archive.com/yade-dev@xxxxxxxxxxxxxxxxxxx/msg06573.html
Сitation:
"...I did a perfomance test for parallel mode and results in no good.
Performance boost only about 40% from 1 thread to 4 thread for 200k
particles... Cause is a non-parallelised InsertionSortCollider, who
need about 80% time with 4 threads.
Results attached."
Follow ups
References