← Back to team overview

yade-dev team mailing list archive

Re: [Bug 729079] Re: Performance optimization of InsertionSortCollider

 

Ok, so you don't need to create a branch via launchpad interface finally.
You just type "bzr push --no-strict lp:~bruno-chareyre/yade/collide",
and it creates a branch called "collide".

You can get this branch with "bzr branch lp:~bruno-
chareyre/yade/collide".

There are different sort of the changes (I'll need to clean before merging):
- 1. some are between #ifdef ORI_VERLET guards (mostly those adding some
data to bounds and engines), ORI_VERLET is defined by default, in Bound.hpp
- 2. some are hardcoded and apply always (whatever nBin, ORI_VERLET,
Collider::oriVerlet)
- 3. some are activated or not at runtime, depending on attribute
oriVerlet of the collider. If oriVerlet=false (true by default), the old
bins algorithm is used, but it is still improved on the basis of 1 and 2.

Using default behaviour of the branch will give some of the best
performance according to my tests (using the default verletDist=-0.15),
although verletDist<-0.15 can give better performance in some case (very
high number of particles).
The best speedup is apparently obtained with 100k particles and more,
but I didn't do any serious benchmark yet.

Any feedback welcome.

-- 
You received this bug notification because you are a member of Yade
developers, which is the registrant for Yade.
https://bugs.launchpad.net/bugs/729079

Title:
  Performance optimization of InsertionSortCollider

Status in Yet Another Dynamic Engine:
  New

Bug description:
  Sergei Dorofeenko  (https://launchpad.net/~sergei.dorofeenko) found,
  that InsertionSortCollider is probably is a "bottle neck" in
  simulations with >10^5 number of particles even in many-threads mode.

  http://www.mail-archive.com/yade-dev@xxxxxxxxxxxxxxxxxxx/msg06573.html

  Сitation:
  "...I did a perfomance test for parallel mode and results in no good.

  Performance boost only about 40% from 1 thread to 4 thread for 200k
  particles... Cause is a non-parallelised InsertionSortCollider, who
  need about 80% time with 4 threads.

  Results attached."


Follow ups

References