← Back to team overview

yade-users team mailing list archive

Re: [Question #244277]: yade-batch much slower than yade

 

Question #244277 on Yade changed:
https://answers.launchpad.net/yade/+question/244277

matthias  proposed the following answer:
i noticed some maybe similar issues.

1. large simulations takes a lot of ram. if the system would swap, you will notice this because swapping  is much  more than 6 time slower.
but there is maybe a bottleneck between cpu and memory. if you run one simulation the whole ram bandwidth is used for this one simulation. but if you run some more, all simulations get only a small peace of it. here you can see if your system layout is optimized for memory intensive  tasks. if an cpu socket has some "exclusive/own" ram slot, you can work on local ram and the cpu must not communicate with other cpus. this is the best case. if there is some global ram or the cpu must fetch some data from other cpus than it gets slow. simulations with huge number of particles also (less) benefit from caches, because of the  huge memory consumption. 

there is a tradeoff between parallel (in this case distributed)
simulation and one parallel simulation run. yade has no good
parallelizied code. so you get no linear speedup and more than 4 thread
are in my experience not usefull. on the other hand is there a
bottleneck between ram and cpu to  serve all jobs with there data

2. compiling of yade
i also use yade on an hpc cluster with bull linux. the default compiler on this machine is the intel compiler which normally generates really fast code. but there is an issue with openBLAS to compile yade with intel compiler. so the sysop use GCC maybe with some suboptimal optimizations. the result is 2 times slower code than the ubuntu binary package one my core i7. so different builds could also lead to slower runs

matthias

-- 
You received this question notification because you are a member of
yade-users, which is an answer contact for Yade.