yade-users team mailing list archive

Thread
Date

Re: Triax profiling on cluster

To: yade-users@xxxxxxxxxxxxxxxxxxx
From: Janek Kozicki <janek_listy@xxxxx>
Date: Thu, 24 Sep 2009 10:54:22 +0200
Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAALVBMVEUBAQEtLS1KSkpRUVFXV1dYWFhjY2Nzc3N3d3eHh4eKioqdnZ24uLjLy8vc3NxVIagyAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAB3RJTUUH2AIVEzgS1fgQtQAAAjRJREFUOMtt1DFv00AUAOAzFQNbjigSyoQaRaBMhKgLUyKXpVNNeUpk9vyDqFJhQ1kiBuaqAwJCqvPtSLY7RlTn5+5IdnYkkt/AOyfxXVLe5vf53Z1875kd34tOEax8djmj6GyjhB5bxz50GdsVZr9fqRjZwAtKOJw5Wqs2MMZ16ALHsaDncF7xAHix1oEFHAB8f+pRjcO4gfZDykcYzbiucRolOLUJ6kjA0xtVt+A6TySlM0RajIpK6DzwKZ/nOYbF/gclHMo1ZOHYY/+Ha+AWuM+3oMS4eeqYzZ8FiCltgUqI8cd2wwAVpJk+8LWYjBtnJdQpHQqJMd4Oxt4bU9ESiFGc5hkqaH74asAX4iabP5I5gZ+qjgGlJCqZa3h3lxhoeVcSE1qLQC4sqKOK9MGW9E3izFqqHokoztLFEgXg31sbZEKnWi2T74A4NxfVQqlkjKtcAWD+zcArFEES01dR0E/nnV0IgugmDd/2L84sOAouRBBHEc7gtc8teDkRlE0iNQPo2w3Xhh/D4TCIQ4LRLoTvgwjj6RRgavdurxYGMaIuGOyAW/PpNlCcU9/93AHenAWYjPoAwa+G3e3to/MgFNTAEKvKDjzuCzHTnY3qqdXtx24VijzQfZ0yewZ5cwRFQaa+mIYr1uI0I76+3W4xhlvoVRwOA0Fdl64HlJnxP6T8YpX/Lga4Wv4A3ErrU5oTfN7Mu/llXMl8RXEPji/lQkN3H7qXqgC2By47EXeU/7PJ/wPxRKMnuZwIeAAAAABJRU5ErkJggg==
In-reply-to: <4ABB3100.8050009@arcig.cz>

Václav Šmilauer said:     (by the date of Thu, 24 Sep 2009 10:42:40 +0200)

> L1 cache is certainly not useless even for DEM, it's just that all your 
> data will not fit inside. But still if one part of your data is at one 
> memory location (not chain of shared_ptr's jumping all over the RAM), it 
> makes the computation much faster (e.g. Dem3Dof classes have comparable 
> speed to SpheresContactGeometry even if they copy extra 
> Vector3r+Quaternionr (=28b of data) at every step. There are some papers 
> [1] on that; speeds of the L1 cache are orders of magnitude higher than 
> speed of CPU-RAM bus and of the RAM modules themselves.
> 
> [1] http://people.redhat.com/drepper/cpumemory.pdf

Yes. I know that. But cluster benchmarks show that if a 16 CPU
machine has all 16 cores at 100% load, it calculates at half the
speed, than when only 4 CPUs are used and 12 remaining are sitting
idle. This must correspond to RAM speed, or call me crazy.

-- 
Janek Kozicki                                                         |

References

Triax profiling on cluster
From: Bruno Chareyre, 2009-09-23
Re: Triax profiling on cluster
From: Janek Kozicki, 2009-09-23
Re: Triax profiling on cluster
From: Bruno Chareyre, 2009-09-23
Re: Triax profiling on cluster
From: Janek Kozicki, 2009-09-23
Re: Triax profiling on cluster
From: Bruno Chareyre, 2009-09-23
Re: Triax profiling on cluster
From: Václav Šmilauer, 2009-09-24