← Back to team overview

yade-users team mailing list archive

Re: Triax profiling on cluster

 

Václav Šmilauer said:     (by the date of Thu, 24 Sep 2009 10:42:40 +0200)

> L1 cache is certainly not useless even for DEM, it's just that all your 
> data will not fit inside. But still if one part of your data is at one 
> memory location (not chain of shared_ptr's jumping all over the RAM), it 
> makes the computation much faster (e.g. Dem3Dof classes have comparable 
> speed to SpheresContactGeometry even if they copy extra 
> Vector3r+Quaternionr (=28b of data) at every step. There are some papers 
> [1] on that; speeds of the L1 cache are orders of magnitude higher than 
> speed of CPU-RAM bus and of the RAM modules themselves.
> 
> [1] http://people.redhat.com/drepper/cpumemory.pdf

Yes. I know that. But cluster benchmarks show that if a 16 CPU
machine has all 16 cores at 100% load, it calculates at half the
speed, than when only 4 CPUs are used and 12 remaining are sitting
idle. This must correspond to RAM speed, or call me crazy.

-- 
Janek Kozicki                                                         |



References