← Back to team overview

yade-users team mailing list archive

Re: parallel?

 

2009/11/4 kan <nicessgg@xxxxxxxxx>

> Thanks~
>
> I hear that OpenMP is based on Shared Memory Parallel Computing, and it
> needs random accessing to the data (I am not very sure, but just hear about
> that, if I am wrong, please forgive me and tell me the right thing,
> thanks).  For my code, the contacts are stored in a list, that is not
> randomly accessable. If it is ture OpenMP needs random accessing, then how
> does YADE do that?Is it ok to run YADE on a distributed memory computer
> architecture, like in Cluster--I saw somebody run it on a cluster, but I do
> not know the cluster architecture. Because in a distributed memory
> architechture, the memory is not directlly accessable, which will increase
> the programming difficulty.
>
> Currently my speed is 3~4 second to do a iteration for about 100,000
> particles, and 33 seconds to do an iterations for 988,000 particles.
> I use threads to do the parallel computing, but the speed up is only 2.2
> times faster (15 seconds for 988,000 particles.).  This speed is terrible
> bad, because the dt is in the order of 10^(-7) second for
>
oh, this speed up is based on 5 theads on a 2X4 cores computer.

> the simulation of rock particles.
>
> Thanks
>
> Yongfeng
>
> 2009/11/4 Václav Šmilauer <eudoxos@xxxxxxxx>
>
>
>>
>> > What is the parallel structure used in YADE? I remember in one mail
>> > (from email-list), it is said that YADE does not use the domain
>> > decomposition method, then what is the parallel method?
>>
>> Yade parallelizes loops over bodies and interactions using openMP; see
>> notably
>>
>> http://bazaar.launchpad.net/%7Eyade-dev/yade/trunk/annotate/head%
>> 3A/pkg/common/Engine/MetaEngine/InteractionDispatchers.cpp#L40
>>
>> http://bazaar.launchpad.net/%7Eyade-dev/yade/trunk/annotate/head%
>> 3A/pkg/dem/Engine/StandAloneEngine/NewtonsDampedLaw.cpp#L57
>>
>> Speedup depends very much on the computer architecture, RAM speed, cache
>> size etc (openmp is shared-memory parallelization). I have speedup over
>> 3x on 4 cores (i7 with ddr3 ram) and recently I had 5.78 on a 2x4core
>> Xeon X5570.
>>
>> I don't know of anyone running on something larger than 8 cores; it
>> might scale further, especially for large simulation, where the openMP
>> overhead and the non-parallel portions of computation (collider, for
>> instance) don't play large role.
>>
>> (If you have 2 engines that don't touch the same data and are
>> independent, there is ParallelEngine for that, but I don't know of any
>> case where it really pays off; maybe the coupling problems could benefit
>> from that)
>>
>> Cheers, Vaclav
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~yade-users
>> Post to     : yade-users@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~yade-users
>> More help   : https://help.launchpad.net/ListHelp
>>
>
>

References