yade-dev team mailing list archive
-
yade-dev team
-
Mailing list archive
-
Message #11729
Re: Parallel loops over interactions to sum
Ok, I was imagining something like that, but it is much clearer for me now ! Thanks a lot Vaclav,
Jerome
________________________________
From: Yade-dev [yade-dev-bounces+jerome.duriez=ucalgary.ca@xxxxxxxxxxxxxxxxxxx] on behalf of Václav Šmilauer [eu@xxxxxxxx]
Sent: December 2, 2014 2:31 PM
To: yade-dev@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Yade-dev] Parallel loops over interactions to sum
Hi Jerome,
this has nothing to do with indeterminism (that has only minor effect, due to rounding), what you have is that the same variable gets written to from different threads, so they are overwriting each others values, something like this:
1. thread 1 reads the value A from memory, computes A1=A+10
2. thread 2 reads the value A from memory, computes A2=A+44
1. thread 1 writes A1 back to memory
2. thread 2 writes A2 back to memory
So you end up with the "sum" A+44 instead of A+10+44.
What you need is the "reduction" feature (one of reductions is sum) of OpenMP, see the second and third example at http://en.wikipedia.org/wiki/OpenMP#Clauses_in_work-sharing_constructs_.28in_C.2FC.2B.2B.29 . You can do something like that yourself, e.g. by protecting the variable during the whole read-write time with a mutex (or by usiung the "critical" direective as the third example shows).
Check that the code is really faster if you run it in parallel. If the only thing the loop does is to sum the values, then running in parallel will not help at all, since all threads will be waiting for the one having the exclusive access to finish anyway.
HTH, Václav
Hi,
I have one engine in which I am looping over interactions to compute incrementally something :
something = 0
for each interaction
something += f(considered interaction)
In fact, I wrote this code using parallel loops, like that :
Real something = 0;
#ifdef YADE_OPENMP
const long size=scene->interactions->size();
#pragma omp parallel for schedule(guided) num_threads(ompThreads>0 ? min(ompThreads,omp_get_max_threads()) : omp_get_max_threads())
for(long i=0; i<size; i++){
const shared_ptr<Interaction>& interaction=(*scene->interactions)[i];
#else
FOREACH(const shared_ptr<Interaction>& interaction, *scene->interactions){
#endif
something += f(considered interaction);
}
And I got (significantly) different results, for the final value of something, running this code either in parallel, or not... Surely linked with a poor understanding of what's really behind this openmp lines, I do not get if I do something wrong, or not (and, if yes, what ?). Is this kind of tasks achievable in parallel, or not ?
Thanks a lot,
Jerome
_______________________________________________
Mailing list: https://launchpad.net/~yade-dev
Post to : yade-dev@xxxxxxxxxxxxxxxxxxx<mailto:yade-dev@xxxxxxxxxxxxxxxxxxx>
Unsubscribe : https://launchpad.net/~yade-dev
More help : https://help.launchpad.net/ListHelp
References