Thread Previous • Date Previous • Date Next • Thread Next |
Hi Jerome,this has nothing to do with indeterminism (that has only minor effect, due to rounding), what you have is that the same variable gets written to from different threads, so they are overwriting each others values, something like this:
1. thread 1 reads the value A from memory, computes A1=A+10 2. thread 2 reads the value A from memory, computes A2=A+44 1. thread 1 writes A1 back to memory 2. thread 2 writes A2 back to memory So you end up with the "sum" A+44 instead of A+10+44.What you need is the "reduction" feature (one of reductions is sum) of OpenMP, see the second and third example at http://en.wikipedia.org/wiki/OpenMP#Clauses_in_work-sharing_constructs_.28in_C.2FC.2B.2B.29 . You can do something like that yourself, e.g. by protecting the variable during the whole read-write time with a mutex (or by usiung the "critical" direective as the third example shows).
Check that the code is really faster if you run it in parallel. If the only thing the loop does is to sum the values, then running in parallel will not help at all, since all threads will be waiting for the one having the exclusive access to finish anyway.
HTH, Václav
Hi,I have one engine in which I am looping over interactions to compute incrementally something :something = 0 for each interaction something += f(considered interaction) In fact, I wrote this code using parallel loops, like that : Real something = 0; #ifdef YADE_OPENMP const long size=scene->interactions->size();#pragma omp parallel for schedule(guided) num_threads(ompThreads>0 ? min(ompThreads,omp_get_max_threads()) : omp_get_max_threads())for(long i=0; i<size; i++){ const shared_ptr<Interaction>& interaction=(*scene->interactions)[i]; #elseFOREACH(const shared_ptr<Interaction>& interaction, *scene->interactions){#endif something += f(considered interaction); }And I got (significantly) different results, for the final value of something, running this code either in parallel, or not... Surely linked with a poor understanding of what's really behind this openmp lines, I do not get if I do something wrong, or not (and, if yes, what ?). Is this kind of tasks achievable in parallel, or not ?Thanks a lot, Jerome _______________________________________________ Mailing list: https://launchpad.net/~yade-dev Post to : yade-dev@xxxxxxxxxxxxxxxxxxx Unsubscribe : https://launchpad.net/~yade-dev More help : https://help.launchpad.net/ListHelp
Thread Previous • Date Previous • Date Next • Thread Next |