yade-dev team mailing list archive
-
yade-dev team
-
Mailing list archive
-
Message #11945
Re: Little news about capillary law2 in parallel
The diff is attached. The line numbers do not correspond to current gitHub version, changes in fact would apply in this version to l. 178
As for the pdf : Yes, comparing (between the 2 pages of the pdf..) case j4 of NewCode to case j1 of NewCode suggests a speedup of 1.15 - 1.2 (depending on whether "fusion operations" are executed or not)
But, it seems speed measurements significantly fluctuate. This is the only reason I see to explain the slight speedup obtained observed in j1 between Old and NewCode : p.1 of the pdf. I do not have a high experience of speed execution variability, but this variability I observe (between comparable simulations) makes me wonder if "1.15 - 1.2" speedup is beyond natural variability..
Jerome
________________________________
From: Yade-dev [yade-dev-bounces+jerome.duriez=ucalgary.ca@xxxxxxxxxxxxxxxxxxx] on behalf of Bruno Chareyre [bruno.chareyre@xxxxxxxxxxxxxxx]
Sent: March-12-15 2:40 AM
To: yade-dev@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Yade-dev] Little news about capillary law2 in parallel
Hi Jérôme,
Thanks for update. Not sure I understand the pdf. Why is the new code faster than the old code with -j1?
Taking the difference between -j1 and -j4 with the new code suggests that the speedup of the capillary law is about x1.15, correct?
Before commiting, could you show the diff?
Thanks
Bruno
On 11/03/15 18:28, Jerome Duriez wrote:
Hi guys,
I finally reconsidered parallelization of Law2_ScGeom_CapillaryPhys_Capillarity, after our passed discussion (*).
I noticed only after re-starting this work that implementations of e.g. remove() and resize() in in-house class OpenMPVector would be necessary to do it completely right (using OpenMPVector objects in BodiesMenisciiList class). Then, I only coded yet parallel loops for the "addForce" block, at the end of the Law2....
Obviously, the results (in terms of capillary stresses as given by getCapillaryStress() function) are unaffected by the changes : see the four attached pictures, showing sXXcap or sYYcap for two triaxial loadings with 20000 particles, repeted before and after changing code. One loading is with 10 kPa capillary pressure : meniscii fusion occur in the sample, affecting the simulation (with fusionDetection = 1,binaryFusion=0 attributes of the Law2). For the other one, the suction value of 300 kPa prevents any meniscii fusion.
As for the speed up, *maybe* such changes in the code allow a significant speed up (of capillary Law2) to be obtained with parallel computations... See the data sheet "NBl2cp" (in ods or pdf format).
Anyway, I hope it is clear this little change in code does not harm. Since it might still be little useful, and exists already on my computer, I plann to commit it, after time for discussion.
Jerome
PS : these results might still be of interest for you if you are curious of results scattering through parallel computations, or capillary law2 and NewtonIntegrator (NI) timings... I obtain here e.g. a speed up of NI about 2.3 from j1 to j4
(*) https://www.mail-archive.com/yade-dev@xxxxxxxxxxxxxxxxxxx/msg10842.html)
_______________________________________________
Mailing list: https://launchpad.net/~yade-dev
Post to : yade-dev@xxxxxxxxxxxxxxxxxxx<mailto:yade-dev@xxxxxxxxxxxxxxxxxxx>
Unsubscribe : https://launchpad.net/~yade-dev
More help : https://help.launchpad.net/ListHelp
--
_______________
Bruno Chareyre
Associate Professor
ENSE³ - Grenoble INP
Lab. 3SR
BP 53
38041 Grenoble cedex 9
Tél : +33 4 56 52 86 21
Fax : +33 4 76 82 70 43
________________
diff --git a/pkg/dem/Law2_ScGeom_CapillaryPhys_Capillarity.cpp b/pkg/dem/Law2_ScGeom_CapillaryPhys_Capillarity.cpp
index b510c0a..0545f31 100644
--- a/pkg/dem/Law2_ScGeom_CapillaryPhys_Capillarity.cpp
+++ b/pkg/dem/Law2_ScGeom_CapillaryPhys_Capillarity.cpp
@@ -164,7 +164,14 @@ void Law2_ScGeom_CapillaryPhys_Capillarity::action()
}
if (fusionDetection) checkFusion();
- FOREACH(const shared_ptr<Interaction>& interaction, *scene->interactions){ // same remark for parallel loops
+ #ifdef YADE_OPENMP
+ const long size=scene->interactions->size();
+ #pragma omp parallel for schedule(guided) num_threads(ompThreads>0 ? min(ompThreads,omp_get_max_threads()) : omp_get_max_threads())
+ for(long i=0; i<size; i++){
+ const shared_ptr<Interaction>& interaction=(*scene->interactions)[i];
+ #else
+ FOREACH(const shared_ptr<Interaction>& interaction, *scene->interactions){
+ #endif
if (interaction->isReal()) {
CapillaryPhys* cundallContactPhysics=NULL;
MindlinCapillaryPhys* mindlinContactPhysics=NULL;
Follow ups
References