← Back to team overview

dolfin team mailing list archive

Re: [FFC-dev] [UFL-dev] UFL and new release

 

On Tuesday 31 March 2009 09:45:33 Johannes Ring wrote:
> On Mon, March 30, 2009 20:19, Johan Hake wrote:
> > On Monday 30 March 2009 20:06:04 Anders Logg wrote:
> >> The buildbot is not responding at the moment so I can't check the
> >> status.
> >
> > All is green except:
> >
> > The DOLFIN macbot how is complaining about the trilinos demo and the
> > DOLFIN
> > linux64-exp which complains about:
> >
> > Traceback (most recent call last):
> >   File "./demo.py", line 8, in <module>
> >     from dolfin import *
> >
> > File
> > "/work/jhbuildbot/fenics/lib/python2.5/site-packages/dolfin/__init__.py",
> > line 16, in <module>
> >     from assemble import *
> >
> > File
> > "/work/jhbuildbot/fenics/lib/python2.5/site-packages/dolfin/assemble.py",
> > line 25, in <module>
> >     import cpp
> >   File
> > "/work/jhbuildbot/fenics/lib/python2.5/site-packages/dolfin/cpp.py",
> > line 25, in <module>
> >     import _cpp
> > ImportError: /work/jhbuildbot/local/lib/openmpi/mca_paffinity_linux.so:
> > undefined symbol: mca_base_param_reg_int
> >
> > I know Johannes has tried to get into this but as far as I know with no
> > success.
>
> This is a problem with PyDOLFIN and Open MPI 1.3. We (me and Johan) had
> success with a small hack this morning. By adding
>
>   import ctypes
>   ctypes.CDLL('libmpi.so', ctypes.RTLD_GLOBAL)
>
> before loading the cpp module in site-packages/dolfin/__init__.py the
> Python poisson demo ran just fine. Alternatively one should be able to use
> this instead:
>
>   import dl
>   import sys
>   flags = sys.getdlopenflags()
>   sys.setdlopenflags(flags | dl.RTLD_GLOBAL)
>
> but because of a bug in the python2.5 package in Hardy (missing dl module)
> we couldn't test this.
>
> It might also be possible to do this in C++ directly. Any suggestion on
> the best way to fix this issue? See also:
>
> http://fenics.org/pipermail/deb-dev/2009-March/000210.html
> http://www.open-mpi.org/faq/?category=running#loading-libmpi-dynamically
>
> Open MPI 1.3 is in Debian unstable so it would be great if we could fix
> this before the release.

The point is that the shared mpi library needs to be loaded using the flag 
RTLD_GLOBAL. Otherwise the loaded mpi libraries will be loaded into private 
namespaces causing the undefined symbol link error.

I do not understand why this is not a problem when only using DOLFIN and not 
PyDOLFIN. Probably the mpi library is only loaded one time to the right 
namespace? Well that's how deep my understanding is of this :P

The hack consist of: 
Before we load the compiled c++ module, i.e., the _cpp.do, we load the shared 
mpi library using the above mentioned calls. The library stays in memory and 
is not loaded a second time when the _cpp.so module is loaded. 

This is my naive view of the problem and the why the hack works. It could 
probably be solved in a more gracious manner, e.g. before PetscInitialize is 
called in DOLFIN or maybe even better on the PETSc side?

Cheers!

Johan


Follow ups

References