← Back to team overview

dolfin team mailing list archive

Re: [FFC-dev] [UFL-dev] UFL and new release

 



Johan Hake wrote:
On Tuesday 31 March 2009 09:45:33 Johannes Ring wrote:
On Mon, March 30, 2009 20:19, Johan Hake wrote:
On Monday 30 March 2009 20:06:04 Anders Logg wrote:
The buildbot is not responding at the moment so I can't check the
status.
All is green except:

The DOLFIN macbot how is complaining about the trilinos demo and the
DOLFIN
linux64-exp which complains about:

Traceback (most recent call last):
  File "./demo.py", line 8, in <module>
    from dolfin import *

File
"/work/jhbuildbot/fenics/lib/python2.5/site-packages/dolfin/__init__.py",
line 16, in <module>
    from assemble import *

File
"/work/jhbuildbot/fenics/lib/python2.5/site-packages/dolfin/assemble.py",
line 25, in <module>
    import cpp
  File
"/work/jhbuildbot/fenics/lib/python2.5/site-packages/dolfin/cpp.py",
line 25, in <module>
    import _cpp
ImportError: /work/jhbuildbot/local/lib/openmpi/mca_paffinity_linux.so:
undefined symbol: mca_base_param_reg_int

I know Johannes has tried to get into this but as far as I know with no
success.
This is a problem with PyDOLFIN and Open MPI 1.3. We (me and Johan) had
success with a small hack this morning. By adding

  import ctypes
  ctypes.CDLL('libmpi.so', ctypes.RTLD_GLOBAL)

before loading the cpp module in site-packages/dolfin/__init__.py the
Python poisson demo ran just fine. Alternatively one should be able to use
this instead:

  import dl
  import sys
  flags = sys.getdlopenflags()
  sys.setdlopenflags(flags | dl.RTLD_GLOBAL)

but because of a bug in the python2.5 package in Hardy (missing dl module)
we couldn't test this.

It might also be possible to do this in C++ directly. Any suggestion on
the best way to fix this issue? See also:

http://fenics.org/pipermail/deb-dev/2009-March/000210.html
http://www.open-mpi.org/faq/?category=running#loading-libmpi-dynamically

Open MPI 1.3 is in Debian unstable so it would be great if we could fix
this before the release.

The point is that the shared mpi library needs to be loaded using the flag RTLD_GLOBAL. Otherwise the loaded mpi libraries will be loaded into private namespaces causing the undefined symbol link error.

I do not understand why this is not a problem when only using DOLFIN and not PyDOLFIN. Probably the mpi library is only loaded one time to the right namespace? Well that's how deep my understanding is of this :P


Things went sour when DOLFIN was compiled with Trilinos and Trilinos had MPI enabled. Could this be related?

Garth

The hack consist of: Before we load the compiled c++ module, i.e., the _cpp.do, we load the shared mpi library using the above mentioned calls. The library stays in memory and is not loaded a second time when the _cpp.so module is loaded. This is my naive view of the problem and the why the hack works. It could probably be solved in a more gracious manner, e.g. before PetscInitialize is called in DOLFIN or maybe even better on the PETSc side?

Cheers!

Johan
_______________________________________________
FFC-dev mailing list
FFC-dev@xxxxxxxxxx
http://www.fenics.org/mailman/listinfo/ffc-dev




Follow ups

References