← Back to team overview

dolfin team mailing list archive

Re: pyDOLFIN and MPI

 

On Thu, August 14, 2008 17:34, Anders Logg wrote:
> On Thu, Aug 14, 2008 at 01:44:31PM +0200, Johan Hake wrote:
>> On Thursday 14 August 2008 09:46:00 Johannes Ring wrote:
>> > On Thu, August 14, 2008 09:42, Anders Logg wrote:
>> > > On Thu, Aug 14, 2008 at 08:39:25AM +0100, Garth N. Wells wrote:
>> > >> Johannes Ring wrote:
>> > >> > On Thu, August 14, 2008 08:52, Garth N. Wells wrote:
>> > >> >> Anders Logg wrote:
>> > >> >>> On Wed, Aug 13, 2008 at 08:03:39PM +0100, Garth N. Wells wrote:
>> > >> >>>> I'm experiencing a puzzling problem wth pyDOLFIN and MPI
>> again.
>> > >> >>>>
>> > >> >>>> When I do
>> > >> >>>>
>> > >> >>>>      python file.py
>> > >> >>>>
>> > >> >>>> where file.py is just
>> > >> >>>>
>> > >> >>>>      from dolfin import *
>> > >> >>>>
>> > >> >>>>      object = Function("/tmp/fileKFnQpl.xml")
>> > >> >>>>      plot(object)
>> > >> >>>>      interactive()
>> > >> >>>>
>> > >> >>>> I see a plot as expected, and get
>> > >> >>>>
>> > >> >>>>      Plot active, press 'q' to continue.
>> > >> >>>>
>> > >> >>>> After pressing 'q', I get
>> > >> >>>>
>> > >> >>>>      *** An error occurred in MPI_Attr_get
>> > >> >>>>      *** after MPI was finalized
>> > >> >>>>      *** MPI_ERRORS_ARE_FATAL (goodbye)
>> > >> >>>>      [gnw20pc:2277] Abort before MPI_INIT completed
>> successfully;
>> > >>
>> > >> not
>> > >>
>> > >> >>>> able to guarantee that all other processes were killed!
>> > >> >>>>      *** An error occurred in MPI_Comm_rank
>> > >> >>>>      *** after MPI was finalized
>> > >> >>>>      *** MPI_ERRORS_ARE_FATAL (goodbye)
>> > >> >>>>      *** An error occurred in MPI_Type_free
>> > >> >>>>      *** after MPI was finalized
>> > >> >>>>      *** MPI_ERRORS_ARE_FATAL (goodbye)
>> > >> >>>>      Segmentation fault
>> > >> >>>>
>> > >> >>>> Somehow, Python appears to be calling MPI_Finalize before
>> DOLFIN
>> > >>
>> > >> gets a
>> > >>
>> > >> >>>> chance to finalise things correctly. Any ideas/experience on
>> how
>> > >>
>> > >> Python
>> > >>
>> > >> >>>> interacts with MPI? I've commented out MPI_Finalize() in
>> DOLFIN to
>> > >>
>> > >> be
>> > >>
>> > >> >>>> sure that DOLFIN is not calling it.
>> > >> >>>>
>> > >> >>>> Garth
>> > >> >>>
>> > >> >>> Would it help if we just call MPI_Finalized to check before
>> > >> >>> finalizing? We can add a wrapper for it just like for
>> > >>
>> > >> MPI_Initialized.
>> > >>
>> > >> >> I've added some annoying debug output to SubSystemsManager. Can
>> you
>> > >>
>> > >> tell
>> > >>
>> > >> >> me what you see when running
>> > >> >>
>> > >> >>      from dolfin import *
>> > >> >>      x = PETScVector(10)
>> > >> >>
>> > >> >> When I run it, the first line of output is
>> > >> >>
>> > >> >>      MPI status in initPETSc() 1
>> > >> >>
>> > >> >> which indicates that MPI_Initialize is saying that MPI has been
>> > >> >> initialised before PETSc is initialised, but we haven't
>> initialise
>> > >>
>> > >> it.
>> > >>
>> > >> >> The same code in C++ gives
>> > >> >>
>> > >> >>      MPI status in initPETSc() 0
>> > >> >>
>> > >> >> which is the expected result.
>> > >> >
>> > >> > I get
>> > >> >
>> > >> >>>> from dolfin import *
>> > >> >>>> x = PETScVector(10)
>> > >> >
>> > >> > MPI status in initPETSc() 0
>> > >> > [simula-x61:17855] mca: base: component_find: unable to open osc
>> > >>
>> > >> pt2pt:
>> > >> > file not found (ignored)
>> > >> > MPI status in initMPI() 1
>> > >> >
>> > >> > Have you compiled Trilinos with MPI support? It seems to be the
>> same
>> > >> > problems I had with my Ubuntu packages for Trilinos on 64bit.
>> Turning
>> > >>
>> > >> off
>> > >>
>> > >> > MPI support for Trilinos fixed the problem for me.
>> > >>
>> > >> Yes, Trilinos with MPI support was the culprit, even though I
>> disabled
>> > >> Trilinos (enableTrilinos=no). Trilinos must be placing some files
>> in
>> > >> /usr/local/lib/ which cause a problem which is nasty. Just removing
>> > >> Trilinos did the trick. Thanks!
>> > >>
>> > >> Garth
>> > >
>> > > What about the annoying
>> > >
>> > >   mca: base: component_find: unable to open osc pt2pt: file not
>> found
>> > > (ignored)
>> > >
>> > > ?
>> >
>> > I'm not sure how to remove it.
>>
>> I have a serial version of trilinos 8.0.7 compiled localy on my
>> computer,
>> which do not produce this output. See:
>>
>> <http://www.fenics.org/pipermail/dolfin-dev/2008-June/008442.html>
>
> I don't think this has anything to do with Trilinos. I had it long
> time before I installed Trilinos.

Yes, me too, however, I have now found a way to remove the warning. Simply
add the following line to /etc/openmpi/openmpi-mca-params.conf:

  mca_component_show_load_errors = 0

The default is 1. I'm not sure if it's safe, but the warning is at least
gone.

BTW: For those who get a similar warning as the one below, it can be
removed by uncommenting the following line in the same file:

   btl = ^openib

Johannes

libibverbs: Fatal: couldn't read uverbs ABI version.
--------------------------------------------------------------------------
[0,0,0]: OpenIB on host simula-x61 was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------




Follow ups

References