← Back to team overview

dolfin team mailing list archive

Re: pyDOLFIN and MPI

 

On Thu, Aug 14, 2008 at 07:08:57PM +0200, Johannes Ring wrote:
> On Thu, August 14, 2008 17:34, Anders Logg wrote:
> > On Thu, Aug 14, 2008 at 01:44:31PM +0200, Johan Hake wrote:
> >> On Thursday 14 August 2008 09:46:00 Johannes Ring wrote:
> >> > On Thu, August 14, 2008 09:42, Anders Logg wrote:
> >> > > On Thu, Aug 14, 2008 at 08:39:25AM +0100, Garth N. Wells wrote:
> >> > >> Johannes Ring wrote:
> >> > >> > On Thu, August 14, 2008 08:52, Garth N. Wells wrote:
> >> > >> >> Anders Logg wrote:
> >> > >> >>> On Wed, Aug 13, 2008 at 08:03:39PM +0100, Garth N. Wells wrote:
> >> > >> >>>> I'm experiencing a puzzling problem wth pyDOLFIN and MPI
> >> again.
> >> > >> >>>>
> >> > >> >>>> When I do
> >> > >> >>>>
> >> > >> >>>>      python file.py
> >> > >> >>>>
> >> > >> >>>> where file.py is just
> >> > >> >>>>
> >> > >> >>>>      from dolfin import *
> >> > >> >>>>
> >> > >> >>>>      object = Function("/tmp/fileKFnQpl.xml")
> >> > >> >>>>      plot(object)
> >> > >> >>>>      interactive()
> >> > >> >>>>
> >> > >> >>>> I see a plot as expected, and get
> >> > >> >>>>
> >> > >> >>>>      Plot active, press 'q' to continue.
> >> > >> >>>>
> >> > >> >>>> After pressing 'q', I get
> >> > >> >>>>
> >> > >> >>>>      *** An error occurred in MPI_Attr_get
> >> > >> >>>>      *** after MPI was finalized
> >> > >> >>>>      *** MPI_ERRORS_ARE_FATAL (goodbye)
> >> > >> >>>>      [gnw20pc:2277] Abort before MPI_INIT completed
> >> successfully;
> >> > >>
> >> > >> not
> >> > >>
> >> > >> >>>> able to guarantee that all other processes were killed!
> >> > >> >>>>      *** An error occurred in MPI_Comm_rank
> >> > >> >>>>      *** after MPI was finalized
> >> > >> >>>>      *** MPI_ERRORS_ARE_FATAL (goodbye)
> >> > >> >>>>      *** An error occurred in MPI_Type_free
> >> > >> >>>>      *** after MPI was finalized
> >> > >> >>>>      *** MPI_ERRORS_ARE_FATAL (goodbye)
> >> > >> >>>>      Segmentation fault
> >> > >> >>>>
> >> > >> >>>> Somehow, Python appears to be calling MPI_Finalize before
> >> DOLFIN
> >> > >>
> >> > >> gets a
> >> > >>
> >> > >> >>>> chance to finalise things correctly. Any ideas/experience on
> >> how
> >> > >>
> >> > >> Python
> >> > >>
> >> > >> >>>> interacts with MPI? I've commented out MPI_Finalize() in
> >> DOLFIN to
> >> > >>
> >> > >> be
> >> > >>
> >> > >> >>>> sure that DOLFIN is not calling it.
> >> > >> >>>>
> >> > >> >>>> Garth
> >> > >> >>>
> >> > >> >>> Would it help if we just call MPI_Finalized to check before
> >> > >> >>> finalizing? We can add a wrapper for it just like for
> >> > >>
> >> > >> MPI_Initialized.
> >> > >>
> >> > >> >> I've added some annoying debug output to SubSystemsManager. Can
> >> you
> >> > >>
> >> > >> tell
> >> > >>
> >> > >> >> me what you see when running
> >> > >> >>
> >> > >> >>      from dolfin import *
> >> > >> >>      x = PETScVector(10)
> >> > >> >>
> >> > >> >> When I run it, the first line of output is
> >> > >> >>
> >> > >> >>      MPI status in initPETSc() 1
> >> > >> >>
> >> > >> >> which indicates that MPI_Initialize is saying that MPI has been
> >> > >> >> initialised before PETSc is initialised, but we haven't
> >> initialise
> >> > >>
> >> > >> it.
> >> > >>
> >> > >> >> The same code in C++ gives
> >> > >> >>
> >> > >> >>      MPI status in initPETSc() 0
> >> > >> >>
> >> > >> >> which is the expected result.
> >> > >> >
> >> > >> > I get
> >> > >> >
> >> > >> >>>> from dolfin import *
> >> > >> >>>> x = PETScVector(10)
> >> > >> >
> >> > >> > MPI status in initPETSc() 0
> >> > >> > [simula-x61:17855] mca: base: component_find: unable to open osc
> >> > >>
> >> > >> pt2pt:
> >> > >> > file not found (ignored)
> >> > >> > MPI status in initMPI() 1
> >> > >> >
> >> > >> > Have you compiled Trilinos with MPI support? It seems to be the
> >> same
> >> > >> > problems I had with my Ubuntu packages for Trilinos on 64bit.
> >> Turning
> >> > >>
> >> > >> off
> >> > >>
> >> > >> > MPI support for Trilinos fixed the problem for me.
> >> > >>
> >> > >> Yes, Trilinos with MPI support was the culprit, even though I
> >> disabled
> >> > >> Trilinos (enableTrilinos=no). Trilinos must be placing some files
> >> in
> >> > >> /usr/local/lib/ which cause a problem which is nasty. Just removing
> >> > >> Trilinos did the trick. Thanks!
> >> > >>
> >> > >> Garth
> >> > >
> >> > > What about the annoying
> >> > >
> >> > >   mca: base: component_find: unable to open osc pt2pt: file not
> >> found
> >> > > (ignored)
> >> > >
> >> > > ?
> >> >
> >> > I'm not sure how to remove it.
> >>
> >> I have a serial version of trilinos 8.0.7 compiled localy on my
> >> computer,
> >> which do not produce this output. See:
> >>
> >> <http://www.fenics.org/pipermail/dolfin-dev/2008-June/008442.html>
> >
> > I don't think this has anything to do with Trilinos. I had it long
> > time before I installed Trilinos.
> 
> Yes, me too, however, I have now found a way to remove the warning. Simply
> add the following line to /etc/openmpi/openmpi-mca-params.conf:
> 
>   mca_component_show_load_errors = 0
> 
> The default is 1. I'm not sure if it's safe, but the warning is at least
> gone.
> 
> BTW: For those who get a similar warning as the one below, it can be
> removed by uncommenting the following line in the same file:
> 
>    btl = ^openib
> 
> Johannes
> 
> libibverbs: Fatal: couldn't read uverbs ABI version.
> --------------------------------------------------------------------------
> [0,0,0]: OpenIB on host simula-x61 was unable to find any HCAs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------

Excellent.

Can you add this to the FAQ on the Wiki?

-- 
Anders

Attachment: signature.asc
Description: Digital signature


Follow ups

References