← Back to team overview

dolfin team mailing list archive

Re: pyDOLFIN and MPI

 

On Thu, August 14, 2008 19:28, Anders Logg wrote:
> On Thu, Aug 14, 2008 at 07:08:57PM +0200, Johannes Ring wrote:
>> On Thu, August 14, 2008 17:34, Anders Logg wrote:
>> > On Thu, Aug 14, 2008 at 01:44:31PM +0200, Johan Hake wrote:
>> >> On Thursday 14 August 2008 09:46:00 Johannes Ring wrote:
>> >> > On Thu, August 14, 2008 09:42, Anders Logg wrote:
>> >> > > On Thu, Aug 14, 2008 at 08:39:25AM +0100, Garth N. Wells wrote:
>> >> > >> Johannes Ring wrote:
>> >> > >> > On Thu, August 14, 2008 08:52, Garth N. Wells wrote:
>> >> > >> >> Anders Logg wrote:
>> >> > >> >>> On Wed, Aug 13, 2008 at 08:03:39PM +0100, Garth N. Wells
>> wrote:
>> >> > >> >>>> I'm experiencing a puzzling problem wth pyDOLFIN and MPI
>> >> again.
>> >> > >> >>>>
>> >> > >> >>>> When I do
>> >> > >> >>>>
>> >> > >> >>>>      python file.py
>> >> > >> >>>>
>> >> > >> >>>> where file.py is just
>> >> > >> >>>>
>> >> > >> >>>>      from dolfin import *
>> >> > >> >>>>
>> >> > >> >>>>      object = Function("/tmp/fileKFnQpl.xml")
>> >> > >> >>>>      plot(object)
>> >> > >> >>>>      interactive()
>> >> > >> >>>>
>> >> > >> >>>> I see a plot as expected, and get
>> >> > >> >>>>
>> >> > >> >>>>      Plot active, press 'q' to continue.
>> >> > >> >>>>
>> >> > >> >>>> After pressing 'q', I get
>> >> > >> >>>>
>> >> > >> >>>>      *** An error occurred in MPI_Attr_get
>> >> > >> >>>>      *** after MPI was finalized
>> >> > >> >>>>      *** MPI_ERRORS_ARE_FATAL (goodbye)
>> >> > >> >>>>      [gnw20pc:2277] Abort before MPI_INIT completed
>> >> successfully;
>> >> > >>
>> >> > >> not
>> >> > >>
>> >> > >> >>>> able to guarantee that all other processes were killed!
>> >> > >> >>>>      *** An error occurred in MPI_Comm_rank
>> >> > >> >>>>      *** after MPI was finalized
>> >> > >> >>>>      *** MPI_ERRORS_ARE_FATAL (goodbye)
>> >> > >> >>>>      *** An error occurred in MPI_Type_free
>> >> > >> >>>>      *** after MPI was finalized
>> >> > >> >>>>      *** MPI_ERRORS_ARE_FATAL (goodbye)
>> >> > >> >>>>      Segmentation fault
>> >> > >> >>>>
>> >> > >> >>>> Somehow, Python appears to be calling MPI_Finalize before
>> >> DOLFIN
>> >> > >>
>> >> > >> gets a
>> >> > >>
>> >> > >> >>>> chance to finalise things correctly. Any ideas/experience
>> on
>> >> how
>> >> > >>
>> >> > >> Python
>> >> > >>
>> >> > >> >>>> interacts with MPI? I've commented out MPI_Finalize() in
>> >> DOLFIN to
>> >> > >>
>> >> > >> be
>> >> > >>
>> >> > >> >>>> sure that DOLFIN is not calling it.
>> >> > >> >>>>
>> >> > >> >>>> Garth
>> >> > >> >>>
>> >> > >> >>> Would it help if we just call MPI_Finalized to check before
>> >> > >> >>> finalizing? We can add a wrapper for it just like for
>> >> > >>
>> >> > >> MPI_Initialized.
>> >> > >>
>> >> > >> >> I've added some annoying debug output to SubSystemsManager.
>> Can
>> >> you
>> >> > >>
>> >> > >> tell
>> >> > >>
>> >> > >> >> me what you see when running
>> >> > >> >>
>> >> > >> >>      from dolfin import *
>> >> > >> >>      x = PETScVector(10)
>> >> > >> >>
>> >> > >> >> When I run it, the first line of output is
>> >> > >> >>
>> >> > >> >>      MPI status in initPETSc() 1
>> >> > >> >>
>> >> > >> >> which indicates that MPI_Initialize is saying that MPI has
>> been
>> >> > >> >> initialised before PETSc is initialised, but we haven't
>> >> initialise
>> >> > >>
>> >> > >> it.
>> >> > >>
>> >> > >> >> The same code in C++ gives
>> >> > >> >>
>> >> > >> >>      MPI status in initPETSc() 0
>> >> > >> >>
>> >> > >> >> which is the expected result.
>> >> > >> >
>> >> > >> > I get
>> >> > >> >
>> >> > >> >>>> from dolfin import *
>> >> > >> >>>> x = PETScVector(10)
>> >> > >> >
>> >> > >> > MPI status in initPETSc() 0
>> >> > >> > [simula-x61:17855] mca: base: component_find: unable to open
>> osc
>> >> > >>
>> >> > >> pt2pt:
>> >> > >> > file not found (ignored)
>> >> > >> > MPI status in initMPI() 1
>> >> > >> >
>> >> > >> > Have you compiled Trilinos with MPI support? It seems to be
>> the
>> >> same
>> >> > >> > problems I had with my Ubuntu packages for Trilinos on 64bit.
>> >> Turning
>> >> > >>
>> >> > >> off
>> >> > >>
>> >> > >> > MPI support for Trilinos fixed the problem for me.
>> >> > >>
>> >> > >> Yes, Trilinos with MPI support was the culprit, even though I
>> >> disabled
>> >> > >> Trilinos (enableTrilinos=no). Trilinos must be placing some
>> files
>> >> in
>> >> > >> /usr/local/lib/ which cause a problem which is nasty. Just
>> removing
>> >> > >> Trilinos did the trick. Thanks!
>> >> > >>
>> >> > >> Garth
>> >> > >
>> >> > > What about the annoying
>> >> > >
>> >> > >   mca: base: component_find: unable to open osc pt2pt: file not
>> >> found
>> >> > > (ignored)
>> >> > >
>> >> > > ?
>> >> >
>> >> > I'm not sure how to remove it.
>> >>
>> >> I have a serial version of trilinos 8.0.7 compiled localy on my
>> >> computer,
>> >> which do not produce this output. See:
>> >>
>> >> <http://www.fenics.org/pipermail/dolfin-dev/2008-June/008442.html>
>> >
>> > I don't think this has anything to do with Trilinos. I had it long
>> > time before I installed Trilinos.
>>
>> Yes, me too, however, I have now found a way to remove the warning.
>> Simply
>> add the following line to /etc/openmpi/openmpi-mca-params.conf:
>>
>>   mca_component_show_load_errors = 0
>>
>> The default is 1. I'm not sure if it's safe, but the warning is at least
>> gone.
>>
>> BTW: For those who get a similar warning as the one below, it can be
>> removed by uncommenting the following line in the same file:
>>
>>    btl = ^openib
>>
>> Johannes
>>
>> libibverbs: Fatal: couldn't read uverbs ABI version.
>> --------------------------------------------------------------------------
>> [0,0,0]: OpenIB on host simula-x61 was unable to find any HCAs.
>> Another transport will be used instead, although this may result in
>> lower performance.
>> --------------------------------------------------------------------------
>
> Excellent.
>
> Can you add this to the FAQ on the Wiki?

Done.

Johannes



References