dolfin team mailing list archive
-
dolfin team
-
Mailing list archive
-
Message #20900
Re: [Bug 705401] Re: When PyTrilinos is imported after dolfin, bad things happen
On 22/01/11 14:10, Joachim Haga wrote:
> I looked a bit more, and it seems more complicated that that. You're
> right about atexit(), but PyTrilinos actually does the right thing wrt
> initialisation (it checks MPI_Initialized, and does nothing if already
> initialised). Hence, this does not explain the failures before exit().
PyTrilinos doesn't do the right thing at the end of a program. It does
check at initialisation, but it calls finalise irrespective of whether
or not not it did the initialisation.
PyTrilinos calls the MPI finalise function from atexit, but this is
called before the destructor for linear algebra objects is called.
Therefore, 'MPI' objects (e.g. PETScFoo) are still in scope when
PyTrilinos incorrectly finalises MPI. When DOLFIN tries to destroy the
MPI-based objects, and error pops up because MPI has been prematurely
finalised.
Garth
> I
> managed to tease out this error message from trilinos with the "wrong"
> import order, but it's not helpful:
>
> Error! An attempt was made to access parameter "aggregation: type" of type "string"
> in the parameter (sub)list "ML preconditioner"
> using the incorrect type "string"!
>
> The visible error, after the above is caught and re-thrown, is
>
> *********************************************************
> ML failed to compute the multigrid preconditioner. The
> most common problem is an incorrect data type in ML's
> parameter list (e.g. 'int' instead of 'bool').
>
> Note: List.set("ML print initial list",X) might help
> figure out the bad one on pid X.
> *********************************************************
>
> ML::ERROR:: -1,
> /home/jobh/src/fenics/trilinos-10.6.2-Source/packages/ml/src/Utils/ml_MultiLevelPreconditioner.cpp,
> line 1694
>
> It ran clean under valgrind, so any stack or heap smash is subtle. I
> don't think I'll dig any deeper, given that the workaround is so simple.
>
> (I added the following in dolfin, to get rid of the MPI abort, but it
> didn't help with the problem above of course:)
>
> diff --git a/dolfin/main/SubSystemsManager.cpp b/dolfin/main/SubSystemsManager.cpp
> index 52c8982..6a19e4f 100644
> --- a/dolfin/main/SubSystemsManager.cpp
> +++ b/dolfin/main/SubSystemsManager.cpp
> @@ -126,7 +126,10 @@ void SubSystemsManager::finalize_mpi()
> //Finalise MPI if required
> if (MPI::Is_initialized() and sub_systems_manager.control_mpi)
> {
> - MPI::Finalize();
> + if (MPI::Is_finalized())
> + warning("MPI::Finalize has been called by someone else (how rude)");
> + else
> + MPI::Finalize();
> sub_systems_manager.control_mpi = false;
> }
>
--
You received this bug notification because you are a member of DOLFIN
Team, which is subscribed to DOLFIN.
https://bugs.launchpad.net/bugs/705401
Title:
When PyTrilinos is imported after dolfin, bad things happen
Status in DOLFIN:
New
Bug description:
When using PyTrilinos (ML in particular), the order of imports is
important. If dolfin is imported first, it crashes at exit, and there
are problems also with constructing preconditioners etc.
It looks like it has to do with MPI initialisation, but I haven't
looked at it closely.
A simple workaround may be to try an import ML in dolfin/__init__.py
(just import, not expose) so that it gets initialised. I don't know if
the performance hit is worth it. It would of course be better to find
a proper fix.
Otherwise, it's nice to have it documented here. For google::
>>> import dolfin
>>> from PyTrilinos import ML
>>> exit()
*** An error occurred in MPI_Finalize
*** after MPI was finalized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[rodin:17864] Abort after MPI_FINALIZE completed successfully; not able to guarantee that all other processes were killed!
Follow ups
References