← Back to team overview

dolfin team mailing list archive

Re: [HG DOLFIN] merge

 

On Tuesday 18 August 2009 14:54:24 Garth N. Wells wrote:
> Johan Hake wrote:
> > On Monday 17 August 2009 19:19:40 Anders Logg wrote:
> >> On Mon, Aug 17, 2009 at 07:09:11PM +0200, DOLFIN wrote:
> >>> changeset:   6762:ca407204632a1b0430099c243c915a151b2bd941
> >>> parent:      6759:efc24a341e41e9e0c83616be4613d819fe95ccb6
> >>> user:        Anders Logg <logg@xxxxxxxxx>
> >>> date:        Mon Aug 17 19:08:56 2009 +0200
> >>> files:       site-packages/dolfin/compile_function.py
> >>> site-packages/dolfin/jit.py description:
> >>> Make JIT compiler work in parallel. The process number is added to the
> >>> signature to create a unique signature for each process. This means
> >>> that each process will compile its own form. This may not be optimal
> >>> and could possibly be handled by Instant. On the other hand, it seems
> >>> to work nicely and might also be advantageous when processes don't
> >>> share a common cache.
> >>
> >> The Poisson Python demo now runs as is without the need for first
> >> running it in serial (to handle JIT compilation):
> >
> > Did it not work before this change? I know Martin added some file locks
> > to prevent simultaneous compilations of the same module.
> >
> >>   mpirun -n 4 python demo.py
> >
> > Do I have to set some environmental variables to make this work. I can't
> > get it to work (probably some stupid error) :P
> >
> > Johan
> >
> > When running the above command I get:
> >
> > ssh: connect to host hake-laptop port 22: Connection refused
>
> This looks like a known bug in the Ubuntu MPI package
>
>    http://www.open-mpi.org/community/lists/users/2009/03/8571.php
>    https://bugs.launchpad.net/ubuntu/+source/openmpi/+bug/365122

I downloaded the source packages applied the patch and compiled. Unfortunately 
it did solve the problem. 

I think I stay with the local ssh workaround for now. It is mainly for 
testing.

Johan

> Garth
>
> > -------------------------------------------------------------------------
> >- A daemon (pid 32065) died unexpectedly with status 255 while attempting
> > to launch so we are aborting.
> >
> > There may be more information reported by the environment (see above).
> >
> > This may be because the daemon was unable to find all the needed shared
> > libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> > the location of the shared libraries on the remote nodes and this will
> > automatically be forwarded to the remote nodes.
> > -------------------------------------------------------------------------
> >-
> > -------------------------------------------------------------------------
> >- mpirun noticed that the job aborted, but has no info as to the process
> > that caused that situation.
> > -------------------------------------------------------------------------
> >- mpirun: clean termination accomplished
> >
> >
> >
> > _______________________________________________
> > DOLFIN-dev mailing list
> > DOLFIN-dev@xxxxxxxxxx
> > http://www.fenics.org/mailman/listinfo/dolfin-dev


Follow ups

References