← Back to team overview

dolfin team mailing list archive

Re: [HG DOLFIN] merge

 

On Tuesday 18 August 2009 08:56:06 Johan Hake wrote:
> On Monday 17 August 2009 23:51:39 Anders Logg wrote:
> > On Mon, Aug 17, 2009 at 11:20:08PM +0200, Johan Hake wrote:
> > > On Monday 17 August 2009 19:19:40 Anders Logg wrote:
> > > > On Mon, Aug 17, 2009 at 07:09:11PM +0200, DOLFIN wrote:
> > > > > changeset:   6762:ca407204632a1b0430099c243c915a151b2bd941
> > > > > parent:      6759:efc24a341e41e9e0c83616be4613d819fe95ccb6
> > > > > user:        Anders Logg <logg@xxxxxxxxx>
> > > > > date:        Mon Aug 17 19:08:56 2009 +0200
> > > > > files:       site-packages/dolfin/compile_function.py
> > > > > site-packages/dolfin/jit.py description:
> > > > > Make JIT compiler work in parallel. The process number is added to
> > > > > the signature to create a unique signature for each process. This
> > > > > means that each process will compile its own form. This may not be
> > > > > optimal and could possibly be handled by Instant. On the other
> > > > > hand, it seems to work nicely and might also be advantageous when
> > > > > processes don't share a common cache.
> > > >
> > > > The Poisson Python demo now runs as is without the need for first
> > > > running it in serial (to handle JIT compilation):
> > >
> > > Did it not work before this change? I know Martin added some file locks
> > > to prevent simultaneous compilations of the same module.
> >
> > No, it didn't work before. I get things like
> >
> > In instant.build_module: Path
> > '/home/logg/.instant/cache/form_f38430af401fbeddb9be4091a6fcde37cef9fa35'
> > already exists, but module wasn't found in cache previously. Not
> > overwriting, assuming this module is valid.
> > Traceback (most recent call last):
> >   File "demo.py", line 23, in <module>
> >     V = FunctionSpace(mesh, "CG", 1)
> >   File
> >
> > "/home/logg/scratch/src/fenics-dev/dolfin-dev/local/lib/python2.6/site-pa
> >ck ages/dolfin/functionspace.py", line 181, in __init__
> >     FunctionSpaceBase.__init__(self, mesh, element)
> >   File
> >
> > "/home/logg/scratch/src/fenics-dev/dolfin-dev/local/lib/python2.6/site-pa
> >ck ages/dolfin/functionspace.py", line 43, in __init__
> >     ufc_element, ufc_dofmap = jit(self._element)
> >   File
> >
> > "/home/logg/scratch/src/fenics-dev/dolfin-dev/local/lib/python2.6/site-pa
> >ck ages/dolfin/jit.py", line 67, in jit
> >     return jit_compile(form, options)
> >   File
> >
> > "/home/logg/scratch/lib/fenics-dev/lib/python2.6/site-packages/ffc/jit/ji
> >t. py", line 56, in jit
> >     return jit_element(object, options)
> >   File
> >
> > "/home/logg/scratch/lib/fenics-dev/lib/python2.6/site-packages/ffc/jit/ji
> >t. py", line 125, in jit_element
> >     (compiled_form, module, form_data) = jit_form(form, options)
> >   File
> >
> > "/home/logg/scratch/lib/fenics-dev/lib/python2.6/site-packages/ffc/jit/ji
> >t. py", line 102, in jit_form
> >     os.unlink(signature + ".h")
> >   OSError: [Errno 2] No such file or directory:
> >   'form_f38430af401fbeddb9be4091a6fcde37cef9fa35.h'
>
> It looks like the error comes from unlinking a file more than one time
> (done in ffc/jit.py), and not in instant. I will look at it.
>
> > I guess the second process tries to read the generated file but
> > it's not ready yet (still being generated by the first process).
> >
> > It would be good to handle the parallel JIT compilation as part of
> > Instant, but I don't know what the best solution is.
> >
> > > >   mpirun -n 4 python demo.py
> > >
> > > Do I have to set some environmental variables to make this work. I
> > > can't get it to work (probably some stupid error) :P
> >
> > No, nothing. It should work out of the box.
> >
> > > Johan
> > >
> > > When running the above command I get:
> > >
> > > ssh: connect to host hake-laptop port 22: Connection refused
> >
> > Can you run other processes in parallel?
> >
> >   mpirun -n 4 ls
> >
> > Maybe you need to install sshd? I didn't know it was required.
>
> Yes, that did the trick! openssh-server in ubuntu, btw, and I also had to
> put my public ssh keys in my own authorized keys.

Also I had to add an alias for mpirun:

  alias mpirun='mpirun -x PYTHONPATH -x LD_LIBRARY_PATH -x PKG_CONFIG_PATH -x 
PATH -x DISPLAY'

which forwards the mentioned environments variables, and I had to set the 

  DISPLAY=:0.0

in my .basrh file. 

I know that others do not have to forward these variables but, I couldn't find 
a way to not do it.

Johan

> Johan
>
> > --
> > Anders
> >
> > > -----------------------------------------------------------------------
> > >-- - A daemon (pid 32065) died unexpectedly with status 255 while
> > > attempting to launch so we are aborting.
> > >
> > > There may be more information reported by the environment (see above).
> > >
> > > This may be because the daemon was unable to find all the needed shared
> > > libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> > > the location of the shared libraries on the remote nodes and this will
> > > automatically be forwarded to the remote nodes.
> > > -----------------------------------------------------------------------
> > >-- -
> > > -----------------------------------------------------------------------
> > >-- - mpirun noticed that the job aborted, but has no info as to the
> > > process that caused that situation.
> > > -----------------------------------------------------------------------
> > >-- - mpirun: clean termination accomplished
>
> _______________________________________________
> DOLFIN-dev mailing list
> DOLFIN-dev@xxxxxxxxxx
> http://www.fenics.org/mailman/listinfo/dolfin-dev


References