Thread Previous • Date Previous • Date Next • Thread Next |
Johan Hake wrote:
On Tuesday 18 August 2009 14:54:24 Garth N. Wells wrote:Johan Hake wrote:On Monday 17 August 2009 19:19:40 Anders Logg wrote:On Mon, Aug 17, 2009 at 07:09:11PM +0200, DOLFIN wrote:changeset: 6762:ca407204632a1b0430099c243c915a151b2bd941 parent: 6759:efc24a341e41e9e0c83616be4613d819fe95ccb6 user: Anders Logg <logg@xxxxxxxxx> date: Mon Aug 17 19:08:56 2009 +0200 files: site-packages/dolfin/compile_function.py site-packages/dolfin/jit.py description: Make JIT compiler work in parallel. The process number is added to the signature to create a unique signature for each process. This means that each process will compile its own form. This may not be optimal and could possibly be handled by Instant. On the other hand, it seems to work nicely and might also be advantageous when processes don't share a common cache.The Poisson Python demo now runs as is without the need for first running it in serial (to handle JIT compilation):Did it not work before this change? I know Martin added some file locks to prevent simultaneous compilations of the same module.mpirun -n 4 python demo.pyDo I have to set some environmental variables to make this work. I can't get it to work (probably some stupid error) :P Johan When running the above command I get: ssh: connect to host hake-laptop port 22: Connection refusedThis looks like a known bug in the Ubuntu MPI package http://www.open-mpi.org/community/lists/users/2009/03/8571.php https://bugs.launchpad.net/ubuntu/+source/openmpi/+bug/365122I downloaded the source packages applied the patch and compiled. Unfortunately it did solve the problem.
Did you download the latest version of OpenMPI? There is no need to patch the latest version.
Do you have other Ubuntu packages that depend on MPI installed, like PETSc? Because I don't use the Ubuntu MPI package, I have built PETSc, ParMETIS, etc myself.
Garth
I think I stay with the local ssh workaround for now. It is mainly for testing.JohanGarth------------------------------------------------------------------------- - A daemon (pid 32065) died unexpectedly with status 255 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. ------------------------------------------------------------------------- - ------------------------------------------------------------------------- - mpirun noticed that the job aborted, but has no info as to the process that caused that situation. ------------------------------------------------------------------------- - mpirun: clean termination accomplished _______________________________________________ DOLFIN-dev mailing list DOLFIN-dev@xxxxxxxxxx http://www.fenics.org/mailman/listinfo/dolfin-dev
Thread Previous • Date Previous • Date Next • Thread Next |