← Back to team overview

yade-users team mailing list archive

[Question #700315]: Open MPI - spawn processes

 

New question #700315 on Yade:
https://answers.launchpad.net/yade/+question/700315

Hi guys, I am new in OpenMPI so I will try to be as clear as possible here.

I instaled Yade 2021.01a on my cluster singularity/yade/yade_debian_bookwarm_1.0.sif.

I can run simulations using all cores I want. My cluster has 160 nodes, each node 80 cpu's.
So far so good.

I am now trying to run multiple nodes. For this, I am checking out this example [1].
When I run it I am getting the following message:

+ singularity run /beegfs/common/singularity/yade/yade_debian_bookwarm_1.0.sif yade -j5 Parallel.py
/usr/lib/x86_64-linux-gnu/yade/py/yade/__init__.py:76: RuntimeWarning: to-Python converter for boost::shared_ptr<yade::PartialSatClayEngine> already registered; second conversion method ignored.
  boot.initialize(plugins,config.confDir)
TCP python prompt on localhost:9000, auth cookie `ksaeuc'
Welcome to Yade 2021.01a 
Using python version: 3.9.7 (default, Sep 24 2021, 09:43:00) 
[GCC 10.3.0]
Warning: no X rendering available (see https://bbs.archlinux.org/viewtopic.php?id=13189)
XMLRPC info provider on http://localhost:21000
Running script Parallel.py
Traceback (most recent call last):
  File "/usr/bin/yade", line 343, in runScript
    execfile(script,globals())
  File "/usr/lib/python3/dist-packages/past/builtins/misc.py", line 87, in execfile
    exec_(code, myglobals, mylocals)
  File "Parallel.py", line 28, in <module>
    mp.initialize(numMPIThreads)
  File "/usr/lib/x86_64-linux-gnu/yade/py/yade/mpy.py", line 288, in initialize
    comm_slave = MPI.COMM_WORLD.Spawn(yadeArgv[0], args=yadeArgv[1:],maxprocs=numThreads-process_count)
  File "mpi4py/MPI/Comm.pyx", line 1534, in mpi4py.MPI.Intracomm.Spawn
mpi4py.MPI.Exception: MPI_ERR_SPAWN: could not spawn processes
Master: will spawn  9  workers running: /usr/bin/yade ['-j5', 'Parallel.py'] 
[[ ^L clears screen, ^U kills line. F8 plot. ]]

In [1]: Do you really want to exit ([y]/n)? 

I am not sure from where it is comming. Any idea?


This is how I am running it in my Batch:

#!/bin/bash -x
#SBATCH --nodes=2
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=80
#SBATCH --partition=compute
#SBATCH --job-name=DEM_PFV_Parallel
#SBATCH --time=10:00:00

singularity run /beegfs/common/singularity/yade/yade_debian_bookwarm_1.0.sif yade -j5 Case2_rotating_drum_mpi.py



PS. I am supposing that numMPIThreads = 10 in the python script is equal to nodes*-j (2*5 in this case).

[1]https://gitlab.com/yade-dev/trunk/-/blob/master/examples/DEM2020Benchmark/Case2_rotating_drum_mpi.py

-- 
You received this question notification because your team yade-users is
an answer contact for Yade.