← Back to team overview

yade-mpi team mailing list archive

Re: deadlock fixed (?)

 

Hi Deepak,

I think there is one more fix to be done (from commit 7dd44a4a in mpi) :
>>> The bodies are sent using the non blocking MPI_ISend, this has to be
>>> completed with the MPI_Wait, as of now the mpi_waits are not called, and
>>> there is a minory memory issue to be fixed which François is working on.
>>>
>> Yes, I'll do it soon.
>>
>
>  I fixed this yesterday, ran a few tests and it worked. Anyway it's better
> you have a look.
>

Thanks, I had a look and changed the logic of the memory issue fix (see my
last commit
<https://gitlab.com/yade-dev/trunk/commit/68e358b93f1b84fdb14e54fb50e07d91f54f6863>
). I added an init function in Subdomain.hpp which is called after the
first scene split. This way we can init buffer vectors (like stringBuff) to
the size of MPI_Comm_size, avoiding numerous push_back() and clear() on
vectors of buffers.

Concerning the non blocking MPI_ISend, using MPI_Wait was not necessary
with the use of a basic global barrier. I'm afraid that looping on send
requests and wait for them to complete can slow down the communications, as
you force (the send) order one more time (the receive order is already
forced here <https://gitlab.com/yade-dev/trunk/blob/mpi/py/mpy.py#L641>).

Follow ups

References