Thread Previous • Date Previous • Date Next • Thread Next |
Anders Logg wrote:
On Thu, Aug 21, 2008 at 09:10:03AM +0200, Niclas Jansson wrote:Anders Logg wrote:On Wed, Aug 20, 2008 at 06:17:30PM +0200, Niclas Jansson wrote:Stage 2 seems to involve a lot of communication, with small messages. I think it would be more efficient if the stage were reorganized such that all messages could be exchanged "at once", in a couple of larger messages.That would be nice. I'm very open to suggestions.If understand the {T, S, F} overlap correctly, a facet could be globally identified by the value of F(facet).No, F(facet) would be the local number of the facet in subdomain S(facet).If so, one suggestion is to buffer N_i and F(facet) in 0...p-1 buffers (one for each processor) and exchange these during stage 2. -- stage 1 for each facet f \in T j = S_i(f) if j > i -- calculate dof N_i buffer[S_i(f)].add(N_i) buffer[S_i(f)].add(F_i(f)) end end -- stage 2 -- Exchange shared dofs with fancy MPI_Allgatherv or a lookalike -- MPI_SendRecv loop. for j = 1 to j = (num processors - 1) src = (rank - j + num processors) % num processors dest = (rank + j) % num processors MPI_SendRecv(dest, buffer[dest], src, recv_buffer) for i = 0 to sizeof(recv_buffer), i += 2 --update facet recv_buff(i+1) with dof value in recv_buff(i) end endI didn't look at this in detail (yet). Is it still valid with the above interpretation of F(facet)?Yes, I think so.I think I understand your point, but I don't understand the details of your code.
if j > i the processor is responsible for creating M_i for the shared facet. The newly created M_i is placed in the send buffer for the subdomain S_f(f), together with the local facet number in that subdomain.
So the send buffers contains tuples {M_i, F_i(f)}, since there is one buffer for each subdomain, one could be sure that F_i(f) is valid on the receiving processor.
Instead of iterating over all processors and facets in stage 2, each processor receives a set of tuples (for all shared facets) from each processor. These could then be used to identify the local facet (since F_i(f) is the local facet number) and assign the dofs, which, if I understand everything correctly is obtained from M_i.
One modification to the above algorithm, I think it's easier if the tuples are stored as {F_i(f), M_i}. Since M_i could be a long list of dofs. So the update loop would be something similar to
for i = 0 to size of recv_buff , i +=(number of dofs on each facet + 1) local facet f = recv_buff(i) for each facet on f, loop counter j assign recv_buff( (i+1) + j) ) to facet dof j end end
The mapping N_i is an auxiliary global-to-global mapping, which maps the global dofs on a local mesh to global dofs on the global mesh. It has a meaning only on each local mesh. What we want to communicate is the stuff in M_i.
I see, then it should be M_i in the outlined code. Niclas
Should we try to implement this? It will essentially be Algorithm 5++ (Algorithm 5 with your improvements). So we don't store a global numbering of mesh entities but instead compute a global dof map in parallel. And we store the overlap as MeshData in some way (a set of MeshFunctions attached to each local mesh). I'm very open to which set of MeshFunctions we will need, either just S, F or additional data we might need. Other opinions? Garth? Ola? -- Anders
Thread Previous • Date Previous • Date Next • Thread Next |