dolfin team mailing list archive

Thread
Date
Re: Parallel assembly

To: "Garth N. Wells" <gnw20@xxxxxxxxx>
From: Anders Logg <logg@xxxxxxxxx>
Date: Sun, 9 Dec 2007 20:54:24 +0100
Cc: dolfin-dev@xxxxxxxxxx
Delivered-to: dolfin-dev@xxxxxxxxxx
In-reply-to: <47595AC5.6060002@cam.ac.uk>
Mail-followup-to: "Garth N. Wells" <gnw20@xxxxxxxxx>, dolfin-dev@xxxxxxxxxx
User-agent: Mutt/1.5.15+20070412 (2007-04-11)
On Fri, Dec 07, 2007 at 02:37:57PM +0000, Garth N. Wells wrote:
> 
> 
> Anders Logg wrote:
> > On Fri, Dec 07, 2007 at 02:13:22PM +0000, Garth N. Wells wrote:
> >>
> >> Anders Logg wrote:
> >>> On Fri, Dec 07, 2007 at 03:04:12PM +0100, Anders Logg wrote:
> >>>> On Wed, Dec 05, 2007 at 06:18:53PM +0100, Johan Hoffman wrote:
> >>>>>> Just some comments on the strategy.
> >>>>>>
> >>>>>> On Dec 5, 2007 7:50 AM, Anders Logg <logg@xxxxxxxxx> wrote:
> >>>>>>> On Mon, Dec 03, 2007 at 11:44:44AM +0100, Niclas Jansson wrote:
> >>>>>>>
> >>>>>>>> It's a different strategy that uses point to point instead of
> >>>>>>> collective
> >>>>>>>> communication. However the plan for parallel assembly should be more
> >>>>>>> or
> >>>>>>>> less the same.
> >>>>>>>>
> >>>>>>>> I attached the more detailed TODO list, it should explain the
> >>>>>>> necessary
> >>>>>>>> changes to the Mesh classes.
> >>>>>>>>
> >>>>>>>> Niclas
> >>>>>>>> Modify mesh representation to store both local and global indices
> >>>>>>>> for each cell/vertex. Implement mesh functions to map between local
> >>>>>>>> and global indices. The local indices corresponds to the current
> >>>>>>>> cell and vertex indices, only the mapping functions must be added to
> >>>>>>>> the Mesh class.
> >>>>>>> I don't think we should store both local and global indices for mesh
> >>>>>>> entities. All we need is to store the mapping from local to global
> >>>>>>> indices. We can use MeshFunctions for this but it's not necessary.
> >>>>>>>
> >>>>>>> My suggestion would be to add a new class MeshNumbering (maybe someone
> >>>>>>> can suggest a better name) which would store the numbering scheme in a
> >>>>>>> set of (dim + 1) arrays:
> >>>>>>>
> >>>>>>>   class MeshNumbering
> >>>>>>>   {
> >>>>>>>   public:
> >>>>>>>
> >>>>>>>     ...
> >>>>>>>
> >>>>>>>   private:
> >>>>>>>
> >>>>>>>     uint** numbering;
> >>>>>>>
> >>>>>>>   }
> >>>>>>>
> >>>>>>> One array for each dimension so, numbering[0][i] would return the
> >>>>>>> global number for the vertex with index i.
> >>>>>>>
> >>>>>>> We can also add an easy-acccess function for global numbers to
> >>>>>>> MeshEntity so that (for example) e.number() would return the global
> >>>>>>> number of an entity e.
> >>>>> I think the motivation for locally storing the global index is to keep the
> >>>>> algorithms fully distributed.
> >>>>> As far as I understand, with a
> >>>>> MeshNumbering-solution as you outline you would store this data at one
> >>>>> processor? OR do I misunderstand this? For very large problems we want all
> >>>>> algorithms to be fully distributed.
> >>>> I mean storing the indices locally on each processor (for its part of
> >>>> the mesh) but storing them in global arrays on those processors. There
> >>>> is no way to store a value locally at each vertex etc since those
> >>>> objects (vertices, cells and in general MeshEntity) don't exist. Only
> >>>> the arrays exist and the objects are views that get created by the
> >>>> iterators etc.
> >>>>
> >>>>>> I will just point out that I think this is very limiting. You can argue
> >>>>>> that it
> >>>>>> covers what you want to do, but it is quite inflexible compared with
> >>>>>> having names. It is an incredible pain in the ass the rebalance ( or
> >>>>>> anything else complicated, like AMR) if you
> >>>>>> rely on offsets (numberings) rather than names. I recommend (as we
> >>>>>> do) using names until you have exactly the mesh you want, and then
> >>>>>> reducing to offsets. This is implemeted manually in Sieve right now
> >>>>>> (you call a method), but I am trying to automate it with code generation.
> >>>>> I am not sure of what you mean by "names"? But I guess that you mean that
> >>>>> you want to work with different closures of a set of mesh entities (like
> >>>>> cells for example)? If that's what you mean, then I would agree.
> >>>>>
> >>>>>>>> Adapt mesh reading for the new representation, store mesh data based
> >>>>>>>> on number of local cells/vertices instead of parsed numbers. This
> >>>>>>>> modification allows processors to read different parts of the mesh
> >>>>>>>> in parallel making an initial distribution step unnecessary.
> >>>>>>>>
> >>>>>>>> Loading meshes in parallel should increase efficiency, reduce cost
> >>>>>>>> full communication and save memory for large scale problem, given
> >>>>>>>> that the parallel environment have a shared file system that could
> >>>>>>>> handle the load. However the serial distribution should still be
> >>>>>>>> implemented to support environment without shared file systems.
> >>>>>>>>
> >>>>>>>>  Modification for the new representation should be implemented in
> >>>>>>>>  class XMLMesh. Functions for initial mesh distribution should be
> >>>>>>>>  implemented in a new class.
> >>>>>>> For this, we should add optional data to the mesh format, such that
> >>>>>>> the current file format still works. If additional data is present,
> >>>>>>> then that is read into MeshNumbering, otherwise it is empty.
> >>>>> Yes, that is natural (that the serial/regular mesh format should still
> >>>>> work without modifications). Although, I am not sure that I think the
> >>>>> MeshNumbering approach is a good solution.
> >>>>>
> >>>>>>> (When I think of it, MeshNumbering may not be a good choice of name
> >>>>>>> for the new class, since it may be confused with MeshOrdering which
> >>>>>>> does something different but related.)
> >>>>>>>
> >>>>>>>> Change mesh partitioning library to ParMETIS. Modify the
> >>>>>>>> partitioning class to work on distributed data, add the necessary
> >>>>>>>> calls to METIS and redistribute the local vertices/cells according
> >>>>>>>> to the result. Since METIS could partition a mesh directly using an
> >>>>>>>> internal mesh to graph translation it is possible to have
> >>>>>>>> partitioning directly in the MeshPartition class. However both
> >>>>>>>> methods could easily be implemented and compared against each other.
> >>>>>>> We don't want to change from SCOTCH to ParMETIS, but we could add
> >>>>>>> support for using METIS/ParMETIS as an option.
> >>>>> Yes, that is right. The implementation for the basic algorithms should
> >>>>> look more or less the same for Scotch and parMetis, and then we could add
> >>>>> extra optional functionality based on parMetis until we can find
> >>>>> corresponding functionality in Scotch.
> >>>>>
> >>>>>> Have you thought about generalizing the partitioning to hypergraphs? I
> >>>>>> just
> >>>>>> did this so I can partition faces (for FVM) and it was not that bad. I
> >>>>>> use Zoltan
> >>>>>> from Sandia.
> >>>>> Sounds interesting. I do not know about this, but it may be worth to look at.
> >>>>>
> >>>>>>>> Finish implementation of mesh communication class MPIMeshCommunicator.
> >>>>>>>> Add functionality for single vertex and cell communication needed
> >>>>>>>> for mesh partitioning.
> >>>>>>> What do you mean by single vertex and cell communication? Also note
> >>>>>>> that it is not enough to communicate indices for vertices and
> >>>>>>> cells. Sometimes we also need to communicate edges and faces.
> >>>>>> That is why you should never explicitly refer to vertices and cells, but
> >>>>>> rather
> >>>>>> communicate that entire closure and star of each element which you send.
> >>>>>> That is the point of the mesh structure, to avoid this kind of special
> >>>>>> purpose
> >>>>>> coding.
> >>>>>>
> >>>>>>>> Adapt boundary calculation to work on distributed meshes. Use
> >>>>>>>> knowledge about which vertices are shared among processors to decide
> >>>>>>>> if an edge is global or local. Implement the logic directly in
> >>>>>>>> BoundaryComputation class using information from the mesh
> >>>>>>>> partitioning.
> >>>>>>> I'm not sure I understand this point.
> >>>>> If you only have a part of the mesh, as is the case for a fully
> >>>>> distributed mesh, then you need to know if your Boundary (of the local
> >>>>> Mesh) is a internal boundary, which should communicate with its
> >>>>> neighboring partitions, or an external boundary for which you apply
> >>>>> boundary conditions.
> >>>> ok, good point.
> >>> We also need to be able to assemble over interior facets (in DG
> >>> methods). There we iterate over all interior facets and for each facet
> >>> pick its two neighboring cells, so those two cells need to be
> >>> available. So if we want to partition along a given facet, do we
> >>> include the two cells on either side in both sub domains so that both
> >>> cells are available from both sub domains?
> >>>
> >> I would say yes. This is how FD methods typically work in parallel.
> >>
> >> Garth
> > 
> > ok, good.
> > 
> > Magnus, can you add a flag to Mesh::partition() for doing this?
> > 
> > To start with, we can just have a boolean:
> > 
> >   void partition(uint num_partitions, MeshFunction<uint>& partitions, bool overlap=true);
> > 
> > The question is how to number those cells which are on two processors.
> > One option would be
> > 
> >   (i + 1)*num_partitions + j
> > 
> > if a cell should be on processors i and j.
> > 
> 
> In terms of the mesh, we won't need this now because each process has 
> the entire mesh. It will be needed when we go fully distributed, but I 
> think that this is the lowest priority in terms of parallelisation at 
> this stage.
> 
> We may need another MeshFunction which is tied to facets so each process 
> knows whether not it should assemble a facet integral at a process 
> boundary.
> 
> What we do need in the short term are the values of functions defined on 
> the 'ghost' cells so that DG terms can be computed at processor 
> boundaries. This will require some careful re-designing of DiscreteFunction.

Yes. I had been hoping PETSc would handle this for us but I see now
that VecGetValues can only fetch values on the current processor.

-- 
Anders
Follow ups

Re: Parallel assembly
From: Matthew Knepley, 2007-12-09
References

Re: Parallel assembly
From: Johan Hoffman, 2007-12-03
Re: Parallel assembly
From: Niclas Jansson, 2007-12-03
Re: Parallel assembly
From: Anders Logg, 2007-12-05
Re: Parallel assembly
From: Matthew Knepley, 2007-12-05
Re: Parallel assembly
From: Johan Hoffman, 2007-12-05
Re: Parallel assembly
From: Anders Logg, 2007-12-07
Re: Parallel assembly
From: Anders Logg, 2007-12-07
Re: Parallel assembly
From: Garth N. Wells, 2007-12-07
Re: Parallel assembly
From: Anders Logg, 2007-12-07
Re: Parallel assembly
From: Garth N. Wells, 2007-12-07