dolfin team mailing list archive

Thread
Date

Re: Parallel assembly

To: dolfin-dev@xxxxxxxxxx
From: Anders Logg <logg@xxxxxxxxx>
Date: Fri, 7 Dec 2007 15:07:47 +0100
Delivered-to: dolfin-dev@xxxxxxxxxx
In-reply-to: <a9f269830712051123s542a8ef6v5008d34e002b4836@mail.gmail.com>
Mail-followup-to: dolfin-dev@xxxxxxxxxx
User-agent: Mutt/1.5.15+20070412 (2007-04-11)

On Wed, Dec 05, 2007 at 01:23:15PM -0600, Matthew Knepley wrote:
> On Dec 5, 2007 11:18 AM, Johan Hoffman <jhoffman@xxxxxxxxxx> wrote:
> >
> > > Just some comments on the strategy.
> > >
> > > On Dec 5, 2007 7:50 AM, Anders Logg <logg@xxxxxxxxx> wrote:
> > >> On Mon, Dec 03, 2007 at 11:44:44AM +0100, Niclas Jansson wrote:
> > >>
> > >> > It's a different strategy that uses point to point instead of
> > >> collective
> > >> > communication. However the plan for parallel assembly should be more
> > >> or
> > >> > less the same.
> > >> >
> > >> > I attached the more detailed TODO list, it should explain the
> > >> necessary
> > >> > changes to the Mesh classes.
> > >> >
> > >> > Niclas
> > >>
> > >> > Modify mesh representation to store both local and global indices
> > >> > for each cell/vertex. Implement mesh functions to map between local
> > >> > and global indices. The local indices corresponds to the current
> > >> > cell and vertex indices, only the mapping functions must be added to
> > >> > the Mesh class.
> > >>
> > >> I don't think we should store both local and global indices for mesh
> > >> entities. All we need is to store the mapping from local to global
> > >> indices. We can use MeshFunctions for this but it's not necessary.
> > >>
> > >> My suggestion would be to add a new class MeshNumbering (maybe someone
> > >> can suggest a better name) which would store the numbering scheme in a
> > >> set of (dim + 1) arrays:
> > >>
> > >>   class MeshNumbering
> > >>   {
> > >>   public:
> > >>
> > >>     ...
> > >>
> > >>   private:
> > >>
> > >>     uint** numbering;
> > >>
> > >>   }
> > >>
> > >> One array for each dimension so, numbering[0][i] would return the
> > >> global number for the vertex with index i.
> > >>
> > >> We can also add an easy-acccess function for global numbers to
> > >> MeshEntity so that (for example) e.number() would return the global
> > >> number of an entity e.
> >
> > I think the motivation for locally storing the global index is to keep the
> > algorithms fully distributed. As far as I understand, with a
> > MeshNumbering-solution as you outline you would store this data at one
> > processor? OR do I misunderstand this? For very large problems we want all
> > algorithms to be fully distributed.
> >
> > > I will just point out that I think this is very limiting. You can argue
> > > that it
> > > covers what you want to do, but it is quite inflexible compared with
> > > having names. It is an incredible pain in the ass the rebalance ( or
> > > anything else complicated, like AMR) if you
> > > rely on offsets (numberings) rather than names. I recommend (as we
> > > do) using names until you have exactly the mesh you want, and then
> > > reducing to offsets. This is implemeted manually in Sieve right now
> > > (you call a method), but I am trying to automate it with code generation.
> >
> > I am not sure of what you mean by "names"? But I guess that you mean that
> > you want to work with different closures of a set of mesh entities (like
> > cells for example)? If that's what you mean, then I would agree.
> 
> A name refers to a piece of the mesh apart from its place in an order. Thus, we
> a piece is communicated to another process, its name does not change.
> Furthermore, if I add a piece on some process, its new name does not disrupt
> some global order automatically. I hate having global orders and avoid them
> whenever I can.

This still sounds abstract too me. How do you implement it, and what
is a name? Is it an integer? So you assemble into entries in the
matrix that are triples, (k, i, j) where k is the name of the sub
domain and then you communicate/move things around to get the global
matrix?

(I'm sure our approach is limiting, but we need to start somewhere.)

-- 
Anders


> > >> > Adapt mesh reading for the new representation, store mesh data based
> > >> > on number of local cells/vertices instead of parsed numbers. This
> > >> > modification allows processors to read different parts of the mesh
> > >> > in parallel making an initial distribution step unnecessary.
> > >> >
> > >> > Loading meshes in parallel should increase efficiency, reduce cost
> > >> > full communication and save memory for large scale problem, given
> > >> > that the parallel environment have a shared file system that could
> > >> > handle the load. However the serial distribution should still be
> > >> > implemented to support environment without shared file systems.
> > >> >
> > >> >  Modification for the new representation should be implemented in
> > >> >  class XMLMesh. Functions for initial mesh distribution should be
> > >> >  implemented in a new class.
> > >>
> > >> For this, we should add optional data to the mesh format, such that
> > >> the current file format still works. If additional data is present,
> > >> then that is read into MeshNumbering, otherwise it is empty.
> >
> > Yes, that is natural (that the serial/regular mesh format should still
> > work without modifications). Although, I am not sure that I think the
> > MeshNumbering approach is a good solution.
> >
> > >> (When I think of it, MeshNumbering may not be a good choice of name
> > >> for the new class, since it may be confused with MeshOrdering which
> > >> does something different but related.)
> > >>
> > >> > Change mesh partitioning library to ParMETIS. Modify the
> > >> > partitioning class to work on distributed data, add the necessary
> > >> > calls to METIS and redistribute the local vertices/cells according
> > >> > to the result. Since METIS could partition a mesh directly using an
> > >> > internal mesh to graph translation it is possible to have
> > >> > partitioning directly in the MeshPartition class. However both
> > >> > methods could easily be implemented and compared against each other.
> > >>
> > >> We don't want to change from SCOTCH to ParMETIS, but we could add
> > >> support for using METIS/ParMETIS as an option.
> >
> > Yes, that is right. The implementation for the basic algorithms should
> > look more or less the same for Scotch and parMetis, and then we could add
> > extra optional functionality based on parMetis until we can find
> > corresponding functionality in Scotch.
> >
> > > Have you thought about generalizing the partitioning to hypergraphs? I
> > > just
> > > did this so I can partition faces (for FVM) and it was not that bad. I
> > > use Zoltan
> > > from Sandia.
> >
> > Sounds interesting. I do not know about this, but it may be worth to look at.
> >
> > >> > Finish implementation of mesh communication class MPIMeshCommunicator.
> > >> > Add functionality for single vertex and cell communication needed
> > >> > for mesh partitioning.
> > >>
> > >> What do you mean by single vertex and cell communication? Also note
> > >> that it is not enough to communicate indices for vertices and
> > >> cells. Sometimes we also need to communicate edges and faces.
> > >
> > > That is why you should never explicitly refer to vertices and cells, but
> > > rather
> > > communicate that entire closure and star of each element which you send.
> > > That is the point of the mesh structure, to avoid this kind of special
> > > purpose
> > > coding.
> > >
> > >> > Adapt boundary calculation to work on distributed meshes. Use
> > >> > knowledge about which vertices are shared among processors to decide
> > >> > if an edge is global or local. Implement the logic directly in
> > >> > BoundaryComputation class using information from the mesh
> > >> > partitioning.
> > >>
> > >> I'm not sure I understand this point.
> >
> > If you only have a part of the mesh, as is the case for a fully
> > distributed mesh, then you need to know if your Boundary (of the local
> > Mesh) is a internal boundary, which should communicate with its
> > neighboring partitions, or an external boundary for which you apply
> > boundary conditions.
> >
> > >> > Modify Assembly process with a mapping function which maps dof_maps
> > >> > indices from local global prior to updating the global tensor.
> > >> > Implement the call in class Assembler using functions from the Mesh
> > >> > class.
> > >>
> > >> It might be enough to modify UFCCell::update().
> >
> > Ok. Sounds good.
> >
> > >> > Change PETSc data types to MPI (PETScMatrix,PETScVector).
> > >> > Change PETSc solver environment to use the correct MPI communicator
> > >> >  (All PETSc solver classes).
> > >>
> > >> We need to determine whether to use MPI or Seq PETSc types depending
> > >> on whether we are running in parallel.
> > >
> > > We have types for this like AIJ and the default Vec.
> >
> > Sounds good.
> >
> > /Johan
> >
> >
> >
> > >
> > >    Matt
> > >
> > >>
> > >
> > >
> > >
> > >
> >
> >
> >
> 
> 
>

Follow ups

Re: Parallel assembly
From: Matthew Knepley, 2007-12-07

References

Re: Parallel assembly
From: Johan Hoffman, 2007-12-02
Re: Parallel assembly
From: Garth N. Wells, 2007-12-02
Re: Parallel assembly
From: Johan Hoffman, 2007-12-02
Re: Parallel assembly
From: Anders Logg, 2007-12-02
Re: Parallel assembly
From: Johan Hoffman, 2007-12-03
Re: Parallel assembly
From: Niclas Jansson, 2007-12-03
Re: Parallel assembly
From: Anders Logg, 2007-12-05
Re: Parallel assembly
From: Matthew Knepley, 2007-12-05
Re: Parallel assembly
From: Johan Hoffman, 2007-12-05
Re: Parallel assembly
From: Matthew Knepley, 2007-12-05