dolfin team mailing list archive

Thread
Date
Re: Parallel assembly

To: jhoffman@xxxxxxxxxx
From: "Matthew Knepley" <knepley@xxxxxxxxx>
Date: Wed, 5 Dec 2007 13:23:15 -0600
Cc: dolfin-dev@xxxxxxxxxx
Delivered-to: dolfin-dev@xxxxxxxxxx
In-reply-to: <64177.83.250.37.245.1196875133.squirrel@webmail.csc.kth.se>
On Dec 5, 2007 11:18 AM, Johan Hoffman <jhoffman@xxxxxxxxxx> wrote:
>
> > Just some comments on the strategy.
> >
> > On Dec 5, 2007 7:50 AM, Anders Logg <logg@xxxxxxxxx> wrote:
> >> On Mon, Dec 03, 2007 at 11:44:44AM +0100, Niclas Jansson wrote:
> >>
> >> > It's a different strategy that uses point to point instead of
> >> collective
> >> > communication. However the plan for parallel assembly should be more
> >> or
> >> > less the same.
> >> >
> >> > I attached the more detailed TODO list, it should explain the
> >> necessary
> >> > changes to the Mesh classes.
> >> >
> >> > Niclas
> >>
> >> > Modify mesh representation to store both local and global indices
> >> > for each cell/vertex. Implement mesh functions to map between local
> >> > and global indices. The local indices corresponds to the current
> >> > cell and vertex indices, only the mapping functions must be added to
> >> > the Mesh class.
> >>
> >> I don't think we should store both local and global indices for mesh
> >> entities. All we need is to store the mapping from local to global
> >> indices. We can use MeshFunctions for this but it's not necessary.
> >>
> >> My suggestion would be to add a new class MeshNumbering (maybe someone
> >> can suggest a better name) which would store the numbering scheme in a
> >> set of (dim + 1) arrays:
> >>
> >>   class MeshNumbering
> >>   {
> >>   public:
> >>
> >>     ...
> >>
> >>   private:
> >>
> >>     uint** numbering;
> >>
> >>   }
> >>
> >> One array for each dimension so, numbering[0][i] would return the
> >> global number for the vertex with index i.
> >>
> >> We can also add an easy-acccess function for global numbers to
> >> MeshEntity so that (for example) e.number() would return the global
> >> number of an entity e.
>
> I think the motivation for locally storing the global index is to keep the
> algorithms fully distributed. As far as I understand, with a
> MeshNumbering-solution as you outline you would store this data at one
> processor? OR do I misunderstand this? For very large problems we want all
> algorithms to be fully distributed.
>
> > I will just point out that I think this is very limiting. You can argue
> > that it
> > covers what you want to do, but it is quite inflexible compared with
> > having names. It is an incredible pain in the ass the rebalance ( or
> > anything else complicated, like AMR) if you
> > rely on offsets (numberings) rather than names. I recommend (as we
> > do) using names until you have exactly the mesh you want, and then
> > reducing to offsets. This is implemeted manually in Sieve right now
> > (you call a method), but I am trying to automate it with code generation.
>
> I am not sure of what you mean by "names"? But I guess that you mean that
> you want to work with different closures of a set of mesh entities (like
> cells for example)? If that's what you mean, then I would agree.

A name refers to a piece of the mesh apart from its place in an order. Thus, we
a piece is communicated to another process, its name does not change.
Furthermore, if I add a piece on some process, its new name does not disrupt
some global order automatically. I hate having global orders and avoid them
whenever I can.

> >> > Adapt mesh reading for the new representation, store mesh data based
> >> > on number of local cells/vertices instead of parsed numbers. This
> >> > modification allows processors to read different parts of the mesh
> >> > in parallel making an initial distribution step unnecessary.
> >> >
> >> > Loading meshes in parallel should increase efficiency, reduce cost
> >> > full communication and save memory for large scale problem, given
> >> > that the parallel environment have a shared file system that could
> >> > handle the load. However the serial distribution should still be
> >> > implemented to support environment without shared file systems.
> >> >
> >> >  Modification for the new representation should be implemented in
> >> >  class XMLMesh. Functions for initial mesh distribution should be
> >> >  implemented in a new class.
> >>
> >> For this, we should add optional data to the mesh format, such that
> >> the current file format still works. If additional data is present,
> >> then that is read into MeshNumbering, otherwise it is empty.
>
> Yes, that is natural (that the serial/regular mesh format should still
> work without modifications). Although, I am not sure that I think the
> MeshNumbering approach is a good solution.
>
> >> (When I think of it, MeshNumbering may not be a good choice of name
> >> for the new class, since it may be confused with MeshOrdering which
> >> does something different but related.)
> >>
> >> > Change mesh partitioning library to ParMETIS. Modify the
> >> > partitioning class to work on distributed data, add the necessary
> >> > calls to METIS and redistribute the local vertices/cells according
> >> > to the result. Since METIS could partition a mesh directly using an
> >> > internal mesh to graph translation it is possible to have
> >> > partitioning directly in the MeshPartition class. However both
> >> > methods could easily be implemented and compared against each other.
> >>
> >> We don't want to change from SCOTCH to ParMETIS, but we could add
> >> support for using METIS/ParMETIS as an option.
>
> Yes, that is right. The implementation for the basic algorithms should
> look more or less the same for Scotch and parMetis, and then we could add
> extra optional functionality based on parMetis until we can find
> corresponding functionality in Scotch.
>
> > Have you thought about generalizing the partitioning to hypergraphs? I
> > just
> > did this so I can partition faces (for FVM) and it was not that bad. I
> > use Zoltan
> > from Sandia.
>
> Sounds interesting. I do not know about this, but it may be worth to look at.
>
> >> > Finish implementation of mesh communication class MPIMeshCommunicator.
> >> > Add functionality for single vertex and cell communication needed
> >> > for mesh partitioning.
> >>
> >> What do you mean by single vertex and cell communication? Also note
> >> that it is not enough to communicate indices for vertices and
> >> cells. Sometimes we also need to communicate edges and faces.
> >
> > That is why you should never explicitly refer to vertices and cells, but
> > rather
> > communicate that entire closure and star of each element which you send.
> > That is the point of the mesh structure, to avoid this kind of special
> > purpose
> > coding.
> >
> >> > Adapt boundary calculation to work on distributed meshes. Use
> >> > knowledge about which vertices are shared among processors to decide
> >> > if an edge is global or local. Implement the logic directly in
> >> > BoundaryComputation class using information from the mesh
> >> > partitioning.
> >>
> >> I'm not sure I understand this point.
>
> If you only have a part of the mesh, as is the case for a fully
> distributed mesh, then you need to know if your Boundary (of the local
> Mesh) is a internal boundary, which should communicate with its
> neighboring partitions, or an external boundary for which you apply
> boundary conditions.
>
> >> > Modify Assembly process with a mapping function which maps dof_maps
> >> > indices from local global prior to updating the global tensor.
> >> > Implement the call in class Assembler using functions from the Mesh
> >> > class.
> >>
> >> It might be enough to modify UFCCell::update().
>
> Ok. Sounds good.
>
> >> > Change PETSc data types to MPI (PETScMatrix,PETScVector).
> >> > Change PETSc solver environment to use the correct MPI communicator
> >> >  (All PETSc solver classes).
> >>
> >> We need to determine whether to use MPI or Seq PETSc types depending
> >> on whether we are running in parallel.
> >
> > We have types for this like AIJ and the default Vec.
>
> Sounds good.
>
> /Johan
>
>
>
> >
> >    Matt
> >
> >> --
> >> Anders
> >> _______________________________________________
> >> DOLFIN-dev mailing list
> >> DOLFIN-dev@xxxxxxxxxx
> >> http://www.fenics.org/mailman/listinfo/dolfin-dev
> >>
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> > experiments is infinitely more interesting than any results to which
> > their experiments lead.
> > -- Norbert Wiener
> > _______________________________________________
> > DOLFIN-dev mailing list
> > DOLFIN-dev@xxxxxxxxxx
> > http://www.fenics.org/mailman/listinfo/dolfin-dev
> >
>
>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener
Follow ups

Re: Parallel assembly
From: Anders Logg, 2007-12-07
References

Re: Parallel assembly
From: Anders Logg, 2007-12-01
Re: Parallel assembly
From: Johan Hoffman, 2007-12-02
Re: Parallel assembly
From: Garth N. Wells, 2007-12-02
Re: Parallel assembly
From: Johan Hoffman, 2007-12-02
Re: Parallel assembly
From: Anders Logg, 2007-12-02
Re: Parallel assembly
From: Johan Hoffman, 2007-12-03
Re: Parallel assembly
From: Niclas Jansson, 2007-12-03
Re: Parallel assembly
From: Anders Logg, 2007-12-05
Re: Parallel assembly
From: Matthew Knepley, 2007-12-05
Re: Parallel assembly
From: Johan Hoffman, 2007-12-05