dolfin team mailing list archive

Thread
Date
Re: Parallel assembly

To: dolfin-dev@xxxxxxxxxx
From: Anders Logg <logg@xxxxxxxxx>
Date: Fri, 7 Dec 2007 15:11:52 +0100
Delivered-to: dolfin-dev@xxxxxxxxxx
In-reply-to: <20071206153726.4ce7c207.njansson@kth.se>
Mail-followup-to: dolfin-dev@xxxxxxxxxx
User-agent: Mutt/1.5.15+20070412 (2007-04-11)
On Thu, Dec 06, 2007 at 03:37:26PM +0100, Niclas Jansson wrote:
> On Wed, 05 Dec 2007 18:18:31 +0000
> "Garth N. Wells" <gnw20@xxxxxxxxx> wrote:
> 
> > 
> > 
> > Niclas Jansson wrote:
> > > On Wed, 5 Dec 2007 08:09:07 -0600
> > > "Matthew Knepley" <knepley@xxxxxxxxx> wrote:
> > > 
> > >> Just some comments on the strategy.
> > >>
> > >> On Dec 5, 2007 7:50 AM, Anders Logg <logg@xxxxxxxxx> wrote:
> > >>> On Mon, Dec 03, 2007 at 11:44:44AM +0100, Niclas Jansson wrote:
> > >>>
> > >>>> It's a different strategy that uses point to point instead of
> > >>>> collective communication. However the plan for parallel assembly
> > >>>> should be more or less the same.
> > >>>>
> > >>>> I attached the more detailed TODO list, it should explain the
> > >>>> necessary changes to the Mesh classes.
> > >>>>
> > >>>> Niclas
> > >>>> Modify mesh representation to store both local and global indices
> > >>>> for each cell/vertex. Implement mesh functions to map between
> > >>>> local and global indices. The local indices corresponds to the
> > >>>> current cell and vertex indices, only the mapping functions must
> > >>>> be added to the Mesh class.
> > >>> I don't think we should store both local and global indices for
> > >mesh>> entities. All we need is to store the mapping from local to
> > >global>> indices. We can use MeshFunctions for this but it's not
> > >necessary.>>
> > >>> My suggestion would be to add a new class MeshNumbering (maybe
> > >>> someone can suggest a better name) which would store the numbering
> > >>> scheme in a set of (dim + 1) arrays:
> > >>>
> > >>>   class MeshNumbering
> > >>>   {
> > >>>   public:
> > >>>
> > >>>     ...
> > >>>
> > >>>   private:
> > >>>
> > >>>     uint** numbering;
> > >>>
> > >>>   }
> > >>>
> > >>> One array for each dimension so, numbering[0][i] would return the
> > >>> global number for the vertex with index i.
> > >>>
> > >>> We can also add an easy-acccess function for global numbers to
> > >>> MeshEntity so that (for example) e.number() would return the
> > >global>> number of an entity e.
> > >> I will just point out that I think this is very limiting. You can
> > >> argue that it covers what you want to do, but it is quite
> > >inflexible> compared with having names. It is an incredible pain in
> > >the ass the> rebalance ( or anything else complicated, like AMR) if
> > >you> rely on offsets (numberings) rather than names. I recommend (as
> > >we> do) using names until you have exactly the mesh you want, and
> > >then> reducing to offsets. This is implemeted manually in Sieve right
> > >now> (you call a method), but I am trying to automate it with code
> > >> generation.
> > >>
> > > 
> > > Ok, since the second part of the project covers AMR maybe a
> > > different approach is needed.
> > > 
> > > 
> > >>>> Adapt mesh reading for the new representation, store mesh data
> > >>>> based on number of local cells/vertices instead of parsed
> > >numbers.>>> This modification allows processors to read different
> > >parts of the>>> mesh in parallel making an initial distribution step
> > >unnecessary.>>>
> > >>>> Loading meshes in parallel should increase efficiency, reduce
> > >cost>>> full communication and save memory for large scale problem,
> > >given>>> that the parallel environment have a shared file system that
> > >could>>> handle the load. However the serial distribution should
> > >still be>>> implemented to support environment without shared file
> > >systems.>>>
> > >>>>  Modification for the new representation should be implemented in
> > >>>>  class XMLMesh. Functions for initial mesh distribution should be
> > >>>>  implemented in a new class.
> > >>> For this, we should add optional data to the mesh format, such
> > >that>> the current file format still works. If additional data is
> > >present,>> then that is read into MeshNumbering, otherwise it is
> > >empty.>>
> > >>> (When I think of it, MeshNumbering may not be a good choice of
> > >name>> for the new class, since it may be confused with MeshOrdering
> > >which>> does something different but related.)
> > >>>
> > > 
> > > 
> > > It would clean up the implementation a lot, my idea was to use a
> > > simple linear distribution just to get everything of disk. But maybe
> > > this approach (whole idea) wont scale beyond 4-8 processor without
> > > any fancy filesystem (gpfs) or MPI-IO implementation.
> > > 
> > > 
> > >>>> Change mesh partitioning library to ParMETIS. Modify the
> > >>>> partitioning class to work on distributed data, add the necessary
> > >>>> calls to METIS and redistribute the local vertices/cells
> > >according>>> to the result. Since METIS could partition a mesh
> > >directly using>>> an internal mesh to graph translation it is
> > >possible to have>>> partitioning directly in the MeshPartition class.
> > >However both>>> methods could easily be implemented and compared
> > >against each>>> other.
> > >>> We don't want to change from SCOTCH to ParMETIS, but we could add
> > >>> support for using METIS/ParMETIS as an option.
> > >> Have you thought about generalizing the partitioning to
> > >hypergraphs? I> just did this so I can partition faces (for FVM) and
> > >it was not that> bad. I use Zoltan
> > >> from Sandia.
> > >>
> > > 
> > > No, but Zoltan looked really interesting for the AMR/load balancing
> > > parts.
> > > 
> > > 
> > >>>> Finish implementation of mesh communication class
> > >>>> MPIMeshCommunicator. Add functionality for single vertex and cell
> > >>>> communication needed for mesh partitioning.
> > >>> What do you mean by single vertex and cell communication? Also
> > >note>> that it is not enough to communicate indices for vertices and
> > >>> cells. Sometimes we also need to communicate edges and faces.
> > >> That is why you should never explicitly refer to vertices and
> > >cells,> but rather communicate that entire closure and star of each
> > >element> which you send. That is the point of the mesh structure, to
> > >avoid this> kind of special purpose coding.
> > >>
> > > 
> > > Of course, what I meant was that functionality for a point-to-point
> > > pattern had to be implemented.
> > > 
> > >>>> Adapt boundary calculation to work on distributed meshes. Use
> > >>>> knowledge about which vertices are shared among processors to
> > >>>> decide if an edge is global or local. Implement the logic
> > >directly>>> in BoundaryComputation class using information from the
> > >mesh>>> partitioning.
> > >>> I'm not sure I understand this point.
> > >>>
> > > 
> > > Since the mesh is distributed a boundary could be local (shared
> > > among processor ) or global where the BC should be applied. The list
> > > of shared vertices could be used to sort out the local boundaries.
> > > 
> > 
> > But why do you need this? All you need is the dof map, and PETsc will 
> > take care of assembling entries on the boundaries of partitions.
> > 
>  
> Ok, probably I have misunderstand some parts of the assembler. I thought
> that when the assembler iterates over a wrong boundarymesh incorrect
> values would be added to the tensor.

Yes, this is correct. We need to skip those facets which are external
on the global mesh but internal on the local meshes.

-- 
Anders


> 
> Niclas
> 
> 
> > > 
> > >>>> Modify Assembly process with a mapping function which maps
> > >>>> dof_maps indices from local global prior to updating the global
> > >>>> tensor. Implement the call in class Assembler using functions
> > >from>>> the Mesh class.
> > >>> It might be enough to modify UFCCell::update().
> > >>>
> > > 
> > > Ok, I was thinking about something similar to the previously
> > > discussed pdofmap approach (src/sandbox/passembly).
> > > 
> > 
> > This is the point which is most pressing. Magnus has mesh partitioning
> > 
> > and distribution working (which can be refined later to be fully 
> > distributed), so to really get moving with parallel assembly we need
> > to sort out the dof mapping. The second priority is then making sire
> > that the Function class works properly in parallel.
> > 
> > Garth
> > 
> > > Niclas
> > > 
> > >>>> Change PETSc data types to MPI (PETScMatrix,PETScVector).
> > >>>> Change PETSc solver environment to use the correct MPI
> > >>>> communicator
> > >>>>  (All PETSc solver classes).
> > >>> We need to determine whether to use MPI or Seq PETSc types
> > >depending>> on whether we are running in parallel.
> > >> We have types for this like AIJ and the default Vec.
> > >>
> > >>    Matt
> > >>
> > >>
> > > 
> > > 
> > > 
> > > _______________________________________________
> > > DOLFIN-dev mailing list
> > > DOLFIN-dev@xxxxxxxxxx
> > > http://www.fenics.org/mailman/listinfo/dolfin-dev
> > 
> > 
> _______________________________________________
> DOLFIN-dev mailing list
> DOLFIN-dev@xxxxxxxxxx
> http://www.fenics.org/mailman/listinfo/dolfin-dev
References

Re: Parallel assembly
From: Garth N. Wells, 2007-12-02
Re: Parallel assembly
From: Johan Hoffman, 2007-12-02
Re: Parallel assembly
From: Anders Logg, 2007-12-02
Re: Parallel assembly
From: Johan Hoffman, 2007-12-03
Re: Parallel assembly
From: Niclas Jansson, 2007-12-03
Re: Parallel assembly
From: Anders Logg, 2007-12-05
Re: Parallel assembly
From: Matthew Knepley, 2007-12-05
Re: Parallel assembly
From: Niclas Jansson, 2007-12-05
Re: Parallel assembly
From: Garth N. Wells, 2007-12-05
Re: Parallel assembly
From: Niclas Jansson, 2007-12-06