← Back to team overview

dolfin team mailing list archive

Re: Parallel assembly

 

On Wed, 05 Dec 2007 18:18:31 +0000
"Garth N. Wells" <gnw20@xxxxxxxxx> wrote:

> 
> 
> Niclas Jansson wrote:
> > On Wed, 5 Dec 2007 08:09:07 -0600
> > "Matthew Knepley" <knepley@xxxxxxxxx> wrote:
> > 
> >> Just some comments on the strategy.
> >>
> >> On Dec 5, 2007 7:50 AM, Anders Logg <logg@xxxxxxxxx> wrote:
> >>> On Mon, Dec 03, 2007 at 11:44:44AM +0100, Niclas Jansson wrote:
> >>>
> >>>> It's a different strategy that uses point to point instead of
> >>>> collective communication. However the plan for parallel assembly
> >>>> should be more or less the same.
> >>>>
> >>>> I attached the more detailed TODO list, it should explain the
> >>>> necessary changes to the Mesh classes.
> >>>>
> >>>> Niclas
> >>>> Modify mesh representation to store both local and global indices
> >>>> for each cell/vertex. Implement mesh functions to map between
> >>>> local and global indices. The local indices corresponds to the
> >>>> current cell and vertex indices, only the mapping functions must
> >>>> be added to the Mesh class.
> >>> I don't think we should store both local and global indices for
> >mesh>> entities. All we need is to store the mapping from local to
> >global>> indices. We can use MeshFunctions for this but it's not
> >necessary.>>
> >>> My suggestion would be to add a new class MeshNumbering (maybe
> >>> someone can suggest a better name) which would store the numbering
> >>> scheme in a set of (dim + 1) arrays:
> >>>
> >>>   class MeshNumbering
> >>>   {
> >>>   public:
> >>>
> >>>     ...
> >>>
> >>>   private:
> >>>
> >>>     uint** numbering;
> >>>
> >>>   }
> >>>
> >>> One array for each dimension so, numbering[0][i] would return the
> >>> global number for the vertex with index i.
> >>>
> >>> We can also add an easy-acccess function for global numbers to
> >>> MeshEntity so that (for example) e.number() would return the
> >global>> number of an entity e.
> >> I will just point out that I think this is very limiting. You can
> >> argue that it covers what you want to do, but it is quite
> >inflexible> compared with having names. It is an incredible pain in
> >the ass the> rebalance ( or anything else complicated, like AMR) if
> >you> rely on offsets (numberings) rather than names. I recommend (as
> >we> do) using names until you have exactly the mesh you want, and
> >then> reducing to offsets. This is implemeted manually in Sieve right
> >now> (you call a method), but I am trying to automate it with code
> >> generation.
> >>
> > 
> > Ok, since the second part of the project covers AMR maybe a
> > different approach is needed.
> > 
> > 
> >>>> Adapt mesh reading for the new representation, store mesh data
> >>>> based on number of local cells/vertices instead of parsed
> >numbers.>>> This modification allows processors to read different
> >parts of the>>> mesh in parallel making an initial distribution step
> >unnecessary.>>>
> >>>> Loading meshes in parallel should increase efficiency, reduce
> >cost>>> full communication and save memory for large scale problem,
> >given>>> that the parallel environment have a shared file system that
> >could>>> handle the load. However the serial distribution should
> >still be>>> implemented to support environment without shared file
> >systems.>>>
> >>>>  Modification for the new representation should be implemented in
> >>>>  class XMLMesh. Functions for initial mesh distribution should be
> >>>>  implemented in a new class.
> >>> For this, we should add optional data to the mesh format, such
> >that>> the current file format still works. If additional data is
> >present,>> then that is read into MeshNumbering, otherwise it is
> >empty.>>
> >>> (When I think of it, MeshNumbering may not be a good choice of
> >name>> for the new class, since it may be confused with MeshOrdering
> >which>> does something different but related.)
> >>>
> > 
> > 
> > It would clean up the implementation a lot, my idea was to use a
> > simple linear distribution just to get everything of disk. But maybe
> > this approach (whole idea) wont scale beyond 4-8 processor without
> > any fancy filesystem (gpfs) or MPI-IO implementation.
> > 
> > 
> >>>> Change mesh partitioning library to ParMETIS. Modify the
> >>>> partitioning class to work on distributed data, add the necessary
> >>>> calls to METIS and redistribute the local vertices/cells
> >according>>> to the result. Since METIS could partition a mesh
> >directly using>>> an internal mesh to graph translation it is
> >possible to have>>> partitioning directly in the MeshPartition class.
> >However both>>> methods could easily be implemented and compared
> >against each>>> other.
> >>> We don't want to change from SCOTCH to ParMETIS, but we could add
> >>> support for using METIS/ParMETIS as an option.
> >> Have you thought about generalizing the partitioning to
> >hypergraphs? I> just did this so I can partition faces (for FVM) and
> >it was not that> bad. I use Zoltan
> >> from Sandia.
> >>
> > 
> > No, but Zoltan looked really interesting for the AMR/load balancing
> > parts.
> > 
> > 
> >>>> Finish implementation of mesh communication class
> >>>> MPIMeshCommunicator. Add functionality for single vertex and cell
> >>>> communication needed for mesh partitioning.
> >>> What do you mean by single vertex and cell communication? Also
> >note>> that it is not enough to communicate indices for vertices and
> >>> cells. Sometimes we also need to communicate edges and faces.
> >> That is why you should never explicitly refer to vertices and
> >cells,> but rather communicate that entire closure and star of each
> >element> which you send. That is the point of the mesh structure, to
> >avoid this> kind of special purpose coding.
> >>
> > 
> > Of course, what I meant was that functionality for a point-to-point
> > pattern had to be implemented.
> > 
> >>>> Adapt boundary calculation to work on distributed meshes. Use
> >>>> knowledge about which vertices are shared among processors to
> >>>> decide if an edge is global or local. Implement the logic
> >directly>>> in BoundaryComputation class using information from the
> >mesh>>> partitioning.
> >>> I'm not sure I understand this point.
> >>>
> > 
> > Since the mesh is distributed a boundary could be local (shared
> > among processor ) or global where the BC should be applied. The list
> > of shared vertices could be used to sort out the local boundaries.
> > 
> 
> But why do you need this? All you need is the dof map, and PETsc will 
> take care of assembling entries on the boundaries of partitions.
> 
 
Ok, probably I have misunderstand some parts of the assembler. I thought
that when the assembler iterates over a wrong boundarymesh incorrect
values would be added to the tensor.


Niclas


> > 
> >>>> Modify Assembly process with a mapping function which maps
> >>>> dof_maps indices from local global prior to updating the global
> >>>> tensor. Implement the call in class Assembler using functions
> >from>>> the Mesh class.
> >>> It might be enough to modify UFCCell::update().
> >>>
> > 
> > Ok, I was thinking about something similar to the previously
> > discussed pdofmap approach (src/sandbox/passembly).
> > 
> 
> This is the point which is most pressing. Magnus has mesh partitioning
> 
> and distribution working (which can be refined later to be fully 
> distributed), so to really get moving with parallel assembly we need
> to sort out the dof mapping. The second priority is then making sire
> that the Function class works properly in parallel.
> 
> Garth
> 
> > Niclas
> > 
> >>>> Change PETSc data types to MPI (PETScMatrix,PETScVector).
> >>>> Change PETSc solver environment to use the correct MPI
> >>>> communicator
> >>>>  (All PETSc solver classes).
> >>> We need to determine whether to use MPI or Seq PETSc types
> >depending>> on whether we are running in parallel.
> >> We have types for this like AIJ and the default Vec.
> >>
> >>    Matt
> >>
> >>> --
> >>> Anders
> >>
> > 
> > 
> > 
> > _______________________________________________
> > DOLFIN-dev mailing list
> > DOLFIN-dev@xxxxxxxxxx
> > http://www.fenics.org/mailman/listinfo/dolfin-dev
> 
> 


Follow ups

References