← Back to team overview

dolfin team mailing list archive

Re: Parallelization and PXMLMesh

 

On Fri, Dec 19, 2008 at 12:09:50PM +0900, Evan Lezar wrote:
> 
> 
> On Thu, Dec 18, 2008 at 4:25 AM, Johan Hake <hake@xxxxxxxxx> wrote:
> 
>     On Wednesday 17 December 2008 20:19:52 Anders Logg wrote:
>     > On Wed, Dec 17, 2008 at 08:13:03PM +0100, Johan Hake wrote:
>     > > On Wednesday 17 December 2008 19:20:11 Anders Logg wrote:
>     > > > Ola and I have now finished up the first round of getting DOLFIN to
>     > > > run in parallel. In short, we can now parse meshes from file in
>     > > > parallel and partition meshes in parallel (using ParMETIS).
>     > > >
>     > > > We reused some good ideas that Niclas Jansson had implemented in
>     > > > PXMLMesh before, but have also made some significant changes as
>     > > > follows:
>     > > >
>     > > > 1. The XML reader does not handle any partitioning.
>     > > >
>     > > > 2. The XML reader just reads in a chunk of the mesh data on each
>     > > > processor (in parallel) and stores that into a LocalMeshData object
>     > > > (one for each processor). The data is just partitioned in blocks so
>     > > > the vertices and cells may be completely unrelated.
>     > > >
>     > > > 3. The partitioning takes place in MeshPartitioning::partition,
>     > > > which gets a LocalMeshData object on each processor. It then calls
>     > > > ParMETIS to compute a partition (in parallel) and then redistributes
>     > > > the data accordingly. Finally, a mesh is built on each processor
>     using
>     > > > the local data.
>     > > >
>     > > > 4. All direct MPI calls (except one which should be removed) have
>     been
>     > > > removed from the code. Instead, we mostly rely on
>     > > > dolfin::MPI::distribute which handles most cases of parallel
>     > > > communication and works with STL data structures.
>     > > >
>     > > > 5. There is just one ParMETIS call (no initial geometric
>     > > > partitioning). It seemed like an unnecessary step, or are there good
>     > > > reasons to perform the partitioning in two steps?
>     > > >
>     > > > For testing, go to sandbox/passembly, build and then run
>     > > >
>     > > >   mpirun -n 4 ./demo
>     > > >   ./plot_partitions 4
>     > >
>     > > Looks beautiful!
>     > >
>     > > I threw a 3D mesh of 160K vertices onto it, and it was partitioned
>     nicely
>     > > in some 10 s, on my 2 core laptop.
>     > >
>     > > Johan
>     >
>     > Nice, in particular since we haven't run any 3D test cases ourselves,
>     > just a tiny mesh of the unit square... :-)
> 
>     Yes I thought so too ;)
> 
>     Johan
>     _______________________________________________
>     DOLFIN-dev mailing list
>     DOLFIN-dev@xxxxxxxxxx
>     http://www.fenics.org/mailman/listinfo/dolfin-dev
> 
> 
> While we are on the topic of parallelisation - I have some comments.
> 
> A couple of months ago I was trying to parallelise some of my own code and ran
> into some problems with the communicators used for PETSc and SLEPc problems - I
> sort of got them to work, but never commited my changes because they were a
> little ungainly and I felt I needed to spend some more time on them.
> 
> One thing that I did notice is that the user does not have much control over
> which parts of the process run in parallel - with the number of MPI processes
> deciding whether or not a parallel implementation should be used.  In my case
> what I wanted to do was perform a frequency sweep and for each frequency point
> perform the same calculation (an eigenvalue problem) for that frequency.  My
> intention was to distribute the frequency sweep over a number of processors and
> then handle the assembly and solution of each system separately.  This is not
> possible as the code is now since the assembly and eigenvalue solvers all try
> to run in parallel as soon as the applicaion is run as an mpi program.
> 
> I know that this is not very descriptive, but does anyone else have thoughts on
> the matter?  I will put together something a little more concrete as soon as I
> have a chance (I am travelling around quite a bit at the moment so it is
> difficult for me to focus).

We're just getting started. Everything needs to be configurable. At
the moment, we assume that all parallel computation should be split in
MPI::num_processes().

We can either add optional parameters to all parallel calls or add
global options to control how many processes are used. But I suggest
we wait until we get all pieces in place. We currently have parallel
parsing and partitioning working. Next step will be to get parallel
assembly working, then the parallel solve (presumably simple since
it's handled by PETSc or Epetra).

-- 
Anders

Attachment: signature.asc
Description: Digital signature


Follow ups

References