← Back to team overview

dolfin team mailing list archive

Re: Parallelization and PXMLMesh

 

On Thu, Dec 18, 2008 at 4:25 AM, Johan Hake <hake@xxxxxxxxx> wrote:

> On Wednesday 17 December 2008 20:19:52 Anders Logg wrote:
> > On Wed, Dec 17, 2008 at 08:13:03PM +0100, Johan Hake wrote:
> > > On Wednesday 17 December 2008 19:20:11 Anders Logg wrote:
> > > > Ola and I have now finished up the first round of getting DOLFIN to
> > > > run in parallel. In short, we can now parse meshes from file in
> > > > parallel and partition meshes in parallel (using ParMETIS).
> > > >
> > > > We reused some good ideas that Niclas Jansson had implemented in
> > > > PXMLMesh before, but have also made some significant changes as
> > > > follows:
> > > >
> > > > 1. The XML reader does not handle any partitioning.
> > > >
> > > > 2. The XML reader just reads in a chunk of the mesh data on each
> > > > processor (in parallel) and stores that into a LocalMeshData object
> > > > (one for each processor). The data is just partitioned in blocks so
> > > > the vertices and cells may be completely unrelated.
> > > >
> > > > 3. The partitioning takes place in MeshPartitioning::partition,
> > > > which gets a LocalMeshData object on each processor. It then calls
> > > > ParMETIS to compute a partition (in parallel) and then redistributes
> > > > the data accordingly. Finally, a mesh is built on each processor
> using
> > > > the local data.
> > > >
> > > > 4. All direct MPI calls (except one which should be removed) have
> been
> > > > removed from the code. Instead, we mostly rely on
> > > > dolfin::MPI::distribute which handles most cases of parallel
> > > > communication and works with STL data structures.
> > > >
> > > > 5. There is just one ParMETIS call (no initial geometric
> > > > partitioning). It seemed like an unnecessary step, or are there good
> > > > reasons to perform the partitioning in two steps?
> > > >
> > > > For testing, go to sandbox/passembly, build and then run
> > > >
> > > >   mpirun -n 4 ./demo
> > > >   ./plot_partitions 4
> > >
> > > Looks beautiful!
> > >
> > > I threw a 3D mesh of 160K vertices onto it, and it was partitioned
> nicely
> > > in some 10 s, on my 2 core laptop.
> > >
> > > Johan
> >
> > Nice, in particular since we haven't run any 3D test cases ourselves,
> > just a tiny mesh of the unit square... :-)
>
> Yes I thought so too ;)
>
> Johan
> _______________________________________________
> DOLFIN-dev mailing list
> DOLFIN-dev@xxxxxxxxxx
> http://www.fenics.org/mailman/listinfo/dolfin-dev
>

While we are on the topic of parallelisation - I have some comments.

A couple of months ago I was trying to parallelise some of my own code and
ran into some problems with the communicators used for PETSc and SLEPc
problems - I sort of got them to work, but never commited my changes because
they were a little ungainly and I felt I needed to spend some more time on
them.

One thing that I did notice is that the user does not have much control over
which parts of the process run in parallel - with the number of MPI
processes deciding whether or not a parallel implementation should be used.
In my case what I wanted to do was perform a frequency sweep and for each
frequency point perform the same calculation (an eigenvalue problem) for
that frequency.  My intention was to distribute the frequency sweep over a
number of processors and then handle the assembly and solution of each
system separately.  This is not possible as the code is now since the
assembly and eigenvalue solvers all try to run in parallel as soon as the
applicaion is run as an mpi program.

I know that this is not very descriptive, but does anyone else have thoughts
on the matter?  I will put together something a little more concrete as soon
as I have a chance (I am travelling around quite a bit at the moment so it
is difficult for me to focus).

Thanks
Evan

Follow ups

References