dolfin team mailing list archive
-
dolfin team
-
Mailing list archive
-
Message #11246
Re: Parallelization and PXMLMesh
On Fri, Dec 19, 2008 at 09:06:43AM +0100, Martin Sandve Alnæs wrote:
> On Fri, Dec 19, 2008 at 8:54 AM, Anders Logg <logg@xxxxxxxxx> wrote:
> > On Fri, Dec 19, 2008 at 12:09:50PM +0900, Evan Lezar wrote:
> >>
> >>
> >> On Thu, Dec 18, 2008 at 4:25 AM, Johan Hake <hake@xxxxxxxxx> wrote:
> >>
> >> On Wednesday 17 December 2008 20:19:52 Anders Logg wrote:
> >> > On Wed, Dec 17, 2008 at 08:13:03PM +0100, Johan Hake wrote:
> >> > > On Wednesday 17 December 2008 19:20:11 Anders Logg wrote:
> >> > > > Ola and I have now finished up the first round of getting DOLFIN to
> >> > > > run in parallel. In short, we can now parse meshes from file in
> >> > > > parallel and partition meshes in parallel (using ParMETIS).
> >> > > >
> >> > > > We reused some good ideas that Niclas Jansson had implemented in
> >> > > > PXMLMesh before, but have also made some significant changes as
> >> > > > follows:
> >> > > >
> >> > > > 1. The XML reader does not handle any partitioning.
> >> > > >
> >> > > > 2. The XML reader just reads in a chunk of the mesh data on each
> >> > > > processor (in parallel) and stores that into a LocalMeshData object
> >> > > > (one for each processor). The data is just partitioned in blocks so
> >> > > > the vertices and cells may be completely unrelated.
> >> > > >
> >> > > > 3. The partitioning takes place in MeshPartitioning::partition,
> >> > > > which gets a LocalMeshData object on each processor. It then calls
> >> > > > ParMETIS to compute a partition (in parallel) and then redistributes
> >> > > > the data accordingly. Finally, a mesh is built on each processor
> >> using
> >> > > > the local data.
> >> > > >
> >> > > > 4. All direct MPI calls (except one which should be removed) have
> >> been
> >> > > > removed from the code. Instead, we mostly rely on
> >> > > > dolfin::MPI::distribute which handles most cases of parallel
> >> > > > communication and works with STL data structures.
> >> > > >
> >> > > > 5. There is just one ParMETIS call (no initial geometric
> >> > > > partitioning). It seemed like an unnecessary step, or are there good
> >> > > > reasons to perform the partitioning in two steps?
> >> > > >
> >> > > > For testing, go to sandbox/passembly, build and then run
> >> > > >
> >> > > > mpirun -n 4 ./demo
> >> > > > ./plot_partitions 4
> >> > >
> >> > > Looks beautiful!
> >> > >
> >> > > I threw a 3D mesh of 160K vertices onto it, and it was partitioned
> >> nicely
> >> > > in some 10 s, on my 2 core laptop.
> >> > >
> >> > > Johan
> >> >
> >> > Nice, in particular since we haven't run any 3D test cases ourselves,
> >> > just a tiny mesh of the unit square... :-)
> >>
> >> Yes I thought so too ;)
> >>
> >> Johan
> >> _______________________________________________
> >> DOLFIN-dev mailing list
> >> DOLFIN-dev@xxxxxxxxxx
> >> http://www.fenics.org/mailman/listinfo/dolfin-dev
> >>
> >>
> >> While we are on the topic of parallelisation - I have some comments.
> >>
> >> A couple of months ago I was trying to parallelise some of my own code and ran
> >> into some problems with the communicators used for PETSc and SLEPc problems - I
> >> sort of got them to work, but never commited my changes because they were a
> >> little ungainly and I felt I needed to spend some more time on them.
> >>
> >> One thing that I did notice is that the user does not have much control over
> >> which parts of the process run in parallel - with the number of MPI processes
> >> deciding whether or not a parallel implementation should be used. In my case
> >> what I wanted to do was perform a frequency sweep and for each frequency point
> >> perform the same calculation (an eigenvalue problem) for that frequency. My
> >> intention was to distribute the frequency sweep over a number of processors and
> >> then handle the assembly and solution of each system separately. This is not
> >> possible as the code is now since the assembly and eigenvalue solvers all try
> >> to run in parallel as soon as the applicaion is run as an mpi program.
> >>
> >> I know that this is not very descriptive, but does anyone else have thoughts on
> >> the matter? I will put together something a little more concrete as soon as I
> >> have a chance (I am travelling around quite a bit at the moment so it is
> >> difficult for me to focus).
> >
> > We're just getting started. Everything needs to be configurable. At
> > the moment, we assume that all parallel computation should be split in
> > MPI::num_processes().
> >
> > We can either add optional parameters to all parallel calls or add
> > global options to control how many processes are used. But I suggest
>
> Global options wouldn't solve anything (the point here is variable
> number of processes during the application lifetime), and it's probably
> enough with a single parameter, the communicator, perhaps permanently
> attached to each FunctionSpace (dofmap must be distributed over
> the processes in a communicator to be usable). Calls to MPI::num_processes()
> would be replaced by V.communicator().num_processes() or something.
> I get the waiting argument though. When stuff works, a search for MPI::*
> should reveal the places where "V.communicator()." should replace "MPI::",
> likely covering most places that need to be updated.
>
> Martin
Sounds good.
Speaking of the communicator, I'd like to add access to a default
communicator in the MPI class which could be accessed by
MPI::communicator() but couldn't figure out how to do this (I suspect
it's trivial).
Then we could avoid including mpi.h in MeshPartitioning.cpp (look for
the two first FIXMEs in that file). If anyone knows how to do this,
let me know.
--
Anders
Attachment:
signature.asc
Description: Digital signature
Follow ups
References