← Back to team overview

dolfin team mailing list archive

Re: multi-thread assembly

 

On Wed, Nov 10, 2010 at 08:06:00PM +0000, Garth N. Wells wrote:
>
>
> On 10/11/10 15:47, Anders Logg wrote:
> >On Wed, Nov 10, 2010 at 02:47:30PM +0000, Garth N. Wells wrote:
> >>Nice to see multi-thread assembly being added. We should look at
> >>adding support for the multi-threaded version of SuperLU. What other
> >>multi-thread solvers are out there?
> >
> >Yes, that would be good, but I don't know which solvers are available.
> >
> >>I haven't looked at the code in great detail, but are element
> >>tensors being added to the global tensor is a thread-safe fashion?
> >>Both PETSc and Trilinos are not thread-safe.
> >
> >Yes, they should. That's the main point. It's a very simple algorithm
> >which just partitions the matrix row by row and makes each process
> >responsible for a chunk of rows. During assembly, all processes
> >iterate over the entire mesh and on each cell does one of three things:
> >
> >   1. all_in_range:  tabulate_tensor as usual and add
> >   2. none_in_range: skip tabulate_tensor (continue)
> >   3. some_in_range: tabulate_tensor and insert only rows in range
> >
> >Didem Unat (PhD student at UCLA/Simula) tried this in a simple
> >prototype code and got very good speedups (up to a factor 7 on an
> >eight-core machine) so it's just a matter of doing the same thing as
> >part of DOLFIN (which is a bit trickier since some of the data access
> >is hidden). The current implementation in DOLFIN seems to work and
> >give some small speedup but I need to do some more testing.
> >
> >>Rather than having two assembly classes, would it be worth using
> >>OpenMP instead? I experimented with OpenMP some time ago, but never
> >>added it since at the time it required a very recent version of gcc.
> >>This shouldn't be a problem now.
> >
> >I don't think this would work with OpenMP since we need to control how
> >the rows are inserted.
> >
>
> The thread number (id) can be accessed in OpenMP. Is there another
> barrier to using OpenMP?
>
> Garth

I haven't used OpenMP much so I can't say for sure this can't be done
with OpenMP, but I don't see the point in using it. Using boost
threads is very simple and I have complete control over what each
thread does, which data is local and which data is shared. And the
code is just standard C++, no ugly pragmas that obfuscate the code.

--
Anders



References