Thread Previous • Date Previous • Date Next • Thread Next |
On 11/11/10 09:53, Anders Logg wrote:
On Wed, Nov 10, 2010 at 08:43:22PM +0000, Garth N. Wells wrote:On 10/11/10 20:29, Anders Logg wrote:On Wed, Nov 10, 2010 at 04:40:47PM +0000, Garth N. Wells wrote:On 10/11/10 16:10, Anders Logg wrote:On Wed, Nov 10, 2010 at 03:58:13PM +0000, Garth N. Wells wrote:On 10/11/10 15:47, Anders Logg wrote:On Wed, Nov 10, 2010 at 02:47:30PM +0000, Garth N. Wells wrote:Nice to see multi-thread assembly being added. We should look at adding support for the multi-threaded version of SuperLU. What other multi-thread solvers are out there?Yes, that would be good, but I don't know which solvers are available.I haven't looked at the code in great detail, but are element tensors being added to the global tensor is a thread-safe fashion? Both PETSc and Trilinos are not thread-safe.Yes, they should. That's the main point. It's a very simple algorithm which just partitions the matrix row by row and makes each process responsible for a chunk of rows.Would it be better to partition the mesh (using Metis) and then renumber dofs? That way the 'all_in_range' case would be maximised and the 'some_in_range' would be minimised. If the rows are distributed, mixed elements will be a mess because the global rows are far apart (using the FFC-generated dof map).Renumbering is definitely important for getting good speedup. This hasn't been added yet but was implemented in the prototype version (which was stand-alone from DOLFIN).What about partitioning of the mesh?I'm not sure that would help. If each process has access to the whole mesh, which is either one big connected mesh or a connected piece of the mesh when running in parallel with MPI, then renumbering on that piece should be enough. Or am I missing something?I think that you're missing something - partitioning (not in memory, but just assigning a partition number) would minimise the number of cells on partition boundaries cells, thereby maximising the number of cells for which 'all_in_range = true'. We could mark cells that are 'internal' to a partition (hence 'all_in_range = true') and cells on the boundary (hence 'some_in_range = true'), e.g. // Partition cells with cell id (negated for cells on partition // boundary) MeshFunction<int> partition; if (partition(cell) == thread_id) compute tensor and assemble all else if (std::abs(partition(cell)) == thread_id) compute tensor and assemble some terms else do nothing What you've described could go wrong certain cell/dof numberings. What I describe above wouldn't depend on the numbering.Yes, that would work, but I imagine a good renumbering algorithm would accomplish the same thing.
Q. How to determine the optimal re-numbering?A. Partition the mesh to minimise 'interface' length between partitions, and number sequentially on each partition.
:)
Anyway, it's worth trying and comparing to just doing the renumbering.
How do you propose to renumber?Renumbering is a *lot* simpler now that the UFC dofmap is copied into data structures in DOLFIN, and the UFC tabulate dofs function is only called at initialisation.
Garth
-- Anders
Thread Previous • Date Previous • Date Next • Thread Next |