Martin Sandve Alnæs wrote:
> On 12/5/06, Garth N. Wells <g.n.wells@xxxxxxxxxx> wrote:
>>
>>
>> Martin Sandve Alnæs wrote:
>> >>
>> >> The partition is related to this. Entries in a matrix associated
>> with a
>> >> particular partition are formed on the process to which the mesh
>> >> partition belongs - no problems there. But you need to make sure
>> >> (through the dof numbering) that (nearly) all of the terms computed
>> by a
>> >> process are also stored by that process.
>> >
>> > In the index_set built in the beforementioned algorithm, all nodes
>> > related to a cell partition are included. Nodes shared between two
>> > processes (partitions) are stored on both processes. This is
>> > independent of the global dof numbering.
>> >
>> > During the assembly, I'm not sure matrix->SumIntoGlobalValues(...)
>> > does any communication at all. At the end of assembly there is a call
>> > I didn't mention, matrix->GlobalAssemble(), which does the
>> > communication of shared values which are then added together.
>> >
>> >> The FFC mapping for vector-valued equations is particularly
>> unsuited to
>> >> this as two unknowns corresponding to a single node (say x and y
>> >> components) are located far from each other (the distance is 1/2 of
>> the
>> >> matrix size).
>> >
>> > I don't see this as being the same issue.
>>
>>
>> >Which nodes reside on a
>> > particular process is defined by the cell partition, independent of
>> > the global dof numbering.
>>
>> The cell partition determines by which process the contribution of a
>> cell is computed. The degree of freedom numbering determines to which
>> process the matrix entry belongs. Consider a 2n x 2n matrix stored on
>> two processes,
>>
>> Prcocess 0 | A B |
>> Prcocess 1 | C D |
>>
>> were the sub-matrices A and B reside on process 0, and C and D reside on
>> process 1. All sub-matrices have size n x n. Consider a cell in the
>> middle of partition 0 and it's local degrees of freedom (0, 1, 2) map to
>> (n+1, n+2, n+3) in the global matrix. The contribution of this cell is
>> computed on process 0, but for matrix assembly it is communicated to
>> process 1. I don't know Epetra, but I suspect that GlobalAssemble()
>> performs this communication.
>
> First about vectors only:
> Epetra_Map can be constructed at least two different ways.
>
> One way gives a linear map like you describe, where the first n/2
> vector entries are mapped to process 0 and the next n/2 entries to
> process 1.
>
> The other way, which I use with my index_set, is something like this
> (from memory):
>
> int n = index_set.size();
> int *my_indices = new int[n];
> fill my_indices from index_set (dof partition)
> epetra_map = new Epetra_Map(my_indices, n, ...)
>
> This way, Epetra allows a distribution of vector entries that is not
> contiguous in the global vector. In other words, while
> my_local_vector[i] and my_local_vector[i+1] is contiguous in memory in
> the local process, the corresponding entries in my_global_vector may
> not be.
>
Isn't having the global vector (and corresponding matrices) as
contiguous as possible the key to efficiency? If it's not, I think that
the performance will be very poor. A lot of communication will be
required for basic operations such as matrix-vector multiplications.
Constructing good mesh partitions won't help if you don't renumber the
degrees of freedom to reflect the partitioning.