← Back to team overview

dolfin team mailing list archive

Re: LA slicing in Python interface

 



On 02/01/11 22:45, Johan Hake wrote:
On Sunday January 2 2011 13:08:05 Garth N. Wells wrote:
On 02/01/11 06:47, Johan Hake wrote:
Hello!

As pointed out by Garth in a couple of commit messages lately we need to
consider the slicing interface for vectors and matrices in PyDOLFIN.

Now that the parallel interface are starting to get more and more robust

(great work Garth!) we need to consider what:
    v[6], v[4:], v[index_array] # returning a scalar and a vector
    A[5,6], A[5:,6], A[5:,6:] # returning a scalar, a vector and a matrix

should return for serial and parallel structures.

I guess Garth comments that the Matrix slicing "is just a toy" has some
merits as it uses a dense representation of the Matrix (even though it
is erasing zero entries). Regardless if we decide to discard the Matrix
slicing or not we need to decide how off process indices should be
treated.

As it is now, a user need to check whether an index is owned by the local
processor or not, by using GenericVector::owns_index, before he access
the data.

An other scenario would be to always allow a user to write:
    v[5], v[4:] and v[index_array]

and then return for example None (for single value access) and a reduced
vector (only including local values) for the two other alternatives. Then
the user need to deal with the result knowing how we treat this in
parallel.

Another option is to not return a numpy vector, but to add a function to
the C++ interface that creates a new reduced vector (which, in the
general case, will be distributed across processes) based on a list of
indices.

This sounds good. If the slice came as an Array<uint>  it would be easy to map
this to a NumPy like interface.

Matrix slicing is more tricky especially now when GenericMatrix::resize
is removed. There are probably ways of getting around this by building a
SparsityPattern based on the slice. But I am not sure it makes any
senses to do a matrix slice for a parallel structure, as it is only the
rows that are

local. Then we need to decide if the Matrix slicing should be:
    1) removed

I think that for the fanciest usage cases this would be best.

    2) reimplemented but disabled in parallel

Definitely no! A real selling point is that users can write simple
Python programs that can run in parallel without knowing it. The code in
the Python adaptivity module is broken now because it uses code that
wouldn't work in parallel. We should make it hard for users to write
code that won't work in parallel.

    3) reimplemented for both serial and parallel cases using for example

       a similar approach as sketched above for vectors, that is always
       returning something.

We should be able to extract rows from matrices as distributed vectors
for most backends.

We then need a more elaborated getrow method in GenericMatrix.


The present getrow is a very troublesome function in parallel.

Btw, is it possible to create (or initialize) distributed vectors with
arbitrary distribution patterns?

Almost. There is a new constructor that takes the local range for the vector. It must be linear, but it is possible to specify the number of entries on each process.

If so it would be cool to support this
through the GenericVector interface, so we could implement a sub vector
extension in the GenericVector interface.

It looks like PETSc can also extract sub-matrices
(and return them as distributed matrices). I guess all we need to do in
the SWIG interface layer is convert the '[:,2]' part of A[:,2] into a
list of indices it to pass to the C++ interface. Extracting sub-vectors
from C++ should be easy.

Ok, do you know if Epetra supports this too?


I don't think so. Epetra has very limited matrix functionality compared to PETSc.

I summary, I think that the best option is to add some more
functionality to the C++ interface, and allow Python users to call these
functions using NumPy-like syntax.

Ok, what would this syntax be?

   GenericMatrix::getsubmat(const Array<uint>  rows,
                            const Array<uint>  columns,
                            GenericMatrix&  values);

   GenericMatrix::setsubmat(const Array<uint>  rows,
                            const Array<uint>  columns,
                            const GenericMatrix&  values);

   GenericMatrix::getrow(uint row, const Array<uint>  columns,
                         GenericVector&  values);

   GenericMatrix::setrow(uint row, const Array<uint>  columns,
                         const GenericVector&  values);

   GenericVector::getsubvec(const Array<uint>  indices,
                            GenericVector&  values);

   GenericVector::getsubvec(const Array<uint>  indices,
                            GenericVector&  values);


Something like that.

For scalars I guess we just return the global value?

This would provide most of the
functionality that we had up to a few days ago. We should avoid using
NumPy data structures since a DOLFIN vector/matrix should be presumed to
be distributed, and NumPy objects can only be local.

The indice vectors need to be global, and can then be mapped to NumPy arrays,
which can then be mapped to slice objects. FYI, previously we returned
GenericVectors and GenericMatrices not NumPy arrays.


True. I was thinking about the 'array' function.

For advanced users, would could make it easy to construct SciPy sparse
matrices form the sparse data arrays. A user would then be on their own
when it comes to parallel programs.

Yes that would be easy to do. For now this is only possible for the uBLAS and
MTL4 backends, using the GenericMatrix::data method. Is this data available
for the distributed backends too?

Yes, but for simplicity it would be advisable not to mess with it.

Garth

If so we could add a method to GenericMatrix
(or a free function) which creates a SciPy sparse matrix.

Johan



Follow ups

References