← Back to team overview

dolfin team mailing list archive

Re: LA slicing in Python interface

 

On Sunday January 2 2011 13:08:05 Garth N. Wells wrote:
> On 02/01/11 06:47, Johan Hake wrote:
> > Hello!
> > 
> > As pointed out by Garth in a couple of commit messages lately we need to
> > consider the slicing interface for vectors and matrices in PyDOLFIN.
> > 
> > Now that the parallel interface are starting to get more and more robust
> > 
> > (great work Garth!) we need to consider what:
> >    v[6], v[4:], v[index_array] # returning a scalar and a vector
> >    A[5,6], A[5:,6], A[5:,6:] # returning a scalar, a vector and a matrix
> > 
> > should return for serial and parallel structures.
> > 
> > I guess Garth comments that the Matrix slicing "is just a toy" has some
> > merits as it uses a dense representation of the Matrix (even though it
> > is erasing zero entries). Regardless if we decide to discard the Matrix
> > slicing or not we need to decide how off process indices should be
> > treated.
> > 
> > As it is now, a user need to check whether an index is owned by the local
> > processor or not, by using GenericVector::owns_index, before he access
> > the data.
> > 
> > An other scenario would be to always allow a user to write:
> >    v[5], v[4:] and v[index_array]
> > 
> > and then return for example None (for single value access) and a reduced
> > vector (only including local values) for the two other alternatives. Then
> > the user need to deal with the result knowing how we treat this in
> > parallel.
> 
> Another option is to not return a numpy vector, but to add a function to
> the C++ interface that creates a new reduced vector (which, in the
> general case, will be distributed across processes) based on a list of
> indices.

This sounds good. If the slice came as an Array<uint> it would be easy to map 
this to a NumPy like interface.
 
> > Matrix slicing is more tricky especially now when GenericMatrix::resize
> > is removed. There are probably ways of getting around this by building a
> > SparsityPattern based on the slice. But I am not sure it makes any
> > senses to do a matrix slice for a parallel structure, as it is only the
> > rows that are
> > 
> > local. Then we need to decide if the Matrix slicing should be:
> >    1) removed
> 
> I think that for the fanciest usage cases this would be best.
> 
> >    2) reimplemented but disabled in parallel
> 
> Definitely no! A real selling point is that users can write simple
> Python programs that can run in parallel without knowing it. The code in
> the Python adaptivity module is broken now because it uses code that
> wouldn't work in parallel. We should make it hard for users to write
> code that won't work in parallel.
> 
> >    3) reimplemented for both serial and parallel cases using for example
> >    
> >       a similar approach as sketched above for vectors, that is always
> >       returning something.
> 
> We should be able to extract rows from matrices as distributed vectors
> for most backends. 

We then need a more elaborated getrow method in GenericMatrix. 

Btw, is it possible to create (or initialize) distributed vectors with 
arbitrary distribution patterns? If so it would be cool to support this 
through the GenericVector interface, so we could implement a sub vector 
extension in the GenericVector interface.

> It looks like PETSc can also extract sub-matrices
> (and return them as distributed matrices). I guess all we need to do in
> the SWIG interface layer is convert the '[:,2]' part of A[:,2] into a
> list of indices it to pass to the C++ interface. Extracting sub-vectors
> from C++ should be easy.

Ok, do you know if Epetra supports this too?

> I summary, I think that the best option is to add some more
> functionality to the C++ interface, and allow Python users to call these
> functions using NumPy-like syntax. 

Ok, what would this syntax be?

  GenericMatrix::getsubmat(const Array<uint> rows,
                           const Array<uint> columns, 
                           GenericMatrix& values);

  GenericMatrix::setsubmat(const Array<uint> rows,
                           const Array<uint> columns, 
                           const GenericMatrix& values);

  GenericMatrix::getrow(uint row, const Array<uint> columns, 
                        GenericVector& values);
  
  GenericMatrix::setrow(uint row, const Array<uint> columns, 
                        const GenericVector& values);
  
  GenericVector::getsubvec(const Array<uint> indices, 
                           GenericVector& values);

  GenericVector::getsubvec(const Array<uint> indices, 
                           GenericVector& values);
  
For scalars I guess we just return the global value? 

> This would provide most of the
> functionality that we had up to a few days ago. We should avoid using
> NumPy data structures since a DOLFIN vector/matrix should be presumed to
> be distributed, and NumPy objects can only be local.

The indice vectors need to be global, and can then be mapped to NumPy arrays, 
which can then be mapped to slice objects. FYI, previously we returned 
GenericVectors and GenericMatrices not NumPy arrays.

> For advanced users, would could make it easy to construct SciPy sparse
> matrices form the sparse data arrays. A user would then be on their own
> when it comes to parallel programs.

Yes that would be easy to do. For now this is only possible for the uBLAS and 
MTL4 backends, using the GenericMatrix::data method. Is this data available 
for the distributed backends too? If so we could add a method to GenericMatrix 
(or a free function) which creates a SciPy sparse matrix.

Johan



Follow ups

References