← Back to team overview

dolfin team mailing list archive

Re: [noreply@xxxxxxxxxxxxx: [Branch ~dolfin-core/dolfin/main] Rev 4333: Add outline of a new Array class.]

 

On Monday 07 December 2009 04:11:28 Garth N. Wells wrote:
> Johan Hake wrote:
> > On Sunday 06 December 2009 07:23:27 Garth N. Wells wrote:
> >> Anders Logg wrote:
> >>> On Sun, Dec 06, 2009 at 12:46:53PM +0000, Garth N. Wells wrote:
> >>>> Anders Logg wrote:
> >>>>> Shouldn't this be templated? This could be used also for passing uint
> >>>>> arrays to and from Python.
> >>>>
> >>>> It probably should be, but it may make things more complicated on the
> >>>> SWIG side. Perhaps Johan H. could comment - not on whether it's
> >>>> possible (I'm sure for Johan that it is), but on whether or not it's
> >>>> easy.
> >>>>
> >>>> Templating it would be better on the C++ side, not just for double,
> >>>> int, etc, but to handle const pointers correctly.
> >>>
> >>> ok.
> >>>
> >>>>> I imagine we can typedef
> >>>>>
> >>>>>   DoubleArray
> >>>>>   UintArray
> >>>>
> >>>> This is a bit ugly. Might as well just use Array<double>, Array<uint>,
> >>>> etc.
> >>>
> >>> I agree it's ugly. I didn't suggest it for use in C++ but that it
> >>> might simplify for the wrapper generators but when I think of it
> >>> again, it probably makes no difference for SWIG.
> >>
> >> As a start, we can declare the templates with these names in SWIG, and
> >> SWIG will wrap the member functions and it should work out of the box
> >> (with a few tweaks for [] and =).
> >
> > We can make the Array class templated. We can also specialize SWIG
> > interfaces for the a particular Array<Foo> class dependent on what Foo
> > is. If it is a primitive (double, uint) we can expand the class with a
> > set of constructors, utility methods, and typemaps which should make it
> > easy to interact with NumPy. However we need to cover all possible cases
> > of such extensions wrt to memory handling and difference in intent use.
> 
> Can you give us a short tutorial on NumPy and memory management, namely:
> 
> 1. If a Numpy object is created from an point to a double, will NumPy
> ever destroy the data?

The long story:

Here are lines 487-505 in ndarrayobject.h from NumPy in karmic. The lines 
describes the quite straightforward Capi of a PyArrayObject:

typedef struct PyArrayObject {
        PyObject_HEAD
        char *data;             /* pointer to raw data buffer */
        int nd;                 /* number of dimensions, also called ndim */
        npy_intp *dimensions;       /* size in each dimension */
        npy_intp *strides;          /* bytes to jump to get to the
                                   next element in each dimension */
        PyObject *base;         /* This object should be decref'd
                                   upon deletion of array */
                                /* For views it points to the original array 
*/
                                /* For creation from buffer object it points
                                   to an object that shold be decref'd on
                                   deletion */
                                /* For UPDATEIFCOPY flag this is an array
                                   to-be-updated upon deletion of this one */
        PyArray_Descr *descr;   /* Pointer to type structure */
        int flags;              /* Flags describing array -- see below*/
        PyObject *weakreflist;  /* For weakreferences */
} PyArrayObject;

We are mostly interested in data, nd, dimensions, descr and flags. The first 
three I think are self explanatory, descr describes the data type (strides 
tell us the size in bytes between the different data segments for each 
dimension. Interesting enough but I have never used it because macros handles 
low level iterations). flags holds a number of important flags for the state 
of the data. These are explained in the same file: lines 530-584. The one we 
are most interested in is:

/* If set, the array owns the data: it will be free'd when the array
   is deleted. */
#define NPY_OWNDATA       0x0004

You can also print the most important flags in Python:

  z = Zeros(10)
  z.flags

<http://docs.scipy.org/doc/numpy/reference/c-api.array.html#creating-arrays>

describes a bunch of macros for creating a NumPy array. If one use:

  PyArray_SimpleNewFromData

which we use in for example mesh_pre.i for coordinates, the flag will be set 
to not owe the data. We can can use the more elaborated macro:

  PyArray_NewFromDescr

to both pass a data structure and set the flags, to for example to owe the 
data. I suppose one can operate directly on an NumPy object to set the flags, 
but I would consider this dangerous.

> 2. If the pointer to the NumyPy data is used to create say an Array
> object, can the Numpy object be stopped from destroying it, or do we
> need to make a copy?

See above.

> > In particular will we need to decide what we want the argument:
> > Array<Foo> & to do. If we know a method will not resize the argument, we
> > can just map the content to the NumPy pointer.
> 
> I think that we should allow resizing.

Me too, but I suppose we will not "allow" it for all methods. For example we 
do not want to let values change in the eval method? If we have such 
convention for some methods we could create a typemap that just pass the 
pointer from the NumPy array to the Array<Foo>. If we are not sure whether the 
data will be resized we cannot pass the pointer from the NumPy array.

> > If we allow resize we can either turn the
> > argument into a pure out argument, i.e., we construct the Array<Foo> in
> > the interface, pass it to the C++ method, and when it is returned, create
> > a NumPy array which takes ownership of the data and pass this to Python.
> 
> This sounds easy.

Easy enough.

> Or we can let
> 
> > the user pass an instance of ArrayFoo to the method, and use
> > ArrayFoo.array() to get a NumPy view of the data.
> 
> This would involve extending Array with the member function 'array' in
> the SWIG code?

yes.

> Would the function array create a NumPy view of the data, or would the
> NumPy object already  exist (i.e., be created in the interface when
> moving the object from C++ to Python)?

Depending on how we do it in the array method it can be a view.

> This option sounds more complicated, but could be nice if it works such
> that if a NumPy object is not required, it is never constructed.

Exactly. I think we should add it just as an improvement of the Python 
interface of Array. See comment on copy versus view below.

> > Note that we can have both of these typemaps (for the same Foo) at the
> > same time in PyDOLFIN, by either using specific argument names for the
> > two different cases:
> >
> >   Array<Foo> & values
> >   Array<Foo> & values_resize
> >
> > or we need to construct module specific typemaps, which might get out of
> > hands, when things get complicated.
> >
> > Garth do you suggest we also use Array<const Foo*> to pass what we now
> > use std::vector for (DirichleBC, GenericFunction, aso)? Would this work
> > from the C++ side? We loose the push_back functionality from std::vector.
> > If you think we go for Array<const Foo*> it is no big deal creating
> > specific typemaps for this.
> 
> I'm not sure.

Ok.

> > As soon as I am able I can add NumPy typemaps for the most
> > straightforward cases.
> 
> Great. We can tweak the Array class as needed. I expect that we'll need
> some functions to deal with memory management in the wrapper code.

Maybe, or we could just control the deeper memory handling in the typemaps, 
preventing a user to segfault his program just because he missused the Python 
interface of ArrayFoo.

Using a view of data is of course dangerous as it is. Try:

  mesh = UnitSquare(1,1)
  c = mesh.coordinates()
  del mesh

Now will c's data variable will point to a deleted memory. If we want to avoid 
this we need to copy the data.

Johan

Follow ups

References