larry-discuss team mailing list archive
-
larry-discuss team
-
Mailing list archive
-
Message #00125
Re: Index by label support added to experimental branch
On Mon, Feb 8, 2010 at 4:34 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Mon, Feb 8, 2010 at 1:27 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Mon, Feb 8, 2010 at 1:26 PM, <josef.pktd@xxxxxxxxx> wrote:
>>> On Mon, Feb 8, 2010 at 4:19 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Mon, Feb 8, 2010 at 1:02 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> On Mon, Feb 8, 2010 at 12:59 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>>>> On Mon, Feb 8, 2010 at 3:51 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>>>>> On Mon, Feb 8, 2010 at 3:37 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>> On Mon, Feb 8, 2010 at 12:27 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>> I made a branch called index-by-label:
>>>>>>>>>
>>>>>>>>> https://code.launchpad.net/~kwgoodman/larry/index-by-label
>>>>>>>>>
>>>>>>>>> and added support for indexing by label. It is all self contained in
>>>>>>>>> one method: deflarry.py(lix). So easy to try.
>>>>>>>>>
>>>>>>>>> Here are a few one-off tests:
>>>>>>>>>
>>>>>>>>>>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>>>>>>>>>
>>>>>>>>>>> y.lix[['a']]
>>>>>>>>> label_0
>>>>>>>>> c
>>>>>>>>> d
>>>>>>>>> x
>>>>>>>>> array([1, 2])
>>>>>>>>>
>>>>>>>>>>> y.lix[['b', 'a']]
>>>>>>>>> label_0
>>>>>>>>> b
>>>>>>>>> a
>>>>>>>>> label_1
>>>>>>>>> c
>>>>>>>>> d
>>>>>>>>> x
>>>>>>>>> array([[3, 4],
>>>>>>>>> [1, 2]])
>>>>>>>>>
>>>>>>>>>>> y.lix[['a']:]
>>>>>>>>> label_0
>>>>>>>>> a
>>>>>>>>> b
>>>>>>>>> label_1
>>>>>>>>> c
>>>>>>>>> d
>>>>>>>>> x
>>>>>>>>> array([[1, 2],
>>>>>>>>> [3, 4]])
>>>>>>>>>
>>>>>>>>>>> y.lix[:['b']]
>>>>>>>>> label_0
>>>>>>>>> a
>>>>>>>>> label_1
>>>>>>>>> c
>>>>>>>>> d
>>>>>>>>> x
>>>>>>>>> array([[1, 2]])
>>>>>>>>>
>>>>>>>>>>> y.lix[['a'], ['c']:]
>>>>>>>>> label_0
>>>>>>>>> c
>>>>>>>>> d
>>>>>>>>> x
>>>>>>>>> array([1, 2])
>>>>>>>>>
>>>>>>>>>>> y.lix[['a'], ['d', 'c']]
>>>>>>>>> label_0
>>>>>>>>> d
>>>>>>>>> c
>>>>>>>>> x
>>>>>>>>> array([2, 1])
>>>>>>>>>
>>>>>>>>>>> y.lix['a':] # <--- expected to crash
>>>>>>>>> ValueError: The start element of a slice must be a list.
>>>>>>>>
>>>>>>>> I think the only thing left to add is the conversion of fancy indexing
>>>>>>>> to rectangular:
>>>>>>>>
>>>>>>>>>> y.lix[['a', 'b'], ['d', 'c']]
>>>>>>>> IndexError: tuple index out of range
>>>>>>>>
>>>>>>>> How do I identify rectangular indexing? If more than one list with
>>>>>>>> more than one element are present anywhere in the index, then it is
>>>>>>>> rectangular and I need to do your trick np.array(first list)[:,None]?
>>>>>>>
>>>>>>> that's the idea, with matching axes added
>>>>>>>
>>>>>>> can you use np.idx_ directly, once you have the numeric index list ?
>>>>>>>
>>>>>>> y.x[np.ix_(lists of indices)]
>>>>>>>
>>>>>>> might not work with slices, to avoid problems with mixing slices and
>>>>>>> array indexing, I converted slices in the past to a array of indices
>>>>>>>
>>>>>>> slice(None) -> np.arange(x.shape(ax))
>>>>>>
>>>>>> actually, I looked this up in the numpy thread with David Huard on
>>>>>> indexing/slicing
>>>>>>
>>>>>>>>> range(*slice(None).indices(5))
>>>>>> [0, 1, 2, 3, 4]
>>>>>>>>> range(*slice(2,None,None).indices(5))
>>>>>> [2, 3, 4]
>>>>>>>>> range(*slice(None,2,None).indices(5))
>>>>>> [0, 1]
>>>>>
>>>>> Very nice!
>>>>
>>>> The np.ix_ conversion is working:
>>>>
>>>>>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>>>>>> y.lix[['a', 'b'], ['d', 'c']]
>>>>
>>>> The index (['a', 'b'], ['d', 'c']) is converted to
>>>>
>>>> (array([[0],
>>>> [1]]), array([[1, 0]]))
>>>>
>>>> So the lix method is done. The only problem left is that
>>>> larry.__getitem__ does not support that type of indexing.
>>>
>>> I thought my earlier example with broadcasting worked.
>>>
>>> you could also return a new sliced larry directly in lix instead of
>>> going through getitem.
>>> you already have the index array for the array lar1.x[...]
>>> all you need to add, is the loop on selecting labels, for with you
>>> also have already the list of lists of indices, so it should be a
>>> short loop.
>>
>> Yep, that's they way I ended up going. I do all the indexing inside
>> lix whenever the input is a tuple.
>
> :)
>
>>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>>> y.lix[['a', 'b'], ['d', 'c']]
>
> label_0
> a
> b
> label_1
> d
> c
> x
> array([[2, 1],
> [4, 3]])
>
nice, I look at the branch after dinner, no it's time to get the kids
(one stayed home with flue)
Josef
References