← Back to team overview

larry-discuss team mailing list archive

Re: Index by label support added to experimental branch

 

On Mon, Feb 8, 2010 at 1:27 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Mon, Feb 8, 2010 at 1:26 PM,  <josef.pktd@xxxxxxxxx> wrote:
>> On Mon, Feb 8, 2010 at 4:19 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Mon, Feb 8, 2010 at 1:02 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Mon, Feb 8, 2010 at 12:59 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>> On Mon, Feb 8, 2010 at 3:51 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>> On Mon, Feb 8, 2010 at 3:37 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>> On Mon, Feb 8, 2010 at 12:27 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>> I made a branch called index-by-label:
>>>>>>>>
>>>>>>>> https://code.launchpad.net/~kwgoodman/larry/index-by-label
>>>>>>>>
>>>>>>>> and added support for indexing by label. It is all self contained in
>>>>>>>> one method: deflarry.py(lix). So easy to try.
>>>>>>>>
>>>>>>>> Here are a few one-off tests:
>>>>>>>>
>>>>>>>>>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>>>>>>>>
>>>>>>>>>> y.lix[['a']]
>>>>>>>> label_0
>>>>>>>>    c
>>>>>>>>    d
>>>>>>>> x
>>>>>>>> array([1, 2])
>>>>>>>>
>>>>>>>>>> y.lix[['b', 'a']]
>>>>>>>> label_0
>>>>>>>>    b
>>>>>>>>    a
>>>>>>>> label_1
>>>>>>>>    c
>>>>>>>>    d
>>>>>>>> x
>>>>>>>> array([[3, 4],
>>>>>>>>       [1, 2]])
>>>>>>>>
>>>>>>>>>> y.lix[['a']:]
>>>>>>>> label_0
>>>>>>>>    a
>>>>>>>>    b
>>>>>>>> label_1
>>>>>>>>    c
>>>>>>>>    d
>>>>>>>> x
>>>>>>>> array([[1, 2],
>>>>>>>>       [3, 4]])
>>>>>>>>
>>>>>>>>>> y.lix[:['b']]
>>>>>>>> label_0
>>>>>>>>    a
>>>>>>>> label_1
>>>>>>>>    c
>>>>>>>>    d
>>>>>>>> x
>>>>>>>> array([[1, 2]])
>>>>>>>>
>>>>>>>>>> y.lix[['a'], ['c']:]
>>>>>>>> label_0
>>>>>>>>    c
>>>>>>>>    d
>>>>>>>> x
>>>>>>>> array([1, 2])
>>>>>>>>
>>>>>>>>>> y.lix[['a'], ['d', 'c']]
>>>>>>>> label_0
>>>>>>>>    d
>>>>>>>>    c
>>>>>>>> x
>>>>>>>> array([2, 1])
>>>>>>>>
>>>>>>>>>> y.lix['a':]  # <--- expected to crash
>>>>>>>> ValueError: The start element of a slice must be a list.
>>>>>>>
>>>>>>> I think the only thing left to add is the conversion of fancy indexing
>>>>>>> to rectangular:
>>>>>>>
>>>>>>>>> y.lix[['a', 'b'], ['d', 'c']]
>>>>>>> IndexError: tuple index out of range
>>>>>>>
>>>>>>> How do I identify rectangular indexing? If more than one list with
>>>>>>> more than one element are present anywhere in the index, then it is
>>>>>>> rectangular and I need to do your trick np.array(first list)[:,None]?
>>>>>>
>>>>>> that's the idea, with matching axes added
>>>>>>
>>>>>> can you use np.idx_ directly, once you have the numeric index list ?
>>>>>>
>>>>>> y.x[np.ix_(lists of indices)]
>>>>>>
>>>>>> might not work with slices, to avoid problems with mixing slices and
>>>>>> array indexing, I converted slices in the past to a array of indices
>>>>>>
>>>>>> slice(None)   -> np.arange(x.shape(ax))
>>>>>
>>>>> actually, I looked this up in the numpy thread with David Huard on
>>>>> indexing/slicing
>>>>>
>>>>>>>> range(*slice(None).indices(5))
>>>>> [0, 1, 2, 3, 4]
>>>>>>>> range(*slice(2,None,None).indices(5))
>>>>> [2, 3, 4]
>>>>>>>> range(*slice(None,2,None).indices(5))
>>>>> [0, 1]
>>>>
>>>> Very nice!
>>>
>>> The np.ix_ conversion is working:
>>>
>>>>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>>>>> y.lix[['a', 'b'], ['d', 'c']]
>>>
>>> The index (['a', 'b'], ['d', 'c']) is converted to
>>>
>>> (array([[0],
>>>       [1]]), array([[1, 0]]))
>>>
>>> So the lix method is done. The only problem left is that
>>> larry.__getitem__ does not support that type of indexing.
>>
>> I thought my earlier example with broadcasting worked.
>>
>> you could also return a new sliced larry directly in lix instead of
>> going through getitem.
>> you already have the index array for the array lar1.x[...]
>> all you need to add, is the loop on selecting labels, for with you
>> also have already the list of lists of indices, so it should be a
>> short loop.
>
> Yep, that's they way I ended up going. I do all the indexing inside
> lix whenever the input is a tuple.

:)

>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>> y.lix[['a', 'b'], ['d', 'c']]

label_0
    a
    b
label_1
    d
    c
x
array([[2, 1],
       [4, 3]])



Follow ups

References