← Back to team overview

larry-discuss team mailing list archive

Re: Index by label support added to experimental branch

 

On Mon, Feb 8, 2010 at 4:34 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Mon, Feb 8, 2010 at 1:27 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Mon, Feb 8, 2010 at 1:26 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>> On Mon, Feb 8, 2010 at 4:19 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Mon, Feb 8, 2010 at 1:02 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> On Mon, Feb 8, 2010 at 12:59 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>> On Mon, Feb 8, 2010 at 3:51 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>> On Mon, Feb 8, 2010 at 3:37 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>> On Mon, Feb 8, 2010 at 12:27 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>> I made a branch called index-by-label:
>>>>>>>>>
>>>>>>>>> https://code.launchpad.net/~kwgoodman/larry/index-by-label
>>>>>>>>>
>>>>>>>>> and added support for indexing by label. It is all self contained in
>>>>>>>>> one method: deflarry.py(lix). So easy to try.
>>>>>>>>>
>>>>>>>>> Here are a few one-off tests:
>>>>>>>>>
>>>>>>>>>>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>>>>>>>>>
>>>>>>>>>>> y.lix[['a']]
>>>>>>>>> label_0
>>>>>>>>>    c
>>>>>>>>>    d
>>>>>>>>> x
>>>>>>>>> array([1, 2])
>>>>>>>>>
>>>>>>>>>>> y.lix[['b', 'a']]
>>>>>>>>> label_0
>>>>>>>>>    b
>>>>>>>>>    a
>>>>>>>>> label_1
>>>>>>>>>    c
>>>>>>>>>    d
>>>>>>>>> x
>>>>>>>>> array([[3, 4],
>>>>>>>>>       [1, 2]])
>>>>>>>>>
>>>>>>>>>>> y.lix[['a']:]
>>>>>>>>> label_0
>>>>>>>>>    a
>>>>>>>>>    b
>>>>>>>>> label_1
>>>>>>>>>    c
>>>>>>>>>    d
>>>>>>>>> x
>>>>>>>>> array([[1, 2],
>>>>>>>>>       [3, 4]])
>>>>>>>>>
>>>>>>>>>>> y.lix[:['b']]
>>>>>>>>> label_0
>>>>>>>>>    a
>>>>>>>>> label_1
>>>>>>>>>    c
>>>>>>>>>    d
>>>>>>>>> x
>>>>>>>>> array([[1, 2]])
>>>>>>>>>
>>>>>>>>>>> y.lix[['a'], ['c']:]
>>>>>>>>> label_0
>>>>>>>>>    c
>>>>>>>>>    d
>>>>>>>>> x
>>>>>>>>> array([1, 2])
>>>>>>>>>
>>>>>>>>>>> y.lix[['a'], ['d', 'c']]
>>>>>>>>> label_0
>>>>>>>>>    d
>>>>>>>>>    c
>>>>>>>>> x
>>>>>>>>> array([2, 1])
>>>>>>>>>
>>>>>>>>>>> y.lix['a':]  # <--- expected to crash
>>>>>>>>> ValueError: The start element of a slice must be a list.
>>>>>>>>
>>>>>>>> I think the only thing left to add is the conversion of fancy indexing
>>>>>>>> to rectangular:
>>>>>>>>
>>>>>>>>>> y.lix[['a', 'b'], ['d', 'c']]
>>>>>>>> IndexError: tuple index out of range
>>>>>>>>
>>>>>>>> How do I identify rectangular indexing? If more than one list with
>>>>>>>> more than one element are present anywhere in the index, then it is
>>>>>>>> rectangular and I need to do your trick np.array(first list)[:,None]?
>>>>>>>
>>>>>>> that's the idea, with matching axes added
>>>>>>>
>>>>>>> can you use np.idx_ directly, once you have the numeric index list ?
>>>>>>>
>>>>>>> y.x[np.ix_(lists of indices)]
>>>>>>>
>>>>>>> might not work with slices, to avoid problems with mixing slices and
>>>>>>> array indexing, I converted slices in the past to a array of indices
>>>>>>>
>>>>>>> slice(None)   -> np.arange(x.shape(ax))
>>>>>>
>>>>>> actually, I looked this up in the numpy thread with David Huard on
>>>>>> indexing/slicing
>>>>>>
>>>>>>>>> range(*slice(None).indices(5))
>>>>>> [0, 1, 2, 3, 4]
>>>>>>>>> range(*slice(2,None,None).indices(5))
>>>>>> [2, 3, 4]
>>>>>>>>> range(*slice(None,2,None).indices(5))
>>>>>> [0, 1]
>>>>>
>>>>> Very nice!
>>>>
>>>> The np.ix_ conversion is working:
>>>>
>>>>>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>>>>>> y.lix[['a', 'b'], ['d', 'c']]
>>>>
>>>> The index (['a', 'b'], ['d', 'c']) is converted to
>>>>
>>>> (array([[0],
>>>>       [1]]), array([[1, 0]]))
>>>>
>>>> So the lix method is done. The only problem left is that
>>>> larry.__getitem__ does not support that type of indexing.
>>>
>>> I thought my earlier example with broadcasting worked.
>>>
>>> you could also return a new sliced larry directly in lix instead of
>>> going through getitem.
>>> you already have the index array for the array lar1.x[...]
>>> all you need to add, is the loop on selecting labels, for with you
>>> also have already the list of lists of indices, so it should be a
>>> short loop.
>>
>> Yep, that's they way I ended up going. I do all the indexing inside
>> lix whenever the input is a tuple.
>
> :)
>
>>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>>> y.lix[['a', 'b'], ['d', 'c']]
>
> label_0
>    a
>    b
> label_1
>    d
>    c
> x
> array([[2, 1],
>       [4, 3]])
>

nice, I look at the branch after dinner, no it's time to get the kids
(one stayed home with flue)

Josef



References