larry-discuss team mailing list archive
-
larry-discuss team
-
Mailing list archive
-
Message #00123
Re: Index by label support added to experimental branch
On Mon, Feb 8, 2010 at 1:26 PM, <josef.pktd@xxxxxxxxx> wrote:
> On Mon, Feb 8, 2010 at 4:19 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Mon, Feb 8, 2010 at 1:02 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Mon, Feb 8, 2010 at 12:59 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>> On Mon, Feb 8, 2010 at 3:51 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>>> On Mon, Feb 8, 2010 at 3:37 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>> On Mon, Feb 8, 2010 at 12:27 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>> I made a branch called index-by-label:
>>>>>>>
>>>>>>> https://code.launchpad.net/~kwgoodman/larry/index-by-label
>>>>>>>
>>>>>>> and added support for indexing by label. It is all self contained in
>>>>>>> one method: deflarry.py(lix). So easy to try.
>>>>>>>
>>>>>>> Here are a few one-off tests:
>>>>>>>
>>>>>>>>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>>>>>>>
>>>>>>>>> y.lix[['a']]
>>>>>>> label_0
>>>>>>> c
>>>>>>> d
>>>>>>> x
>>>>>>> array([1, 2])
>>>>>>>
>>>>>>>>> y.lix[['b', 'a']]
>>>>>>> label_0
>>>>>>> b
>>>>>>> a
>>>>>>> label_1
>>>>>>> c
>>>>>>> d
>>>>>>> x
>>>>>>> array([[3, 4],
>>>>>>> [1, 2]])
>>>>>>>
>>>>>>>>> y.lix[['a']:]
>>>>>>> label_0
>>>>>>> a
>>>>>>> b
>>>>>>> label_1
>>>>>>> c
>>>>>>> d
>>>>>>> x
>>>>>>> array([[1, 2],
>>>>>>> [3, 4]])
>>>>>>>
>>>>>>>>> y.lix[:['b']]
>>>>>>> label_0
>>>>>>> a
>>>>>>> label_1
>>>>>>> c
>>>>>>> d
>>>>>>> x
>>>>>>> array([[1, 2]])
>>>>>>>
>>>>>>>>> y.lix[['a'], ['c']:]
>>>>>>> label_0
>>>>>>> c
>>>>>>> d
>>>>>>> x
>>>>>>> array([1, 2])
>>>>>>>
>>>>>>>>> y.lix[['a'], ['d', 'c']]
>>>>>>> label_0
>>>>>>> d
>>>>>>> c
>>>>>>> x
>>>>>>> array([2, 1])
>>>>>>>
>>>>>>>>> y.lix['a':] # <--- expected to crash
>>>>>>> ValueError: The start element of a slice must be a list.
>>>>>>
>>>>>> I think the only thing left to add is the conversion of fancy indexing
>>>>>> to rectangular:
>>>>>>
>>>>>>>> y.lix[['a', 'b'], ['d', 'c']]
>>>>>> IndexError: tuple index out of range
>>>>>>
>>>>>> How do I identify rectangular indexing? If more than one list with
>>>>>> more than one element are present anywhere in the index, then it is
>>>>>> rectangular and I need to do your trick np.array(first list)[:,None]?
>>>>>
>>>>> that's the idea, with matching axes added
>>>>>
>>>>> can you use np.idx_ directly, once you have the numeric index list ?
>>>>>
>>>>> y.x[np.ix_(lists of indices)]
>>>>>
>>>>> might not work with slices, to avoid problems with mixing slices and
>>>>> array indexing, I converted slices in the past to a array of indices
>>>>>
>>>>> slice(None) -> np.arange(x.shape(ax))
>>>>
>>>> actually, I looked this up in the numpy thread with David Huard on
>>>> indexing/slicing
>>>>
>>>>>>> range(*slice(None).indices(5))
>>>> [0, 1, 2, 3, 4]
>>>>>>> range(*slice(2,None,None).indices(5))
>>>> [2, 3, 4]
>>>>>>> range(*slice(None,2,None).indices(5))
>>>> [0, 1]
>>>
>>> Very nice!
>>
>> The np.ix_ conversion is working:
>>
>>>> y = la.larry([[1,2],[3,4]], [['a', 'b'], ['c', 'd']])
>>>> y.lix[['a', 'b'], ['d', 'c']]
>>
>> The index (['a', 'b'], ['d', 'c']) is converted to
>>
>> (array([[0],
>> [1]]), array([[1, 0]]))
>>
>> So the lix method is done. The only problem left is that
>> larry.__getitem__ does not support that type of indexing.
>
> I thought my earlier example with broadcasting worked.
>
> you could also return a new sliced larry directly in lix instead of
> going through getitem.
> you already have the index array for the array lar1.x[...]
> all you need to add, is the loop on selecting labels, for with you
> also have already the list of lists of indices, so it should be a
> short loop.
Yep, that's they way I ended up going. I do all the indexing inside
lix whenever the input is a tuple.
Follow ups
References