larry-discuss team mailing list archive

Thread
Date

Re: Index by label support added to experimental branch

To: josef.pktd@xxxxxxxxx
From: Keith Goodman <kwgoodman@xxxxxxxxx>
Date: Tue, 9 Feb 2010 08:39:28 -0800
Cc: larry-discuss@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1cd32cbb1002090826q3eb637d9tbc365e391cf28a8e@mail.gmail.com>

On Tue, Feb 9, 2010 at 8:26 AM,  <josef.pktd@xxxxxxxxx> wrote:
> On Tue, Feb 9, 2010 at 10:23 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Mon, Feb 8, 2010 at 8:42 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>> On Mon, Feb 8, 2010 at 11:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Mon, Feb 8, 2010 at 7:54 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>> (back on mailinglist)
>>>>>
>>>>> On Mon, Feb 8, 2010 at 10:26 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>> On Mon, Feb 8, 2010 at 4:02 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>> I don't have a strong opinion either about the dimension reduction,
>>>>>>> for consistency with the numpy philosophy the dimension should be
>>>>>>> reduced, for working with larry it is easier for a user to remove than
>>>>>>> to add an axis.
>>>>>>
>>>>>> How about insertaxis for the name? Or do you like addaxis better?
>>>>>>
>>>>>> insertaxis(axis=0, label=None)
>>>>>>
>>>>>
>>>>> definitely not expand_dims, I had to look at the changes in scipy svn to find it
>>>>>
>>>>> I usually think of it as addaxis, but I can get used to insertaxis
>>>>> (from list analogy)
>>>>>
>>>>> you can use np.expand_dims to insert axis into x
>>>>
>>>> Very nice. I didn't know that function existed. (It even handles
>>>> negative axes. I should make a unit test to make sure larry methods
>>>> can handle negative axes.)
>>>>
>>>
>>> I'm just reading the new lix
>>>
>>> For this case  lar.lix[['a']:['b']:2]
>>>
>>> why do you need the list check? you raise a ValueError if it's not a
>>> list. this and the fact that slicing start and stop need to be valid
>>> labels, seems to indicate that the list is not necessary, i.e.
>>> lar.lix['a':'b':2] , and lar.lix['a':'b':2,...]  should be unambiguous
>>> for any label type.
>>> Do you have an example where this would break? I don't see one right now.
>>
>> Yes, good point, removing the list requirement for slices should work.
>> If, however, we keep the list requirement, then by adding one if
>> statement to lix we can also index with integers:
>>
>> lar.lix[0, ['a', 'b']]
>
> but this  lar.lix[[0,3], ['a', 'b']]  won't work and we get an
> inconsistency in what's allowed

Yes, anything wrapped in a list would be interpreted as a label.
Anything not wrapped in a list is interpreted as an index value. So
the above would work if the labels 0 and 3 are along axis 0.

>> lar.lix[1:[date1]]
>
> this would be useful but not really pretty
>
>> lar.lix[5, [date1]:-1]
>>
>> That might make it easier to loop:
>>
>> for i in range(lar.shape[0]):
>>    date = datetime.date(2010,1,1) + datetime.timedelta(i)
>>    y = lar.lix[i, [date]]
>
> I would prefer this
>
>  for lab in lar.label[0]:
>    date = datetime.date(2010,1,1) + datetime.timedelta(i)
>    y = lar.lix[lab, date]
>
>
>>
>> And I could even eventually add some tuple support in the form:
>>
>> lar.lix[([date] - 10):[date]]
>
> I don't think this will work, because python first tries to do  [date]
> - 10 and raises an exception
> date - 10  or [date-10] would work if date has add and subtract
> defined as methods

Yep, you're right.

>>
>> Yes, I think  that woul be powerful. And the integer support comes for
>> free if we keep the list requirement.
>>
>> What do you think?
>
> I prefer the cleaner/prettier version, although allowing slices that
> are integers would be useful

So, a few ways to go:

1. All labels must be wrapped in [], this is the current versin commited
2. All labels must be wrapped in [] and interger support
3. All labels must be wrapped in [] and interger support in slicing only
4. When slicing, labels should not be wrapped in [] and no integer
support (and no special type support will be allowed in slices in the
future since the label, which can be any type, is not wrapped.)

In regards to #1 and #4, it seems easier to explain #1 since you
always wrap labels with a list. But #4 is easier to type.

Which of the 4 do you prefer?

> y = lar.lix[5:-2, [date]]
> instead of
>
> y = lar.lix[:, [date]][5:-2]
> y = lar.lix[:, date1:date2][5:-2]     # does this return view or copy?
> if view (which I don't think it does) it would be clean also
>
> or instead of
> y = lar.lix[labelatindex(5):labelatindex(-2), date]
> where labelatindex(5) is e.g lar.label[0][5]  or whatever is the best
> way to get the label from an index
>
> I don't remember
> y = lar.lix[:, [date]]
>
> Josef
>
>>
>>> I think you can merge if your unit tests pass. Some cosmetic cleaning
>>> we could also do in the main branch, e.g. maybe the code duplication
>>> is not necessary (unless it's not really duplicate, I'm only reading)
>>>
>>> Are you ok with merging? There might be a merge conflict with the
>>> changelog. I got one each time, I merge your branch.
>>>
>>> Josef
>>>
>>
>

Follow ups

Re: Index by label support added to experimental branch
From: Keith Goodman, 2010-02-09

References

Index by label support added to experimental branch
From: Keith Goodman, 2010-02-08
Re: Index by label support added to experimental branch
From: josef . pktd, 2010-02-09
Re: Index by label support added to experimental branch
From: Keith Goodman, 2010-02-09
Re: Index by label support added to experimental branch
From: josef . pktd, 2010-02-09
Re: Index by label support added to experimental branch
From: Keith Goodman, 2010-02-09
Re: Index by label support added to experimental branch
From: josef . pktd, 2010-02-09