larry-discuss team mailing list archive
-
larry-discuss team
-
Mailing list archive
-
Message #00137
Re: Index by label support added to experimental branch
On Tue, Feb 9, 2010 at 9:58 AM, <josef.pktd@xxxxxxxxx> wrote:
> On Tue, Feb 9, 2010 at 12:18 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Tue, Feb 9, 2010 at 8:39 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Tue, Feb 9, 2010 at 8:26 AM, <josef.pktd@xxxxxxxxx> wrote:
>>>> On Tue, Feb 9, 2010 at 10:23 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> On Mon, Feb 8, 2010 at 8:42 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>>>> On Mon, Feb 8, 2010 at 11:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>> On Mon, Feb 8, 2010 at 7:54 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>> (back on mailinglist)
>>>>>>>>
>>>>>>>> On Mon, Feb 8, 2010 at 10:26 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>> On Mon, Feb 8, 2010 at 4:02 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>> I don't have a strong opinion either about the dimension reduction,
>>>>>>>>>> for consistency with the numpy philosophy the dimension should be
>>>>>>>>>> reduced, for working with larry it is easier for a user to remove than
>>>>>>>>>> to add an axis.
>>>>>>>>>
>>>>>>>>> How about insertaxis for the name? Or do you like addaxis better?
>>>>>>>>>
>>>>>>>>> insertaxis(axis=0, label=None)
>>>>>>>>>
>>>>>>>>
>>>>>>>> definitely not expand_dims, I had to look at the changes in scipy svn to find it
>>>>>>>>
>>>>>>>> I usually think of it as addaxis, but I can get used to insertaxis
>>>>>>>> (from list analogy)
>>>>>>>>
>>>>>>>> you can use np.expand_dims to insert axis into x
>>>>>>>
>>>>>>> Very nice. I didn't know that function existed. (It even handles
>>>>>>> negative axes. I should make a unit test to make sure larry methods
>>>>>>> can handle negative axes.)
>>>>>>>
>>>>>>
>>>>>> I'm just reading the new lix
>>>>>>
>>>>>> For this case lar.lix[['a']:['b']:2]
>>>>>>
>>>>>> why do you need the list check? you raise a ValueError if it's not a
>>>>>> list. this and the fact that slicing start and stop need to be valid
>>>>>> labels, seems to indicate that the list is not necessary, i.e.
>>>>>> lar.lix['a':'b':2] , and lar.lix['a':'b':2,...] should be unambiguous
>>>>>> for any label type.
>>>>>> Do you have an example where this would break? I don't see one right now.
>>>>>
>>>>> Yes, good point, removing the list requirement for slices should work.
>>>>> If, however, we keep the list requirement, then by adding one if
>>>>> statement to lix we can also index with integers:
>>>>>
>>>>> lar.lix[0, ['a', 'b']]
>>>>
>>>> but this lar.lix[[0,3], ['a', 'b']] won't work and we get an
>>>> inconsistency in what's allowed
>>>
>>> Yes, anything wrapped in a list would be interpreted as a label.
>>> Anything not wrapped in a list is interpreted as an index value. So
>>> the above would work if the labels 0 and 3 are along axis 0.
>>>
>>>>> lar.lix[1:[date1]]
>>>>
>>>> this would be useful but not really pretty
>>>>
>>>>> lar.lix[5, [date1]:-1]
>>>>>
>>>>> That might make it easier to loop:
>>>>>
>>>>> for i in range(lar.shape[0]):
>>>>> date = datetime.date(2010,1,1) + datetime.timedelta(i)
>>>>> y = lar.lix[i, [date]]
>>>>
>>>> I would prefer this
>>>>
>>>> for lab in lar.label[0]:
>>>> date = datetime.date(2010,1,1) + datetime.timedelta(i)
>>>> y = lar.lix[lab, date]
>>>>
>>>>
>>>>>
>>>>> And I could even eventually add some tuple support in the form:
>>>>>
>>>>> lar.lix[([date] - 10):[date]]
>>>>
>>>> I don't think this will work, because python first tries to do [date]
>>>> - 10 and raises an exception
>>>> date - 10 or [date-10] would work if date has add and subtract
>>>> defined as methods
>>>
>>> Yep, you're right.
>>>
>>>>>
>>>>> Yes, I think that woul be powerful. And the integer support comes for
>>>>> free if we keep the list requirement.
>>>>>
>>>>> What do you think?
>>>>
>>>> I prefer the cleaner/prettier version, although allowing slices that
>>>> are integers would be useful
>>>
>>> So, a few ways to go:
>>>
>>> 1. All labels must be wrapped in [], this is the current versin commited
>>> 2. All labels must be wrapped in [] and interger support
>>> 3. All labels must be wrapped in [] and interger support in slicing only
>>> 4. When slicing, labels should not be wrapped in [] and no integer
>>> support (and no special type support will be allowed in slices in the
>>> future since the label, which can be any type, is not wrapped.)
>>>
>>> In regards to #1 and #4, it seems easier to explain #1 since you
>>> always wrap labels with a list. But #4 is easier to type.
>>>
>>> Which of the 4 do you prefer?
>>
>> I programmed up #4. It does simplify the code. And when pulling out
>> one label element from a larry it will not be in a list to begin with
>> (unless you slice). So #4 is handy.
>>
>> One thing I worry about is this
>>
>> y.lix[4:5]
>>
>> Reading code like that (or writing it) it might be easy to think that
>> 4 and 5 are indices not labels. Wrapping it in a list would remind the
>> us that they are labels.
>
> After thinking about it for a while, I also start to find your version
> 2 or 3 more attractive. Keeping labels always in list brackets is only
> redundant and looks a bit ugly in the slicing case
>
> lar.lix[['a']:['b']:2]
>
> in all other cases list brackets are required anyway. I think now that
> this is a small enough price to pay and it has the clear message that
> it is different.
>
> between 2 and 3 I'm not sure, 2 has the danger of leading users to use
> a list of indices by accident.
> What would happen if users use an array?
>
> lar.lix[np.array([2,3]), ['msft']) since you are doing a list type
> check, this would not trigger the label interpretation
>
> Josef
I commited method #2. Yes, arrays could be used to index with mutliple
integer indices. I'll save that for later.
>
>>
>>>> y = lar.lix[5:-2, [date]]
>>>> instead of
>>>>
>>>> y = lar.lix[:, [date]][5:-2]
>>>> y = lar.lix[:, date1:date2][5:-2] # does this return view or copy?
>>>> if view (which I don't think it does) it would be clean also
>>>>
>>>> or instead of
>>>> y = lar.lix[labelatindex(5):labelatindex(-2), date]
>>>> where labelatindex(5) is e.g lar.label[0][5] or whatever is the best
>>>> way to get the label from an index
>>>>
>>>> I don't remember
>>>> y = lar.lix[:, [date]]
>>>>
>>>> Josef
>>>>
>>>>>
>>>>>> I think you can merge if your unit tests pass. Some cosmetic cleaning
>>>>>> we could also do in the main branch, e.g. maybe the code duplication
>>>>>> is not necessary (unless it's not really duplicate, I'm only reading)
>>>>>>
>>>>>> Are you ok with merging? There might be a merge conflict with the
>>>>>> changelog. I got one each time, I merge your branch.
>>>>>>
>>>>>> Josef
>>>>>>
>>>>>
>>>>
>>>
>>
>
Follow ups
References