larry-discuss team mailing list archive
-
larry-discuss team
-
Mailing list archive
-
Message #00135
Re: Index by label support added to experimental branch
On Tue, Feb 9, 2010 at 8:39 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Tue, Feb 9, 2010 at 8:26 AM, <josef.pktd@xxxxxxxxx> wrote:
>> On Tue, Feb 9, 2010 at 10:23 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Mon, Feb 8, 2010 at 8:42 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>> On Mon, Feb 8, 2010 at 11:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> On Mon, Feb 8, 2010 at 7:54 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>>>> (back on mailinglist)
>>>>>>
>>>>>> On Mon, Feb 8, 2010 at 10:26 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>> On Mon, Feb 8, 2010 at 4:02 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>> I don't have a strong opinion either about the dimension reduction,
>>>>>>>> for consistency with the numpy philosophy the dimension should be
>>>>>>>> reduced, for working with larry it is easier for a user to remove than
>>>>>>>> to add an axis.
>>>>>>>
>>>>>>> How about insertaxis for the name? Or do you like addaxis better?
>>>>>>>
>>>>>>> insertaxis(axis=0, label=None)
>>>>>>>
>>>>>>
>>>>>> definitely not expand_dims, I had to look at the changes in scipy svn to find it
>>>>>>
>>>>>> I usually think of it as addaxis, but I can get used to insertaxis
>>>>>> (from list analogy)
>>>>>>
>>>>>> you can use np.expand_dims to insert axis into x
>>>>>
>>>>> Very nice. I didn't know that function existed. (It even handles
>>>>> negative axes. I should make a unit test to make sure larry methods
>>>>> can handle negative axes.)
>>>>>
>>>>
>>>> I'm just reading the new lix
>>>>
>>>> For this case lar.lix[['a']:['b']:2]
>>>>
>>>> why do you need the list check? you raise a ValueError if it's not a
>>>> list. this and the fact that slicing start and stop need to be valid
>>>> labels, seems to indicate that the list is not necessary, i.e.
>>>> lar.lix['a':'b':2] , and lar.lix['a':'b':2,...] should be unambiguous
>>>> for any label type.
>>>> Do you have an example where this would break? I don't see one right now.
>>>
>>> Yes, good point, removing the list requirement for slices should work.
>>> If, however, we keep the list requirement, then by adding one if
>>> statement to lix we can also index with integers:
>>>
>>> lar.lix[0, ['a', 'b']]
>>
>> but this lar.lix[[0,3], ['a', 'b']] won't work and we get an
>> inconsistency in what's allowed
>
> Yes, anything wrapped in a list would be interpreted as a label.
> Anything not wrapped in a list is interpreted as an index value. So
> the above would work if the labels 0 and 3 are along axis 0.
>
>>> lar.lix[1:[date1]]
>>
>> this would be useful but not really pretty
>>
>>> lar.lix[5, [date1]:-1]
>>>
>>> That might make it easier to loop:
>>>
>>> for i in range(lar.shape[0]):
>>> date = datetime.date(2010,1,1) + datetime.timedelta(i)
>>> y = lar.lix[i, [date]]
>>
>> I would prefer this
>>
>> for lab in lar.label[0]:
>> date = datetime.date(2010,1,1) + datetime.timedelta(i)
>> y = lar.lix[lab, date]
>>
>>
>>>
>>> And I could even eventually add some tuple support in the form:
>>>
>>> lar.lix[([date] - 10):[date]]
>>
>> I don't think this will work, because python first tries to do [date]
>> - 10 and raises an exception
>> date - 10 or [date-10] would work if date has add and subtract
>> defined as methods
>
> Yep, you're right.
>
>>>
>>> Yes, I think that woul be powerful. And the integer support comes for
>>> free if we keep the list requirement.
>>>
>>> What do you think?
>>
>> I prefer the cleaner/prettier version, although allowing slices that
>> are integers would be useful
>
> So, a few ways to go:
>
> 1. All labels must be wrapped in [], this is the current versin commited
> 2. All labels must be wrapped in [] and interger support
> 3. All labels must be wrapped in [] and interger support in slicing only
> 4. When slicing, labels should not be wrapped in [] and no integer
> support (and no special type support will be allowed in slices in the
> future since the label, which can be any type, is not wrapped.)
>
> In regards to #1 and #4, it seems easier to explain #1 since you
> always wrap labels with a list. But #4 is easier to type.
>
> Which of the 4 do you prefer?
I programmed up #4. It does simplify the code. And when pulling out
one label element from a larry it will not be in a list to begin with
(unless you slice). So #4 is handy.
One thing I worry about is this
y.lix[4:5]
Reading code like that (or writing it) it might be easy to think that
4 and 5 are indices not labels. Wrapping it in a list would remind the
us that they are labels.
>> y = lar.lix[5:-2, [date]]
>> instead of
>>
>> y = lar.lix[:, [date]][5:-2]
>> y = lar.lix[:, date1:date2][5:-2] # does this return view or copy?
>> if view (which I don't think it does) it would be clean also
>>
>> or instead of
>> y = lar.lix[labelatindex(5):labelatindex(-2), date]
>> where labelatindex(5) is e.g lar.label[0][5] or whatever is the best
>> way to get the label from an index
>>
>> I don't remember
>> y = lar.lix[:, [date]]
>>
>> Josef
>>
>>>
>>>> I think you can merge if your unit tests pass. Some cosmetic cleaning
>>>> we could also do in the main branch, e.g. maybe the code duplication
>>>> is not necessary (unless it's not really duplicate, I'm only reading)
>>>>
>>>> Are you ok with merging? There might be a merge conflict with the
>>>> changelog. I got one each time, I merge your branch.
>>>>
>>>> Josef
>>>>
>>>
>>
>
Follow ups
References