← Back to team overview

larry-discuss team mailing list archive

Re: Index by label support added to experimental branch

 

On Tue, Feb 9, 2010 at 12:18 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Tue, Feb 9, 2010 at 8:39 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Tue, Feb 9, 2010 at 8:26 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>> On Tue, Feb 9, 2010 at 10:23 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Mon, Feb 8, 2010 at 8:42 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>> On Mon, Feb 8, 2010 at 11:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>> On Mon, Feb 8, 2010 at 7:54 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>> (back on mailinglist)
>>>>>>>
>>>>>>> On Mon, Feb 8, 2010 at 10:26 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>> On Mon, Feb 8, 2010 at 4:02 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>> I don't have a strong opinion either about the dimension reduction,
>>>>>>>>> for consistency with the numpy philosophy the dimension should be
>>>>>>>>> reduced, for working with larry it is easier for a user to remove than
>>>>>>>>> to add an axis.
>>>>>>>>
>>>>>>>> How about insertaxis for the name? Or do you like addaxis better?
>>>>>>>>
>>>>>>>> insertaxis(axis=0, label=None)
>>>>>>>>
>>>>>>>
>>>>>>> definitely not expand_dims, I had to look at the changes in scipy svn to find it
>>>>>>>
>>>>>>> I usually think of it as addaxis, but I can get used to insertaxis
>>>>>>> (from list analogy)
>>>>>>>
>>>>>>> you can use np.expand_dims to insert axis into x
>>>>>>
>>>>>> Very nice. I didn't know that function existed. (It even handles
>>>>>> negative axes. I should make a unit test to make sure larry methods
>>>>>> can handle negative axes.)
>>>>>>
>>>>>
>>>>> I'm just reading the new lix
>>>>>
>>>>> For this case  lar.lix[['a']:['b']:2]
>>>>>
>>>>> why do you need the list check? you raise a ValueError if it's not a
>>>>> list. this and the fact that slicing start and stop need to be valid
>>>>> labels, seems to indicate that the list is not necessary, i.e.
>>>>> lar.lix['a':'b':2] , and lar.lix['a':'b':2,...]  should be unambiguous
>>>>> for any label type.
>>>>> Do you have an example where this would break? I don't see one right now.
>>>>
>>>> Yes, good point, removing the list requirement for slices should work.
>>>> If, however, we keep the list requirement, then by adding one if
>>>> statement to lix we can also index with integers:
>>>>
>>>> lar.lix[0, ['a', 'b']]
>>>
>>> but this  lar.lix[[0,3], ['a', 'b']]  won't work and we get an
>>> inconsistency in what's allowed
>>
>> Yes, anything wrapped in a list would be interpreted as a label.
>> Anything not wrapped in a list is interpreted as an index value. So
>> the above would work if the labels 0 and 3 are along axis 0.
>>
>>>> lar.lix[1:[date1]]
>>>
>>> this would be useful but not really pretty
>>>
>>>> lar.lix[5, [date1]:-1]
>>>>
>>>> That might make it easier to loop:
>>>>
>>>> for i in range(lar.shape[0]):
>>>>    date = datetime.date(2010,1,1) + datetime.timedelta(i)
>>>>    y = lar.lix[i, [date]]
>>>
>>> I would prefer this
>>>
>>>  for lab in lar.label[0]:
>>>    date = datetime.date(2010,1,1) + datetime.timedelta(i)
>>>    y = lar.lix[lab, date]
>>>
>>>
>>>>
>>>> And I could even eventually add some tuple support in the form:
>>>>
>>>> lar.lix[([date] - 10):[date]]
>>>
>>> I don't think this will work, because python first tries to do  [date]
>>> - 10 and raises an exception
>>> date - 10  or [date-10] would work if date has add and subtract
>>> defined as methods
>>
>> Yep, you're right.
>>
>>>>
>>>> Yes, I think  that woul be powerful. And the integer support comes for
>>>> free if we keep the list requirement.
>>>>
>>>> What do you think?
>>>
>>> I prefer the cleaner/prettier version, although allowing slices that
>>> are integers would be useful
>>
>> So, a few ways to go:
>>
>> 1. All labels must be wrapped in [], this is the current versin commited
>> 2. All labels must be wrapped in [] and interger support
>> 3. All labels must be wrapped in [] and interger support in slicing only
>> 4. When slicing, labels should not be wrapped in [] and no integer
>> support (and no special type support will be allowed in slices in the
>> future since the label, which can be any type, is not wrapped.)
>>
>> In regards to #1 and #4, it seems easier to explain #1 since you
>> always wrap labels with a list. But #4 is easier to type.
>>
>> Which of the 4 do you prefer?
>
> I programmed up #4. It does simplify the code. And when pulling out
> one label element from a larry it will not be in a list to begin with
> (unless you slice). So #4 is handy.
>
> One thing I worry about is this
>
> y.lix[4:5]
>
> Reading code like that (or writing it) it might be easy to think that
> 4 and 5 are indices not labels. Wrapping it in a list would remind the
> us that they are labels.

After thinking about it for a while, I also start to find your version
2 or 3 more attractive. Keeping labels always in list brackets is only
redundant and looks a bit ugly in the slicing case

lar.lix[['a']:['b']:2]

in all other cases list brackets are required anyway. I think now that
this is a small enough price to pay and it has the clear message that
it is different.

between 2 and 3 I'm not sure, 2 has the danger of leading users to use
a list of indices by accident.
What would happen if users use an array?

lar.lix[np.array([2,3]), ['msft']) since you are doing a list type
check, this would not trigger the label interpretation

Josef

>
>>> y = lar.lix[5:-2, [date]]
>>> instead of
>>>
>>> y = lar.lix[:, [date]][5:-2]
>>> y = lar.lix[:, date1:date2][5:-2]     # does this return view or copy?
>>> if view (which I don't think it does) it would be clean also
>>>
>>> or instead of
>>> y = lar.lix[labelatindex(5):labelatindex(-2), date]
>>> where labelatindex(5) is e.g lar.label[0][5]  or whatever is the best
>>> way to get the label from an index
>>>
>>> I don't remember
>>> y = lar.lix[:, [date]]
>>>
>>> Josef
>>>
>>>>
>>>>> I think you can merge if your unit tests pass. Some cosmetic cleaning
>>>>> we could also do in the main branch, e.g. maybe the code duplication
>>>>> is not necessary (unless it's not really duplicate, I'm only reading)
>>>>>
>>>>> Are you ok with merging? There might be a merge conflict with the
>>>>> changelog. I got one each time, I merge your branch.
>>>>>
>>>>> Josef
>>>>>
>>>>
>>>
>>
>



Follow ups

References