← Back to team overview

larry-discuss team mailing list archive

Re: Label indexing

 

On Mon, Feb 8, 2010 at 12:39 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Mon, Feb 8, 2010 at 9:09 AM,  <josef.pktd@xxxxxxxxx> wrote:
>> On Mon, Feb 8, 2010 at 11:54 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Mon, Feb 8, 2010 at 7:31 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>> On Mon, Feb 8, 2010 at 10:18 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> On Mon, Feb 8, 2010 at 7:15 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>> On Mon, Feb 8, 2010 at 10:05 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>> After a long thread on how to support label indexing in larry, here's
>>>>>>> where we ended up:
>>>>>>>
>>>>>>> 1. Remove the indexing of labels by strings from the trunk.
>>>>>>>
>>>>>>> 2. Create a method larry.lix that can be used for label indexing like this:
>>>>>>>
>>>>>>> lar.lix['a']
>>>>>>> lar.lix['a':]
>>>>>>> lar.lix[:, 'a']
>>>>>>> lar.lix['a', 'b', 'c':]
>>>>>>
>>>>>> I'm not sure what the last index means, if there are several labels
>>>>>> for the same axis, they would need to be in a list_like
>>>>>> lar.lix[['a', 'b', 'c'], :]   to distinguish different elements for
>>>>>> one axis from elements for several axis.
>>>>>
>>>>> Yep. Good catch. Typo.
>>>>
>>>> just for clarification, you also want to support label slices?
>>>>
>>>> lar.lix[['a', 'b', 'c'], '2009-01-01':]  or
>>>> lar.lix[['a', 'b', 'c'], date1 :date2]
>>>>
>>>> I don't think there is any ambiguity in the interpretation
>>>
>>> Yes, slices will be supported. So both of your examples will work if
>>> the corresponding labels exists. Looking at your date slice made be
>>> realize that larry needs a sortlabel method
>>>
>>> def sortlabel(axis=None):
>>>    etc.
>>>
>>> where axis=None will sort all axes.
>>
>> When I started initially to work with some examples I found the label
>> ordering a bit confusing. I think any call to _align returns a larry
>> with sorted labels. Initially I had a descending sorted larry and
>> after some binary operations, I had a ascending sorted larry.
>>
>> There is currently no guarantee of preserving label ordering unless
>> it's ascending sort order, is there?
>> I'm not sure what the policy is, and I don't think that there are any
>> unit tests that would check label ordering.
>
> That's a good point. Something to add to the sphinx doc in the
> alignment section.
>
> There is no way to preserve label ordering in arbitrary binary
> operations since the labels of the two inputs could be different. If
> any alignment is needed larry sorts the labels. But If no alignment is
> needed larry doesn't sort. So if my labels are in reverse order they
> will remain reversed if added to an larry with the same labels in the
> same order. But if you add a reversed label larry to a non-reserved,
> then a non-reverse larry is returned.
>
> Lable ordering should not be importants for larrys since alignment is
> automatic. But that break when we do stuff like
>
> idx = lar.labelindex(date)
> lar[:,idx+1-100:idx+1]
>
> Is there a better design? Would it be better to always reorder on
> binary operations even if the labels are aligned?

I didn't have any good ideas when I briefly looked at the problem. I
think my example was when I tried to do a diff() (when I didn't know
much about larry yet), but I don't remember any details.

For the time axis, ordering is important also for the moving functions
and eg. fill_forward.
For labels that are names, it would be nice to be able to work with
arbitrary ordering.

Does _align sort every axis, even if only one axis has disagreement ?

Since I don't have currently any better ideas, I would just document
it as something to be aware of.
I think, it would be good to check individual methods for how
consistent the policy is, but I haven't worked with enough examples to
see whether there might be any problems.

I would sort by default only if it is really necessary to avoid some
confusion by users, or if it would complicate the implementation.

Josef



>
>>
>> Josef
>>
>>>
>>>>
>>>>>>
>>>>>> Josef
>>>>>>
>>>>>>>
>>>>>>> Only labels and slices are allowed. Inside the function the labels
>>>>>>> will be converted to indices and then a call will be made to
>>>>>>> lar[converted_index].
>>>>>>>
>>>>>>> 3. Indexing with more than one list will do rectangular indexing, not
>>>>>>> fancy indexing.
>>>>>>>
>>>>>>> OK, that's it. I'll start. I'll need help with #3
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: https://launchpad.net/~larry-discuss
>>>>>>> Post to     : larry-discuss@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~larry-discuss
>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>



Follow ups

References