larry-discuss team mailing list archive
-
larry-discuss team
-
Mailing list archive
-
Message #00070
Re: A new proposal for indexing with labels
On Sun, Feb 7, 2010 at 8:46 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Sun, Feb 7, 2010 at 5:26 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Sat, Feb 6, 2010 at 6:53 PM, <josef.pktd@xxxxxxxxx> wrote:
>>> On Sat, Feb 6, 2010 at 9:48 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Sat, Feb 6, 2010 at 5:38 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> In a blueprint titled "index-by-label" I proposed a way to index
>>>>> larrys by lists of label elements. Here's a simpler, but less
>>>>> versatile, proposal. On the whole, due to its simplicity, I think it
>>>>> is more powerful.
>>>>
>>>> I commit this proposal in r187. Please give it a try.
>>>
>>> I will try it tomorrow and look at the implementation.
>>> My first reaction: very convenient but potentially fragile for arbitrary labels.
>>
>> The rule is simple for indexing with a string S:
>>
>> 1. Look for string S in the label. If found you are done. If not found...
>> 2. Map the labels to strings and look again
>>
>> Although the rule is simple, the result can be unexpected in corner
>> cases. For example, you may try to index with str(1) to access the
>> label integer 1 but the label could also contain string '1'. So in
>> that case you'd get an unexpected result even though the rule is
>> simple.
>>
>> I could add a check: len(set(strlabel)) == len(set(label)). And raise
>> an IndexError (or is that ValueError?) if they are not equal. That
>> will slow things down but only for indexing by strings.
>>
>> Would that address your fragile comment? Or do you have something else in mind?
>
> Wait, that's being too restrictive. We don't care if there are
> duplicates in strlabel. We only care if S appears more than once in
> strlabel. For example, if we are indexing with str(1) and the label is
> [2, str(2), 1], then we don't care that strlabel = [str(2), str(2),
> str(1)] has duplicates; we only care that str(1) only appears once. If
> we were indexing with str(2), on the other hand, then there would be a
> problem and we'd raise a ValueError.
>
> I can add that check and then you can take a look.
>
I just started to look at it. I saw in str2labelindex you use
str(labelobject) to identify the label.
I don't think __string__ is very save to use in general, I don't think
it is guaranteed to remain unchanged. e.g. in numpy you can affect the
str result with the print options for numbers in arrays, e.g.
np.set_printoptions(precision=2).
another example objects that don't define a unique string or use a
default string
>>> class MyA(object):pass
>>> aaa = MyA()
>>> str(aaa)
'<__main__.MyA object at 0x01A57DD0>'
I'm not very familiar with datetime, Is the string representation
locale or timezone dependent ?
decimal point is local dependent from some messages on the mailing
lists, I assume that in some cases the default in german is 5,4
instead of 5.4
So, relying on the string representation imposes quite a lot of
restrictions for which type of labels this would work.
I look some more.
Josef
Follow ups
References