← Back to team overview

larry-discuss team mailing list archive

Re: stream lining signature of assert_larry_equal

 

On Fri, Feb 5, 2010 at 9:48 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Fri, Feb 5, 2010 at 9:37 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Fri, Feb 5, 2010 at 9:19 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>> On Fri, Feb 5, 2010 at 12:01 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Fri, Feb 5, 2010 at 8:52 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> On Fri, Feb 5, 2010 at 7:28 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>> On Fri, Feb 5, 2010 at 7:24 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>> The current signature
>>>>>>>
>>>>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>                       noreference=True, nocopy=False)
>>>>>>>
>>>>>>> has 3 keywords, original, noreference, nocopy for the option to check
>>>>>>> whether a larry is a view or a copy.
>>>>>>>
>>>>>>> This makes 8 binary options, only 3 are relevant cases, the other 5
>>>>>>> combinations raise errors or are redundant. This was quite confusing
>>>>>>> when I used the tests, since I often forgot to use, reset a keyword,
>>>>>>> e.g. nocopy=True also requires setting noreference = False
>>>>>>>
>>>>>>> I think a more compact signature would be
>>>>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>> noref=True)
>>>>>>>
>>>>>>> if not original is None:
>>>>>>>    if noref:
>>>>>>>       check noreference
>>>>>>>    else:
>>>>>>>       check nocopy
>>>>>>>
>>>>>>> otherwise don't do any reference/copy check
>>>>>>>
>>>>>>> the presence of original would indicate that we want the view check,
>>>>>>> `noref` would tell which one.
>>>>>>
>>>>>> This is how larry gets improved---with cleaning like that.
>>>>>>
>>>>>> I'll make the change in the trunk.
>>>>>
>>>>> I changed the function signature as you suggested. I also changed
>>>>> noref to iscopy. The change in the kw name will mess up your unit
>>>>> tests :(
>>>
>>> I will change them
>>>
>>>>>
>>>>> And I change the name of assert_noreference to assert_iscopy and
>>>>> assert_nocopy to assert_isview.
>>>>
>>>> Here's a potential problem if the label contains objects:
>>>>
>>>>>> import datetime
>>>>>> x = larry([1, 2] , [[9, datetime.date(2009,1,1)]])
>>>>>> y = larry([1, 2] , [[7, x.label[0][1]]])
>>>>>> x.label[0][1] is y.label[0][1]
>>>>   True
>>>>>> la.util.testing.assert_iscopy(x, y)
>>>>>>
>>>>
>>>> assert_iscopy and assert_isview only compare the entire axis at once:
>>>>
>>>>    for i in xrange(larry1.ndim):
>>>>        if larry1.label[i] is larry2.label[i]:
>>>>            msg.append('The labels along axis %d share a reference.' % i)
>>>>
>>>> So I should change that to loop over each label element, right?
>>>
>>> Yes. Because this is for the tests, we want to catch all cases, even
>>> if they don't show up in common usage.
>>>
>>> Do we have the same problem with nested labels, e.g. tuples after
>>> flatten?  Maybe then a recursive helper function is useful.
>>
>> Now I know why I didn't check each label element:
>>
>>>>> 9 is 9
>> True
>>
>> So I need to check if the object is immutable. Do you know how to do
>> that? A tuple is immutable, but not the elements inside. Seems
>> complicated.
>
> I'm only comparing, or trying to compare, label elements that are in
> the same position in both larrys. I guess label element i along axis j
> in larry1 should instead be compared to all label elements in larrys2
> to search for references. That might actually be easier to code. We'd
> need a function that collapses tuples and lists into a single list and
> then do all cross comparisons. The unit test larrys are small so it
> should not take long. And after comparing i with j we wouldn't need to
> compare j to i.

Ack! This is too complicated. And I don't see much gain from it. Plus
the copylabel method in larry looks pretty solid:

    def copylabel(self):
        """
        Return a copy of a larry's label.

        Examples
        --------
        >>> y = larry([1, 2], [['a', 'b']])
        >>> label = y.copylabel()
        >>> label
        [['a', 'b']]

        """
        return deepcopy(self.label)

So I'm thinking of leaving assert_iscopy and assert_isview as is. So
only check if an entire axis is a label copy.

If we were to keep going with this we'd realize that the exact same
issue applies to the data array in larry:

>> x = np.array([1, datetime.date(2009,1,1)])
>> y = np.array([2, x[1]])
>> x[1] is y[1]
   True
>> x is y
   False

So a full comparison is needed there too.



Follow ups

References