← Back to team overview

larry-discuss team mailing list archive

Re: stream lining signature of assert_larry_equal

 

On Fri, Feb 5, 2010 at 10:03 AM,  <josef.pktd@xxxxxxxxx> wrote:
> On Fri, Feb 5, 2010 at 12:56 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Fri, Feb 5, 2010 at 9:48 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Fri, Feb 5, 2010 at 9:37 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Fri, Feb 5, 2010 at 9:19 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>> On Fri, Feb 5, 2010 at 12:01 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>> On Fri, Feb 5, 2010 at 8:52 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>> On Fri, Feb 5, 2010 at 7:28 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>> On Fri, Feb 5, 2010 at 7:24 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>> The current signature
>>>>>>>>>
>>>>>>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>>>                       noreference=True, nocopy=False)
>>>>>>>>>
>>>>>>>>> has 3 keywords, original, noreference, nocopy for the option to check
>>>>>>>>> whether a larry is a view or a copy.
>>>>>>>>>
>>>>>>>>> This makes 8 binary options, only 3 are relevant cases, the other 5
>>>>>>>>> combinations raise errors or are redundant. This was quite confusing
>>>>>>>>> when I used the tests, since I often forgot to use, reset a keyword,
>>>>>>>>> e.g. nocopy=True also requires setting noreference = False
>>>>>>>>>
>>>>>>>>> I think a more compact signature would be
>>>>>>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>>> noref=True)
>>>>>>>>>
>>>>>>>>> if not original is None:
>>>>>>>>>    if noref:
>>>>>>>>>       check noreference
>>>>>>>>>    else:
>>>>>>>>>       check nocopy
>>>>>>>>>
>>>>>>>>> otherwise don't do any reference/copy check
>>>>>>>>>
>>>>>>>>> the presence of original would indicate that we want the view check,
>>>>>>>>> `noref` would tell which one.
>>>>>>>>
>>>>>>>> This is how larry gets improved---with cleaning like that.
>>>>>>>>
>>>>>>>> I'll make the change in the trunk.
>>>>>>>
>>>>>>> I changed the function signature as you suggested. I also changed
>>>>>>> noref to iscopy. The change in the kw name will mess up your unit
>>>>>>> tests :(
>>>>>
>>>>> I will change them
>>>>>
>>>>>>>
>>>>>>> And I change the name of assert_noreference to assert_iscopy and
>>>>>>> assert_nocopy to assert_isview.
>>>>>>
>>>>>> Here's a potential problem if the label contains objects:
>>>>>>
>>>>>>>> import datetime
>>>>>>>> x = larry([1, 2] , [[9, datetime.date(2009,1,1)]])
>>>>>>>> y = larry([1, 2] , [[7, x.label[0][1]]])
>>>>>>>> x.label[0][1] is y.label[0][1]
>>>>>>   True
>>>>>>>> la.util.testing.assert_iscopy(x, y)
>>>>>>>>
>>>>>>
>>>>>> assert_iscopy and assert_isview only compare the entire axis at once:
>>>>>>
>>>>>>    for i in xrange(larry1.ndim):
>>>>>>        if larry1.label[i] is larry2.label[i]:
>>>>>>            msg.append('The labels along axis %d share a reference.' % i)
>>>>>>
>>>>>> So I should change that to loop over each label element, right?
>>>>>
>>>>> Yes. Because this is for the tests, we want to catch all cases, even
>>>>> if they don't show up in common usage.
>>>>>
>>>>> Do we have the same problem with nested labels, e.g. tuples after
>>>>> flatten?  Maybe then a recursive helper function is useful.
>>>>
>>>> Now I know why I didn't check each label element:
>>>>
>>>>>>> 9 is 9
>>>> True
>>>>
>>>> So I need to check if the object is immutable. Do you know how to do
>>>> that? A tuple is immutable, but not the elements inside. Seems
>>>> complicated.
>>>
>>> I'm only comparing, or trying to compare, label elements that are in
>>> the same position in both larrys. I guess label element i along axis j
>>> in larry1 should instead be compared to all label elements in larrys2
>>> to search for references. That might actually be easier to code. We'd
>>> need a function that collapses tuples and lists into a single list and
>>> then do all cross comparisons. The unit test larrys are small so it
>>> should not take long. And after comparing i with j we wouldn't need to
>>> compare j to i.
>>
>> Ack! This is too complicated. And I don't see much gain from it. Plus
>> the copylabel method in larry looks pretty solid:
>>
>>    def copylabel(self):
>>        """
>>        Return a copy of a larry's label.
>>
>>        Examples
>>        --------
>>        >>> y = larry([1, 2], [['a', 'b']])
>>        >>> label = y.copylabel()
>>        >>> label
>>        [['a', 'b']]
>>
>>        """
>>        return deepcopy(self.label)
>>
>> So I'm thinking of leaving assert_iscopy and assert_isview as is. So
>> only check if an entire axis is a label copy.
>>
>> If we were to keep going with this we'd realize that the exact same
>> issue applies to the data array in larry:
>>
>>>> x = np.array([1, datetime.date(2009,1,1)])
>>>> y = np.array([2, x[1]])
>>>> x[1] is y[1]
>>   True
>>>> x is y
>>   False
>>
>> So a full comparison is needed there too.
>
> No, for numpy arrays we have the array methods, and there are stronger
> memory restrictions. When we compare larrys that are produced by
> slicing then we would need to check the base attribute of the array,
> we wouldn't check individual array elements.  Although, I'm not
> completely sure this holds for object arrays.

But doesn't the example below fail?

>>>> x = np.array([1, datetime.date(2009,1,1)])
>>>> y = np.array([2, x[1]])
>>>> x[1] is y[1]
>>   True
>>>> x is y
>>   False



Follow ups

References