larry-discuss team mailing list archive

Thread
Date

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing

To: Keith Goodman <kwgoodman@xxxxxxxxx>
From: josef.pktd@xxxxxxxxx
Date: Thu, 4 Feb 2010 13:54:23 -0500
Cc: larry-discuss@xxxxxxxxxxxxxxxxxxx
In-reply-to: <f4f93d421002041038k649806ecu4ca10b3824470b3f@mail.gmail.com>

On Thu, Feb 4, 2010 at 1:38 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Thu, Feb 4, 2010 at 10:35 AM,  <josef.pktd@xxxxxxxxx> wrote:
>> On Thu, Feb 4, 2010 at 1:25 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Thu, Feb 4, 2010 at 10:08 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Thu, Feb 4, 2010 at 9:11 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>> On Thu, Feb 4, 2010 at 11:17 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>> On Thu, Feb 4, 2010 at 8:04 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>> On Thu, Feb 4, 2010 at 10:33 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>> On Wed, Feb 3, 2010 at 7:04 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>> On Wed, Feb 3, 2010 at 9:54 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>> On Wed, Feb 3, 2010 at 9:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>>> On Wed, Feb 3, 2010 at 6:06 PM, joep <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>> Blueprint changed by joep:
>>>>>>>>>>>>
>>>>>>>>>>>> Whiteboard set to:
>>>>>>>>>>>>
>>>>>>>>>>>> assert_larry
>>>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L49
>>>>>>>>>>>>
>>>>>>>>>>>> and the method check_function in test class
>>>>>>>>>>>>
>>>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L128
>>>>>>>>>>>>
>>>>>>>>>>>> already contain the extracted boiler plate assert function to compare a
>>>>>>>>>>>> larry with desired data matrix x and labels
>>>>>>>>>>>>
>>>>>>>>>>>> the later has a view keyword to choose whether to verify nocopy or
>>>>>>>>>>>> noreference
>>>>>>>>>>>>
>>>>>>>>>>>> used in nosetests for larry methods
>>>>>>>>>>>
>>>>>>>>>>> I moved this to the larry-discuss list since it is hard to discuss it
>>>>>>>>>>> through the whiteboard on the blueprint.
>>>>>>>>>>>
>>>>>>>>>>> Yes, let's start with those functions and then prettify them.
>>>>>>>>>>>
>>>>>>>>>>> What should the signature be?
>>>>>>>>>>>
>>>>>>>>>>> The current signature is:
>>>>>>>>>>>
>>>>>>>>>>> assert_larry(opname, la, t, lab, msgstr)
>>>>>>>>>>>
>>>>>>>>>>> How about changing that to the same signature as np.testing.assert_equal? So:
>>>>>>>>>>>
>>>>>>>>>>> assert_equal(actual, desired, err_msg='', verbose=True)
>>>>>>>>>>>
>>>>>>>>>>> Then we don't have to separate the data and the label. And instead of
>>>>>>>>>>> nancode we can use the numpy nan aware assert in numpy 1.4.
>>>>>>>>>>>
>>>>>>>>>>> Oops...dinner time!
>>>>>>>>>>
>>>>>>>>>> I'm just browsing the code and adding some notes.
>>>>>>>>>>
>>>>>>>>>> Yes, matching the numpy signature for assert_equal and
>>>>>>>>>> assert_almost_equal is a good idea.
>>>>>>>>>> If you require numpy 1.4 for the test suite, then most of the boiler
>>>>>>>>>> plate is gone (nan handling)
>>>>>>>>>>
>>>>>>>>>> I think that assert_larry_equal is equivalent to
>>>>>>>>>> assert_equal(la1.x, la2.x)
>>>>>>>>>> assert_equal(la1.label, la2.label)
>>>>>>>>>>
>>>>>>>>>> slicing_test.py/test_slicing and test_morph use directly an assert_
>>>>>>>>>> which could be replaced by np.testing.assert_equal
>>>>>>>>>
>>>>>>>>> just another comment:
>>>>>>>>>
>>>>>>>>> there are 3 patterns in the test suite corresponding to the previous comments
>>>>>>>>>
>>>>>>>>> * python unittest with boilerplate
>>>>>>>>> * numpy 1.3 nosetests with explicit nan handling
>>>>>>>>> * numpy 1.4 nosetests where numpy.testing assert do the nan handling
>>>>>>>>>
>>>>>>>>> In the 4th option an explicit function assert_larry_xxx is not really
>>>>>>>>> necessary, and the test for x and labels and nocopy/noreference could
>>>>>>>>> also be "yielded" directly from the test function/method.
>>>>>>>>
>>>>>>>> Instead of making many unit tests out of one call to larry's
>>>>>>>> assert_equal, which would occur if we used yield, I think it is better
>>>>>>>> for the whole thing be one unit test. That would mean that we'd have
>>>>>>>> to wrap calls to np.testing.assert_equal in try...except blocks,
>>>>>>>> collect any error messages, and raise an AssertionError at the end of
>>>>>>>> the function if needed.
>>>>>>>
>>>>>>> several asserts don't have to be yielded, if you want them to be only
>>>>>>> one unittest that fails at the first assertion error, e.g.
>>>>>>> def test_movingsum32(self)  in deflarry_nose_test.py
>>>>>>
>>>>>> I think it is better for debugging if all the info is printed out when
>>>>>> the test fails. For example, if a test fails on the label, Id like to
>>>>>> know if it passed on the array.
>>>>>>
>>>>>>>>
>>>>>>>> As for signature, how about
>>>>>>>>
>>>>>>>> assert_equal(actual, desired, msg='', dtype=True, noreference=True,
>>>>>>>> nocopy=False, verbose=True)
>>>>>>>
>>>>>>> this doesn't work, since noreference and nocopy also need the original larry,
>>>>>>> the signature of check function is
>>>>>>
>>>>>> Good point.
>>>>>>
>>>>>> An alternative to passing in the original and the actual is to pass in
>>>>>> the original and the function that modifies the original to produce
>>>>>> the actual. Then two larrys are always passed in. But that sounds
>>>>>> messy.
>>>>>>
>>>>>>> check_function(self, t, label, p, orig, view=False)
>>>>>>>
>>>>>>> the signature could be
>>>>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>> noreference=True,
>>>>>>> nocopy=False, verbose=True)
>>>>>>
>>>>>> Should we go with the signature above?
>>>>>>
>>>>>>> but for noreference=True check an original has to be included
>>>>>>>
>>>>>>> with
>>>>>>> assert_larry_equal(actual, desired, original, msg='', dtype=True,
>>>>>>> noreference=True,
>>>>>>> nocopy=False, verbose=True)
>>>>>>>
>>>>>>> the original would have to be passed in even if both noreference and
>>>>>>> nocopy are False
>>>>>>>
>>>>>>>
>>>>>>>> So by default the dtype would be compared. Sometimes you expect the
>>>>>>>> dtype to change so maybe an option would be to pass in the dtype for
>>>>>>>> the "desired" larry.
>>>>>>>>
>>>>>>>> I think that would cover the most common use cases.
>>>>>>>
>>>>>>> Yes, I think so for the comparison of two larrys
>>>>>>>
>>>>>>> slicing tests e.g. test_slicing, would need a new function for
>>>>>>> noreference, nocopy that verifies e.g. that a slice is really a view.
>>>>>>> (I don't know what the numpy tests for view versus copy are for fancy
>>>>>>> slicing/indexing)
>>>>>>>
>>>>>>> Josef
>>>>>
>>>>> I attached a draft of the assert_larry_equal function. it imports some
>>>>> helper functions from test.py in the la/tests folder, which
>>>>> could/should also be rewritten into assert form.
>>>>> It's a draft, I haven't checked if it is working correctly yet for all cases.
>>>>>
>>>>> Also, I think it would be better to add the testing helper functions
>>>>> to la.utils so that they can be imported and don't need to have a copy
>>>>> in the test folder, as in the case of test.py.
>>>>
>>>> I made a few tweaks. To get all tests to run even if the first one
>>>> (check labels) fails, I wrapped the asserts in try...except (just the
>>>> first two for now).
>>>>
>>>> Also added a check that original is not None when doing a noreference
>>>> check ro nocopy check. And changed an assert to assert_equal.
>>>>
>>>> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>                       noreference=True, nocopy=False, verbose=True):
>>>>    #assert equality of attributes of two larries
>>>>
>>>>    fail = []
>>>>
>>>>    try:
>>>>        assert_equal(actual.x, desired.x, 'x')
>>>>    except AssertionError, err:
>>>>        fail.append(str(err))
>>>>    try:
>>>>        assert_equal(actual.label, desired.label, 'label')
>>>>    except AssertionError, errmsg:
>>>>        fail.append(str(err))
>>>>
>>>>    if dtype:
>>>>        msg = printfail(actual.x.dtype, desired.x.dtype, 'x.dtype')
>>>>        assert_equal(actual.x.dtype, desired.x.dtype, msg)
>>>>
>>>>    if noreference:
>>>>        if original is None:
>>>>            raise ValueError, 'original must be a larry to run
>>>> noreference check.'
>>>>        assert_(assert_noreference(actual, original), 'Reference found')
>>>>    elif nocopy:
>>>>        if original is None:
>>>>            raise ValueError, 'original must be a larry to run nocopy
>>>> check.'
>>>>        assert_(assert_nocopy(actual, original), 'copy instead of
>>>> reference found')
>>>>    else:   #FIXME check view for different dimensional larries
>>>>        pass
>>>>
>>>>    if len(fail) > 0:
>>>>        msg = ''.join(fail)
>>>>        raise AssertionError, msg
>>>
>>> I get this:
>>>
>>>>> x = larry([1,2,3])
>>>>> y = larry([2,2,3], [['a', 'b', 'c']])
>>>>>
>>>>> assert_larry_equal(x, y, 'cumsum test', noreference=False)
>>> ---------------------------------------------------------------------------
>>> AssertionError:
>>> Items are not equal:
>>> item=0
>>> item=0
>>> label
>>>  ACTUAL: 0
>>>  DESIRED: 'a'
>>> Arrays are not equal
>>> x
>>> (mismatch 33.3333333333%)
>>>  x: array([1, 2, 3])
>>>  y: array([2, 2, 3])
>>>
>>> with the code below. Hmm, the AssertionError message needs to be cleaned up.
>>>
>>> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>                       noreference=True, nocopy=False, verbose=True):
>>>    #assert equality of attributes of two larries
>>>
>>>    fail = []
>>>
>>>    # label
>>>    try:
>>>        assert_equal(actual.label, desired.label, 'label')
>>>    except AssertionError, err:
>>>        fail.append(str(err))
>>>
>>>    # Data array
>>>    try:
>>>        assert_equal(actual.x, desired.x, 'x')
>>>    except AssertionError, err:
>>>        fail.append(str(err))
>>>
>>>    # dtype
>>>    if dtype:
>>>        try:
>>>            assert_equal(actual.x.dtype, desired.x.dtype, 'dtype')
>>>        except AssertionError, err:
>>>            fail.append(str(err))
>>>
>>>    # Check for references or copies
>>>    if noreference:
>>>        if original is None:
>>>            raise ValueError, 'original must be a larry to run
>>> noreference check.'
>>>        try:
>>>            assert_(assert_noreference(actual, original), 'Reference found')
>>>        except AssertionError, err:
>>>            fail.append(str(err))
>>>    elif nocopy:
>>>        if original is None:
>>>            raise ValueError, 'original must be a larry to run nocopy check.'
>>>        try:
>>>            assert_(assert_nocopy(actual, original), 'copy instead of
>>> reference found')
>>>        except AssertionError, err:
>>>            fail.append(str(err))
>>>    else:   #FIXME check view for different dimensional larries
>>>        pass
>>>
>>>    # Did the test pass?
>>>    if len(fail) > 0:
>>>        # No
>>>        msg = ''.join(fail)
>>>        raise AssertionError, msg
>>>
>>
>>
>> Another possibility to look at, might be to use a yield inside the
>> assert_larry instead of the try .. except, which might result in
>> pretty much the same outcome, except for raising the test count. But
>> if it's just in one function in doesn't increase boiler plate so it's
>> fine.
>>
>> mabye '\n'.join(fail) would make it easier to read.
>>
>> Can you send it as an attachment or add it to the blueprint? gmail
>> destroys the correct intend when I copy from the mail.
>
> https://blueprints.launchpad.net/larry/+spec/simplify-unit-testing
>

looks good, we can make minor adjustments after we see actual test output
(One thought I tried is to remove the extra newline after
------------------, but it doesn't make much difference.)

the second function would be almost a copy of this with only one
assert_equal replace by assert_almost_equal for x

def assert_larry_almost_equal(actual, desired, decimal=7, msg='',
dtype=True, original=None,
                       noreference=True, nocopy=False, verbose=True):

Josef

Follow ups

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04

References

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04