larry-discuss team mailing list archive

Thread
Date

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing

To: josef.pktd@xxxxxxxxx
From: Keith Goodman <kwgoodman@xxxxxxxxx>
Date: Thu, 4 Feb 2010 14:50:44 -0800
Cc: larry-discuss@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1cd32cbb1002041445w962fd5ak460b933b2f75501b@mail.gmail.com>

On Thu, Feb 4, 2010 at 2:45 PM,  <josef.pktd@xxxxxxxxx> wrote:
> On Thu, Feb 4, 2010 at 2:12 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Thu, Feb 4, 2010 at 11:07 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>> On Thu, Feb 4, 2010 at 1:59 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Thu, Feb 4, 2010 at 10:54 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>> On Thu, Feb 4, 2010 at 1:38 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>> On Thu, Feb 4, 2010 at 10:35 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>> On Thu, Feb 4, 2010 at 1:25 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>> On Thu, Feb 4, 2010 at 10:08 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>> On Thu, Feb 4, 2010 at 9:11 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>> On Thu, Feb 4, 2010 at 11:17 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>>> On Thu, Feb 4, 2010 at 8:04 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>> On Thu, Feb 4, 2010 at 10:33 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 7:04 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 9:54 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 9:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 6:06 PM, joep <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>>>> Blueprint changed by joep:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Whiteboard set to:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> assert_larry
>>>>>>>>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L49
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> and the method check_function in test class
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L128
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> already contain the extracted boiler plate assert function to compare a
>>>>>>>>>>>>>>>>> larry with desired data matrix x and labels
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the later has a view keyword to choose whether to verify nocopy or
>>>>>>>>>>>>>>>>> noreference
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> used in nosetests for larry methods
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I moved this to the larry-discuss list since it is hard to discuss it
>>>>>>>>>>>>>>>> through the whiteboard on the blueprint.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes, let's start with those functions and then prettify them.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What should the signature be?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The current signature is:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> assert_larry(opname, la, t, lab, msgstr)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> How about changing that to the same signature as np.testing.assert_equal? So:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> assert_equal(actual, desired, err_msg='', verbose=True)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Then we don't have to separate the data and the label. And instead of
>>>>>>>>>>>>>>>> nancode we can use the numpy nan aware assert in numpy 1.4.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Oops...dinner time!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm just browsing the code and adding some notes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes, matching the numpy signature for assert_equal and
>>>>>>>>>>>>>>> assert_almost_equal is a good idea.
>>>>>>>>>>>>>>> If you require numpy 1.4 for the test suite, then most of the boiler
>>>>>>>>>>>>>>> plate is gone (nan handling)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think that assert_larry_equal is equivalent to
>>>>>>>>>>>>>>> assert_equal(la1.x, la2.x)
>>>>>>>>>>>>>>> assert_equal(la1.label, la2.label)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> slicing_test.py/test_slicing and test_morph use directly an assert_
>>>>>>>>>>>>>>> which could be replaced by np.testing.assert_equal
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> just another comment:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> there are 3 patterns in the test suite corresponding to the previous comments
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> * python unittest with boilerplate
>>>>>>>>>>>>>> * numpy 1.3 nosetests with explicit nan handling
>>>>>>>>>>>>>> * numpy 1.4 nosetests where numpy.testing assert do the nan handling
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In the 4th option an explicit function assert_larry_xxx is not really
>>>>>>>>>>>>>> necessary, and the test for x and labels and nocopy/noreference could
>>>>>>>>>>>>>> also be "yielded" directly from the test function/method.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Instead of making many unit tests out of one call to larry's
>>>>>>>>>>>>> assert_equal, which would occur if we used yield, I think it is better
>>>>>>>>>>>>> for the whole thing be one unit test. That would mean that we'd have
>>>>>>>>>>>>> to wrap calls to np.testing.assert_equal in try...except blocks,
>>>>>>>>>>>>> collect any error messages, and raise an AssertionError at the end of
>>>>>>>>>>>>> the function if needed.
>>>>>>>>>>>>
>>>>>>>>>>>> several asserts don't have to be yielded, if you want them to be only
>>>>>>>>>>>> one unittest that fails at the first assertion error, e.g.
>>>>>>>>>>>> def test_movingsum32(self)  in deflarry_nose_test.py
>>>>>>>>>>>
>>>>>>>>>>> I think it is better for debugging if all the info is printed out when
>>>>>>>>>>> the test fails. For example, if a test fails on the label, Id like to
>>>>>>>>>>> know if it passed on the array.
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> As for signature, how about
>>>>>>>>>>>>>
>>>>>>>>>>>>> assert_equal(actual, desired, msg='', dtype=True, noreference=True,
>>>>>>>>>>>>> nocopy=False, verbose=True)
>>>>>>>>>>>>
>>>>>>>>>>>> this doesn't work, since noreference and nocopy also need the original larry,
>>>>>>>>>>>> the signature of check function is
>>>>>>>>>>>
>>>>>>>>>>> Good point.
>>>>>>>>>>>
>>>>>>>>>>> An alternative to passing in the original and the actual is to pass in
>>>>>>>>>>> the original and the function that modifies the original to produce
>>>>>>>>>>> the actual. Then two larrys are always passed in. But that sounds
>>>>>>>>>>> messy.
>>>>>>>>>>>
>>>>>>>>>>>> check_function(self, t, label, p, orig, view=False)
>>>>>>>>>>>>
>>>>>>>>>>>> the signature could be
>>>>>>>>>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>>>>>> noreference=True,
>>>>>>>>>>>> nocopy=False, verbose=True)
>>>>>>>>>>>
>>>>>>>>>>> Should we go with the signature above?
>>>>>>>>>>>
>>>>>>>>>>>> but for noreference=True check an original has to be included
>>>>>>>>>>>>
>>>>>>>>>>>> with
>>>>>>>>>>>> assert_larry_equal(actual, desired, original, msg='', dtype=True,
>>>>>>>>>>>> noreference=True,
>>>>>>>>>>>> nocopy=False, verbose=True)
>>>>>>>>>>>>
>>>>>>>>>>>> the original would have to be passed in even if both noreference and
>>>>>>>>>>>> nocopy are False
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> So by default the dtype would be compared. Sometimes you expect the
>>>>>>>>>>>>> dtype to change so maybe an option would be to pass in the dtype for
>>>>>>>>>>>>> the "desired" larry.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think that would cover the most common use cases.
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, I think so for the comparison of two larrys
>>>>>>>>>>>>
>>>>>>>>>>>> slicing tests e.g. test_slicing, would need a new function for
>>>>>>>>>>>> noreference, nocopy that verifies e.g. that a slice is really a view.
>>>>>>>>>>>> (I don't know what the numpy tests for view versus copy are for fancy
>>>>>>>>>>>> slicing/indexing)
>>>>>>>>>>>>
>>>>>>>>>>>> Josef
>>>>>>>>>>
>>>>>>>>>> I attached a draft of the assert_larry_equal function. it imports some
>>>>>>>>>> helper functions from test.py in the la/tests folder, which
>>>>>>>>>> could/should also be rewritten into assert form.
>>>>>>>>>> It's a draft, I haven't checked if it is working correctly yet for all cases.
>>>>>>>>>>
>>>>>>>>>> Also, I think it would be better to add the testing helper functions
>>>>>>>>>> to la.utils so that they can be imported and don't need to have a copy
>>>>>>>>>> in the test folder, as in the case of test.py.
>>>>>>>>>
>>>>>>>>> I made a few tweaks. To get all tests to run even if the first one
>>>>>>>>> (check labels) fails, I wrapped the asserts in try...except (just the
>>>>>>>>> first two for now).
>>>>>>>>>
>>>>>>>>> Also added a check that original is not None when doing a noreference
>>>>>>>>> check ro nocopy check. And changed an assert to assert_equal.
>>>>>>>>>
>>>>>>>>> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>>>                       noreference=True, nocopy=False, verbose=True):
>>>>>>>>>    #assert equality of attributes of two larries
>>>>>>>>>
>>>>>>>>>    fail = []
>>>>>>>>>
>>>>>>>>>    try:
>>>>>>>>>        assert_equal(actual.x, desired.x, 'x')
>>>>>>>>>    except AssertionError, err:
>>>>>>>>>        fail.append(str(err))
>>>>>>>>>    try:
>>>>>>>>>        assert_equal(actual.label, desired.label, 'label')
>>>>>>>>>    except AssertionError, errmsg:
>>>>>>>>>        fail.append(str(err))
>>>>>>>>>
>>>>>>>>>    if dtype:
>>>>>>>>>        msg = printfail(actual.x.dtype, desired.x.dtype, 'x.dtype')
>>>>>>>>>        assert_equal(actual.x.dtype, desired.x.dtype, msg)
>>>>>>>>>
>>>>>>>>>    if noreference:
>>>>>>>>>        if original is None:
>>>>>>>>>            raise ValueError, 'original must be a larry to run
>>>>>>>>> noreference check.'
>>>>>>>>>        assert_(assert_noreference(actual, original), 'Reference found')
>>>>>>>>>    elif nocopy:
>>>>>>>>>        if original is None:
>>>>>>>>>            raise ValueError, 'original must be a larry to run nocopy
>>>>>>>>> check.'
>>>>>>>>>        assert_(assert_nocopy(actual, original), 'copy instead of
>>>>>>>>> reference found')
>>>>>>>>>    else:   #FIXME check view for different dimensional larries
>>>>>>>>>        pass
>>>>>>>>>
>>>>>>>>>    if len(fail) > 0:
>>>>>>>>>        msg = ''.join(fail)
>>>>>>>>>        raise AssertionError, msg
>>>>>>>>
>>>>>>>> I get this:
>>>>>>>>
>>>>>>>>>> x = larry([1,2,3])
>>>>>>>>>> y = larry([2,2,3], [['a', 'b', 'c']])
>>>>>>>>>>
>>>>>>>>>> assert_larry_equal(x, y, 'cumsum test', noreference=False)
>>>>>>>> ---------------------------------------------------------------------------
>>>>>>>> AssertionError:
>>>>>>>> Items are not equal:
>>>>>>>> item=0
>>>>>>>> item=0
>>>>>>>> label
>>>>>>>>  ACTUAL: 0
>>>>>>>>  DESIRED: 'a'
>>>>>>>> Arrays are not equal
>>>>>>>> x
>>>>>>>> (mismatch 33.3333333333%)
>>>>>>>>  x: array([1, 2, 3])
>>>>>>>>  y: array([2, 2, 3])
>>>>>>>>
>>>>>>>> with the code below. Hmm, the AssertionError message needs to be cleaned up.
>>>>>>>>
>>>>>>>> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>>                       noreference=True, nocopy=False, verbose=True):
>>>>>>>>    #assert equality of attributes of two larries
>>>>>>>>
>>>>>>>>    fail = []
>>>>>>>>
>>>>>>>>    # label
>>>>>>>>    try:
>>>>>>>>        assert_equal(actual.label, desired.label, 'label')
>>>>>>>>    except AssertionError, err:
>>>>>>>>        fail.append(str(err))
>>>>>>>>
>>>>>>>>    # Data array
>>>>>>>>    try:
>>>>>>>>        assert_equal(actual.x, desired.x, 'x')
>>>>>>>>    except AssertionError, err:
>>>>>>>>        fail.append(str(err))
>>>>>>>>
>>>>>>>>    # dtype
>>>>>>>>    if dtype:
>>>>>>>>        try:
>>>>>>>>            assert_equal(actual.x.dtype, desired.x.dtype, 'dtype')
>>>>>>>>        except AssertionError, err:
>>>>>>>>            fail.append(str(err))
>>>>>>>>
>>>>>>>>    # Check for references or copies
>>>>>>>>    if noreference:
>>>>>>>>        if original is None:
>>>>>>>>            raise ValueError, 'original must be a larry to run
>>>>>>>> noreference check.'
>>>>>>>>        try:
>>>>>>>>            assert_(assert_noreference(actual, original), 'Reference found')
>>>>>>>>        except AssertionError, err:
>>>>>>>>            fail.append(str(err))
>>>>>>>>    elif nocopy:
>>>>>>>>        if original is None:
>>>>>>>>            raise ValueError, 'original must be a larry to run nocopy check.'
>>>>>>>>        try:
>>>>>>>>            assert_(assert_nocopy(actual, original), 'copy instead of
>>>>>>>> reference found')
>>>>>>>>        except AssertionError, err:
>>>>>>>>            fail.append(str(err))
>>>>>>>>    else:   #FIXME check view for different dimensional larries
>>>>>>>>        pass
>>>>>>>>
>>>>>>>>    # Did the test pass?
>>>>>>>>    if len(fail) > 0:
>>>>>>>>        # No
>>>>>>>>        msg = ''.join(fail)
>>>>>>>>        raise AssertionError, msg
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Another possibility to look at, might be to use a yield inside the
>>>>>>> assert_larry instead of the try .. except, which might result in
>>>>>>> pretty much the same outcome, except for raising the test count. But
>>>>>>> if it's just in one function in doesn't increase boiler plate so it's
>>>>>>> fine.
>>>>>>>
>>>>>>> mabye '\n'.join(fail) would make it easier to read.
>>>>>>>
>>>>>>> Can you send it as an attachment or add it to the blueprint? gmail
>>>>>>> destroys the correct intend when I copy from the mail.
>>>>>>
>>>>>> https://blueprints.launchpad.net/larry/+spec/simplify-unit-testing
>>>>>>
>>>>>
>>>>> looks good, we can make minor adjustments after we see actual test output
>>>>> (One thought I tried is to remove the extra newline after
>>>>> ------------------, but it doesn't make much difference.)
>>>>>
>>>>> the second function would be almost a copy of this with only one
>>>>> assert_equal replace by assert_almost_equal for x
>>>>>
>>>>> def assert_larry_almost_equal(actual, desired, decimal=7, msg='',
>>>>> dtype=True, original=None,
>>>>>                       noreference=True, nocopy=False, verbose=True):
>>>>
>>>> Hmm, could we do almost equal whenever one of the dtypes is float;
>>>> equal otherwise?
>>>
>>> Yes,
>>> In general I never use assert_equal with floats. In this case, I would
>>> add the decimal keyword to assert_equal.
>>> Also, to be less confusing, I would set decimal=14 (or similar) so the
>>> function corresponds better to its name
>>
>> That sounds good. Then we just need a check to see if both x's are
>> scalar (almost_equal) or not (equal).
>>
>
> something strange till later
>
>>>> y = larry([2.0,2.0,3.0], [['a', 'b', 'c']])
>>>> yc = y.copy()
>>>> assert_larry_equal(y, y, 'identity', original=yc, noreference=True)
> Traceback (most recent call last):
>  File "<pyshell#32>", line 1, in <module>
>    assert_larry_equal(y, y, 'identity', original=yc, noreference=True)
>  File "C:\Josef\eclipsegworkspace\larry-josef\larry-josef\la\util\testing.py",
> line 54, in assert_larry_equal
>    raise AssertionError, msg
> AssertionError:
>
> ---------------
> REFERENCE FOUND
> ---------------

I cleaned up assert_larry_equal, added an automatic check for inexact
dtype (almost equal) or exact (equal), and added a doc string. It's in
r167.

Now we can begin to unit test it. From the result you got above, maybe
we should start by reviewing the nocopy and noreference unit tests. If
there are any.

Follow ups

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04

References

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04