← Back to team overview

larry-discuss team mailing list archive

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing

 

On Thu, Feb 4, 2010 at 5:50 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Thu, Feb 4, 2010 at 2:45 PM,  <josef.pktd@xxxxxxxxx> wrote:
>> On Thu, Feb 4, 2010 at 2:12 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Thu, Feb 4, 2010 at 11:07 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>> On Thu, Feb 4, 2010 at 1:59 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> On Thu, Feb 4, 2010 at 10:54 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>> On Thu, Feb 4, 2010 at 1:38 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>> On Thu, Feb 4, 2010 at 10:35 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>> On Thu, Feb 4, 2010 at 1:25 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>> On Thu, Feb 4, 2010 at 10:08 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>> On Thu, Feb 4, 2010 at 9:11 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>> On Thu, Feb 4, 2010 at 11:17 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>>>> On Thu, Feb 4, 2010 at 8:04 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>>> On Thu, Feb 4, 2010 at 10:33 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 7:04 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 9:54 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 9:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 6:06 PM, joep <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>>>>> Blueprint changed by joep:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Whiteboard set to:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> assert_larry
>>>>>>>>>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L49
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> and the method check_function in test class
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L128
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> already contain the extracted boiler plate assert function to compare a
>>>>>>>>>>>>>>>>>> larry with desired data matrix x and labels
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> the later has a view keyword to choose whether to verify nocopy or
>>>>>>>>>>>>>>>>>> noreference
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> used in nosetests for larry methods
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I moved this to the larry-discuss list since it is hard to discuss it
>>>>>>>>>>>>>>>>> through the whiteboard on the blueprint.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yes, let's start with those functions and then prettify them.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What should the signature be?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The current signature is:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> assert_larry(opname, la, t, lab, msgstr)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> How about changing that to the same signature as np.testing.assert_equal? So:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> assert_equal(actual, desired, err_msg='', verbose=True)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Then we don't have to separate the data and the label. And instead of
>>>>>>>>>>>>>>>>> nancode we can use the numpy nan aware assert in numpy 1.4.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Oops...dinner time!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm just browsing the code and adding some notes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes, matching the numpy signature for assert_equal and
>>>>>>>>>>>>>>>> assert_almost_equal is a good idea.
>>>>>>>>>>>>>>>> If you require numpy 1.4 for the test suite, then most of the boiler
>>>>>>>>>>>>>>>> plate is gone (nan handling)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think that assert_larry_equal is equivalent to
>>>>>>>>>>>>>>>> assert_equal(la1.x, la2.x)
>>>>>>>>>>>>>>>> assert_equal(la1.label, la2.label)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> slicing_test.py/test_slicing and test_morph use directly an assert_
>>>>>>>>>>>>>>>> which could be replaced by np.testing.assert_equal
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> just another comment:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> there are 3 patterns in the test suite corresponding to the previous comments
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * python unittest with boilerplate
>>>>>>>>>>>>>>> * numpy 1.3 nosetests with explicit nan handling
>>>>>>>>>>>>>>> * numpy 1.4 nosetests where numpy.testing assert do the nan handling
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In the 4th option an explicit function assert_larry_xxx is not really
>>>>>>>>>>>>>>> necessary, and the test for x and labels and nocopy/noreference could
>>>>>>>>>>>>>>> also be "yielded" directly from the test function/method.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Instead of making many unit tests out of one call to larry's
>>>>>>>>>>>>>> assert_equal, which would occur if we used yield, I think it is better
>>>>>>>>>>>>>> for the whole thing be one unit test. That would mean that we'd have
>>>>>>>>>>>>>> to wrap calls to np.testing.assert_equal in try...except blocks,
>>>>>>>>>>>>>> collect any error messages, and raise an AssertionError at the end of
>>>>>>>>>>>>>> the function if needed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> several asserts don't have to be yielded, if you want them to be only
>>>>>>>>>>>>> one unittest that fails at the first assertion error, e.g.
>>>>>>>>>>>>> def test_movingsum32(self)  in deflarry_nose_test.py
>>>>>>>>>>>>
>>>>>>>>>>>> I think it is better for debugging if all the info is printed out when
>>>>>>>>>>>> the test fails. For example, if a test fails on the label, Id like to
>>>>>>>>>>>> know if it passed on the array.
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As for signature, how about
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> assert_equal(actual, desired, msg='', dtype=True, noreference=True,
>>>>>>>>>>>>>> nocopy=False, verbose=True)
>>>>>>>>>>>>>
>>>>>>>>>>>>> this doesn't work, since noreference and nocopy also need the original larry,
>>>>>>>>>>>>> the signature of check function is
>>>>>>>>>>>>
>>>>>>>>>>>> Good point.
>>>>>>>>>>>>
>>>>>>>>>>>> An alternative to passing in the original and the actual is to pass in
>>>>>>>>>>>> the original and the function that modifies the original to produce
>>>>>>>>>>>> the actual. Then two larrys are always passed in. But that sounds
>>>>>>>>>>>> messy.
>>>>>>>>>>>>
>>>>>>>>>>>>> check_function(self, t, label, p, orig, view=False)
>>>>>>>>>>>>>
>>>>>>>>>>>>> the signature could be
>>>>>>>>>>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>>>>>>> noreference=True,
>>>>>>>>>>>>> nocopy=False, verbose=True)
>>>>>>>>>>>>
>>>>>>>>>>>> Should we go with the signature above?
>>>>>>>>>>>>
>>>>>>>>>>>>> but for noreference=True check an original has to be included
>>>>>>>>>>>>>
>>>>>>>>>>>>> with
>>>>>>>>>>>>> assert_larry_equal(actual, desired, original, msg='', dtype=True,
>>>>>>>>>>>>> noreference=True,
>>>>>>>>>>>>> nocopy=False, verbose=True)
>>>>>>>>>>>>>
>>>>>>>>>>>>> the original would have to be passed in even if both noreference and
>>>>>>>>>>>>> nocopy are False
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> So by default the dtype would be compared. Sometimes you expect the
>>>>>>>>>>>>>> dtype to change so maybe an option would be to pass in the dtype for
>>>>>>>>>>>>>> the "desired" larry.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think that would cover the most common use cases.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, I think so for the comparison of two larrys
>>>>>>>>>>>>>
>>>>>>>>>>>>> slicing tests e.g. test_slicing, would need a new function for
>>>>>>>>>>>>> noreference, nocopy that verifies e.g. that a slice is really a view.
>>>>>>>>>>>>> (I don't know what the numpy tests for view versus copy are for fancy
>>>>>>>>>>>>> slicing/indexing)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Josef
>>>>>>>>>>>
>>>>>>>>>>> I attached a draft of the assert_larry_equal function. it imports some
>>>>>>>>>>> helper functions from test.py in the la/tests folder, which
>>>>>>>>>>> could/should also be rewritten into assert form.
>>>>>>>>>>> It's a draft, I haven't checked if it is working correctly yet for all cases.
>>>>>>>>>>>
>>>>>>>>>>> Also, I think it would be better to add the testing helper functions
>>>>>>>>>>> to la.utils so that they can be imported and don't need to have a copy
>>>>>>>>>>> in the test folder, as in the case of test.py.
>>>>>>>>>>
>>>>>>>>>> I made a few tweaks. To get all tests to run even if the first one
>>>>>>>>>> (check labels) fails, I wrapped the asserts in try...except (just the
>>>>>>>>>> first two for now).
>>>>>>>>>>
>>>>>>>>>> Also added a check that original is not None when doing a noreference
>>>>>>>>>> check ro nocopy check. And changed an assert to assert_equal.
>>>>>>>>>>
>>>>>>>>>> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>>>>                       noreference=True, nocopy=False, verbose=True):
>>>>>>>>>>    #assert equality of attributes of two larries
>>>>>>>>>>
>>>>>>>>>>    fail = []
>>>>>>>>>>
>>>>>>>>>>    try:
>>>>>>>>>>        assert_equal(actual.x, desired.x, 'x')
>>>>>>>>>>    except AssertionError, err:
>>>>>>>>>>        fail.append(str(err))
>>>>>>>>>>    try:
>>>>>>>>>>        assert_equal(actual.label, desired.label, 'label')
>>>>>>>>>>    except AssertionError, errmsg:
>>>>>>>>>>        fail.append(str(err))
>>>>>>>>>>
>>>>>>>>>>    if dtype:
>>>>>>>>>>        msg = printfail(actual.x.dtype, desired.x.dtype, 'x.dtype')
>>>>>>>>>>        assert_equal(actual.x.dtype, desired.x.dtype, msg)
>>>>>>>>>>
>>>>>>>>>>    if noreference:
>>>>>>>>>>        if original is None:
>>>>>>>>>>            raise ValueError, 'original must be a larry to run
>>>>>>>>>> noreference check.'
>>>>>>>>>>        assert_(assert_noreference(actual, original), 'Reference found')
>>>>>>>>>>    elif nocopy:
>>>>>>>>>>        if original is None:
>>>>>>>>>>            raise ValueError, 'original must be a larry to run nocopy
>>>>>>>>>> check.'
>>>>>>>>>>        assert_(assert_nocopy(actual, original), 'copy instead of
>>>>>>>>>> reference found')
>>>>>>>>>>    else:   #FIXME check view for different dimensional larries
>>>>>>>>>>        pass
>>>>>>>>>>
>>>>>>>>>>    if len(fail) > 0:
>>>>>>>>>>        msg = ''.join(fail)
>>>>>>>>>>        raise AssertionError, msg
>>>>>>>>>
>>>>>>>>> I get this:
>>>>>>>>>
>>>>>>>>>>> x = larry([1,2,3])
>>>>>>>>>>> y = larry([2,2,3], [['a', 'b', 'c']])
>>>>>>>>>>>
>>>>>>>>>>> assert_larry_equal(x, y, 'cumsum test', noreference=False)
>>>>>>>>> ---------------------------------------------------------------------------
>>>>>>>>> AssertionError:
>>>>>>>>> Items are not equal:
>>>>>>>>> item=0
>>>>>>>>> item=0
>>>>>>>>> label
>>>>>>>>>  ACTUAL: 0
>>>>>>>>>  DESIRED: 'a'
>>>>>>>>> Arrays are not equal
>>>>>>>>> x
>>>>>>>>> (mismatch 33.3333333333%)
>>>>>>>>>  x: array([1, 2, 3])
>>>>>>>>>  y: array([2, 2, 3])
>>>>>>>>>
>>>>>>>>> with the code below. Hmm, the AssertionError message needs to be cleaned up.
>>>>>>>>>
>>>>>>>>> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>>>                       noreference=True, nocopy=False, verbose=True):
>>>>>>>>>    #assert equality of attributes of two larries
>>>>>>>>>
>>>>>>>>>    fail = []
>>>>>>>>>
>>>>>>>>>    # label
>>>>>>>>>    try:
>>>>>>>>>        assert_equal(actual.label, desired.label, 'label')
>>>>>>>>>    except AssertionError, err:
>>>>>>>>>        fail.append(str(err))
>>>>>>>>>
>>>>>>>>>    # Data array
>>>>>>>>>    try:
>>>>>>>>>        assert_equal(actual.x, desired.x, 'x')
>>>>>>>>>    except AssertionError, err:
>>>>>>>>>        fail.append(str(err))
>>>>>>>>>
>>>>>>>>>    # dtype
>>>>>>>>>    if dtype:
>>>>>>>>>        try:
>>>>>>>>>            assert_equal(actual.x.dtype, desired.x.dtype, 'dtype')
>>>>>>>>>        except AssertionError, err:
>>>>>>>>>            fail.append(str(err))
>>>>>>>>>
>>>>>>>>>    # Check for references or copies
>>>>>>>>>    if noreference:
>>>>>>>>>        if original is None:
>>>>>>>>>            raise ValueError, 'original must be a larry to run
>>>>>>>>> noreference check.'
>>>>>>>>>        try:
>>>>>>>>>            assert_(assert_noreference(actual, original), 'Reference found')
>>>>>>>>>        except AssertionError, err:
>>>>>>>>>            fail.append(str(err))
>>>>>>>>>    elif nocopy:
>>>>>>>>>        if original is None:
>>>>>>>>>            raise ValueError, 'original must be a larry to run nocopy check.'
>>>>>>>>>        try:
>>>>>>>>>            assert_(assert_nocopy(actual, original), 'copy instead of
>>>>>>>>> reference found')
>>>>>>>>>        except AssertionError, err:
>>>>>>>>>            fail.append(str(err))
>>>>>>>>>    else:   #FIXME check view for different dimensional larries
>>>>>>>>>        pass
>>>>>>>>>
>>>>>>>>>    # Did the test pass?
>>>>>>>>>    if len(fail) > 0:
>>>>>>>>>        # No
>>>>>>>>>        msg = ''.join(fail)
>>>>>>>>>        raise AssertionError, msg
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Another possibility to look at, might be to use a yield inside the
>>>>>>>> assert_larry instead of the try .. except, which might result in
>>>>>>>> pretty much the same outcome, except for raising the test count. But
>>>>>>>> if it's just in one function in doesn't increase boiler plate so it's
>>>>>>>> fine.
>>>>>>>>
>>>>>>>> mabye '\n'.join(fail) would make it easier to read.
>>>>>>>>
>>>>>>>> Can you send it as an attachment or add it to the blueprint? gmail
>>>>>>>> destroys the correct intend when I copy from the mail.
>>>>>>>
>>>>>>> https://blueprints.launchpad.net/larry/+spec/simplify-unit-testing
>>>>>>>
>>>>>>
>>>>>> looks good, we can make minor adjustments after we see actual test output
>>>>>> (One thought I tried is to remove the extra newline after
>>>>>> ------------------, but it doesn't make much difference.)
>>>>>>
>>>>>> the second function would be almost a copy of this with only one
>>>>>> assert_equal replace by assert_almost_equal for x
>>>>>>
>>>>>> def assert_larry_almost_equal(actual, desired, decimal=7, msg='',
>>>>>> dtype=True, original=None,
>>>>>>                       noreference=True, nocopy=False, verbose=True):
>>>>>
>>>>> Hmm, could we do almost equal whenever one of the dtypes is float;
>>>>> equal otherwise?
>>>>
>>>> Yes,
>>>> In general I never use assert_equal with floats. In this case, I would
>>>> add the decimal keyword to assert_equal.
>>>> Also, to be less confusing, I would set decimal=14 (or similar) so the
>>>> function corresponds better to its name
>>>
>>> That sounds good. Then we just need a check to see if both x's are
>>> scalar (almost_equal) or not (equal).
>>>
>>
>> something strange till later
>>
>>>>> y = larry([2.0,2.0,3.0], [['a', 'b', 'c']])
>>>>> yc = y.copy()
>>>>> assert_larry_equal(y, y, 'identity', original=yc, noreference=True)
>> Traceback (most recent call last):
>>  File "<pyshell#32>", line 1, in <module>
>>    assert_larry_equal(y, y, 'identity', original=yc, noreference=True)
>>  File "C:\Josef\eclipsegworkspace\larry-josef\larry-josef\la\util\testing.py",
>> line 54, in assert_larry_equal
>>    raise AssertionError, msg
>> AssertionError:
>>
>> ---------------
>> REFERENCE FOUND
>> ---------------
>
> I cleaned up assert_larry_equal, added an automatic check for inexact
> dtype (almost equal) or exact (equal), and added a doc string. It's in
> r167.
>
> Now we can begin to unit test it. From the result you got above, maybe
> we should start by reviewing the nocopy and noreference unit tests. If
> there are any.
>

here is the first group I tried

from numpy.testing import assert_, assert_equal, assert_raises

x = larry([1,2,3])
y = larry([2.0,2.0,3.0], [['a', 'b', 'c']])
assert_larry_equal(y, y, 'identity', noreference=False)
assert_raises(AssertionError, assert_larry_equal, x, y, 'failall',
noreference=False)
assert_raises(AssertionError, assert_larry_equal, y, y, 'failall',
original=y, noreference=False, nocopy=True)
assert_larry_equal(y, y, 'identity', original=y, noreference=False)
#assert_raises( ... assert_larry_equal(y, y+1, 'identity', original=y,
noreference=False)
assert_larry_equal(y, y, 'identity', original=y+1, noreference=False)
#assert_raises(AssertionError, assert_larry_equal(y, y, 'identity',
original=y.copy(), noreference=True)
assert_larry_equal(y, y, 'identity', original=y.copy(),
noreference=True) #error raises ??
assert_larry_equal(y, y, 'identity', original=y.copy(),
noreference=False, nocopy=True) #raises ??
assert_larry_equal(y, y, 'identity', original=y.copy(),
noreference=False, nocopy=False)

is larry copy method tested?

nocopy, noreference need to be checked



Follow ups

References