← Back to team overview

larry-discuss team mailing list archive

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing

 

On Thu, Feb 4, 2010 at 10:08 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Thu, Feb 4, 2010 at 9:11 AM,  <josef.pktd@xxxxxxxxx> wrote:
>> On Thu, Feb 4, 2010 at 11:17 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Thu, Feb 4, 2010 at 8:04 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>> On Thu, Feb 4, 2010 at 10:33 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> On Wed, Feb 3, 2010 at 7:04 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>> On Wed, Feb 3, 2010 at 9:54 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>> On Wed, Feb 3, 2010 at 9:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>> On Wed, Feb 3, 2010 at 6:06 PM, joep <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>> Blueprint changed by joep:
>>>>>>>>>
>>>>>>>>> Whiteboard set to:
>>>>>>>>>
>>>>>>>>> assert_larry
>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L49
>>>>>>>>>
>>>>>>>>> and the method check_function in test class
>>>>>>>>>
>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L128
>>>>>>>>>
>>>>>>>>> already contain the extracted boiler plate assert function to compare a
>>>>>>>>> larry with desired data matrix x and labels
>>>>>>>>>
>>>>>>>>> the later has a view keyword to choose whether to verify nocopy or
>>>>>>>>> noreference
>>>>>>>>>
>>>>>>>>> used in nosetests for larry methods
>>>>>>>>
>>>>>>>> I moved this to the larry-discuss list since it is hard to discuss it
>>>>>>>> through the whiteboard on the blueprint.
>>>>>>>>
>>>>>>>> Yes, let's start with those functions and then prettify them.
>>>>>>>>
>>>>>>>> What should the signature be?
>>>>>>>>
>>>>>>>> The current signature is:
>>>>>>>>
>>>>>>>> assert_larry(opname, la, t, lab, msgstr)
>>>>>>>>
>>>>>>>> How about changing that to the same signature as np.testing.assert_equal? So:
>>>>>>>>
>>>>>>>> assert_equal(actual, desired, err_msg='', verbose=True)
>>>>>>>>
>>>>>>>> Then we don't have to separate the data and the label. And instead of
>>>>>>>> nancode we can use the numpy nan aware assert in numpy 1.4.
>>>>>>>>
>>>>>>>> Oops...dinner time!
>>>>>>>
>>>>>>> I'm just browsing the code and adding some notes.
>>>>>>>
>>>>>>> Yes, matching the numpy signature for assert_equal and
>>>>>>> assert_almost_equal is a good idea.
>>>>>>> If you require numpy 1.4 for the test suite, then most of the boiler
>>>>>>> plate is gone (nan handling)
>>>>>>>
>>>>>>> I think that assert_larry_equal is equivalent to
>>>>>>> assert_equal(la1.x, la2.x)
>>>>>>> assert_equal(la1.label, la2.label)
>>>>>>>
>>>>>>> slicing_test.py/test_slicing and test_morph use directly an assert_
>>>>>>> which could be replaced by np.testing.assert_equal
>>>>>>
>>>>>> just another comment:
>>>>>>
>>>>>> there are 3 patterns in the test suite corresponding to the previous comments
>>>>>>
>>>>>> * python unittest with boilerplate
>>>>>> * numpy 1.3 nosetests with explicit nan handling
>>>>>> * numpy 1.4 nosetests where numpy.testing assert do the nan handling
>>>>>>
>>>>>> In the 4th option an explicit function assert_larry_xxx is not really
>>>>>> necessary, and the test for x and labels and nocopy/noreference could
>>>>>> also be "yielded" directly from the test function/method.
>>>>>
>>>>> Instead of making many unit tests out of one call to larry's
>>>>> assert_equal, which would occur if we used yield, I think it is better
>>>>> for the whole thing be one unit test. That would mean that we'd have
>>>>> to wrap calls to np.testing.assert_equal in try...except blocks,
>>>>> collect any error messages, and raise an AssertionError at the end of
>>>>> the function if needed.
>>>>
>>>> several asserts don't have to be yielded, if you want them to be only
>>>> one unittest that fails at the first assertion error, e.g.
>>>> def test_movingsum32(self)  in deflarry_nose_test.py
>>>
>>> I think it is better for debugging if all the info is printed out when
>>> the test fails. For example, if a test fails on the label, Id like to
>>> know if it passed on the array.
>>>
>>>>>
>>>>> As for signature, how about
>>>>>
>>>>> assert_equal(actual, desired, msg='', dtype=True, noreference=True,
>>>>> nocopy=False, verbose=True)
>>>>
>>>> this doesn't work, since noreference and nocopy also need the original larry,
>>>> the signature of check function is
>>>
>>> Good point.
>>>
>>> An alternative to passing in the original and the actual is to pass in
>>> the original and the function that modifies the original to produce
>>> the actual. Then two larrys are always passed in. But that sounds
>>> messy.
>>>
>>>> check_function(self, t, label, p, orig, view=False)
>>>>
>>>> the signature could be
>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>> noreference=True,
>>>> nocopy=False, verbose=True)
>>>
>>> Should we go with the signature above?
>>>
>>>> but for noreference=True check an original has to be included
>>>>
>>>> with
>>>> assert_larry_equal(actual, desired, original, msg='', dtype=True,
>>>> noreference=True,
>>>> nocopy=False, verbose=True)
>>>>
>>>> the original would have to be passed in even if both noreference and
>>>> nocopy are False
>>>>
>>>>
>>>>> So by default the dtype would be compared. Sometimes you expect the
>>>>> dtype to change so maybe an option would be to pass in the dtype for
>>>>> the "desired" larry.
>>>>>
>>>>> I think that would cover the most common use cases.
>>>>
>>>> Yes, I think so for the comparison of two larrys
>>>>
>>>> slicing tests e.g. test_slicing, would need a new function for
>>>> noreference, nocopy that verifies e.g. that a slice is really a view.
>>>> (I don't know what the numpy tests for view versus copy are for fancy
>>>> slicing/indexing)
>>>>
>>>> Josef
>>
>> I attached a draft of the assert_larry_equal function. it imports some
>> helper functions from test.py in the la/tests folder, which
>> could/should also be rewritten into assert form.
>> It's a draft, I haven't checked if it is working correctly yet for all cases.
>>
>> Also, I think it would be better to add the testing helper functions
>> to la.utils so that they can be imported and don't need to have a copy
>> in the test folder, as in the case of test.py.
>
> I made a few tweaks. To get all tests to run even if the first one
> (check labels) fails, I wrapped the asserts in try...except (just the
> first two for now).
>
> Also added a check that original is not None when doing a noreference
> check ro nocopy check. And changed an assert to assert_equal.
>
> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>                       noreference=True, nocopy=False, verbose=True):
>    #assert equality of attributes of two larries
>
>    fail = []
>
>    try:
>        assert_equal(actual.x, desired.x, 'x')
>    except AssertionError, err:
>        fail.append(str(err))
>    try:
>        assert_equal(actual.label, desired.label, 'label')
>    except AssertionError, errmsg:
>        fail.append(str(err))
>
>    if dtype:
>        msg = printfail(actual.x.dtype, desired.x.dtype, 'x.dtype')
>        assert_equal(actual.x.dtype, desired.x.dtype, msg)
>
>    if noreference:
>        if original is None:
>            raise ValueError, 'original must be a larry to run
> noreference check.'
>        assert_(assert_noreference(actual, original), 'Reference found')
>    elif nocopy:
>        if original is None:
>            raise ValueError, 'original must be a larry to run nocopy
> check.'
>        assert_(assert_nocopy(actual, original), 'copy instead of
> reference found')
>    else:   #FIXME check view for different dimensional larries
>        pass
>
>    if len(fail) > 0:
>        msg = ''.join(fail)
>        raise AssertionError, msg

I get this:

>> x = larry([1,2,3])
>> y = larry([2,2,3], [['a', 'b', 'c']])
>>
>> assert_larry_equal(x, y, 'cumsum test', noreference=False)
---------------------------------------------------------------------------
AssertionError:
Items are not equal:
item=0
item=0
label
 ACTUAL: 0
 DESIRED: 'a'
Arrays are not equal
x
(mismatch 33.3333333333%)
 x: array([1, 2, 3])
 y: array([2, 2, 3])

with the code below. Hmm, the AssertionError message needs to be cleaned up.

def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
                       noreference=True, nocopy=False, verbose=True):
    #assert equality of attributes of two larries

    fail = []

    # label
    try:
        assert_equal(actual.label, desired.label, 'label')
    except AssertionError, err:
        fail.append(str(err))

    # Data array
    try:
        assert_equal(actual.x, desired.x, 'x')
    except AssertionError, err:
        fail.append(str(err))

    # dtype
    if dtype:
        try:
            assert_equal(actual.x.dtype, desired.x.dtype, 'dtype')
        except AssertionError, err:
            fail.append(str(err))

    # Check for references or copies
    if noreference:
        if original is None:
            raise ValueError, 'original must be a larry to run
noreference check.'
        try:
            assert_(assert_noreference(actual, original), 'Reference found')
        except AssertionError, err:
            fail.append(str(err))
    elif nocopy:
        if original is None:
            raise ValueError, 'original must be a larry to run nocopy check.'
        try:
            assert_(assert_nocopy(actual, original), 'copy instead of
reference found')
        except AssertionError, err:
            fail.append(str(err))
    else:   #FIXME check view for different dimensional larries
        pass

    # Did the test pass?
    if len(fail) > 0:
        # No
        msg = ''.join(fail)
        raise AssertionError, msg



Follow ups

References