← Back to team overview

larry-discuss team mailing list archive

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing

 

On Thu, Feb 4, 2010 at 10:25 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Thu, Feb 4, 2010 at 10:08 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Thu, Feb 4, 2010 at 9:11 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>> On Thu, Feb 4, 2010 at 11:17 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Thu, Feb 4, 2010 at 8:04 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>> On Thu, Feb 4, 2010 at 10:33 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>> On Wed, Feb 3, 2010 at 7:04 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>> On Wed, Feb 3, 2010 at 9:54 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>> On Wed, Feb 3, 2010 at 9:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>> On Wed, Feb 3, 2010 at 6:06 PM, joep <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>> Blueprint changed by joep:
>>>>>>>>>>
>>>>>>>>>> Whiteboard set to:
>>>>>>>>>>
>>>>>>>>>> assert_larry
>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L49
>>>>>>>>>>
>>>>>>>>>> and the method check_function in test class
>>>>>>>>>>
>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L128
>>>>>>>>>>
>>>>>>>>>> already contain the extracted boiler plate assert function to compare a
>>>>>>>>>> larry with desired data matrix x and labels
>>>>>>>>>>
>>>>>>>>>> the later has a view keyword to choose whether to verify nocopy or
>>>>>>>>>> noreference
>>>>>>>>>>
>>>>>>>>>> used in nosetests for larry methods
>>>>>>>>>
>>>>>>>>> I moved this to the larry-discuss list since it is hard to discuss it
>>>>>>>>> through the whiteboard on the blueprint.
>>>>>>>>>
>>>>>>>>> Yes, let's start with those functions and then prettify them.
>>>>>>>>>
>>>>>>>>> What should the signature be?
>>>>>>>>>
>>>>>>>>> The current signature is:
>>>>>>>>>
>>>>>>>>> assert_larry(opname, la, t, lab, msgstr)
>>>>>>>>>
>>>>>>>>> How about changing that to the same signature as np.testing.assert_equal? So:
>>>>>>>>>
>>>>>>>>> assert_equal(actual, desired, err_msg='', verbose=True)
>>>>>>>>>
>>>>>>>>> Then we don't have to separate the data and the label. And instead of
>>>>>>>>> nancode we can use the numpy nan aware assert in numpy 1.4.
>>>>>>>>>
>>>>>>>>> Oops...dinner time!
>>>>>>>>
>>>>>>>> I'm just browsing the code and adding some notes.
>>>>>>>>
>>>>>>>> Yes, matching the numpy signature for assert_equal and
>>>>>>>> assert_almost_equal is a good idea.
>>>>>>>> If you require numpy 1.4 for the test suite, then most of the boiler
>>>>>>>> plate is gone (nan handling)
>>>>>>>>
>>>>>>>> I think that assert_larry_equal is equivalent to
>>>>>>>> assert_equal(la1.x, la2.x)
>>>>>>>> assert_equal(la1.label, la2.label)
>>>>>>>>
>>>>>>>> slicing_test.py/test_slicing and test_morph use directly an assert_
>>>>>>>> which could be replaced by np.testing.assert_equal
>>>>>>>
>>>>>>> just another comment:
>>>>>>>
>>>>>>> there are 3 patterns in the test suite corresponding to the previous comments
>>>>>>>
>>>>>>> * python unittest with boilerplate
>>>>>>> * numpy 1.3 nosetests with explicit nan handling
>>>>>>> * numpy 1.4 nosetests where numpy.testing assert do the nan handling
>>>>>>>
>>>>>>> In the 4th option an explicit function assert_larry_xxx is not really
>>>>>>> necessary, and the test for x and labels and nocopy/noreference could
>>>>>>> also be "yielded" directly from the test function/method.
>>>>>>
>>>>>> Instead of making many unit tests out of one call to larry's
>>>>>> assert_equal, which would occur if we used yield, I think it is better
>>>>>> for the whole thing be one unit test. That would mean that we'd have
>>>>>> to wrap calls to np.testing.assert_equal in try...except blocks,
>>>>>> collect any error messages, and raise an AssertionError at the end of
>>>>>> the function if needed.
>>>>>
>>>>> several asserts don't have to be yielded, if you want them to be only
>>>>> one unittest that fails at the first assertion error, e.g.
>>>>> def test_movingsum32(self)  in deflarry_nose_test.py
>>>>
>>>> I think it is better for debugging if all the info is printed out when
>>>> the test fails. For example, if a test fails on the label, Id like to
>>>> know if it passed on the array.
>>>>
>>>>>>
>>>>>> As for signature, how about
>>>>>>
>>>>>> assert_equal(actual, desired, msg='', dtype=True, noreference=True,
>>>>>> nocopy=False, verbose=True)
>>>>>
>>>>> this doesn't work, since noreference and nocopy also need the original larry,
>>>>> the signature of check function is
>>>>
>>>> Good point.
>>>>
>>>> An alternative to passing in the original and the actual is to pass in
>>>> the original and the function that modifies the original to produce
>>>> the actual. Then two larrys are always passed in. But that sounds
>>>> messy.
>>>>
>>>>> check_function(self, t, label, p, orig, view=False)
>>>>>
>>>>> the signature could be
>>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>> noreference=True,
>>>>> nocopy=False, verbose=True)
>>>>
>>>> Should we go with the signature above?
>>>>
>>>>> but for noreference=True check an original has to be included
>>>>>
>>>>> with
>>>>> assert_larry_equal(actual, desired, original, msg='', dtype=True,
>>>>> noreference=True,
>>>>> nocopy=False, verbose=True)
>>>>>
>>>>> the original would have to be passed in even if both noreference and
>>>>> nocopy are False
>>>>>
>>>>>
>>>>>> So by default the dtype would be compared. Sometimes you expect the
>>>>>> dtype to change so maybe an option would be to pass in the dtype for
>>>>>> the "desired" larry.
>>>>>>
>>>>>> I think that would cover the most common use cases.
>>>>>
>>>>> Yes, I think so for the comparison of two larrys
>>>>>
>>>>> slicing tests e.g. test_slicing, would need a new function for
>>>>> noreference, nocopy that verifies e.g. that a slice is really a view.
>>>>> (I don't know what the numpy tests for view versus copy are for fancy
>>>>> slicing/indexing)
>>>>>
>>>>> Josef
>>>
>>> I attached a draft of the assert_larry_equal function. it imports some
>>> helper functions from test.py in the la/tests folder, which
>>> could/should also be rewritten into assert form.
>>> It's a draft, I haven't checked if it is working correctly yet for all cases.
>>>
>>> Also, I think it would be better to add the testing helper functions
>>> to la.utils so that they can be imported and don't need to have a copy
>>> in the test folder, as in the case of test.py.
>>
>> I made a few tweaks. To get all tests to run even if the first one
>> (check labels) fails, I wrapped the asserts in try...except (just the
>> first two for now).
>>
>> Also added a check that original is not None when doing a noreference
>> check ro nocopy check. And changed an assert to assert_equal.
>>
>> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>                       noreference=True, nocopy=False, verbose=True):
>>    #assert equality of attributes of two larries
>>
>>    fail = []
>>
>>    try:
>>        assert_equal(actual.x, desired.x, 'x')
>>    except AssertionError, err:
>>        fail.append(str(err))
>>    try:
>>        assert_equal(actual.label, desired.label, 'label')
>>    except AssertionError, errmsg:
>>        fail.append(str(err))
>>
>>    if dtype:
>>        msg = printfail(actual.x.dtype, desired.x.dtype, 'x.dtype')
>>        assert_equal(actual.x.dtype, desired.x.dtype, msg)
>>
>>    if noreference:
>>        if original is None:
>>            raise ValueError, 'original must be a larry to run
>> noreference check.'
>>        assert_(assert_noreference(actual, original), 'Reference found')
>>    elif nocopy:
>>        if original is None:
>>            raise ValueError, 'original must be a larry to run nocopy
>> check.'
>>        assert_(assert_nocopy(actual, original), 'copy instead of
>> reference found')
>>    else:   #FIXME check view for different dimensional larries
>>        pass
>>
>>    if len(fail) > 0:
>>        msg = ''.join(fail)
>>        raise AssertionError, msg
>
> I get this:
>
>>> x = larry([1,2,3])
>>> y = larry([2,2,3], [['a', 'b', 'c']])
>>>
>>> assert_larry_equal(x, y, 'cumsum test', noreference=False)
> ---------------------------------------------------------------------------
> AssertionError:
> Items are not equal:
> item=0
> item=0
> label
>  ACTUAL: 0
>  DESIRED: 'a'
> Arrays are not equal
> x
> (mismatch 33.3333333333%)
>  x: array([1, 2, 3])
>  y: array([2, 2, 3])
>
> with the code below. Hmm, the AssertionError message needs to be cleaned up.
>
> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>                       noreference=True, nocopy=False, verbose=True):
>    #assert equality of attributes of two larries
>
>    fail = []
>
>    # label
>    try:
>        assert_equal(actual.label, desired.label, 'label')
>    except AssertionError, err:
>        fail.append(str(err))
>
>    # Data array
>    try:
>        assert_equal(actual.x, desired.x, 'x')
>    except AssertionError, err:
>        fail.append(str(err))
>
>    # dtype
>    if dtype:
>        try:
>            assert_equal(actual.x.dtype, desired.x.dtype, 'dtype')
>        except AssertionError, err:
>            fail.append(str(err))
>
>    # Check for references or copies
>    if noreference:
>        if original is None:
>            raise ValueError, 'original must be a larry to run
> noreference check.'
>        try:
>            assert_(assert_noreference(actual, original), 'Reference found')
>        except AssertionError, err:
>            fail.append(str(err))
>    elif nocopy:
>        if original is None:
>            raise ValueError, 'original must be a larry to run nocopy check.'
>        try:
>            assert_(assert_nocopy(actual, original), 'copy instead of
> reference found')
>        except AssertionError, err:
>            fail.append(str(err))
>    else:   #FIXME check view for different dimensional larries
>        pass
>
>    # Did the test pass?
>    if len(fail) > 0:
>        # No
>        msg = ''.join(fail)
>        raise AssertionError, msg

This cleans up the output (but not the code). For example:

>> x = larry([1,2,3])
>> y = larry([2.0,2.0,3.0], [['a', 'b', 'c']])
>>
>> assert_larry_equal(x, y, 'cumsum test', noreference=False)
---------------------------------------------------------------------------
AssertionError:

LABEL
-----

Items are not equal:
item=0
item=0

 ACTUAL: 0
 DESIRED: 'a'

X DATA ARRAY
------------

Arrays are not equal

(mismatch 33.3333333333%)
 x: array([1, 2, 3])
 y: array([ 2.,  2.,  3.])

DTYPE
-----

Items are not equal:
 ACTUAL: dtype('int64')
 DESIRED: dtype('float64')



Code:

def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
                       noreference=True, nocopy=False, verbose=True):
    #assert equality of attributes of two larries

    fail = []

    # label
    try:
        assert_equal(actual.label, desired.label)
    except AssertionError, err:
        fail.append('\n\nLABEL\n-----\n' + str(err))

    # Data array
    try:
        assert_equal(actual.x, desired.x)
    except AssertionError, err:
        fail.append('\n\nX DATA ARRAY\n------------\n' + str(err))

    # dtype
    if dtype:
        try:
            assert_equal(actual.x.dtype, desired.x.dtype)
        except AssertionError, err:
            fail.append('\n\nDTYPE\n-----\n' + str(err))

    # Check for references or copies
    if noreference:
        if original is None:
            raise ValueError, 'original must be a larry to run
noreference check.'
        try:
            assert_(assert_noreference(actual, original))
        except AssertionError, err:
            fail.append('\n\nREFERENCE FOUND\n---------------\n' +
str(err))
    elif nocopy:
        if original is None:
            raise ValueError, 'original must be a larry to run nocopy check.'
        try:
            assert_(assert_nocopy(actual, original))
        except AssertionError, err:
            fail.append('\n\nCOPY INSTEAD OF REFERENCE
FOUND\n-------------------------------\n' + str(err))
    else:   #FIXME check view for different dimensional larries
        pass

    # Did the test pass?
    if len(fail) > 0:
        # No
        msg = ''.join(fail)
        raise AssertionError, msg



References