larry-discuss team mailing list archive

Thread
Date

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing

To: Keith Goodman <kwgoodman@xxxxxxxxx>
From: josef.pktd@xxxxxxxxx
Date: Thu, 4 Feb 2010 17:45:54 -0500
Cc: larry-discuss@xxxxxxxxxxxxxxxxxxx
In-reply-to: <f4f93d421002041112u12e8a9u6ba7d03de0edd857@mail.gmail.com>

On Thu, Feb 4, 2010 at 2:12 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Thu, Feb 4, 2010 at 11:07 AM,  <josef.pktd@xxxxxxxxx> wrote:
>> On Thu, Feb 4, 2010 at 1:59 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Thu, Feb 4, 2010 at 10:54 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>> On Thu, Feb 4, 2010 at 1:38 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> On Thu, Feb 4, 2010 at 10:35 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>> On Thu, Feb 4, 2010 at 1:25 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>> On Thu, Feb 4, 2010 at 10:08 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>> On Thu, Feb 4, 2010 at 9:11 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>> On Thu, Feb 4, 2010 at 11:17 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>> On Thu, Feb 4, 2010 at 8:04 AM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>> On Thu, Feb 4, 2010 at 10:33 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>>>> On Wed, Feb 3, 2010 at 7:04 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 9:54 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 9:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>> On Wed, Feb 3, 2010 at 6:06 PM, joep <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>>>>>>>> Blueprint changed by joep:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Whiteboard set to:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> assert_larry
>>>>>>>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L49
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> and the method check_function in test class
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L128
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> already contain the extracted boiler plate assert function to compare a
>>>>>>>>>>>>>>>> larry with desired data matrix x and labels
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the later has a view keyword to choose whether to verify nocopy or
>>>>>>>>>>>>>>>> noreference
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> used in nosetests for larry methods
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I moved this to the larry-discuss list since it is hard to discuss it
>>>>>>>>>>>>>>> through the whiteboard on the blueprint.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes, let's start with those functions and then prettify them.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What should the signature be?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The current signature is:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> assert_larry(opname, la, t, lab, msgstr)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> How about changing that to the same signature as np.testing.assert_equal? So:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> assert_equal(actual, desired, err_msg='', verbose=True)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Then we don't have to separate the data and the label. And instead of
>>>>>>>>>>>>>>> nancode we can use the numpy nan aware assert in numpy 1.4.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Oops...dinner time!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm just browsing the code and adding some notes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, matching the numpy signature for assert_equal and
>>>>>>>>>>>>>> assert_almost_equal is a good idea.
>>>>>>>>>>>>>> If you require numpy 1.4 for the test suite, then most of the boiler
>>>>>>>>>>>>>> plate is gone (nan handling)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think that assert_larry_equal is equivalent to
>>>>>>>>>>>>>> assert_equal(la1.x, la2.x)
>>>>>>>>>>>>>> assert_equal(la1.label, la2.label)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> slicing_test.py/test_slicing and test_morph use directly an assert_
>>>>>>>>>>>>>> which could be replaced by np.testing.assert_equal
>>>>>>>>>>>>>
>>>>>>>>>>>>> just another comment:
>>>>>>>>>>>>>
>>>>>>>>>>>>> there are 3 patterns in the test suite corresponding to the previous comments
>>>>>>>>>>>>>
>>>>>>>>>>>>> * python unittest with boilerplate
>>>>>>>>>>>>> * numpy 1.3 nosetests with explicit nan handling
>>>>>>>>>>>>> * numpy 1.4 nosetests where numpy.testing assert do the nan handling
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the 4th option an explicit function assert_larry_xxx is not really
>>>>>>>>>>>>> necessary, and the test for x and labels and nocopy/noreference could
>>>>>>>>>>>>> also be "yielded" directly from the test function/method.
>>>>>>>>>>>>
>>>>>>>>>>>> Instead of making many unit tests out of one call to larry's
>>>>>>>>>>>> assert_equal, which would occur if we used yield, I think it is better
>>>>>>>>>>>> for the whole thing be one unit test. That would mean that we'd have
>>>>>>>>>>>> to wrap calls to np.testing.assert_equal in try...except blocks,
>>>>>>>>>>>> collect any error messages, and raise an AssertionError at the end of
>>>>>>>>>>>> the function if needed.
>>>>>>>>>>>
>>>>>>>>>>> several asserts don't have to be yielded, if you want them to be only
>>>>>>>>>>> one unittest that fails at the first assertion error, e.g.
>>>>>>>>>>> def test_movingsum32(self)  in deflarry_nose_test.py
>>>>>>>>>>
>>>>>>>>>> I think it is better for debugging if all the info is printed out when
>>>>>>>>>> the test fails. For example, if a test fails on the label, Id like to
>>>>>>>>>> know if it passed on the array.
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> As for signature, how about
>>>>>>>>>>>>
>>>>>>>>>>>> assert_equal(actual, desired, msg='', dtype=True, noreference=True,
>>>>>>>>>>>> nocopy=False, verbose=True)
>>>>>>>>>>>
>>>>>>>>>>> this doesn't work, since noreference and nocopy also need the original larry,
>>>>>>>>>>> the signature of check function is
>>>>>>>>>>
>>>>>>>>>> Good point.
>>>>>>>>>>
>>>>>>>>>> An alternative to passing in the original and the actual is to pass in
>>>>>>>>>> the original and the function that modifies the original to produce
>>>>>>>>>> the actual. Then two larrys are always passed in. But that sounds
>>>>>>>>>> messy.
>>>>>>>>>>
>>>>>>>>>>> check_function(self, t, label, p, orig, view=False)
>>>>>>>>>>>
>>>>>>>>>>> the signature could be
>>>>>>>>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>>>>> noreference=True,
>>>>>>>>>>> nocopy=False, verbose=True)
>>>>>>>>>>
>>>>>>>>>> Should we go with the signature above?
>>>>>>>>>>
>>>>>>>>>>> but for noreference=True check an original has to be included
>>>>>>>>>>>
>>>>>>>>>>> with
>>>>>>>>>>> assert_larry_equal(actual, desired, original, msg='', dtype=True,
>>>>>>>>>>> noreference=True,
>>>>>>>>>>> nocopy=False, verbose=True)
>>>>>>>>>>>
>>>>>>>>>>> the original would have to be passed in even if both noreference and
>>>>>>>>>>> nocopy are False
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> So by default the dtype would be compared. Sometimes you expect the
>>>>>>>>>>>> dtype to change so maybe an option would be to pass in the dtype for
>>>>>>>>>>>> the "desired" larry.
>>>>>>>>>>>>
>>>>>>>>>>>> I think that would cover the most common use cases.
>>>>>>>>>>>
>>>>>>>>>>> Yes, I think so for the comparison of two larrys
>>>>>>>>>>>
>>>>>>>>>>> slicing tests e.g. test_slicing, would need a new function for
>>>>>>>>>>> noreference, nocopy that verifies e.g. that a slice is really a view.
>>>>>>>>>>> (I don't know what the numpy tests for view versus copy are for fancy
>>>>>>>>>>> slicing/indexing)
>>>>>>>>>>>
>>>>>>>>>>> Josef
>>>>>>>>>
>>>>>>>>> I attached a draft of the assert_larry_equal function. it imports some
>>>>>>>>> helper functions from test.py in the la/tests folder, which
>>>>>>>>> could/should also be rewritten into assert form.
>>>>>>>>> It's a draft, I haven't checked if it is working correctly yet for all cases.
>>>>>>>>>
>>>>>>>>> Also, I think it would be better to add the testing helper functions
>>>>>>>>> to la.utils so that they can be imported and don't need to have a copy
>>>>>>>>> in the test folder, as in the case of test.py.
>>>>>>>>
>>>>>>>> I made a few tweaks. To get all tests to run even if the first one
>>>>>>>> (check labels) fails, I wrapped the asserts in try...except (just the
>>>>>>>> first two for now).
>>>>>>>>
>>>>>>>> Also added a check that original is not None when doing a noreference
>>>>>>>> check ro nocopy check. And changed an assert to assert_equal.
>>>>>>>>
>>>>>>>> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>>                       noreference=True, nocopy=False, verbose=True):
>>>>>>>>    #assert equality of attributes of two larries
>>>>>>>>
>>>>>>>>    fail = []
>>>>>>>>
>>>>>>>>    try:
>>>>>>>>        assert_equal(actual.x, desired.x, 'x')
>>>>>>>>    except AssertionError, err:
>>>>>>>>        fail.append(str(err))
>>>>>>>>    try:
>>>>>>>>        assert_equal(actual.label, desired.label, 'label')
>>>>>>>>    except AssertionError, errmsg:
>>>>>>>>        fail.append(str(err))
>>>>>>>>
>>>>>>>>    if dtype:
>>>>>>>>        msg = printfail(actual.x.dtype, desired.x.dtype, 'x.dtype')
>>>>>>>>        assert_equal(actual.x.dtype, desired.x.dtype, msg)
>>>>>>>>
>>>>>>>>    if noreference:
>>>>>>>>        if original is None:
>>>>>>>>            raise ValueError, 'original must be a larry to run
>>>>>>>> noreference check.'
>>>>>>>>        assert_(assert_noreference(actual, original), 'Reference found')
>>>>>>>>    elif nocopy:
>>>>>>>>        if original is None:
>>>>>>>>            raise ValueError, 'original must be a larry to run nocopy
>>>>>>>> check.'
>>>>>>>>        assert_(assert_nocopy(actual, original), 'copy instead of
>>>>>>>> reference found')
>>>>>>>>    else:   #FIXME check view for different dimensional larries
>>>>>>>>        pass
>>>>>>>>
>>>>>>>>    if len(fail) > 0:
>>>>>>>>        msg = ''.join(fail)
>>>>>>>>        raise AssertionError, msg
>>>>>>>
>>>>>>> I get this:
>>>>>>>
>>>>>>>>> x = larry([1,2,3])
>>>>>>>>> y = larry([2,2,3], [['a', 'b', 'c']])
>>>>>>>>>
>>>>>>>>> assert_larry_equal(x, y, 'cumsum test', noreference=False)
>>>>>>> ---------------------------------------------------------------------------
>>>>>>> AssertionError:
>>>>>>> Items are not equal:
>>>>>>> item=0
>>>>>>> item=0
>>>>>>> label
>>>>>>>  ACTUAL: 0
>>>>>>>  DESIRED: 'a'
>>>>>>> Arrays are not equal
>>>>>>> x
>>>>>>> (mismatch 33.3333333333%)
>>>>>>>  x: array([1, 2, 3])
>>>>>>>  y: array([2, 2, 3])
>>>>>>>
>>>>>>> with the code below. Hmm, the AssertionError message needs to be cleaned up.
>>>>>>>
>>>>>>> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>>>>                       noreference=True, nocopy=False, verbose=True):
>>>>>>>    #assert equality of attributes of two larries
>>>>>>>
>>>>>>>    fail = []
>>>>>>>
>>>>>>>    # label
>>>>>>>    try:
>>>>>>>        assert_equal(actual.label, desired.label, 'label')
>>>>>>>    except AssertionError, err:
>>>>>>>        fail.append(str(err))
>>>>>>>
>>>>>>>    # Data array
>>>>>>>    try:
>>>>>>>        assert_equal(actual.x, desired.x, 'x')
>>>>>>>    except AssertionError, err:
>>>>>>>        fail.append(str(err))
>>>>>>>
>>>>>>>    # dtype
>>>>>>>    if dtype:
>>>>>>>        try:
>>>>>>>            assert_equal(actual.x.dtype, desired.x.dtype, 'dtype')
>>>>>>>        except AssertionError, err:
>>>>>>>            fail.append(str(err))
>>>>>>>
>>>>>>>    # Check for references or copies
>>>>>>>    if noreference:
>>>>>>>        if original is None:
>>>>>>>            raise ValueError, 'original must be a larry to run
>>>>>>> noreference check.'
>>>>>>>        try:
>>>>>>>            assert_(assert_noreference(actual, original), 'Reference found')
>>>>>>>        except AssertionError, err:
>>>>>>>            fail.append(str(err))
>>>>>>>    elif nocopy:
>>>>>>>        if original is None:
>>>>>>>            raise ValueError, 'original must be a larry to run nocopy check.'
>>>>>>>        try:
>>>>>>>            assert_(assert_nocopy(actual, original), 'copy instead of
>>>>>>> reference found')
>>>>>>>        except AssertionError, err:
>>>>>>>            fail.append(str(err))
>>>>>>>    else:   #FIXME check view for different dimensional larries
>>>>>>>        pass
>>>>>>>
>>>>>>>    # Did the test pass?
>>>>>>>    if len(fail) > 0:
>>>>>>>        # No
>>>>>>>        msg = ''.join(fail)
>>>>>>>        raise AssertionError, msg
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Another possibility to look at, might be to use a yield inside the
>>>>>> assert_larry instead of the try .. except, which might result in
>>>>>> pretty much the same outcome, except for raising the test count. But
>>>>>> if it's just in one function in doesn't increase boiler plate so it's
>>>>>> fine.
>>>>>>
>>>>>> mabye '\n'.join(fail) would make it easier to read.
>>>>>>
>>>>>> Can you send it as an attachment or add it to the blueprint? gmail
>>>>>> destroys the correct intend when I copy from the mail.
>>>>>
>>>>> https://blueprints.launchpad.net/larry/+spec/simplify-unit-testing
>>>>>
>>>>
>>>> looks good, we can make minor adjustments after we see actual test output
>>>> (One thought I tried is to remove the extra newline after
>>>> ------------------, but it doesn't make much difference.)
>>>>
>>>> the second function would be almost a copy of this with only one
>>>> assert_equal replace by assert_almost_equal for x
>>>>
>>>> def assert_larry_almost_equal(actual, desired, decimal=7, msg='',
>>>> dtype=True, original=None,
>>>>                       noreference=True, nocopy=False, verbose=True):
>>>
>>> Hmm, could we do almost equal whenever one of the dtypes is float;
>>> equal otherwise?
>>
>> Yes,
>> In general I never use assert_equal with floats. In this case, I would
>> add the decimal keyword to assert_equal.
>> Also, to be less confusing, I would set decimal=14 (or similar) so the
>> function corresponds better to its name
>
> That sounds good. Then we just need a check to see if both x's are
> scalar (almost_equal) or not (equal).
>

something strange till later

>>> y = larry([2.0,2.0,3.0], [['a', 'b', 'c']])
>>> yc = y.copy()
>>> assert_larry_equal(y, y, 'identity', original=yc, noreference=True)
Traceback (most recent call last):
  File "<pyshell#32>", line 1, in <module>
    assert_larry_equal(y, y, 'identity', original=yc, noreference=True)
  File "C:\Josef\eclipsegworkspace\larry-josef\larry-josef\la\util\testing.py",
line 54, in assert_larry_equal
    raise AssertionError, msg
AssertionError:

---------------
REFERENCE FOUND
---------------

Follow ups

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04

References

Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: josef . pktd, 2010-02-04
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
From: Keith Goodman, 2010-02-04