larry-discuss team mailing list archive
-
larry-discuss team
-
Mailing list archive
-
Message #00026
Re: [Blueprint simplify-unit-testing] Create a larry specific assert function to simplify unit testing
On Thu, Feb 4, 2010 at 10:25 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Thu, Feb 4, 2010 at 10:08 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Thu, Feb 4, 2010 at 9:11 AM, <josef.pktd@xxxxxxxxx> wrote:
>>> On Thu, Feb 4, 2010 at 11:17 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>> On Thu, Feb 4, 2010 at 8:04 AM, <josef.pktd@xxxxxxxxx> wrote:
>>>>> On Thu, Feb 4, 2010 at 10:33 AM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>> On Wed, Feb 3, 2010 at 7:04 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>>>>> On Wed, Feb 3, 2010 at 9:54 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>> On Wed, Feb 3, 2010 at 9:16 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>>>>>> On Wed, Feb 3, 2010 at 6:06 PM, joep <josef.pktd@xxxxxxxxx> wrote:
>>>>>>>>>> Blueprint changed by joep:
>>>>>>>>>>
>>>>>>>>>> Whiteboard set to:
>>>>>>>>>>
>>>>>>>>>> assert_larry
>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L49
>>>>>>>>>>
>>>>>>>>>> and the method check_function in test class
>>>>>>>>>>
>>>>>>>>>> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head%3A/la/tests/deflarry_nose_test.py#L128
>>>>>>>>>>
>>>>>>>>>> already contain the extracted boiler plate assert function to compare a
>>>>>>>>>> larry with desired data matrix x and labels
>>>>>>>>>>
>>>>>>>>>> the later has a view keyword to choose whether to verify nocopy or
>>>>>>>>>> noreference
>>>>>>>>>>
>>>>>>>>>> used in nosetests for larry methods
>>>>>>>>>
>>>>>>>>> I moved this to the larry-discuss list since it is hard to discuss it
>>>>>>>>> through the whiteboard on the blueprint.
>>>>>>>>>
>>>>>>>>> Yes, let's start with those functions and then prettify them.
>>>>>>>>>
>>>>>>>>> What should the signature be?
>>>>>>>>>
>>>>>>>>> The current signature is:
>>>>>>>>>
>>>>>>>>> assert_larry(opname, la, t, lab, msgstr)
>>>>>>>>>
>>>>>>>>> How about changing that to the same signature as np.testing.assert_equal? So:
>>>>>>>>>
>>>>>>>>> assert_equal(actual, desired, err_msg='', verbose=True)
>>>>>>>>>
>>>>>>>>> Then we don't have to separate the data and the label. And instead of
>>>>>>>>> nancode we can use the numpy nan aware assert in numpy 1.4.
>>>>>>>>>
>>>>>>>>> Oops...dinner time!
>>>>>>>>
>>>>>>>> I'm just browsing the code and adding some notes.
>>>>>>>>
>>>>>>>> Yes, matching the numpy signature for assert_equal and
>>>>>>>> assert_almost_equal is a good idea.
>>>>>>>> If you require numpy 1.4 for the test suite, then most of the boiler
>>>>>>>> plate is gone (nan handling)
>>>>>>>>
>>>>>>>> I think that assert_larry_equal is equivalent to
>>>>>>>> assert_equal(la1.x, la2.x)
>>>>>>>> assert_equal(la1.label, la2.label)
>>>>>>>>
>>>>>>>> slicing_test.py/test_slicing and test_morph use directly an assert_
>>>>>>>> which could be replaced by np.testing.assert_equal
>>>>>>>
>>>>>>> just another comment:
>>>>>>>
>>>>>>> there are 3 patterns in the test suite corresponding to the previous comments
>>>>>>>
>>>>>>> * python unittest with boilerplate
>>>>>>> * numpy 1.3 nosetests with explicit nan handling
>>>>>>> * numpy 1.4 nosetests where numpy.testing assert do the nan handling
>>>>>>>
>>>>>>> In the 4th option an explicit function assert_larry_xxx is not really
>>>>>>> necessary, and the test for x and labels and nocopy/noreference could
>>>>>>> also be "yielded" directly from the test function/method.
>>>>>>
>>>>>> Instead of making many unit tests out of one call to larry's
>>>>>> assert_equal, which would occur if we used yield, I think it is better
>>>>>> for the whole thing be one unit test. That would mean that we'd have
>>>>>> to wrap calls to np.testing.assert_equal in try...except blocks,
>>>>>> collect any error messages, and raise an AssertionError at the end of
>>>>>> the function if needed.
>>>>>
>>>>> several asserts don't have to be yielded, if you want them to be only
>>>>> one unittest that fails at the first assertion error, e.g.
>>>>> def test_movingsum32(self) in deflarry_nose_test.py
>>>>
>>>> I think it is better for debugging if all the info is printed out when
>>>> the test fails. For example, if a test fails on the label, Id like to
>>>> know if it passed on the array.
>>>>
>>>>>>
>>>>>> As for signature, how about
>>>>>>
>>>>>> assert_equal(actual, desired, msg='', dtype=True, noreference=True,
>>>>>> nocopy=False, verbose=True)
>>>>>
>>>>> this doesn't work, since noreference and nocopy also need the original larry,
>>>>> the signature of check function is
>>>>
>>>> Good point.
>>>>
>>>> An alternative to passing in the original and the actual is to pass in
>>>> the original and the function that modifies the original to produce
>>>> the actual. Then two larrys are always passed in. But that sounds
>>>> messy.
>>>>
>>>>> check_function(self, t, label, p, orig, view=False)
>>>>>
>>>>> the signature could be
>>>>> assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>>>>> noreference=True,
>>>>> nocopy=False, verbose=True)
>>>>
>>>> Should we go with the signature above?
>>>>
>>>>> but for noreference=True check an original has to be included
>>>>>
>>>>> with
>>>>> assert_larry_equal(actual, desired, original, msg='', dtype=True,
>>>>> noreference=True,
>>>>> nocopy=False, verbose=True)
>>>>>
>>>>> the original would have to be passed in even if both noreference and
>>>>> nocopy are False
>>>>>
>>>>>
>>>>>> So by default the dtype would be compared. Sometimes you expect the
>>>>>> dtype to change so maybe an option would be to pass in the dtype for
>>>>>> the "desired" larry.
>>>>>>
>>>>>> I think that would cover the most common use cases.
>>>>>
>>>>> Yes, I think so for the comparison of two larrys
>>>>>
>>>>> slicing tests e.g. test_slicing, would need a new function for
>>>>> noreference, nocopy that verifies e.g. that a slice is really a view.
>>>>> (I don't know what the numpy tests for view versus copy are for fancy
>>>>> slicing/indexing)
>>>>>
>>>>> Josef
>>>
>>> I attached a draft of the assert_larry_equal function. it imports some
>>> helper functions from test.py in the la/tests folder, which
>>> could/should also be rewritten into assert form.
>>> It's a draft, I haven't checked if it is working correctly yet for all cases.
>>>
>>> Also, I think it would be better to add the testing helper functions
>>> to la.utils so that they can be imported and don't need to have a copy
>>> in the test folder, as in the case of test.py.
>>
>> I made a few tweaks. To get all tests to run even if the first one
>> (check labels) fails, I wrapped the asserts in try...except (just the
>> first two for now).
>>
>> Also added a check that original is not None when doing a noreference
>> check ro nocopy check. And changed an assert to assert_equal.
>>
>> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
>> noreference=True, nocopy=False, verbose=True):
>> #assert equality of attributes of two larries
>>
>> fail = []
>>
>> try:
>> assert_equal(actual.x, desired.x, 'x')
>> except AssertionError, err:
>> fail.append(str(err))
>> try:
>> assert_equal(actual.label, desired.label, 'label')
>> except AssertionError, errmsg:
>> fail.append(str(err))
>>
>> if dtype:
>> msg = printfail(actual.x.dtype, desired.x.dtype, 'x.dtype')
>> assert_equal(actual.x.dtype, desired.x.dtype, msg)
>>
>> if noreference:
>> if original is None:
>> raise ValueError, 'original must be a larry to run
>> noreference check.'
>> assert_(assert_noreference(actual, original), 'Reference found')
>> elif nocopy:
>> if original is None:
>> raise ValueError, 'original must be a larry to run nocopy
>> check.'
>> assert_(assert_nocopy(actual, original), 'copy instead of
>> reference found')
>> else: #FIXME check view for different dimensional larries
>> pass
>>
>> if len(fail) > 0:
>> msg = ''.join(fail)
>> raise AssertionError, msg
>
> I get this:
>
>>> x = larry([1,2,3])
>>> y = larry([2,2,3], [['a', 'b', 'c']])
>>>
>>> assert_larry_equal(x, y, 'cumsum test', noreference=False)
> ---------------------------------------------------------------------------
> AssertionError:
> Items are not equal:
> item=0
> item=0
> label
> ACTUAL: 0
> DESIRED: 'a'
> Arrays are not equal
> x
> (mismatch 33.3333333333%)
> x: array([1, 2, 3])
> y: array([2, 2, 3])
>
> with the code below. Hmm, the AssertionError message needs to be cleaned up.
>
> def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
> noreference=True, nocopy=False, verbose=True):
> #assert equality of attributes of two larries
>
> fail = []
>
> # label
> try:
> assert_equal(actual.label, desired.label, 'label')
> except AssertionError, err:
> fail.append(str(err))
>
> # Data array
> try:
> assert_equal(actual.x, desired.x, 'x')
> except AssertionError, err:
> fail.append(str(err))
>
> # dtype
> if dtype:
> try:
> assert_equal(actual.x.dtype, desired.x.dtype, 'dtype')
> except AssertionError, err:
> fail.append(str(err))
>
> # Check for references or copies
> if noreference:
> if original is None:
> raise ValueError, 'original must be a larry to run
> noreference check.'
> try:
> assert_(assert_noreference(actual, original), 'Reference found')
> except AssertionError, err:
> fail.append(str(err))
> elif nocopy:
> if original is None:
> raise ValueError, 'original must be a larry to run nocopy check.'
> try:
> assert_(assert_nocopy(actual, original), 'copy instead of
> reference found')
> except AssertionError, err:
> fail.append(str(err))
> else: #FIXME check view for different dimensional larries
> pass
>
> # Did the test pass?
> if len(fail) > 0:
> # No
> msg = ''.join(fail)
> raise AssertionError, msg
This cleans up the output (but not the code). For example:
>> x = larry([1,2,3])
>> y = larry([2.0,2.0,3.0], [['a', 'b', 'c']])
>>
>> assert_larry_equal(x, y, 'cumsum test', noreference=False)
---------------------------------------------------------------------------
AssertionError:
LABEL
-----
Items are not equal:
item=0
item=0
ACTUAL: 0
DESIRED: 'a'
X DATA ARRAY
------------
Arrays are not equal
(mismatch 33.3333333333%)
x: array([1, 2, 3])
y: array([ 2., 2., 3.])
DTYPE
-----
Items are not equal:
ACTUAL: dtype('int64')
DESIRED: dtype('float64')
Code:
def assert_larry_equal(actual, desired, msg='', dtype=True, original=None,
noreference=True, nocopy=False, verbose=True):
#assert equality of attributes of two larries
fail = []
# label
try:
assert_equal(actual.label, desired.label)
except AssertionError, err:
fail.append('\n\nLABEL\n-----\n' + str(err))
# Data array
try:
assert_equal(actual.x, desired.x)
except AssertionError, err:
fail.append('\n\nX DATA ARRAY\n------------\n' + str(err))
# dtype
if dtype:
try:
assert_equal(actual.x.dtype, desired.x.dtype)
except AssertionError, err:
fail.append('\n\nDTYPE\n-----\n' + str(err))
# Check for references or copies
if noreference:
if original is None:
raise ValueError, 'original must be a larry to run
noreference check.'
try:
assert_(assert_noreference(actual, original))
except AssertionError, err:
fail.append('\n\nREFERENCE FOUND\n---------------\n' +
str(err))
elif nocopy:
if original is None:
raise ValueError, 'original must be a larry to run nocopy check.'
try:
assert_(assert_nocopy(actual, original))
except AssertionError, err:
fail.append('\n\nCOPY INSTEAD OF REFERENCE
FOUND\n-------------------------------\n' + str(err))
else: #FIXME check view for different dimensional larries
pass
# Did the test pass?
if len(fail) > 0:
# No
msg = ''.join(fail)
raise AssertionError, msg
References