← Back to team overview

launchpad-dev team mailing list archive

Page and Windmill Test Experiment

 

Hi, all.

I have some thoughts about our page and windmill tests, and I'd like
to propose an experiment related to these tests.  This is inspired by
my own thinking about how to best test web UI and by Rob's recent
cost/benefit analysis arguments for optional reviews.

Here's my assessment of the problem:

The burden we carry for our current page tests and Windmill tests does
not match the benefit we get from them.

The burden:

* Much longer test run times (the bugs module goes from 45 minutes to
20 locally without them)
* Fragile tests that block landings
* Fragile infrastructure (see issues with Windmill tests under load)
* Confusion over how to best test UI (page tests vs. integration vs.
browser unit tests)

All of these slow down development substantially.

The benefit:

* catches web UI regressions

Perhaps there's more to say in terms of benefit (acceptance tests,
happy path testing, etc.) but in practice I think the only benefit is
that we catch regressions in our UI.  In my experience, I've found
this more with Windmill than page tests.  I can't think of a single
page test failure I've had that wasn't a bad test (i.e. relying on
html format).  I'm sure these have caught real regressions (surely
they have! :-)) but in my experience this isn't the case.

Here's my argument:

Since we're moving to continuous rollouts and we're having to focus
more on daily QA, any regression these tests would catch should be
caught in QA.  If problems exist, we rollback the rev and try again.
If not, we rollback the rev as soon as we catch it.  We can rely on
ourselves and beta users to catch these regressions, and by not
carrying the burden of the tests we get quicker cycles and landings.
I'm not suggesting manual testing is better.  In a perfect world, we
would have a fast test suite and get the benefit of both, but for now,
the test bloat is causing more harm than the benefit it brings.  I
think we could trust our QA process and users until we can get a
sufficiently fast test suite to worry about web ui tests again.  All
IMHO, of course. :-)

I'd like to propose an experiment to see if this holds true.  I would
propose that for 3 months (or until the Thunderdome) that we:

1.) disable page tests and windmill tests
2.) leave windmill around for running yuitests and API integration tests
3.) anytime we touch UI in this time we must add unit test coverage
for browser classes (if the page test was acting as a unit test for
the view)
4.) we track closely any regression that slips through and the impact
of this regression

A note about #2, #3, and #4 here:

For #2, I think we should still run the JS unit tests automatically
and in my experience Windmill is not fragile for yui tests.  Migrating
to jstestdriver, if we decide post-experiement to abandon Windmill
completely, should be a different issue, IMHO.  Francis noted
yesterday that the API tests for the js LP.client have no other
coverage, so we should leave those around as well.  Again, I don't
think these change much or are subject to the same fragility.

My assumption about #3 is that we have tested some bits in page tests
that have no other test coverage.  For example, view methods that
should be tested in browser unit tests but are covered by the page
test.  I'm proposing we disable these tests but leave them around and
that we add unit test coverage where appropriate when touching UI.  I
am not suggesting we have browser unit tests attempt to replicate
story tests.  I think it would be could to run the page tests locally
when working on UI to see if something needs unit test coverage, too.
But this can't really be enforced by the experiment.

As for #4, we don't currently track regressions, so I propose that we
make use of a regression tag going forward.  For *any* regression bug.
 This will also help with Robert's experiment.  And as we fix these
bugs we tag them with ui-test-experiment if a page or windmill test
would have caught this regression.

After the experiment completes, we should make an assessment.  Did the
decreased test run times affect cycle time?  Did the simpler UI
testing strategy affect cycle time?  Did we introduce regressions that
tests would have caught?  Was the impact of such regressions serious
or minor?  And then, should we continue with this, do something new,
or re-enable the tests and return to what we had before?

What do you all think?

Cheers,
deryck


-- 
Deryck Hodge
https://launchpad.net/~deryck
http://www.devurandom.org/



Follow ups