openstack-qa-team team mailing list archive

Thread
Date

Re: Use of tearDown in tempest

To: Daryl Walleck <daryl.walleck@xxxxxxxxxxxxx>
From: David Kranz <david.kranz@xxxxxxxxxx>
Date: Mon, 04 Jun 2012 10:06:47 -0400
Cc: "openstack-qa-team@xxxxxxxxxxxxxxxxxxx" <openstack-qa-team@xxxxxxxxxxxxxxxxxxx>, Sam Danes <sam.danes@xxxxxxxxxxxxx>
In-reply-to: <01F18698-8B55-4508-9626-B91A1648A996@rackspace.com>
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0) Gecko/20111222 Thunderbird/9.0.1

Thanks, Daryl. This is a complicated issue and I will try to spell outmy concern more clearly.


On 6/1/2012 6:19 PM, Daryl Walleck wrote:

Hi David,

The per test fixtures are there for two reasons. One - stability. If we were to share one server among the tests, any of the previous tests tainted it or left it in a bad state, the rest of the suite would fail for unclear reasons. The second is for parallel execution. We cannot perform all those actions on a single server at once.

These statements are obviously true but I am having trouble interpretingthem against your (below) expressed desire to have Tempest be anacceptance test for a production deploy. If a test checked its resultsin a way that made sure its server was not tainted or left in a badstate, there would be no concern about running other tests on thatserver. If a test does not do that, then it could "pass" but also leaveits server in a bad state. If we never reuse a server then such failureswill never be detected by Tempest.


I was really thinking about two things I noticed in the tests:

1. They spend a huge amount of time in server create, delete, resize,rebuild, etc. even though those operations represent a minisculefraction of the code being tested.2. There are a lot of tests that take the same path through the code butvary parameters in a way that is not thorough enough to say that an APIis fully tested, andthat would be better covered by the unit tests that bypass the expensivecode that is the same for all the cases. The number of these casesvaries widely betweenthe different test classes in Tempest and I am not sure what principleshave been used to decide how many variants tempest should have.


I certainly agree that we should optimize Tempest as best we can. Tempest is an black-box, integration test suite. In my view, its purpose is to throughly, from end to end, test the integration of OpenStack components. For quick testings, we have unit and component level integration tests.  If those suites are lacking, then more effort should be thrown behind those efforts as well. I'm open to making all optimizations where we can, and I think there's still quite a few things we can do. However, what I look for from Tempest is confidence in making a production deploy of OpenStack, and out there are corners I would rather not cut. I personally would not be comfortable delegating tests for basic tasks such as resize, rebuild to run daily.  What do you think acceptable run times for a smoke grouping and full regression run should be?

Daryl

For a full regression it should take as long as it needs to be run afterbeing optimized appropriately. I don't see a shortcut there. I wasimagining that a full regression run would happen every day. I don'tknow enough about the inner workings of Jenkins to say how long a smokegrouping should take but it is not just the time. If we managed to getthe time down by massive parallelization, it still might use anunacceptable amount of (real) server resources.


I think it would be helpful to agree on:

a) How thorough tempest tests should be about variants of arguments forboth positive and negative tests in general, for full regression and smoke.b) How (a) might be different for cases where the test cases take a longtimec) Consider which, if any, cases it would make sense to use a pool ofpre-allocated servers rather than spinning up new ones for each test.


 -David


On Jun 1, 2012, at 4:03 PM, David Kranz wrote:

I am a little confused about this. Most test classes define tearDownClass that frees resources allocated in setUpClass. But two of the classes deviate from this.

ServerActionsTest uses setUp and tearDown and creates a new server in setUp. I think this means that a new server is created before running each test method. This test is very slow, taking 9 minutes with three hefty compute nodes in a cluster. Many of the methods could reuse the same server and the negative tests don't need to create one at all. Unfortunately I think a lot of that time is spent just doing resize. I think we should consider making this test be nightly-build only.

ServerDetailsNegativeTest has methods that create lots of servers and has a tearDown method that deletes them after each test method. That seems unnecessary.
This test is very slow, taking 3 minutes on a cluster with three hefty compute nodes. And that is with 15 tests being skipped pending bug fixes.
It also has a tearDownClass method that deletes all running servers whether this test created them or not. That seems pretty bad. Why is it doing this?

Does any one have any comment about this?

-David

--
Mailing list: https://launchpad.net/~openstack-qa-team
Post to     : openstack-qa-team@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack-qa-team
More help   : https://help.launchpad.net/ListHelp

References

Use of tearDown in tempest
From: David Kranz, 2012-06-01
Re: Use of tearDown in tempest
From: Daryl Walleck, 2012-06-01