openstack-qa-team team mailing list archive

Thread
Date

Incorporating Stress tests into Tempest

To: openstack-qa-team@xxxxxxxxxxxxxxxxxxx
From: David Kranz <david.kranz@xxxxxxxxxx>
Date: Thu, 26 Jan 2012 12:02:33 -0500
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0) Gecko/20111222 Thunderbird/9.0.1

(after long cursing at pep8) I am preparing to check in a set of stresstests that we have developed and wanted to get some feedback before Iattempt to do so.

The problem to be solved is that nova is a distributed, asynchronoussystem that is prone to race condition bugs. These bugs will not beeasily found duringfunctional testing but will be encountered by users in large deploymentsin a way that is hard to debug. The stress test tries to cause thesebugs to happen in a more

controlled environment.

The basic idea of the test is that there are a number of actions,roughly corresponding to the Compute API, that are fired pseudo-randomlyat a nova cluster as fast as possible. These actions consist of what todo, how to verify success, and a state filter to make sure that theoperation makes sense. For example, if the action is to reboot a serverand none are active, nothing should be done. A test case is a set ofactions to be performed and the probability that each action should beselected. There are also parameters controlling rate of fire and stufflike that. Currently there are only a few actions defined but this testhas discovered three bugs just with that so I want to check it in asquickly as possible.

I was going to check in a 'stress' directory parallel to the 'tempest'directory.

This test is not like functional tests in that it can never succeed,only fail. Ideally it will run for a long time and so cannot really be runafter every checkin. It would be good to run a short-duration case aspart of the functional tests though.

This test requires some new parameters for the environment. For example,one thing it does is periodically check the nova logson all cluster nodes to make sure there are no errors and will fail thetest is there are. So it needs the path to the ssh private key forthe cluster nodes. It seems that currently we have getters defined forall of the config parameters. Should there be a new getter forevery kind of config option that some one adds to Tempest, or should wejust provide a method to get the parameter by string name?

This test was developed before the tempest code was available. The "whatto do" part of each action is pretty similar to the tempest methods thatcall the API. Would it make sense at some point to extend the tempestactions to include methods that enable them to participate in a stress test?


Any other comments or issues?

Thanks.

 -David

Follow ups

Re: Incorporating Stress tests into Tempest
From: Daryl Walleck, 2012-01-31