← Back to team overview

openstack-qa-team team mailing list archive

Re: Incorporating Stress tests into Tempest

 

Hi David, 

I wanted to reply to this last week but I was swamped. This sounds like great work! I can't say exactly on how the Tempest actions would need to be extended without seeing the code and understanding what the proposed changes are, but I'd definitely like to see what you have to figure out how it could integrate into our work.

Daryl


On Jan 26, 2012, at 11:02 AM, David Kranz wrote:

> (after long cursing at pep8) I am preparing to check in a set of stress tests that we have developed and wanted to get some feedback before I attempt to do so.
> 
> The problem to be solved is that nova is a distributed, asynchronous system that is prone to race condition bugs. These bugs will not be easily found during
> functional testing but will be encountered by users in large deployments in a way that is hard to debug. The stress test tries to cause these bugs to happen in a more
> controlled environment.
> 
> The basic idea of the test is that there are a number of actions, roughly corresponding to the Compute API, that are fired pseudo-randomly at a nova cluster as fast as possible. These actions consist of what to do, how to verify success, and a state filter to make sure that the operation makes sense. For example, if the action is to reboot a server and none are active, nothing should be done. A test case is a set of actions to be performed and the probability that each action should be selected. There are also parameters controlling rate of fire and stuff like that. Currently there are only a few actions defined but this test has discovered three bugs just with that so I want to check it in as quickly as possible.
> 
> I was going to check in a 'stress' directory parallel to the 'tempest' directory.
> 
> This test is not like functional tests in that it can never succeed, only fail. Ideally it will run for a long time and so cannot really be run
> after every checkin. It would be good to run a short-duration case as part of the functional tests though.
> 
> This test requires some new parameters for the environment. For example, one thing it does is periodically check the nova logs
> on all cluster nodes to make sure there are no errors and will fail the test is there are. So it needs the path to the ssh private key for
> the cluster nodes. It seems that currently we have getters defined for all of the config parameters. Should there be a new getter for
> every kind of config option that some one adds to Tempest, or should we just provide a method to get the parameter by string name?
> 
> This test was developed before the tempest code was available. The "what to do" part of each action is pretty similar to the tempest methods that
> call the API. Would it make sense at some point to extend the tempest actions to include methods that enable them to participate in a stress test?
> 
> Any other comments or issues?
> 
> Thanks.
> 
> -David
> 
> -- 
> Mailing list: https://launchpad.net/~openstack-qa-team
> Post to     : openstack-qa-team@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack-qa-team
> More help   : https://help.launchpad.net/ListHelp



Follow ups

References