← Back to team overview

openstack-qa-team team mailing list archive

Re: Running tempest in parallel

 

On 05/29/2012 09:59 PM, Daryl Walleck wrote:
On May 29, 2012, at 6:46 PM, Jay Pipes wrote:

On 05/29/2012 07:28 PM, Daryl Walleck wrote:
Well, if by complete failure you mean not configured, you're correct.
:-) The cleanup solution has several parallel solutions in progress, and
should be a coding standard going forward.

Perhaps, but I haven't seen any of those...

Actually, there's a few

  * My proposal, never was accepted - https://review.openstack.org/#/c/5544/
  * Your base class proposal, which uses a similar methodology -
    https://review.openstack.org/#/c/7069/
  * As well as several proposals by Rohit -
    https://review.openstack.org/#/c/7543/2


There's many solutions to this problem, and I'm open to whatever folks
feel comfortable with. I know this is something everyone wants, and it's
a relatively easy problem to solve. I'd rather not force a solution in
as I know there's been some discussion about how this should be handled,
so the best path would be to push this into the agenda for this week's
meeting and try to come to a solution everyone can agree on.

Ah, sorry, I misread your response as being proposed solutions for *parallel cleanup*. Sorry, yes, I completely understand you now :)

> The quotas issue is a matter
of configuration the devstack environment. This is a matter of if we
should expect Tempest to pass out of the box with any Devstack
configuration, which may not be realistic. I know there was an issue not
too far back where some tests related to volumes would not pass with the
default Devstack config because there wasn't enough space allocated for
volumes. However, I think we should be able to make some sensible
default suggestions that should run in most "basic" environments.

The quotas != rate-limiting. Default quotas for Compute API resources
came from Rackspace Cloud Servers, IIRC, and so they represent a
"basic environment" pretty well, I'd say. That said, I do think that
setting quotas higher for users used in tempest makes sense.

Quotas may not equal rate limiting, but the fact that they are not
currently easily configurable (you can drop an entry in the quotas
table, but an API implementation would be much preferable) is a
limitation of the API we have, which I believe I've heard movement
around adding that capability. A cap of 10 servers is fine, it just adds
an extra limitation if one is trying to optimize their tests for speed.

Sure, agreed. I think what may be a good solution for Tempest, specifically, is to have the setup for testing actually create a brand new user for tempest, so that we know precisely what the quota would be. Hell, we could even have the test cases (or base test case classes) themselves create their own users for the test runs to get around quota limits...

At the development conference we had a discussion about how best to
proceed with parallel execution. The general consensus was:

* Switching to py.test isn't a popular idea
* Folks would rather improve nose's parallel capabilities than develop
another solution

I haven't tinkered with nose in the last few weeks, but I think it may
be possible to simply run the Tempest with py.test without any
modifications. This still wouldn't be a popular solution regardless, so
let's go back to the problem we are trying to solve. I think we can all
agree that we would like a series of decisive smoke tests that run in
what is perceived to be a reasonable amount of time. So what is the bar
we're trying to reach? My personal feelings is that the 10-15 minute
range is more than reasonable using a full Linux distro, which means
with Cirros we should be able to squeak by under the 10 minute mark.

I'd agree with that, yep. Unfortunately, Tempest is taking more than
3200 seconds to complete now on some of the node providers. :(

By node providers, I'm assuming you're referring to the OpenStack CI
environment? I ask again, what do you think acceptable execution times
are?

For smoke tests, <10 minutes seems good. For the full run, keeping it <60 minutes would be good. But, we can actually split out the long run into several independent Jenkins jobs -- one for negative tests, one for keystone tests, one for admin API tests, etc...

> With ubuntu and several other full Linux OSes, the best I've seen
is 15 minute smoke tests and 40-ish minute full regression runs, which
in my case is within my pain threshold. In terms of actions, we've paid
the majority of the time price for testing scenarios. The additional
tests I'm adding (such as the SSH code) don't add significant time to
tests, but keep adding value. The same can be said of checking for
events that should be generated on server actions, and other side
effects that take minimal time to run. As we start to move in that
direction, test classes will become more streamlined/optimized. The test
class I added with the initial SSH tests is a good example of this, and
I'd be more than glad to share some other examples of how we can do more
with less. Eventually these types of optimizations can grow into our
coding standards.

What's everyone else's feelings on smoke test pain point times? I think
getting a sense of that will make any further decisions a bit more clear.

Regardless of what length of time we say, I still think we need to get
parallel execution working properly (with the existing --processes
option in nosetest's multiprocessing plugin).

Agreed. I think like we talked about at the conference, I think this
discussion and the overall discussion on optimization would probably be
better handled in a Skype meeting where we can all discuss this in real
time. I'm free every day this week save Friday.

Cool, I will be at the meeting tomorrow, but otherwise not really available much this week (have my folks in town). Next week, though, I'm all yours. :) Let's set up a time to Skype when we're at the IRC meeting tomorrow.

Best,
-jay


Follow ups

References