openstack-qa-team team mailing list archive

Thread
Date

Re: Tempest Gating

To: corvus@xxxxxxxxxxxx (James E. Blair)
From: Dan Smith <danms@xxxxxxxxxx>
Date: Wed, 22 Aug 2012 13:13:35 -0700
Cc: openstack-qa-team@xxxxxxxxxxxxxxxxxxx
In-reply-to: <87pq6i4u1d.fsf@meyer.lemoncheese.net> (James E. Blair's message of "Wed, 22 Aug 2012 12:15:58 -0700")
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux)

JB> 1) We designed the devstack-gate with a facility to install the SSH
JB> key of the developer whose change failed onto the VM and give it to
JB> them for debugging purposes -- however, we've yet to have a
JB> devstack-gate node provider give us permission to hand out VMs like
JB> that.  I'm sorry that hasn't happened, though in the mean time, if
JB> there's any useful information (logs, output of ps or ip commands,
JB> etc) you'd like to be copied off of the machines that we aren't
JB> already doing, please let me know (or submit a patch to
JB> openstack-ci/devstack-gate).

Oh, that'd be sweet. I think I'd take Jenkins' name in vain a lot less
if I could get in and check things out like that. Thanks for aiming this
high, despite the obvious problem it generates.

JB> But as we get to the point where the non-smoke tests are failing due
JB> to real problems with the core projects rather than tempest itself,
JB> we should look at making those tests part of the wider gate (either
JB> by making them smoke tests, or expanding the gate to run more than
JB> just the smoke tests for all projects).

Perhaps changing the cut point for what is considered a smoke test to
include anything that isn't likely to be fragile, or adding a third
grouping that includes a wider selection of tests would be
appropriate. The imbalance between tempest and the other projects is
(IMHO) is the actual problem, as you identified, not just that tempest
is gated on the full set of course.

JB> Of course, run time is a consideration, but we wrote Zuul largely to
JB> deal with this problem -- Zuul performs gate tests in parallel (but
JB> still tests each change individually as it will be merged).  So
JB> while we definitely would like to keep run-time as short as
JB> possible, running the Jenkins jobs in parallel means we don't have
JB> to wait for each change to be tested in series.  So in short, do
JB> please make tempest run as fast as possible, but we want to run
JB> useful tests, which takes time, and that's something we're prepared
JB> to deal with.

One thing I noticed while doing some of this was that Jenkins doesn't
seem to know about dependent patches, and will often run them in some
sub-optimal order. This means that if the third patch in a series of ten
is broken, it still runs everything ten times, instead of saying "well,
patch 3/10 is broken, so 4/10 and beyond will be skipped." Not sure how
hard that logic would be to integrate, but it might help to cut down
some load. I was generating ten test runs every hour or so the other day
trying to diagnose what was going on... :D

Thanks!

-- 
Dan Smith
IBM Linux Technology Center

Follow ups

Re: Tempest Gating
From: James E. Blair, 2012-08-22

References

Tempest Gating
From: Dan Smith, 2012-08-21
Re: Tempest Gating
From: Daryl Walleck, 2012-08-21
Re: Tempest Gating
From: Dan Smith, 2012-08-21
Re: Tempest Gating
From: David Kranz, 2012-08-21
Re: Tempest Gating
From: James E. Blair, 2012-08-22