← Back to team overview

openstack team mailing list archive

Re: Intermittent devstack-gate failures

 

Hey Jim,

Any updates or new ideas on the cause of the intermittent hangs?

I mentioned these on IRC with regards to one of the Essex failures.... one thing I've seen with Glance recently is that both the glance-api and glance-registry now use (and try to auto create) the database by default. I've had issues when I start glance-api and glance-registry in a quick sequence because both of them try to init the DB.

So in devstack we could run 'glance-manage db_sync' manually:

  https://review.openstack.org/#/c/8495/

And in Glance we could then default auto_db_create to False:

  https://review.openstack.org/#/c/8496/

Any chance this is the cause of the intermittent failures?

Dan

----- Original Message -----
> From: "James E. Blair" <corvus@xxxxxxxxxxxx>
> To: "OpenStack Mailing List" <openstack@xxxxxxxxxxxxxxxxxxx>
> Sent: Tuesday, June 12, 2012 7:25:01 PM
> Subject: [Openstack] Intermittent devstack-gate failures
> 
> Hi,
> 
> It looks like there are intermittent, but frequent, failures in the
> devstack-gate.  This suggests a non-deterministic bug has crept into
> some piece of OpenStack software.
> 
> In this kind of situation, certainly could keep re-approving changes
> in
> the hope that they will pass the test and merge, but it would be
> better
> to fix the underlying problem.  Simply re-approving is mostly just
> going
> to make the queue longer.
> 
> Note that the output from Jenkins has changed recently.  I've seen
> some
> people misconstrue some normal parts of the test process as errors.
>  In
> particular, this message from Jenkins is not an error:
> 
>   Looks like the node went offline during the build. Check the slave
>   log
>   for the details.
> 
> That's a normal part of the way the devstack-gate tests run, where we
> add a machine to Jenkins as a slave, run the tests, and remove it
> from
> the list of slaves before it's done.  This is to accommodate the
> one-shot nature of devstack based testing.  It doesn't interfere with
> the results.
> 
> To find out why a test failed, you should scroll up a bit to the
> devstack exercise output, which normally looks like this:
> 
> *********************************************************************
> SUCCESS: End DevStack Exercise: ./exercises/volumes.sh
> *********************************************************************
> =====================================================================
> SKIP boot_from_volume
> SKIP client-env
> SKIP quantum
> SKIP swift
> PASS aggregates
> PASS bundle
> PASS client-args
> PASS euca
> PASS floating_ips
> PASS sec_groups
> PASS volumes
> =====================================================================
> 
> Everything after that point is test running boilerplate.  I'll add
> some
> echo statements to that effect in the future.
> 
> Finally, it may be a little difficult to pinpoint when this started.
>  A
> number of devstack-gate tests have passed recently without actually
> running any tests, due to an issue with one of our OpenStack based
> node
> providers.  We are eating our own dogfood.
> 
> -Jim
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
> 


Follow ups

References