← Back to team overview

openstack team mailing list archive

Re: Jenkins vs SmokeStack tests & Gerrit merge blockers

 


On 06/28/2012 01:49 PM, Dan Prince wrote:
> 
> 
> ----- Original Message -----
>> From: "Monty Taylor" <mordred@xxxxxxxxxxxx> To:
>> openstack@xxxxxxxxxxxxxxxxxxx Sent: Thursday, June 28, 2012
>> 11:13:28 AM Subject: Re: [Openstack] Jenkins vs SmokeStack tests &
>> Gerrit merge blockers
>> 
>> 
>> 
>> On 06/28/2012 07:32 AM, Daniel P. Berrange wrote:
>>> Today we face a situation where Nova GIT master fails to pass
>>> all the libvirt test cases. This regression was accidentally
>>> introduced by the following changeset
>>> 
>>> https://review.openstack.org/#/c/8778/
>>> 
>>> If you look at the history of that, the first SmokeStack test
>>> run fails with some (presumably) transient errors, and added
>>> negative karma to the change against patchset 2. If it were not
>>> for this transient failure, it should have shown the regression
>>> in the libvirt test case. The libvirt test case in question was
>>> one that is skipped, unless libvirt is actually present on the
>>> host running the tests. SmokeStack had made sure the tests would
>>> run on such a host.
>>> 
>>> There were then further patchsets uploaded, and patchset 4 was 
>>> approved for merge. Jenkins ran its gate jobs and these all
>>> passed successfully. I am told that Jenkins will actually run
>>> the unittests that are included in Nova, so I would have expected
>>> it to see the flawed libvirt test case, but it didn't. I presume
>>> therefore, that Jenkins is not running on a libvirt enabled
>>> host.
>> 
>> Kind of - it's sadly more complex than that ...
>> 
>>> The end result was that the broken changeset was merged to
>>> master, which in turns means any other developers submitting
>>> changes touching the libvirt area will get broken tests reported
>>> that were not actually their own fault.
>>> 
>>> This leaves me with the following questions...
>>> 
>>> 1. Why was the recorded failure from SmokeStack not considered to
>>> be a blocker for the merge of the commit by Gerrit or Jenkins or
>>> any of the reviewers ?
>>> 
>>> 2. Why did SmokeStack not get re-triggered for the later patch 
>>> set revisions, before it was merged ?
>> 
>> The answer to 1 and 2 is largely the same - SmokeStack is a
>> community contributed resources and is not managed by the CI team.
>> Dan Prince does a great job with it, but it's not a resource that
>> we have the ability to fix should it start messing up, so we have
>> not granted it the permissions to file blocking votes.
> 
> I would add that if anyone else is interested in collaborating on
> making SmokeStack better I'm more than happy to give access. Its all
> open source and has been since Cactus.
> 
> As is now SmokeStack can can cast a -1 vote and hopefully this is
> proving to be useful. I'm open to suggestions.

I think it's stellar!

>> 
>> The tests that smokestack runs could all be written such that they 
>> are run by jenkins.
> 
> I actually put in quite a bit of work to maintain an openstack_vpc
> job on Jenkins post-Cactus. When we talked about gating on this job
> at the Diablo conference the idea didn't seem to get very far... I
> kind of saw that as the end of the line for maintaining an
> openstack_vpc job and eventually it went away. Not sure who deleted
> it, but anyway.
>
> The way I see it there is value in both testing systems. Rather than
> complaining about why one system exists and/or doesn't port its tests
> to the other.... why don't we build on each others strengths. Seeing
> a green "verified +1" from both Jenkins and SmokeStack on a review
> should be very encouraging... and if one of the two systems fails it
> might require further investigation.

I completely agree with that. I'm still hoping we'll see more systems
from more people so that the set of combinations get larger.

I think also there's clearly value in running tests, like how SmokeStack
is doing right now, that aren't necessarily part of the gate, but which
pro-actively provide useful information to the reviewers.

>> The repos that run the jenkins tests are all in git and managed by
>> openstack's gerrit. If there are testing profiles that it runs that
>> we as a community value and want to see part of the gate, anyone is
>> welcome to port them.
>> 
>>> 3. Why did Jenkins not ensure that the tests were run on a
>>> libvirt enabled host ?
>> 
>> This is a different, and slightly more complex. We run tests in 
>> virtualenvs so that the process used to test the code can be 
>> consistently duplicated by all of the developers in the project.
>> This is the reason that we no longer do ubuntu package creation as
>> part of the gate - turns out that's really hard for a developer
>> running on OSX to do locally on their laptop - and if Jenkins
>> reports an blocking error in a patch, we want a developer to be
>> able to reproduce the problem locally so that they can have a
>> chance at fixing it.
> 
> The ability for developers to test things locally is very important.
> For that matter SmokeStack all started with a project called
> openstack_vpc, a project to spin up groups of cloud servers installed
> with the latest OpenStack code. A developer can use a project like
> openstack_vpc to spin up a set of servers in the cloud which builds
> and installs custom built packages for a set of Git URLs. So
> essentially the underpinnings of SmokeStack *can* all be done from a
> local machine just like they run from the UI.
> 
> There is also value in testing things differently in ways which may
> not be easy for all developers to reproduce. Take XenServer for
> example... not every developer has access to a machine which can spin
> up a mini XenServer cloud. Is there value in running upstream tests
> on XenServer? I think so...
> 
> What about running OpenStack with PostgreSQL and MySQL, Rabbit and
> Qpid? What I'm trying to do with SmokeStack is add value to our
> testing matrix so that some of the things we aren't automatically
> testing elsewhere get some coverage.

++

> Reproducability is important... but the way I see it if everyone
> always ran tests with exactly the same flags, or in the same
> environments we might not find some things.
> 
> And where a developer can't reproduce something locally what I've
> done is give them direct access to a box running a SmokeStack job so
> they can troubleshoot it directly.

Yup. We've done this with devstack tests in jenkins too. Super helpful
for some of those weird times...


>> 
>> Problem arise in paradise though. libvirt being one of them. It's
>> not possible to install libvirt into a virtualenv, because it's a 
>> swig-based module built as part of the libvirt source itself. One
>> of the solutions to this is to allow the testing virtual
>> environments to use packages installed at the system level. We
>> suggested this a little while ago, but this was rejected by the
>> nova team who valued the benefit of having a restricted test run so
>> that we know we've got all of the depends properly specified.
>> 
>> To that end, after chatting with Brian Waldon, I put this up as a 
>> possible next try:
>> 
>> https://review.openstack.org/#/c/8949/
>> 
>> Which adds an additional testing environment that has system
>> software enabled and also installs additional "optional" things.
>> With that environment, we should be able to run a jenkins gate that
>> tests things with full libvirt, and also tests the mysql upgrade
>> paths, without screwing our fine friends who run OSX.
>> 
>> Fundamentally though - we're at a point of trying to have our cake 
>> and eat it too. Either we want comprehensive testing of all of the
>> unit tests, or we want to be careful about not making the test
>> environment to hard for a developer to exactly mimic.
>> 
>> I'm obviously on the side of having us have gating tests that some 
>> devs might not be able to do on their laptops - such as  running
>> the libvirt tests properly. We're working on cloud software - worst
>> case scenario if there's an intractable problem, as dev can always
>> spin up an ubuntu image somewhere.
>> 
>>> Obviously this was all made worse by the transient problems
>>> we've had with the tests suite infrastructure these past 2 days,
>>> but regardless it seems like we have a gap in our merge approval
>>> procedures here.
>>> 
>>> IMHO, either SmokeStack needs to be made compulsory, or Jenkins 
>>> needs to ensure tests are run on suitable hosts like SmokeStack
>>> does, or both.
>> 
>> The second is much more possible and as I've pointed out is in work
>> - but I do think we should develop a clear sense that it's
>> important to us that we run these things properly even if it means
>> direct local developer reproducibility is impacted.
>> 
>> Thanks! Monty
>> 
>> _______________________________________________ Mailing list:
>> https://launchpad.net/~openstack Post to     :
>> openstack@xxxxxxxxxxxxxxxxxxx Unsubscribe :
>> https://launchpad.net/~openstack More help   :
>> https://help.launchpad.net/ListHelp
>> 
> 



Follow ups

References