← Back to team overview

openstack team mailing list archive

Re: Steps that can help stabilize Nova's trunk


Hi Dan! Sorry you were the last of my reply emails on this thread.
Email overload today :) Comments inline.

On Thu, Feb 17, 2011 at 9:43 AM, Dan Prince <dan.prince@xxxxxxxxxxxxx> wrote:
> I see the 'smoketests' directory in the nova code. Is anyone running these tests on a regular basis (every commit)? Is this the best place to further build out integration tests?

The Anso team has told me that they run the smoketests regularly, but
I would assume they only run these smoketests against a test cluster
that mimicks the Nebula environment. Devin and termie can correct me
if I'm wrong on that.

As far as I know, nobody outside of Anso has run the smoketests, and
they are not integrated into Hudson.

Yet. :)

termie and I had a brief email conversation about using the existing
Nova smoketests as the basis for some continuous integration testing.
I was kind of waiting for him to email the mailing list about that
topic, but I'll guess I'll pre-empt his email here. :)

As far as I know, the Anso plan (which I am in favour of) is to set up
a multi-node test cluster (Jordan Rinke may have already gotten this
done) that can have a number of these "target environments" (Nebula at
NASA, Rackspace Cloud Servers setup, etc) ready to run tests on. Then,
we'd install the Hudson agent on some test machine that would have a
nova.conf that represented the test environments and fire the
smoketests against these test environments.

The smoketests are perfectly fine, but I think most agree that they
represent a limited set of actions that a user and an admin would
execute against the Nova API.  In fact, the current smoketests only
test the EC2 API, since that is the API that the Nebula cluster uses.
So, AFAIK, there are no smoketests at all for the OpenStack API. This
is a major problem for Racker contributors, as I think you'd agree,
and one that we need to address ASAP. :)

I've created a bug for this, the lack of OpenStack API integration
test (smoketest): https://bugs.launchpad.net/nova/+bug/720941

Trey, by virtue of responding to my prior email suggesting the need
for an OpenStack API smoketest, volunteered to create the smoketest.
And by "volunteered", I mean of course that I simply assigned Trey to
the bug ;)

> Regarding environment/setup tools: I been working on an Openstack VPC toolkit project that we are using in Blacksburg to stage test some things in the cloud. I'm using Chef along with the Anso/Opscode Openstack cookbooks to setup Rackspace Cloud Servers with the latest trunk PPA packages.

This is great news. How can we integrate this work into these things?

a) Nova's smoketest files
b) Hudson

> This setup works well however I can only test with Qemu (no Xenserver) and using network managers that have DHCP (I use FlatDHCPManager since Cloud Servers kernels don't currently have the 'ndb' kernel module which would support the network injection stuff).

That is a hardware problem that Paul Voccio and others have been
working on. There is a cluster of machines that are being set up that
will have XenServer (5.6p2 Callie?) installed on them that will be
available to run tests against. I'll let Paul give details on this
test cluster that will shortly (?) be integrated into our Hudson

> Using this setup I'm able to create multi node installations where instances on different machines can ping each other. While this isn't what I would call a true production setup it is fully functional and can easily be run in parallel. The only limitations are the limits on your Cloud Servers Account.

OK. Are you executing a specific set of pre-determined Cloud Servers
API requests against Nova? If so, how are those requests created? Are
they in a test case that we could integrate into the existing Nova
smoketests files?

> If you have bare metal then you can simply swap out the Cloud Servers API layer with something that interfaces with your PXE imaging system. I'm a big fan of slicking the machines between each test run to avoid the buildup of cruft in the system.

Yep, AFAIK, Rackspace's internal Autotools team has been working with
Paul to enable this kind of thing with the afore-mentioned test
cluster to "wipe" the cluster back to an original XenServer setting
between test runs.

> This would integrate as a Hudson job nicely as well. We've done some similar setups in Rackspace Email and Apps using a single Hudson server. The hudson server runs a simple Bash script that invokes the toolkit to create the cloud servers, chef them up with the latest PPA packages (or your branch code), and then uses Torque (an HPC'ish resource manager) to run schedule test jobs on some of the machines. I use Chef recipes to install Torque along with a REST interface to schedule and monitor the jobs on a "head" node. The Hudson job the waits for the Torque jobs to finish. The last thing the Hudson script does is 'scp' the unit test results XML file back to the Hudson server where you can use something like the xUnit plugin to display and graph the results over time.

Yes, we're in agreement that Hudson can and should be the "driver" for
all these jobs. Where can I see this toolkit code? Is it available
online somewhere?

> To summarize:
> -testing in the cloud provides a low barrier of entry that anyone can use
> -testing on bare metal is more expensive but gets us extra coverage (XenServer, etc.)
> -we should do both as often as possible
> -the same set of tools and tests should work in both environments

Cheers, and thanks for all the info. Let's make this happen.

Follow ups