launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #09647
Re: Parallel testing is live
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 12-09-21 03:22 PM, Francis J. Lacoste wrote:
> I agree that going back to pre-commit merge is one thing we could
> try. There is a caveat in your data though. You are counting the
> number of test runs processed by buildbot in a day. The number that
> is truly important is the number of ec2test submission daily.
> Because only once a successful ec2 test run has happened does it
> gets send to buildbot.
Right. The number of landings was much easier to get, since it didn't
require hacking all the launchpad team and grovelling their shell
histories :-).
However, we're only in trouble if the number of attempts exceeds 41,
which would be ~ a 4:1 ratio of attempts to landings. Also, the
number of attempts is probably correlated to the speed of landings--
i.e. the faster results come back, the more incentive people have to
re-attempt landing before checking to ensure they've fixed every bug.
> What you are proposing (and what was happening before we switched
> to buildbot) is that developers simply use the landing architecture
> as a convenient test runner. One problem in the old days was that a
> lot of queued landings would fail simply because the tests hadn't
> been run, not because of failing tests because of a integration
> error or intermittent failure. (Although we had also some of
> those.)
I'm not sure that's really a problem. People still have an incentive
to be conscientious about running the obvious tests, because 35
minutes is still a long time. But if non-obvious tests fail, it's
better for them to fail in 35 minutes via our parallel tester than in
4 hours via ec2. I think reckless landings would be self-limiting,
because they would tend to generate queues, reducing the advantage of
reckless landing.
> I don't know what's the effort required to set a tarmac instance
> that can run parallelized tests. (Unfortunately, it probably
> requires scarce webops resource also). But I'd be willing to try an
> experiment around that if it's cheap.
Cool.
> To achieve the similar flow you want, we can also make ec2test run
> tests in parallel in EC2.
Or Canonistack, since this probably involves a lot of re-work anyhow.
> On the big instances with 32 cores, Yellow was seeing a ~50 mins
> test run in EC2. That would put us well into range of writing and
> deploying code in the same day. (And deploying this requires 0
> webops involvement).
It also lets people run the full suite quickly for those cases where
you know you've probably broken something, but you don't know where.
Maybe this is also a good time to plug my "fault-line" plugin
<https://launchpad.net/fault-line>, which uses revision history to
find correlations between changed files and test files. It's good for
doing a broader test run without running the full suite:
bin/test -vm $(bzr fault-line --module-regex -r :submit)
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/
iEYEARECAAYFAlBc2vkACgkQ0F+nu1YWqI2mbACghSoFQ9W53SRLTpDHS6zoZXn9
k+sAn2EE6Bd2LfVhcGEyl2vLnzTyBn1f
=cvq1
-----END PGP SIGNATURE-----
References