launchpad-reviewers team mailing list archive
-
launchpad-reviewers team
-
Mailing list archive
-
Message #16675
[Merge] lp:~cjwatson/launchpad-buildd/abort-race into lp:launchpad-buildd
Colin Watson has proposed merging lp:~cjwatson/launchpad-buildd/abort-race into lp:launchpad-buildd.
Commit message:
If a build is aborted between subprocesses, pretend that it was terminated by a signal.
Requested reviews:
Launchpad code reviewers (launchpad-reviewers)
For more details, see:
https://code.launchpad.net/~cjwatson/launchpad-buildd/abort-race/+merge/218222
We've noticed a few cases where builds get stuck after trying to cancel them, particularly when they're aborted early in a build. While I don't know if I can necessarily trigger this on demand, I believe that it happens because the build slave doesn't have correct handling of abort requests received between subprocess execution. This branch should fix that by treating those as equivalent to a subprocess having been terminated by SIGKILL, and includes a test that provokes the failure.
--
https://code.launchpad.net/~cjwatson/launchpad-buildd/abort-race/+merge/218222
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~cjwatson/launchpad-buildd/abort-race into lp:launchpad-buildd.
=== modified file 'debian/changelog'
--- debian/changelog 2014-05-03 14:48:19 +0000
+++ debian/changelog 2014-05-04 21:39:08 +0000
@@ -3,6 +3,8 @@
* Drop the status_dict XML-RPC method, now that the master uses the
new-style dict-flavoured status method.
* Don't add symlinks to the results of livefs builds (LP: #1247461).
+ * If a build is aborted between subprocesses, pretend that it was
+ terminated by a signal.
-- Colin Watson <cjwatson@xxxxxxxxxx> Fri, 25 Apr 2014 13:59:16 +0100
=== modified file 'lpbuildd/debian.py'
--- lpbuildd/debian.py 2013-09-26 09:26:03 +0000
+++ lpbuildd/debian.py 2014-05-04 21:39:08 +0000
@@ -10,6 +10,7 @@
import os
import re
+import signal
from lpbuildd.slave import (
BuildManager,
@@ -115,6 +116,10 @@
# When a Twisted ProcessControl class is killed by SIGTERM,
# which we call 'build process aborted', 'None' is returned as
# exit_code.
+ if self.alreadyfailed and success == 0:
+ # We may have been aborted in between subprocesses; pretend that
+ # we were terminated by a signal, which is close enough.
+ success = 128 + signal.SIGKILL
print ("Iterating with success flag %s against stage %s"
% (success, self._state))
func = getattr(self, "iterate_" + self._state, None)
=== modified file 'lpbuildd/tests/test_binarypackage.py'
--- lpbuildd/tests/test_binarypackage.py 2013-10-04 01:05:00 +0000
+++ lpbuildd/tests/test_binarypackage.py 2014-05-04 21:39:08 +0000
@@ -207,6 +207,30 @@
self.assertEqual(
self.buildmanager.iterate, self.buildmanager.iterators[-1])
+ def test_abort_between_subprocesses(self):
+ # If a build is aborted between subprocesses, the build manager
+ # pretends that it was terminated by a signal.
+ self.buildmanager.initiate(
+ {'foo_1.dsc': ''}, 'chroot.tar.gz',
+ {'suite': 'warty', 'ogrecomponent': 'main'})
+
+ self.buildmanager.abort()
+ expected_command = [
+ 'processscanpath', 'scan-for-processes', self.buildid
+ ]
+ self.assertEqual(BinaryPackageBuildState.INIT, self.getState())
+ self.assertEqual(expected_command, self.buildmanager.commands[-1])
+ self.assertNotEqual(
+ self.buildmanager.iterate, self.buildmanager.iterators[-1])
+
+ self.buildmanager.iterate(0)
+ expected_command = ['cleanpath', 'remove-build', self.buildid]
+ self.assertEqual(BinaryPackageBuildState.CLEANUP, self.getState())
+ self.assertEqual(expected_command, self.buildmanager.commands[-1])
+ self.assertEqual(
+ self.buildmanager.iterate, self.buildmanager.iterators[-1])
+ self.assertFalse(self.slave.wasCalled('builderFail'))
+
def test_missing_changes(self):
# The build manager recovers if the expected .changes file does not
# exist, and considers it a package build failure.
Follow ups