← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~cjwatson/launchpad-buildd/fix-abort into lp:launchpad-buildd

 

Colin Watson has proposed merging lp:~cjwatson/launchpad-buildd/fix-abort into lp:launchpad-buildd.

Commit message:
Make abort work properly, calling scan-for-processes to kill all processes in the chroot.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #54730 in launchpad-buildd: "launchpad buildd abort does not work as expected"
  https://bugs.launchpad.net/launchpad-buildd/+bug/54730

For more details, see:
https://code.launchpad.net/~cjwatson/launchpad-buildd/fix-abort/+merge/177003

Make the abort operation work, so that we have a hope of being able to cancel builds.

This turned out to be gratuitously complicated, involving moving scan-for-processes configuration up to the slave, rearranging how reaping is handled in the state machine so that it can reasonably easily happen at multiple stages, and introducing a new BuildStatus.ABORTED state (which should eventually supersede BuilderStatus.ABORTED) so that cancelled builds can be WAITING/ABORTED rather than just ABORTED.  But I believe it all works now.
-- 
https://code.launchpad.net/~cjwatson/launchpad-buildd/fix-abort/+merge/177003
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~cjwatson/launchpad-buildd/fix-abort into lp:launchpad-buildd.
=== modified file 'Makefile'
--- Makefile	2011-11-21 01:47:22 +0000
+++ Makefile	2013-07-25 17:35:33 +0000
@@ -27,6 +27,7 @@
 # a Launchpad checkout.
 check:
 	PYTHONPATH=$(PWD):$(PYTHONPATH) $(PYTHON) -m testtools.run -v \
+		   lpbuildd.tests.test_binarypackage \
 		   lpbuildd.tests.test_buildd_slave \
 		   lpbuildd.tests.test_check_implicit_pointer_functions \
 		   lpbuildd.tests.test_harness \

=== modified file 'buildd-slave-example.conf'
--- buildd-slave-example.conf	2011-11-09 07:50:56 +0000
+++ buildd-slave-example.conf	2013-07-25 17:35:33 +0000
@@ -16,10 +16,10 @@
 cleanpath = /home/buildd/slavebin/remove-build
 mountpath = /home/buildd/slavebin/mount-chroot
 umountpath = /home/buildd/slavebin/umount-chroot
+processscanpath = /home/buildd/slavebin/scan-for-processes
 
 [debianmanager]
 sbuildpath = /home/buildd/slavebin/sbuild-package
 sbuildargs = -dautobuild --nolog --batch -A
 updatepath = /home/buildd/slavebin/update-debian-chroot
-processscanpath = /home/buildd/slavebin/scan-for-processes
 ogrepath = /home/buildd/slavebin/apply-ogre-model

=== modified file 'debian/changelog'
--- debian/changelog	2013-07-25 16:29:08 +0000
+++ debian/changelog	2013-07-25 17:35:33 +0000
@@ -1,10 +1,17 @@
 launchpad-buildd (115) UNRELEASED; urgency=low
 
+  [ Adam Conrad ]
   * Short the readlink call in scan-for-process with a true to avoid
     prematurely exiting the process scan when tripping over zombies.
   * Write to temporary cache files and then rename after validation
     to avoid the cache containing broken aborted files (LP: #471076)
 
+  [ Colin Watson ]
+  * Move scan-for-processes up to the top-level slave code so that it is
+    available for more general use.
+  * Make abort work properly, calling scan-for-processes to kill all
+    processes in the chroot.
+
  -- Adam Conrad <adconrad@xxxxxxxxxx>  Mon, 29 Apr 2013 15:47:20 -0600
 
 launchpad-buildd (114) hardy; urgency=low

=== modified file 'debian/upgrade-config'
--- debian/upgrade-config	2011-12-05 02:39:05 +0000
+++ debian/upgrade-config	2013-07-25 17:35:33 +0000
@@ -7,7 +7,6 @@
 
 import re
 import sys
-import os
 import subprocess
 
 (old_version, conf_file) = sys.argv[1:]
@@ -114,6 +113,19 @@
     in_file.close()
     out_file.close()
 
+def upgrade_to_115():
+    print "Upgrading %s to version 115" % conf_file
+    subprocess.call(["mv", conf_file, conf_file+"-prev115~"])
+    in_file = open(conf_file+"-prev115~", "r")
+    out_file = open(conf_file, "w")
+    for line in in_file:
+        if line.startswith("[allmanagers]"):
+            line += "processscanpath = /usr/share/launchpad-buildd/slavebin/scan-for-processes\n"
+        if not line.startswith("processscanpath = "):
+            out_file.write(line)
+    in_file.close()
+    out_file.close()
+
 if __name__ == "__main__":
     old_version = re.sub(r'[~-].*', '', old_version)
     if int(old_version) < 12:
@@ -132,4 +144,6 @@
         upgrade_to_63()
     if int(old_version) < 110:
         upgrade_to_110()
+    if int(old_version) < 115:
+        upgrade_to_115()
 

=== modified file 'lpbuildd/binarypackage.py'
--- lpbuildd/binarypackage.py	2011-11-10 04:55:02 +0000
+++ lpbuildd/binarypackage.py	2013-07-25 17:35:33 +0000
@@ -39,8 +39,8 @@
 
     initial_build_state = BinaryPackageBuildState.SBUILD
 
-    def __init__(self, slave, buildid):
-        DebianBuildManager.__init__(self, slave, buildid)
+    def __init__(self, slave, buildid, **kwargs):
+        DebianBuildManager.__init__(self, slave, buildid, **kwargs)
         self._sbuildpath = slave._config.get("binarypackagemanager", "sbuildpath")
         self._sbuildargs = slave._config.get("binarypackagemanager",
                                              "sbuildargs").split(" ")
@@ -124,10 +124,13 @@
                     print("Returning build status: BUILDERFAIL")
                     self._slave.builderFail()
             self.alreadyfailed = True
-            self._state = DebianBuildState.REAP
-            self.doReapProcesses()
+            self.doReapProcesses(self._state)
         else:
             print("Returning build status: OK")
             self.gatherResults()
-            self._state = DebianBuildState.REAP
-            self.doReapProcesses()
+            self.doReapProcesses(self._state)
+
+    def iterateReap_SBUILD(self, success):
+        """Finished reaping after sbuild run."""
+        self._state = DebianBuildState.UMOUNT
+        self.doUnmounting()

=== modified file 'lpbuildd/debian.py'
--- lpbuildd/debian.py	2011-11-18 04:17:27 +0000
+++ lpbuildd/debian.py	2013-07-25 17:35:33 +0000
@@ -22,7 +22,6 @@
     MOUNT = "MOUNT"
     SOURCES = "SOURCES"
     UPDATE = "UPDATE"
-    REAP = "REAP"
     UMOUNT = "UMOUNT"
     CLEANUP = "CLEANUP"
 
@@ -30,10 +29,9 @@
 class DebianBuildManager(BuildManager):
     """Base behaviour for Debian chrooted builds."""
 
-    def __init__(self, slave, buildid):
-        BuildManager.__init__(self, slave, buildid)
+    def __init__(self, slave, buildid, **kwargs):
+        BuildManager.__init__(self, slave, buildid, **kwargs)
         self._updatepath = slave._config.get("debianmanager", "updatepath")
-        self._scanpath = slave._config.get("debianmanager", "processscanpath")
         self._sourcespath = slave._config.get("debianmanager", "sourcespath")
         self._cachepath = slave._config.get("slave","filecache")
         self._state = DebianBuildState.INIT
@@ -74,10 +72,6 @@
         """
         raise NotImplementedError()
 
-    def doReapProcesses(self):
-        """Reap any processes left lying around in the chroot."""
-        self.runSubProcess( self._scanpath, [self._scanpath, self._buildid] )
-
     @staticmethod
     def _parseChangesFile(linesIter):
         """A generator that iterates over files listed in a changes file.
@@ -98,7 +92,7 @@
 
     def getChangesFilename(self):
         changes = self._dscfile[:-4] + "_" + self.arch_tag + ".changes"
-        return get_build_path(self._buildid, changes)
+        return get_build_path(self.home, self._buildid, changes)
 
     def gatherResults(self):
         """Gather the results of the build and add them to the file cache.
@@ -113,7 +107,8 @@
         seenfiles = False
 
         for fn in self._parseChangesFile(chfile):
-            self._slave.addWaitingFile(get_build_path(self._buildid, fn))
+            self._slave.addWaitingFile(
+                get_build_path(self.home, self._buildid, fn))
 
         chfile.close()
 
@@ -128,6 +123,15 @@
             raise ValueError, "Unknown internal state " + self._state
         func(success)
 
+    def iterateReap(self, state, success):
+        print ("Iterating with success flag %s against stage %s after reaping "
+               "processes"
+               % (success, state))
+        func = getattr(self, "iterateReap_" + state, None)
+        if func is None:
+            raise ValueError, "Unknown internal post-reap state " + state
+        func(success)
+
     def iterate_INIT(self, success):
         """Just finished initializing the build."""
         if success != 0:
@@ -183,26 +187,29 @@
             if not self.alreadyfailed:
                 self._slave.chrootFail()
                 self.alreadyfailed = True
-            self._state = DebianBuildState.REAP
-            self.doReapProcesses()
+            self.doReapProcesses(self._state)
         else:
             self._state = DebianBuildState.UPDATE
             self.doUpdateChroot()
 
+    def iterateReap_SOURCES(self, success):
+        """Just finished reaping after failure to overwrite sources.list."""
+        self._state = DebianBuildState.UMOUNT
+        self.doUnmounting()
+
     def iterate_UPDATE(self, success):
         """Just finished updating the chroot."""
         if success != 0:
             if not self.alreadyfailed:
                 self._slave.chrootFail()
                 self.alreadyfailed = True
-            self._state = DebianBuildState.REAP
-            self.doReapProcesses()
+            self.doReapProcesses(self._state)
         else:
             self._state = self.initial_build_state
             self.doRunBuild()
 
-    def iterate_REAP(self, success):
-        """Finished reaping processes; ignore error returns."""
+    def iterateReap_UPDATE(self, success):
+        """Just finished reaping after failure to update the chroot."""
         self._state = DebianBuildState.UMOUNT
         self.doUnmounting()
 
@@ -227,13 +234,20 @@
                 self._slave.buildOK()
         self._slave.buildComplete()
 
-
-def get_build_path(build_id, *extra):
+    def abortReap(self):
+        """Abort by killing all processes in the chroot, as hard as we can.
+
+        Overridden here to handle state management.
+        """
+        self.doReapProcesses(self._state, notify=False)
+
+
+def get_build_path(home, build_id, *extra):
     """Generate a path within the build directory.
 
+    :param home: the user's home directory.
     :param build_id: the build id to use.
     :param extra: the extra path segments within the build directory.
     :return: the generated path.
     """
-    return os.path.join(
-        os.environ["HOME"], "build-" + build_id, *extra)
+    return os.path.join(home, "build-" + build_id, *extra)

=== modified file 'lpbuildd/slave.py'
--- lpbuildd/slave.py	2013-07-25 16:29:08 +0000
+++ lpbuildd/slave.py	2013-07-25 17:35:33 +0000
@@ -8,6 +8,7 @@
 
 __metaclass__ = type
 
+from functools import partial
 import hashlib
 import os
 import re
@@ -15,7 +16,7 @@
 import xmlrpclib
 
 from twisted.internet import protocol
-from twisted.internet import reactor
+from twisted.internet import reactor as default_reactor
 from twisted.internet import process
 from twisted.web import xmlrpc
 
@@ -52,7 +53,8 @@
     def __init__(self, slave, callback):
         self.slave = slave
         self.notify = callback
-        self.killCall = None
+        self.builderFailCall = None
+        self.ignore = False
 
     def outReceived(self, data):
         """Pass on stdout data to the log."""
@@ -67,30 +69,27 @@
     def processEnded(self, statusobject):
         """This method is called when a child process got terminated.
 
-        Three actions are required at this point: identify if we are within an
-        "aborting" process, eliminate pending calls to "kill" and invoke the
-        programmed notification callback. We only really care about invoking
-        the notification callback last thing in this method. The order
-        of the rest of the method is not critical.
+        Two actions are required at this point: eliminate pending calls to
+        "builderFail", and invoke the programmed notification callback.  The
+        notification callback must be invoked last.
         """
-        # finishing the ABORTING workflow
-        if self.slave.builderstatus == BuilderStatus.ABORTING:
-            self.slave.builderstatus = BuilderStatus.ABORTED
+        if self.ignore:
+            # The build manager no longer cares about this process.
+            return
 
-        # check if there is a pending request for kill the process,
-        # in afirmative case simply cancel this request since it
-        # already died.
-        if self.killCall and self.killCall.active():
-            self.killCall.cancel()
+        # Since the process terminated, we don't need to fail the builder.
+        if self.builderFailCall and self.builderFailCall.active():
+            self.builderFailCall.cancel()
 
         # notify the slave, it'll perform the required actions
-        self.notify(statusobject.value.exitCode)
+        if self.notify is not None:
+            self.notify(statusobject.value.exitCode)
 
 
 class BuildManager(object):
     """Build Daemon slave build manager abstract parent"""
 
-    def __init__(self, slave, buildid):
+    def __init__(self, slave, buildid, reactor=None):
         """Create a BuildManager.
 
         :param slave: A `BuildDSlave`.
@@ -99,20 +98,27 @@
         object.__init__(self)
         self._buildid = buildid
         self._slave = slave
+        self._reactor = default_reactor if reactor is None else reactor
         self._preppath = slave._config.get("allmanagers", "preppath")
         self._unpackpath = slave._config.get("allmanagers", "unpackpath")
         self._cleanpath = slave._config.get("allmanagers", "cleanpath")
         self._mountpath = slave._config.get("allmanagers", "mountpath")
         self._umountpath = slave._config.get("allmanagers", "umountpath")
+        self._scanpath = slave._config.get("allmanagers", "processscanpath")
+        self._subprocess = None
+        self._reaped_states = set()
         self.is_archive_private = False
         self.home = os.environ['HOME']
+        self.abort_timeout = 120
 
-    def runSubProcess(self, command, args):
+    def runSubProcess(self, command, args, iterate=None):
         """Run a sub process capturing the results in the log."""
-        self._subprocess = RunCapture(self._slave, self.iterate)
+        if iterate is None:
+            iterate = self.iterate
+        self._subprocess = RunCapture(self._slave, iterate)
         self._slave.log("RUN: %s %r\n" % (command, args))
         childfds = {0: devnull.fileno(), 1: "r", 2: "r"}
-        reactor.spawnProcess(
+        self._reactor.spawnProcess(
             self._subprocess, command, args, env=os.environ,
             path=self.home, childFDs=childfds)
 
@@ -122,6 +128,25 @@
             self._unpackpath,
             ["unpack-chroot", self._buildid, self._chroottarfile])
 
+    def doReapProcesses(self, state, notify=True):
+        """Reap any processes left lying around in the chroot."""
+        if state is not None and state in self._reaped_states:
+            # We've already reaped this state.  To avoid a loop, proceed
+            # immediately to the next iterator.
+            self._slave.log("Already reaped from state %s" % state)
+            if notify:
+                self.iterateReap(state, 0)
+        else:
+            if state is not None:
+                self._reaped_states.add(state)
+            if notify:
+                iterate = partial(self.iterateReap, state)
+            else:
+                iterate = lambda success: None
+            self.runSubProcess(
+                self._scanpath, [self._scanpath, self._buildid],
+                iterate=iterate)
+
     def doCleanup(self):
         """Remove the build tree etc."""
         self.runSubProcess(self._cleanpath, ["remove-build", self._buildid])
@@ -175,37 +200,66 @@
         raise NotImplementedError("BuildManager should be subclassed to be "
                                   "used")
 
+    def iterateReap(self, state, success):
+        """Perform an iteration of the slave following subprocess reaping.
+
+        Subprocess reaping is special, typically occurring at several
+        positions in a build manager's state machine.  We therefore keep
+        track of the state being reaped so that we can select the
+        appropriate next state.
+        """
+        raise NotImplementedError("BuildManager should be subclassed to be "
+                                  "used")
+
+    def abortReap(self):
+        """Abort by killing all processes in the chroot, as hard as we can.
+
+        We expect this to result in the main build process exiting non-zero
+        and giving us some useful logs.
+
+        This may be overridden in subclasses so that they can perform their
+        own state machine management.
+        """
+        self.doReapProcesses(None, notify=False)
+
     def abort(self):
         """Abort the build by killing the subprocess."""
         if self.alreadyfailed or self._subprocess is None:
             return
         else:
             self.alreadyfailed = True
-        # Either SIGKILL and SIGTERM presents the same behavior, the
-        # process is just killed some time after the signal was sent 
-        # 10 s ~ 40 s, and returns None as exit_code, instead of the normal
-        # integer. See further info on DebianBuildermanager.iterate in
-        # debian.py
-        # XXX cprov 2005-09-02:
-        # we may want to follow the canonical.tachandler kill process
-        # style, which sends SIGTERM to the process wait a given timeout
-        # and if was not killed sends a SIGKILL. IMO it only would be worth
-        # if we found different behaviour than the previous described.
-        self._subprocess.transport.signalProcess('TERM')
-        # alternatively to simply send SIGTERM, we can pend a request to
-        # send SIGKILL to the process if nothing happened in 10 seconds
-        # see base class process
-        self._subprocess.killCall = reactor.callLater(10, self.kill)
-
-    def kill(self):
-        """Send SIGKILL to child process
-
-        Mask exception generated when the child process has already exited.
-        """
+        primary_subprocess = self._subprocess
+        self.abortReap()
+        # In extreme cases the build may be hung too badly for
+        # scan-for-processes to manage to kill it (blocked on I/O,
+        # forkbombing test suite, etc.).  In this case, fail the builder and
+        # let an admin sort it out.
+        self._subprocess.builderFailCall = self._reactor.callLater(
+            self.abort_timeout, self.builderFail,
+            "Failed to kill all processes.", primary_subprocess)
+
+    def builderFail(self, reason, primary_subprocess):
+        """Mark the builder as failed."""
+        self._slave.log("ABORTING: %s\n" % reason)
+        self._subprocess.builderFailCall = None
+        self._slave.builderFail()
+        self.alreadyfailed = True
+        # If we failed to kill all processes in the chroot, then the primary
+        # subprocess (i.e. the one running immediately before
+        # doReapProcesses was called) may not have exited.  Kill it so that
+        # we can proceed.
         try:
-            self._subprocess.transport.signalProcess('KILL')
+            primary_subprocess.transport.signalProcess('KILL')
         except process.ProcessExitedAlready:
             self._slave.log("ABORTING: Process Exited Already\n")
+        primary_subprocess.transport.loseConnection()
+        # Leave the reaper running, but disconnect it from our state
+        # machine.  Perhaps an admin can make something of it, and in any
+        # case scan-for-processes elevates itself to root so it's awkward to
+        # kill it.
+        self._subprocess.ignore = True
+        self._subprocess.transport.loseConnection()
+
 
 class BuilderStatus:
     """Status values for the builder."""
@@ -229,6 +283,7 @@
     PACKAGEFAIL = "BuildStatus.PACKAGEFAIL"
     CHROOTFAIL = "BuildStatus.CHROOTFAIL"
     BUILDERFAIL = "BuildStatus.BUILDERFAIL"
+    ABORTED = "BuildStatus.ABORTED"
 
 
 class BuildDSlave(object):
@@ -458,8 +513,10 @@
 
     def builderFail(self):
         """Cease building because the builder has a problem."""
-        if self.builderstatus != BuilderStatus.BUILDING:
-            raise ValueError("Slave is not BUILDING when set to BUILDERFAIL")
+        if self.builderstatus not in (BuilderStatus.BUILDING,
+                                      BuilderStatus.ABORTING):
+            raise ValueError("Slave is not BUILDING|ABORTING when set to "
+                             "BUILDERFAIL")
         self.buildstatus = BuildStatus.BUILDERFAIL
 
     def chrootFail(self):
@@ -497,14 +554,25 @@
             raise ValueError("Slave is not BUILDING when set to GIVENBACK")
         self.buildstatus = BuildStatus.GIVENBACK
 
+    def buildAborted(self):
+        """Mark a build as aborted."""
+        if self.builderstatus != BuilderStatus.ABORTING:
+            raise ValueError("Slave is not ABORTING when set to ABORTED")
+        if self.buildstatus != BuildStatus.BUILDERFAIL:
+            self.buildstatus = BuildStatus.ABORTED
+
     def buildComplete(self):
         """Mark the build as complete and waiting interaction from the build
         daemon master.
         """
-        if self.builderstatus != BuilderStatus.BUILDING:
-            raise ValueError("Slave is not BUILDING when told build is "
-                             "complete")
-        self.builderstatus = BuilderStatus.WAITING
+        if self.builderstatus == BuilderStatus.BUILDING:
+            self.builderstatus = BuilderStatus.WAITING
+        elif self.builderstatus == BuilderStatus.ABORTING:
+            self.buildAborted()
+            self.builderstatus = BuilderStatus.WAITING
+        else:
+            raise ValueError("Slave is not BUILDING|ABORTING when told build "
+                             "is complete")
 
     def sanitizeBuildlog(self, log_path):
         """Removes passwords from buildlog URLs.

=== modified file 'lpbuildd/sourcepackagerecipe.py'
--- lpbuildd/sourcepackagerecipe.py	2011-11-10 04:55:02 +0000
+++ lpbuildd/sourcepackagerecipe.py	2013-07-25 17:35:33 +0000
@@ -124,8 +124,12 @@
                 self._slave.builderFail()
                 print("Returning build status: Builder failed.")
             self.alreadyfailed = True
-        self._state = DebianBuildState.REAP
-        self.doReapProcesses()
+        self.doReapProcesses(self._state)
+
+    def iterateReap_BUILD_RECIPE(self, retcode):
+        """Finished reaping after recipe building."""
+        self._state = DebianBuildState.UMOUNT
+        self.doUnmounting()
 
     def getChangesFilename(self):
         """Return the path to the changes file."""

=== modified file 'lpbuildd/tests/buildd-slave-test.conf'
--- lpbuildd/tests/buildd-slave-test.conf	2011-12-08 22:13:03 +0000
+++ lpbuildd/tests/buildd-slave-test.conf	2013-07-25 17:35:33 +0000
@@ -12,17 +12,16 @@
 mountpath = /var/tmp/buildd/slavebin/mount-chroot
 umountpath = /var/tmp/buildd/slavebin/umount-chroot
 preppath = /var/tmp/buildd/slavebin/slave-prep
+processscanpath = /var/tmp/buildd/slavebin/scan-for-processes
 
 [debianmanager]
 sbuildpath = /var/tmp/buildd/slavebin/sbuild-package
 sbuildargs = -dautobuild --nolog --batch
 updatepath = /var/tmp/buildd/slavebin/update-debian-chroot
-processscanpath = /var/tmp/buildd/slavebin/scan-for-processes
 sourcespath = /usr/share/launchpad-buildd/slavebin/override-sources-list
 
 [binarypackagemanager]
 sbuildpath = /var/tmp/buildd/slavebin/sbuild-package
 sbuildargs = -dautobuild --nolog --batch
 updatepath = /var/tmp/buildd/slavebin/update-debian-chroot
-processscanpath = /var/tmp/buildd/slavebin/scan-for-processes
 sourcespath = /usr/share/launchpad-buildd/slavebin/override-sources-list

=== added file 'lpbuildd/tests/fakeslave.py'
--- lpbuildd/tests/fakeslave.py	1970-01-01 00:00:00 +0000
+++ lpbuildd/tests/fakeslave.py	2013-07-25 17:35:33 +0000
@@ -0,0 +1,104 @@
+# Copyright 2013 Canonical Ltd.  This software is licensed under the
+# GNU Affero General Public License version 3 (see the file LICENSE).
+
+__metaclass__ = type
+__all__ = [
+    'FakeMethod',
+    'FakeSlave',
+    ]
+
+import os
+
+
+class FakeMethod:
+    """Catch any function or method call, and record the fact.
+
+    Use this for easy stubbing.  The call operator can return a fixed
+    value, or raise a fixed exception object.
+
+    This is useful when unit-testing code that does things you don't
+    want to integration-test, e.g. because it wants to talk to remote
+    systems.
+    """
+
+    def __init__(self, result=None, failure=None):
+        """Set up a fake function or method.
+
+        :param result: Value to return.
+        :param failure: Exception to raise.
+        """
+        self.result = result
+        self.failure = failure
+
+        # A log of arguments for each call to this method.
+        self.calls = []
+
+    def __call__(self, *args, **kwargs):
+        """Catch an invocation to the method.
+
+        Increment `call_count`, and adds the arguments to `calls`.
+
+        Accepts any and all parameters.  Raises the failure passed to
+        the constructor, if any; otherwise, returns the result value
+        passed to the constructor.
+        """
+        self.calls.append((args, kwargs))
+
+        if self.failure is None:
+            return self.result
+        else:
+            # pylint thinks this raises None, which is clearly not
+            # possible.  That's why this test disables pylint message
+            # E0702.
+            raise self.failure
+
+    @property
+    def call_count(self):
+        return len(self.calls)
+
+    def extract_args(self):
+        """Return just the calls' positional-arguments tuples."""
+        return [args for args, kwargs in self.calls]
+
+    def extract_kwargs(self):
+        """Return just the calls' keyword-arguments dicts."""
+        return [kwargs for args, kwargs in self.calls]
+
+
+class FakeConfig:
+    def get(self, section, key):
+        return key
+
+
+class FakeSlave:
+    def __init__(self, tempdir):
+        self._cachepath = tempdir
+        self._config = FakeConfig()
+        self._was_called = set()
+        self.waitingfiles = {}
+
+    def cachePath(self, file):
+        return os.path.join(self._cachepath, file)
+
+    def anyMethod(self, *args, **kwargs):
+        pass
+
+    fake_methods = [
+        'emptyLog', 'log', 'chrootFail', 'buildFail', 'builderFail',
+        ]
+    def __getattr__(self, name):
+        """Remember which fake methods were called."""
+        if name not in self.fake_methods:
+            raise AttributeError(
+                "'%s' object has no attribute '%s'" % (self.__class__, name))
+        self._was_called.add(name)
+        return self.anyMethod
+
+    def wasCalled(self, name):
+        return name in self._was_called
+
+    def getArch(self):
+        return 'i386'
+
+    storeFile = FakeMethod()
+    addWaitingFile = FakeMethod()

=== modified file 'lpbuildd/tests/harness.py'
--- lpbuildd/tests/harness.py	2012-06-05 17:11:03 +0000
+++ lpbuildd/tests/harness.py	2013-07-25 17:35:33 +0000
@@ -4,6 +4,7 @@
 __metaclass__ = type
 __all__ = [
     'BuilddTestCase',
+    'FakeSlave',
     ]
 
 import os

=== added file 'lpbuildd/tests/test_binarypackage.py'
--- lpbuildd/tests/test_binarypackage.py	1970-01-01 00:00:00 +0000
+++ lpbuildd/tests/test_binarypackage.py	2013-07-25 17:35:33 +0000
@@ -0,0 +1,204 @@
+# Copyright 2013 Canonical Ltd.  This software is licensed under the
+# GNU Affero General Public License version 3 (see the file LICENSE).
+
+__metaclass__ = type
+
+import tempfile
+
+import os
+import shutil
+from testtools import TestCase
+
+from twisted.internet.task import Clock
+
+from lpbuildd.binarypackage import (
+    BinaryPackageBuildManager,
+    BinaryPackageBuildState,
+    )
+from lpbuildd.tests.fakeslave import (
+    FakeMethod,
+    FakeSlave,
+    )
+
+
+class MockTransport:
+    loseConnection = FakeMethod()
+    signalProcess = FakeMethod()
+
+
+class MockSubprocess:
+    def __init__(self, path):
+        self.path = path
+        self.transport = MockTransport()
+
+
+class MockBuildManager(BinaryPackageBuildManager):
+    def __init__(self, *args, **kwargs):
+        super(MockBuildManager, self).__init__(*args, **kwargs)
+        self.commands = []
+        self.iterators = []
+
+    def runSubProcess(self, path, command, iterate=None):
+        self.commands.append([path]+command)
+        self.iterators.append(self.iterate if iterate is None else iterate)
+        self._subprocess = MockSubprocess(path)
+        return 0
+
+
+class TestBinaryPackageBuildManagerIteration(TestCase):
+    """Run BinaryPackageBuildManager through its iteration steps."""
+    def setUp(self):
+        super(TestBinaryPackageBuildManagerIteration, self).setUp()
+        self.working_dir = tempfile.mkdtemp()
+        self.addCleanup(lambda: shutil.rmtree(self.working_dir))
+        slave_dir = os.path.join(self.working_dir, 'slave')
+        home_dir = os.path.join(self.working_dir, 'home')
+        for dir in (slave_dir, home_dir):
+            os.mkdir(dir)
+        self.slave = FakeSlave(slave_dir)
+        self.buildid = '123'
+        self.clock = Clock()
+        self.buildmanager = MockBuildManager(
+            self.slave, self.buildid, reactor=self.clock)
+        self.buildmanager.home = home_dir
+        self.buildmanager._cachepath = self.slave._cachepath
+        self.chrootdir = os.path.join(
+            home_dir, 'build-%s' % self.buildid, 'chroot-autobuild')
+
+    def getState(self):
+        """Retrieve build manager's state."""
+        return self.buildmanager._state
+
+    def startBuild(self):
+        # The build manager's iterate() kicks off the consecutive states
+        # after INIT.
+        self.buildmanager.initiate(
+            {'foo_1.dsc': ''}, 'chroot.tar.gz',
+            {'suite': 'warty', 'ogrecomponent': 'main'})
+
+        # Skip DebianBuildManager states to the state directly before
+        # SBUILD.
+        self.buildmanager._state = BinaryPackageBuildState.UPDATE
+
+        # SBUILD: Build the package.
+        self.buildmanager.iterate(0)
+        self.assertEqual(BinaryPackageBuildState.SBUILD, self.getState())
+        expected_command = [
+            'sbuildpath', 'sbuild-package', self.buildid, 'i386', 'warty',
+            'sbuildargs', '--dist=warty', '--architecture=i386', '--comp=main',
+            'foo_1.dsc',
+            ]
+        self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
+        self.assertFalse(self.slave.wasCalled('chrootFail'))
+
+    def test_iterate(self):
+        # The build manager iterates a normal build from start to finish.
+        self.startBuild()
+
+        log_path = os.path.join(self.buildmanager._cachepath, 'buildlog')
+        log = open(log_path, 'w')
+        log.write("I am a build log.")
+        log.close()
+
+        changes_path = os.path.join(
+            self.buildmanager.home, 'build-%s' % self.buildid,
+            'foo_1_i386.changes')
+        changes = open(changes_path, 'w')
+        changes.write("I am a changes file.")
+        changes.close()
+
+        # After building the package, reap processes.
+        self.buildmanager.iterate(0)
+        expected_command = [
+            'processscanpath', 'processscanpath', self.buildid,
+            ]
+        self.assertEqual(BinaryPackageBuildState.SBUILD, self.getState())
+        self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertNotEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
+        self.assertFalse(self.slave.wasCalled('buildFail'))
+        self.assertEqual([], self.slave.addWaitingFile.calls)
+
+        # Control returns to the DebianBuildManager in the UMOUNT state.
+        self.buildmanager.iterateReap(self.getState(), 0)
+        expected_command = [
+            'umountpath', 'umount-chroot', self.buildid
+            ]
+        self.assertEqual(BinaryPackageBuildState.UMOUNT, self.getState())
+        self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
+        self.assertFalse(self.slave.wasCalled('buildFail'))
+
+    def test_abort_sbuild(self):
+        # Aborting sbuild kills processes in the chroot.
+        self.startBuild()
+
+        # Send an abort command.  The build manager reaps processes.
+        self.buildmanager.abort()
+        expected_command = [
+            'processscanpath', 'processscanpath', self.buildid
+            ]
+        self.assertEqual(BinaryPackageBuildState.SBUILD, self.getState())
+        self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertNotEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
+        self.assertFalse(self.slave.wasCalled('buildFail'))
+
+        # If reaping completes successfully, the build manager returns
+        # control to the DebianBuildManager in the UMOUNT state.
+        self.buildmanager.iterateReap(self.getState(), 0)
+        expected_command = [
+            'umountpath', 'umount-chroot', self.buildid
+            ]
+        self.assertEqual(BinaryPackageBuildState.UMOUNT, self.getState())
+        self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
+        self.assertFalse(self.slave.wasCalled('buildFail'))
+
+    def test_abort_sbuild_fail(self):
+        # If killing processes in the chroot hangs, the build manager does
+        # its best to clean up and fails the builder.
+        self.startBuild()
+        sbuild_subprocess = self.buildmanager._subprocess
+
+        # Send an abort command.  The build manager reaps processes.
+        self.buildmanager.abort()
+        expected_command = [
+            'processscanpath', 'processscanpath', self.buildid
+            ]
+        self.assertEqual(BinaryPackageBuildState.SBUILD, self.getState())
+        self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertNotEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
+        self.assertFalse(self.slave.wasCalled('builderFail'))
+        reap_subprocess = self.buildmanager._subprocess
+
+        # If reaping fails, the builder is failed, sbuild is killed, and the
+        # reaper is disconnected.
+        self.clock.advance(120)
+        self.assertTrue(self.slave.wasCalled('builderFail'))
+        self.assertEqual(
+            [(('KILL',), {})], sbuild_subprocess.transport.signalProcess.calls)
+        self.assertNotEqual(
+            [], sbuild_subprocess.transport.loseConnection.calls)
+        self.assertNotEqual([], reap_subprocess.transport.loseConnection.calls)
+
+        log_path = os.path.join(self.buildmanager._cachepath, 'buildlog')
+        log = open(log_path, 'w')
+        log.write("I am a build log.")
+        log.close()
+
+        # When sbuild exits, it does not reap processes again, but proceeds
+        # directly to UMOUNT.
+        self.buildmanager.iterate(128 + 9)  # SIGKILL
+        expected_command = [
+            'umountpath', 'umount-chroot', self.buildid
+            ]
+        self.assertEqual(BinaryPackageBuildState.UMOUNT, self.getState())
+        self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])

=== modified file 'lpbuildd/tests/test_translationtemplatesbuildmanager.py'
--- lpbuildd/tests/test_translationtemplatesbuildmanager.py	2011-11-21 05:41:27 +0000
+++ lpbuildd/tests/test_translationtemplatesbuildmanager.py	2013-07-25 17:35:33 +0000
@@ -9,109 +9,22 @@
 
 from testtools import TestCase
 
+from lpbuildd.tests.fakeslave import FakeSlave
 from lpbuildd.translationtemplates import (
     TranslationTemplatesBuildManager,
     TranslationTemplatesBuildState,
     )
 
 
-class FakeMethod:
-    """Catch any function or method call, and record the fact.
-
-    Use this for easy stubbing.  The call operator can return a fixed
-    value, or raise a fixed exception object.
-
-    This is useful when unit-testing code that does things you don't
-    want to integration-test, e.g. because it wants to talk to remote
-    systems.
-    """
-
-    def __init__(self, result=None, failure=None):
-        """Set up a fake function or method.
-
-        :param result: Value to return.
-        :param failure: Exception to raise.
-        """
-        self.result = result
-        self.failure = failure
-
-        # A log of arguments for each call to this method.
-        self.calls = []
-
-    def __call__(self, *args, **kwargs):
-        """Catch an invocation to the method.
-
-        Increment `call_count`, and adds the arguments to `calls`.
-
-        Accepts any and all parameters.  Raises the failure passed to
-        the constructor, if any; otherwise, returns the result value
-        passed to the constructor.
-        """
-        self.calls.append((args, kwargs))
-
-        if self.failure is None:
-            return self.result
-        else:
-            # pylint thinks this raises None, which is clearly not
-            # possible.  That's why this test disables pylint message
-            # E0702.
-            raise self.failure
-
-    @property
-    def call_count(self):
-        return len(self.calls)
-
-    def extract_args(self):
-        """Return just the calls' positional-arguments tuples."""
-        return [args for args, kwargs in self.calls]
-
-    def extract_kwargs(self):
-        """Return just the calls' keyword-arguments dicts."""
-        return [kwargs for args, kwargs in self.calls]
-
-
-class FakeConfig:
-    def get(self, section, key):
-        return key
-
-
-class FakeSlave:
-    def __init__(self, tempdir):
-        self._cachepath = tempdir
-        self._config = FakeConfig()
-        self._was_called = set()
-
-    def cachePath(self, file):
-        return os.path.join(self._cachepath, file)
-
-    def anyMethod(self, *args, **kwargs):
-        pass
-
-    fake_methods = ['emptyLog', 'chrootFail', 'buildFail', 'builderFail',]
-    def __getattr__(self, name):
-        """Remember which fake methods were called."""
-        if name not in self.fake_methods:
-            raise AttributeError(
-                "'%s' object has no attribute '%s'" % (self.__class__, name))
-        self._was_called.add(name)
-        return self.anyMethod
-
-    def wasCalled(self, name):
-        return name in self._was_called
-
-    def getArch(self):
-        return 'i386'
-
-    addWaitingFile = FakeMethod()
-
-
 class MockBuildManager(TranslationTemplatesBuildManager):
     def __init__(self, *args, **kwargs):
         super(MockBuildManager, self).__init__(*args, **kwargs)
         self.commands = []
+        self.iterators = []
 
-    def runSubProcess(self, path, command):
+    def runSubProcess(self, path, command, iterate=None):
         self.commands.append([path]+command)
+        self.iterators.append(self.iterate if iterate is None else iterate)
         return 0
 
 
@@ -143,7 +56,7 @@
         # after INIT.
         self.buildmanager.initiate({}, 'chroot.tar.gz', {'branch_url': url})
 
-        # Skip states that are done in DebianBuldManager to the state
+        # Skip states that are done in DebianBuildManager to the state
         # directly before INSTALL.
         self.buildmanager._state = TranslationTemplatesBuildState.UPDATE
 
@@ -158,6 +71,8 @@
             'apt-get',
             ]
         self.assertEqual(expected_command, self.buildmanager.commands[-1][:5])
+        self.assertEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
 
         # GENERATE: Run the slave's payload, the script that generates
         # templates.
@@ -168,6 +83,8 @@
             'generatepath', 'generatepath', self.buildid, url, 'resultarchive'
             ]
         self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
         self.assertFalse(self.slave.wasCalled('chrootFail'))
 
         outfile_path = os.path.join(
@@ -179,18 +96,32 @@
         outfile.write("I am a template tarball. Seriously.")
         outfile.close()
 
-        # The control returns to the DebianBuildManager in the REAP state.
+        # After generating templates, reap processes.
         self.buildmanager.iterate(0)
         expected_command = [
             'processscanpath', 'processscanpath', self.buildid
             ]
         self.assertEqual(
-            TranslationTemplatesBuildState.REAP, self.getState())
+            TranslationTemplatesBuildState.GENERATE, self.getState())
         self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertNotEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
         self.assertFalse(self.slave.wasCalled('buildFail'))
         self.assertEqual(
             [((outfile_path,), {})], self.slave.addWaitingFile.calls)
 
+        # The control returns to the DebianBuildManager in the UMOUNT state.
+        self.buildmanager.iterateReap(self.getState(), 0)
+        expected_command = [
+            'umountpath', 'umount-chroot', self.buildid
+            ]
+        self.assertEqual(
+            TranslationTemplatesBuildState.UMOUNT, self.getState())
+        self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
+        self.assertFalse(self.slave.wasCalled('buildFail'))
+
     def test_iterate_fail_INSTALL(self):
         # See that a failing INSTALL is handled properly.
         url = 'lp:~my/branch'
@@ -209,6 +140,8 @@
             'umountpath', 'umount-chroot', self.buildid
             ]
         self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
         self.assertTrue(self.slave.wasCalled('chrootFail'))
 
     def test_iterate_fail_GENERATE(self):
@@ -221,12 +154,25 @@
         # Skip states to the INSTALL state.
         self.buildmanager._state = TranslationTemplatesBuildState.GENERATE
 
-        # The buildmanager fails and iterates to the REAP state.
+        # The buildmanager fails and reaps processes.
         self.buildmanager.iterate(-1)
         expected_command = [
             'processscanpath', 'processscanpath', self.buildid
             ]
         self.assertEqual(
-            TranslationTemplatesBuildState.REAP, self.getState())
+            TranslationTemplatesBuildState.GENERATE, self.getState())
         self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertNotEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])
         self.assertTrue(self.slave.wasCalled('buildFail'))
+
+        # The buildmanager iterates to the UMOUNT state.
+        self.buildmanager.iterateReap(self.getState(), 0)
+        self.assertEqual(
+            TranslationTemplatesBuildState.UMOUNT, self.getState())
+        expected_command = [
+            'umountpath', 'umount-chroot', self.buildid
+            ]
+        self.assertEqual(expected_command, self.buildmanager.commands[-1])
+        self.assertEqual(
+            self.buildmanager.iterate, self.buildmanager.iterators[-1])

=== modified file 'lpbuildd/translationtemplates.py'
--- lpbuildd/translationtemplates.py	2011-11-10 04:55:02 +0000
+++ lpbuildd/translationtemplates.py	2013-07-25 17:35:33 +0000
@@ -88,12 +88,14 @@
         if success == 0:
             # It worked! Now let's bring in the harvest.
             self.gatherResults()
-            self._state = TranslationTemplatesBuildState.REAP
-            self.doReapProcesses()
+            self.doReapProcesses(self._state)
         else:
             if not self.alreadyfailed:
                 self._slave.buildFail()
                 self.alreadyfailed = True
-            self._state = TranslationTemplatesBuildState.REAP
-            self.doReapProcesses()
+            self.doReapProcesses(self._state)
 
+    def iterateReap_GENERATE(self, success):
+        """Finished reaping after template generation."""
+        self._state = TranslationTemplatesBuildState.UMOUNT
+        self.doUnmounting()

=== modified file 'template-buildd-slave.conf'
--- template-buildd-slave.conf	2011-12-05 02:39:05 +0000
+++ template-buildd-slave.conf	2013-07-25 17:35:33 +0000
@@ -15,10 +15,10 @@
 cleanpath = /usr/share/launchpad-buildd/slavebin/remove-build
 mountpath = /usr/share/launchpad-buildd/slavebin/mount-chroot
 umountpath = /usr/share/launchpad-buildd/slavebin/umount-chroot
+processscanpath = /usr/share/launchpad-buildd/slavebin/scan-for-processes
 
 [debianmanager]
 updatepath = /usr/share/launchpad-buildd/slavebin/update-debian-chroot
-processscanpath = /usr/share/launchpad-buildd/slavebin/scan-for-processes
 sourcespath = /usr/share/launchpad-buildd/slavebin/override-sources-list
 
 [binarypackagemanager]


Follow ups