launchpad-dev team mailing list archive

Thread
Date

Build farm and the slave build id menagerie

To: Launchpad Community Development Team <launchpad-dev@xxxxxxxxxxxxxxxxxxx>
From: Jeroen Vermeulen <jtv@xxxxxxxxxxxxx>
Date: Sat, 13 Mar 2010 17:24:57 +0700
User-agent: Thunderbird 2.0.0.23 (X11/20090817)

I've just been discussing something with wgrant that has been botheringboth of us.

The build farm puts serious complexity into having unique "slave buildids" (which are basically the same as "buildfarm job names").Theoretically arbitrary, they are produced in different ways fordifferent job types, and then for each job type there's a method tocheck that the build ids cited by the slaves match what the masterthought the slaves were working on.

The only constant between these slave build ids is that they all containa BuildQueue id. And that's enough to guarantee uniqueness (thoughAFAIK even that isn't really needed). They also all contain a"something else" that can be cross-checked against it: a Build.id, aBuildBase.id (not the same!), a Branch.name. That's where all thecomplexity goes.

We can't be sure, but we think the cross-check may have started out asan extra protection against compromised slaves trying to confuse thebuildd master. If it is, the ids are too predictable to offer muchprotection (and I'm told the worst the attacker could achieve is hold upthe recovery of a hung slave). Or maybe it's just a belt-and-suspenderscheck against accidental matches, but then making it simpler would bemuch better protection.


So we propose simplifying the whole thing as follows:

1. The slave build id is concocted in a single place, and completelygeneric between build farm job types.

2. Likewise, we verify the slave build id in a single place and with novariations for different job types.

3. We pass the ready-made slave build id to dispatchBuildToSlave.There's no need for each implementation to repeat the code to generate it.

4. The slave build id uses the BuildQueue id for uniqueness, plusoptionally a hard-to-predict cookie to thwart compromised slaves. Wemay even want to combine the two into a single hash; see below.

5. If we do want a cookie for security, we use generically availablevalues that are tightly associated with the slave build but not allpredictable in the same way: Job.date_created, BuildQueue.builder,Job.requester. If we hash the lot together, a compromised slave won'treceive any of the component values for its own job as starting pointsfor a guess.


6. We come up with a better or at least consistent name for these.

7. We forget about the whole thing & live merrily ever after.

If we ever decide that we need seriously unpredictable ids, the hash Isuggested is an improvement but still not exactly safe. If desired wecan throw in a new column BuildQueue.salt later, optional at first, toget a better hash trapdoor without breaking compatibility with pendingjobs. Who wouldn't want cookies with salt in them?

Then again, maybe we don't need a cookie at all and that would be eveneasier.



Any comments?  Jeers?  Cheers?  Beers..?


Jeroen

Follow ups

Re: Build farm and the slave build id menagerie
From: Jonathan Lange, 2010-03-16