← Back to team overview

launchpad-dev team mailing list archive

Re: Immediate plan for Build Farm generic jobs

 

Michael Hudson wrote:
 > So, we finally get to think concretely about some things :-)

I hope it was worth the wait.

> Or, and there are actually reasons for doing this, we could do something
> in between: store it mostly as text but replace the references to
> branches in the text with references to database objects (probably the
> id of entries in some RecipeBranch linking table).  This would let us
> (a) check that the branches exist at parsing time (b) keep the
> references up to date if the branch is moved or renamed (c) prevent
> branches that are referenced in Recipes from being deleted.

This seems very reasonable to me.

> Separately, we need to decide where a recipe lives.  The current
> thinking is
> "https://launchpad.net/ubuntu/karmic/+source/some-package/+recipe/recipe-name";
> which seems OK to me (we'd have to trust a bit that this recipe would
> build a recipe for some-package in karmic, but that doesn't seem any
> different to say branches today).

I don't think this is good for a few reasons:
 * The URL is too long
 * The traversal would fail for a recipe where a source package name
doesn't exist yet
 * The recipe is tied to a series.  Recipes should be independent of a
series.
 * Why have more than one recipe for one package?

My suggestion is:
/<distro>/+recipe/packagename

> Finally, we could stick an archive on the recipe, but maybe we don't
> want to.  I'll talk about this a bit more later in the mail.

We absolutely don't want to do this, because source packages can exist
in many archives.  The act of publication of a package is separate from
its existence.

> This leads to a schema a bit like:
> 
> Recipe:

I think this should be called SourcePackageRecipe, mostly because we
might have other recipes and somewhat for consistency reasons.

>  - id, registrant, date_created, owner, date_last_modified
>    - all standard launchpad fields.  the owner would be able to edit
>      the recipe.
>  - name
>    - the last bit of the url

I don't think you need this.

>  - distroseries, sourcepackagename
>    - provides the rest of the url

I would say you need "distribution" instead of distroseries.

>  - recipe
>    - a text field containing the text of the recipe (probably with
> mangled branch references so lp:foo would be replaced with lp:21435)
> 
> RecipeBranch:
>  - id, recipe, branch
>    - all obvious i hope
> 
> What follows hopefully doesn't depend too much on how the above gets
> decided in the end.
> 
> For the job of building a recipe into a source package we'll have a
> BuildSourcePackageFromRecipeJob table.  I foresee this table looking like:
> 
> BuildSourcePackageFromRecipeJob
>  - job
>  - recipe
>  - archive?

I don't know if archive is going to be necessary here because it ties
the creation of a source package to one archive.  We might want to take
an existing recipe build job and re-upload it to a different archive.

> BuildQueue will get a row with a job column will reference same job and
> have a particular job_type.

Yep.

> One of the things bzr-builder does when it creates the debianised source
> tree is create a manifest, which is a sort of frozen version of a recipe
> -- it references particular revisions of the branches so as to allow a
> repeat of exactly this build.  We could use a manifest like this to
> actually run the recipe: at the point where the build is requested, we
> make the manifest and stuff it into the database.  This seems like a
> neat idea, but isn't how bzr-builder works now as far as I can tell.

I think this manifest should be stored somewhere with the build job.  As
I discussed with Jono on Monday, we know we're going to need a new table
(SourcePackageRecipeBuild) that records a build event to get a source
package from a recipe.  This table should have the manifest on it.

It's also possible we don't need this table and we can just use the
BuildSourcePackageFromRecipeJob.


> This doesn't include anything that will actually create
> BuildSourcePackageFromRecipeJob rows (say every day for a daily build
> PPA).  I guess we can worry about this later.
> 
> I think the current plan is to use bzr-builder to make the debianized
> source tree and bzr-builddeb to then make the source package.  I'm
> presume the process for getting the source package off the builder and
> into the process of being built will follow that of the existing
> builders: the builder will tell the buildd-manager where to get the
> .dsc, the manager will parse this to find the other parts of the package
> and then grab them, shove all of the files into the librarian and
> trigger the existing parts of soyuz to look at them somehow[1].

What happens for binary builds is that the builders return a bunch of
files that the buildd-manager throws into a directory on disk.  It then
calls process-upload.py (using Popen :( ) to deal with it.  For a source
package resulting from a recipe build we can do exactly the same thing.

One thing I need to change though is to stop this use of Popen since it
blocks everything else on the buildd-manager.  There's a spec for this
at
https://blueprints.edge.launchpad.net/soyuz/+spec/buildd-manager-upload-decoupling

> Something that's missing from all the above is how the archive is
> selected.  It's more or less essential that the
> BuildSourcePackageFromRecipeJob knows the archive, so that the generated
> source package can be built for the right one.

Or we can divorce the archive from the Job entirely and have another
mechanism that records who requested the build/upload.  This is
important because we need to observe upload ACLs.

  It could be tied to the
> recipe or it could be supplied when the job row is created.  In some
> sense the archive is totally orthogonal to the recipe, but OTOH, I can't
> really see the use case for targeting more than one archive with a
> recipe.  Advice welcome.

The recipe, no, the actual source package, yes.

> 
> In case the above wasn't enough, here's some things I haven't thought
> hard about:
> 
>  - do people want to subscribe to a recipe?
>    - does this mean getting notified when the recipe builds or fails to
>      build?
>    - does this mean getting notified when the recipe is changed?

If a recipe fails to build we need to notify the recipe owner and if the
person requesting the build is different, them also.

Soyuz already has a lot of code for dealing with notifications, it
should be easy enough to hook some more bits in.

> 
>  - the whole privacy thing.
>    - do we only allow recipes to be created that reference branches the
>      owner can see?

Makes sense.

>    - is having the people who can view the recipe being the intersection
>      of those that can see the branches reasonable?

Yes.

>    - the issues of accessing private branches from the buildslaves
>      scares me a bit, I hope we can avoid worrying about that until some
>      time in 2010.

Yeah, I had only considered the firewall rules from the slaves.
Presumably we'll need a buildd-slave SSH key that can access everything?

>> The model code should implement the interface ISoyuzJob (although this is a 
>> terrible name, it will be changed) which is declared in 
>> lib/lp/soyuz/interfaces/soyuzjob.py.
> 
> This file doesn't seem to exist?

See Muharem's response.

> In coarse outline, building a source package from a recipe isn't very
> different from building a binary from a source package, so this sounds
> like it will be a mass of (presumably devilish) details rather than deep
> design work.

Hopefully yes.  One thing that we need to make sure of is that *all*
build jobs must have a determinate build time.

We currently have a system in place that estimates how long it will be
before your package build starts.  It does this by adding up the
previous build times for all the packages in the queue in front of you.

To be able to continue to do this, we need to be able to look up a
SourcePackageRecipeBuild for a package name (or the equivalent Job table
row) and see how long it took to build last time.

Muharem is refactoring our code right now and making an interface that
your job classes must implement for us to be able to do this.

> [1] I guess the fact that these packages aren't signed will bite us in
>    the ass somehow or other at some point, but I don't know how much it
>    affects how this bit would work.  We don't *have* to get the source
>    package files into Soyuz via the Librarian I guess.

Mark S has suggested that we have a single Launchpad key to sign them,
so that if the packages are used outside of Launchpad then people know
where they came from.

Cheers
J



Follow ups

References