← Back to team overview

launchpad-dev team mailing list archive

The State of the Soyuz

 

Build Farm Improvements
=======================

This is currently going well, we have a few initiatives ongoing:

Making build uploads asynchronous.
----------------------------------
Currently as each build is fetched from the slave it will block the buildd-
manager until the upload is finished.  For a large upload this can take half a 
minute.  As more builders are added to the build farm you can see how this 
quickly becomes a scaling nightmare!  Jelmer is making changes so that the 
uploads are dumped in a queue which is processed outside of the manager 
process.

Parallel scans of slaves.
-------------------------
Currently, the manager polls each builder in turn on each scan cycle.  This is 
problematic because it reduces the opportunities for doing more network 
operations at the same time and decreases robustness since a failure scanning 
one builder will cause the whole scan to be aborted.  I've been working on a 
branch that separates these scans so each builder is isolated, and it's 
currently in testing on the dogfood server.

Better failure detection.
-------------------------
Sometimes we get a bad builder or a bad job, the latter of which can take out 
the whole build farm as it hops from builder to builder getting automatically 
retried.  I've got an experimental branch that works out whether the job or 
the builder is at fault by counting failures of both and seeing which one 
increases the most over time.


Software Center
===============

Michael has been working hard on this with the ISD guys and has helped to 
bootstrap their Software Center Agent project.  The Agent is a middle-man that 
talks to the billing system, Launchpad and the end user and sets up access on 
private PPAs when the end user pays for something.  Michael is pretty much 
done with his work and is handing over to ISD so they can finish it off.


Derived Distributions
=====================

We've been working on the LEP 
(https://dev.launchpad.net/LEP/DerivativeDistributions) for this for a while 
now and recently completed the requirements specification at the sprint in 
Prague.  A couple of things are ongoing:

UI mockups and user testing.
----------------------------
Some initial mockups to capture the basic user interaction are in the LEP, and 
these were recently presented to three potential users in some user testing 
sessions done by Matthew Revell.  This valuable feedback is currently being 
analysed and we'll produce updated mockups for a second round of user testing 
in the next week.

Moving server scripts to the job system.
----------------------------------------
Steve Kowalik is currently working on moving the initialise-from-parent.py 
script into the job system so that we can initiate it from a webapp request 
instead of requiring shell access to the servers.  This is a pre-requisite to 
expanding that script so that it's more flexible and can handle the kind of 
derivations that are described in the LEP.


Booby-Traps!
============

Soyuz has a lot of very old bugs that we've traditionally only looked at once 
they've caused havoc in production.  Francis likes to call these booby-trap 
bugs and while we've not had a leg blown off yet, I've pulled shrapnel from 
places I don't want to talk about.  This approach is not healthy for our 
production status, so we're embarking on a way of slotting in fixes for these 
while doing our normal feature developments.  Hopefully we'll see more long-
term stability from Soyuz by the end of the year.


Thanks for reading this far.  If you know what "Soyuz" means in Russian I hope 
you found the subject of this email entertaining.

J.



Follow ups