← Back to team overview

launchpad-dev team mailing list archive

Introduction to and Minutes of yellow squad's weekly postmortem call: 2012-03-30

 

= Introduction for Launchpad devs =

Yellow squad has had a weekly postmortem call for many months now--perhaps more than a year now.

The purpose is to identify successes, problems, and useful tricks; to share them; and to see if we can problem solve and identify action items. The call started because we were not taking the time to do these things, which was wasteful and had us repeating mistakes and not learning from them.

The call has been successful enough that we have kept at it all this time. We've done some problem analysis, and shared a number of nice tricks among ourselves. Some of the nicer or more notable tricks have led to emails to canonical tech, sharing what we've discovered with a larger audience.

In the course of writing up notes for the current performance review period, I realized that we still were not doing as good of a job on this as I'd like. In particular, last period we had some issues that we did not try to learn from.

I decided to try two small changes to make the call more rigorous--more likely to accomplish its goals.

* I've expanded our checklist for the call with more pointed questions. You can see them near the top of https://dev.launchpad.net/yellow/ if you are curious.
 * I'm sending the minutes out to a wider audience.  Hello!

We tried the new, expanded checklist for the first time today. It worked well so far. :-) Similarly, today is the first day for our more public minutes. The rest of this email contains the minutes and action items for the call. Please let us know if you'd prefer to have these elsewhere, or if you like having them sent to the launchpad-dev list.

= Minutes of postmortem call 2012-03-30 =

gmb: had to rework a blog post, because it duplicated in part what benji wrote recently (http://blog.launchpad.net/general/parallelising-the-unparallelisable). Lesson to be learned: if multiple blog posts are written simultaneously, coordinate! This is clearly a case of a larger rule: if you work on similar projects simultaneously, coordinate. We try to apply the larger rule to our coding projects, so keep it in mind for everything else we do too.

frankban: the forked zope.testing egg that we have been using in Launchpad for many months had many tests that failed. After work by yellow squad, the upcoming version (p5) only has three tests that fail, but > 0 is a problem because it makes further changes to the egg (like the ones we did for bug 609986) difficult. This led to a large discussion. Points raised include the following. * Forking code means that you take responsibility for it, within the context of your project. This includes passing tests, and tests for your changes. Generally, forks hurt more than you think they will. * Patching (and patched eggs) is equivalent in that regard; and worse because subsequent developers do not have a branch to work with. * Doing a constant upgrade to your dependencies is the right thing to do in general, as advocated recently by someone on canonical-tech. It's like brushing your teeth: if you don't do it, things rot, and it's a lot more expensive to fix later. However, we acknowledged that it can be a very expensive regular process too. * We are far behind on zope eggs. Catching up will be expensive, and it is difficult to be motivated given the low activity/participation of the project. * Suggested process change: for future projects (e.g. SOA stand-alone projects) explicitly adopt a constant-upgrade policy, setting up expectations and schedules initially. * Suggested process change: never patch eggs; fork branches if you must, and make eggs from them. * Suggested process change: if you fork, make sure tests pass on branch before you leave, and in general take ownership of the code. Acknowledge that you are making a new package. * Action item: Gary will send an email about the discussion to the list. [I'm not sure if this counts or not ;-) but I'll send one under separate cover with this info ]

benji: he found tty recorder "ttyrec" (http://0xcc.net/ttyrec/index.html.en) , which might be a tool to improve his simple terminal sharing work in slack time (https://dev.launchpad.net/yellow/RemoteTerminalBroadcasting) . If you want to play with it, he suggests installing it from apt, not from source, because source has some BSD-isms.

benji: using LXC for dev is nice. It puts up enough isolation between host and and the container to be sufficient for our use, but allows nice sharing. A current downside for him is that LP's make doesn't work until you flail a bit. [It worked for me, but it's been awhile since I set it up, and we have seen multiple regressions/fixes over the past couple of months.]

benji: we should test our buildbot/lxc setup daily to watch out for regressions. gary_poster said that, similarly, it was his intent to run parallel tests constantly on ec2 once we got near the end of the project, because one of our goals given by Robert at project inception was to prevent spurious failures. However, gary_poster's AWS bill this month will be close to $400 because of his work and tests on EC2. Benji used http://calculator.s3.amazonaws.com/calc5.html to calculate that having juju buildbot master & slave with eight core machines for all would by almost $1500/month; tricking juju into giving us two small machines and an 8-core would be about $600. * Action item: gary_poster will manually run a master and slave today to see how we are doing (LXC bug 968371 was a problem yesterday; we also want to see how many test failures we have now, particularly after adding --shuffle to the test command) * Action item: gary_poster will add a card to the kanban board for making an automated juju setup and test run, at bac's suggestion. Maybe we'll run one of these daily. [DONE]
 * Action item: gary_poster will talk to flacoste about EC2 expenses.


Follow ups