launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #09179
Introduction to and Minutes of yellow squad's weekly postmortem call: 2012-03-30
= Introduction for Launchpad devs =
Yellow squad has had a weekly postmortem call for many months
now--perhaps more than a year now.
The purpose is to identify successes, problems, and useful tricks; to
share them; and to see if we can problem solve and identify action
items. The call started because we were not taking the time to do these
things, which was wasteful and had us repeating mistakes and not
learning from them.
The call has been successful enough that we have kept at it all this
time. We've done some problem analysis, and shared a number of nice
tricks among ourselves. Some of the nicer or more notable tricks have
led to emails to canonical tech, sharing what we've discovered with a
larger audience.
In the course of writing up notes for the current performance review
period, I realized that we still were not doing as good of a job on this
as I'd like. In particular, last period we had some issues that we did
not try to learn from.
I decided to try two small changes to make the call more rigorous--more
likely to accomplish its goals.
* I've expanded our checklist for the call with more pointed
questions. You can see them near the top of
https://dev.launchpad.net/yellow/ if you are curious.
* I'm sending the minutes out to a wider audience. Hello!
We tried the new, expanded checklist for the first time today. It worked
well so far. :-) Similarly, today is the first day for our more public
minutes. The rest of this email contains the minutes and action items
for the call. Please let us know if you'd prefer to have these
elsewhere, or if you like having them sent to the launchpad-dev list.
= Minutes of postmortem call 2012-03-30 =
gmb: had to rework a blog post, because it duplicated in part what benji
wrote recently
(http://blog.launchpad.net/general/parallelising-the-unparallelisable).
Lesson to be learned: if multiple blog posts are written
simultaneously, coordinate! This is clearly a case of a larger rule: if
you work on similar projects simultaneously, coordinate. We try to
apply the larger rule to our coding projects, so keep it in mind for
everything else we do too.
frankban: the forked zope.testing egg that we have been using in
Launchpad for many months had many tests that failed. After work by
yellow squad, the upcoming version (p5) only has three tests that fail,
but > 0 is a problem because it makes further changes to the egg (like
the ones we did for bug 609986) difficult. This led to a large
discussion. Points raised include the following.
* Forking code means that you take responsibility for it, within the
context of your project. This includes passing tests, and tests for
your changes. Generally, forks hurt more than you think they will.
* Patching (and patched eggs) is equivalent in that regard; and worse
because subsequent developers do not have a branch to work with.
* Doing a constant upgrade to your dependencies is the right thing to
do in general, as advocated recently by someone on canonical-tech. It's
like brushing your teeth: if you don't do it, things rot, and it's a lot
more expensive to fix later. However, we acknowledged that it can be a
very expensive regular process too.
* We are far behind on zope eggs. Catching up will be expensive, and
it is difficult to be motivated given the low activity/participation of
the project.
* Suggested process change: for future projects (e.g. SOA stand-alone
projects) explicitly adopt a constant-upgrade policy, setting up
expectations and schedules initially.
* Suggested process change: never patch eggs; fork branches if you
must, and make eggs from them.
* Suggested process change: if you fork, make sure tests pass on
branch before you leave, and in general take ownership of the code.
Acknowledge that you are making a new package.
* Action item: Gary will send an email about the discussion to the
list. [I'm not sure if this counts or not ;-) but I'll send one under
separate cover with this info ]
benji: he found tty recorder "ttyrec"
(http://0xcc.net/ttyrec/index.html.en) , which might be a tool to
improve his simple terminal sharing work in slack time
(https://dev.launchpad.net/yellow/RemoteTerminalBroadcasting) . If you
want to play with it, he suggests installing it from apt, not from
source, because source has some BSD-isms.
benji: using LXC for dev is nice. It puts up enough isolation between
host and and the container to be sufficient for our use, but allows nice
sharing. A current downside for him is that LP's make doesn't work
until you flail a bit. [It worked for me, but it's been awhile since I
set it up, and we have seen multiple regressions/fixes over the past
couple of months.]
benji: we should test our buildbot/lxc setup daily to watch out for
regressions. gary_poster said that, similarly, it was his intent to run
parallel tests constantly on ec2 once we got near the end of the
project, because one of our goals given by Robert at project inception
was to prevent spurious failures. However, gary_poster's AWS bill this
month will be close to $400 because of his work and tests on EC2. Benji
used http://calculator.s3.amazonaws.com/calc5.html to calculate that
having juju buildbot master & slave with eight core machines for all
would by almost $1500/month; tricking juju into giving us two small
machines and an 8-core would be about $600.
* Action item: gary_poster will manually run a master and slave today
to see how we are doing (LXC bug 968371 was a problem yesterday; we also
want to see how many test failures we have now, particularly after
adding --shuffle to the test command)
* Action item: gary_poster will add a card to the kanban board for
making an automated juju setup and test run, at bac's suggestion. Maybe
we'll run one of these daily. [DONE]
* Action item: gary_poster will talk to flacoste about EC2 expenses.
Follow ups