opencog-dev team mailing list archive

Thread
Date

essay on release management, etc.

To: opencog-dev@xxxxxxxxxxxxxxxxxxx
From: "David Hart" <hart@xxxxxxxxxxxx>
Date: Fri, 16 May 2008 08:07:42 +1000
Sender: dhart@xxxxxxxxxxxx

Hi All,

Mark Shuttleworth (the Ubuntu guy) wrote a good article (below) on release
management and related stuff like distributed VCS and test-driven
development (TDD). The details of Linux distributions, and their component
projects, are really only incidental to the topics he's discussing.

-dave
Discussing free software
syncronicity<http://www.markshuttleworth.com/archives/150>from
Mark Shuttleworth<https://www.google.com/reader/view/feed/http%3A%2F%2Fwww.markshuttleworth.com%2Ffeed%2F>by
mark

There's been a flurry of discussion around the idea of syncronicity in free
software projects. I'd like to write up a more comprehensive view, but I'm
in Prague prepping for FOSSCamp and the Ubuntu Developer Summit (can't wait
to see everyone again!) so I'll just contribute a few thoughts and responses
to some of the commentary I've seen so far.

Robert Knight summarized the
arguments<http://kdemonkey.blogspot.com/2008/05/singing-in-tune.html>I
made during a keynote
at aKademy<http://home.kde.org/%7Eakademy07/videos/1-06-Keynote-Shuttleworth.ogg>last
year. I'm really delighted by the recent announcement of that the main
GNOME and KDE annual developer conferences (GUADEC and aKademy) will be held
at the same time, and in the same place, in 2009. This is an important step
towards even better collaboration. Initiatives like FreeDesktop.org have
helped tremendously in recent years, and a shared conference venue will
accelerate that process of bringing the best ideas to the front across both
projects. Getting all of the passionate and committed developers from both
of these into the same real-space will pay dividends for both projects.

Aaron Seigo of KDE Plasma has taken a strong position against synchronized
release cycles, and his
three<http://aseigo.blogspot.com/2008/05/ramblings-on-6-month-cycles-and-plasma.html>
recent<http://aseigo.blogspot.com/2008/05/re-re-ramblings-on-6-month-cycles-and.html>
posts <http://aseigo.blogspot.com/2008/05/re-singing-in-tune.html> on the
subject make interesting reading.

Aaron raises concerns about features being "punted" out of a release in
order to stick to the release cycle. It's absolutely true that discipline
about "what gets in" is essential in order to maintain a commitment on the
release front. It's unfortunate that features don't always happen on the
schedule we hope they might. But it's worth thinking a little bit about the
importance of a specific feature versus the whole release. When a release
happens on time, it builds confidence in the project, and injects a round of
fresh testing, publicity, enthusiasm and of course bug reports. Code that is
new gets a real kicking, and improves as a result. Free software projects
are not like proprietary projects - they don't have to ship new releases in
order to get the money from new licenses and upgrades. We can choose to
slip a particular feature in order to get a new round of testing and
feedback on all the code which did make it.

Some developers are passionate about specific features, others are
passionate about the project as a whole. There are two specific
technologies, or rather methodologies, that have hugely helped to separate
those two and empower them both. They are very-good-branching VCS, and
test-driven development (TDD).

We have found that the developers who are really focused on a specific
feature tend to work on that feature in a branch (or collaborative set of
branches), improving it "until it is done" regardless of the project release
cycle. They then land the feature as a whole, usually after some review.
This of course depends on having a VCS that supports branching and merging
very well. You need to be able to merge from trunk continuously, so that
your feature branch is always mergeable *back* to trunk. And you need to be
able to merge between a number of developers all working on the same
features. Of course, my oft-stated preference in VCS is Bazaar, because the
developers have thought very carefully about how to support collaborative
teams across platforms and projects and different workflows, but any VCS,
even a centralised one, that supports good branches will do.

A comprehensive test suite, on the other hand, lets you be more open to big
landings on trunk, because you know that the tests protect the functionality
that people had *before* the landing. A test suite is like a force-field,
protecting the integrity of code that was known to behave in a particular
way yesterday, in the face of constant change. Most of the projects I'm
funding now have adopted a tests-before-landings approach, where landings
are trunk are handled by a robot who refuses to commit the landing unless
all tests passed. You can't argue with the robot! The beauty of this is that
your trunk is "always releasable". That's not *entirely* true, you always
want to do a little more QA before you push bits out the door, but you have
the wonderful assurance that the test suite is always passing. Always.

So, branch-friendly VCS's and test-driven development make all the
difference. Work on your feature till it's done, then land it on the trunk
during the open window. For folks who care about the release, the freeze
window can be much narrower if you have great tests.

There's a lot of discussion about the exact length of cycle that is
"optimal", with some commentary about the windows of development, freeze, QA
and so on. I think that's a bit of a red herring, when you factor in good
branching, because feature development absolutely does not stop when the
trunk is frozen in preparation for a release. Those who prefer to keep
committing to their branches do so, they scratch the itch that matters most
to them.

I do think that cycle lengths matter, though. Aaron speculates that a
4-month cycle might be good for a web site. I agree, and we've converged on
a 4-month planning cycle for Launchpad after a few variations on the theme.
The key difference for me with a web site is that one has only one
deployment point of the code in question, so you don't have to worry as much
about update and cross-version compatibility. The Launchpad team has a very
cool system, where they roll out fresh code from trunk every day to a set of
app servers (called "edge.launchpad.net"), and the beta testers of LP use
those servers by default. Once a month, they roll out a fresh drop from tip
to all the app servers, which is also when they rev the database and can
introduce substantial new features. It's tight, but it does give the project
a lot of rhythm. And we plan in "sets of 4 months", at least, we are for the
next cycle. The last planning cycle was 9 months, which was just way too
long.

I think the cycles-within-cycles idea is neat. Aaron talks about how 6
months is too long for quick releases, and too short to avoid having to bump
features from one cycle to the next. I've already said that a willingness to
bump a feature that is not ready is a strength and not a weakness. It would
be interesting to see if the Plasma team adopted a shorter "internal" cycle,
like 2 months or 3 months, and fit that into a 6 month "external" cycle,
whether Aaron's concerns were addressed.

For large projects, the fact that a year comes around every, well, year,
turns out to be quite significant. You really want a cycle that divides
neatly into a year, because a lot of external events are going to happen on
that basis. And you want some cohesion between the parts. We used to run the
Canonical sprints on a 4-month cycle (3 times a year) and the Ubuntu
releases on a six month cycle (twice a year) and it was excessively complex.
As soon as we all knew each other well enough not to need to meet up every 4
months, we aligned the two and it's been much smoother ever since.

Some folks feel that distributions aren't an important factor in choosing an
upstream release cycle. And to a certain extent that's true. There will
always be a "next" release of whatever distribution you care about, and
hopefully, an upstream release that misses "this" release will make it into
the next one. But I think that misses the benefit of getting your work to a
wider audience as fast as possible. There's a great project management
methodology called "lean", which we've been working with. And it says that
any time that the product of your work sits waiting for someone else to do
something, is "waste". You could have done that work later, and done
something else before that generated results sooner. This is based on the
amazing results seen in real-world production lines, like cars and
electronics.

So while it's certainly true that you could put out a release that misses
the "wave" of distribution releases, but catches the next wave in six months
time, you're missing out on all the bug reports and patches and other
opportunities for learning and improvement that would have come if you'd
been on the first wave. Nothing morally wrong with that, and there may be
other things that are more important for sure, but it's worth considering,
nonetheless.

Some folks have said that my interest in this is "for Canonical", or "just
for Ubuntu". And that's really not true. I think it's a much more productive
approach for the whole free software ecosystem, and will help us compete
with the proprietary world. That's good for everyone. And it's not just
Ubuntu that does regular 6-month releases, Fedora has adopted the same
cycle, which is great because it improves the opportunities to collaborate
across both distributions - we're more likely to have the same versions of
key components at any given time.

Aaron says:

Let's assume project A depends on B, and B releases at the same time as A.
That means that A is either going to be one cycle behind B in using what B
provides, or will have to track B's bleeding edge for the latter part of
their cycle allowing some usage. What you really want is a
*staggered*approach where B releases right about when A starts to work
on things.

This goes completely counter to the "everyone on the same month, every 6
months" doctrine Mark preaches, of course.

I have never suggested that *everyone* should release at the same time. In
fact, at Ubuntu we have converged around the idea of releasing about one
month *after* our biggest predictable upstream, which happens to be GNOME.
And similarly, the fact that the kernel has their own relatively predictable
cycle is very useful. We don't release Ubuntu on the same day as a kernel
release that we will ship, of course, but we are able to plan and
communicate meaningfully with the folks at kernel.org as to which version
makes sense for us to collaborate around.

Rather than try and release the entire stack all at the same time, it makes
sense to me to offset the releases based on a rough sense of dependencies.

Just to be clear, I'm not asking the projects I'll mention below to change
anything, I'm painting a picture or a scenario for the purposes of the
discussion. Each project should find their own pace and scratch their itch
in whatever way makes them happiest. I think there are strong
itch-scratching benefits to syncronicity, however, so I'll sketch out a
scenario.

Imagine we aimed to have three waves of releases, about a month apart.

In the first wave, we'd have the kernel, toolchain, languages and system
libraries, and possibly components which are performance- and
security-critical. Linux, GCC, Python, Java, Apache, Tomcat… these are items
which likely need the most stabilisation and testing before they ship to the
innocent, and they are also pieces which need to be relatively static so
that other pieces can settle down themselves. I might also include things
like Gtk in there.

In the second wave, we'd have applications, the desktop environments and
other utilities. AbiWord and KOffice, Gnumeric and possibly even Firefox
(though some would say Firefox is a kernel and window manager so… ;-)).

And in the third wave, we'd have the distributions - Ubuntu, Fedora, Gentoo,
possibly Debian, OpenSolaris. The aim would be to encourage as much
collaboration and discussion around component versions in the distributions,
so that they can effectively exchange information and patches and bug
reports.

I'll continue to feel strongly that there is value to projects in getting
their code to a wider audience than those who will check it out of
VCS-du-jour, keep it up to date and build it. And the distributions are the
best way to get your code… distributed! So the fact that both Fedora and
Ubuntu have converged on a rhythm bodes very well for upstreams who can take
advantage of that to get wider testing, more often, earlier after their
releases. I know every project will do what suits it, and I hope that
projects will feel it suits them to get their code onto servers and desktops
faster so that the bug fixes can come faster, too.

Stepping back from the six month view, it's clear that there's a slower
rhythm of "enterprise", "LTS" or "major" releases. These are the ones that
people end up supporting for years and years. They are also the ones that
hardware vendors want to write drivers for, more often than not. And a big
problem for them is still "which version of X, kernel, libC, GCC" etc should
we support? If the distributions can articulate, both to upstreams and to
the rest of the ecosystem, some clear guidance in that regard then I have
every reason to believe people would respond to it appropriates. I've talked
with kernel developers who have said they would LOVE to know which kernel
version is going to turn into RHEL or an Ubuntu LTS release, and ideally,
they would LOVE it if those were the same versions, because it would enable
them to plan their own work accordingly. So let's do it!

Finally, in the comments on Russell Coker's thoughtful
commentary<http://etbe.coker.com.au/2008/05/13/release-dates-for-debian/>there's
a suggestion that I really like - that it's coordinated freeze dates
more than coordinated release dates that would make all the difference.
Different distributions do take different views on how they integrate, test
and deploy new code, and fixing the release dates suggests a reduction in
the flexibility that they would have to position themselves differently. I
think this is a great point. I'm primarily focused on creating a pulse in
the free software community, and encouraging more collaboration. If an
Ubuntu LTS release, and a Debian release, and a RHEL release, used the same
major kernel version, GCC version and X version, we would be able to improve
greatly ALL of their support for today's hardware. They still wouldn't ship
on the same date, but they would all be better off than they would be going
it alone. And the broader ecosystem would feel that an investment in code
targeting those key versions would be justified much more easily.