← Back to team overview

launchpad-dev team mailing list archive

Re: fallout and regressions are the new black


On Thu, Dec 15, 2011 at 6:30 AM, curtis Hovey
<curtis.hovey@xxxxxxxxxxxxx> wrote:
> I propose we reorganise our priorities based on Francis' statement in
> http://blog.launchpad.net/general/legacy-performance-testing-6-months-of-new-critical-bugs-analyzed
> that fallout and regressions are the true critical issues.

Its a pithy statement, but I don't think that something being either a
regression or fallout tells us that the mine is being extended: if its
technical debt (that is, if its something we could have done better),
it may have unexpected consequences later, and that will be adding to
the mine; merely missing a feature users used to make use of, or
oopsing on an unexpected combination of parameters, doesn't have that
deeper consequence.

> At the start of the year we raised the importance of all oopses and
> timeouts to critical. We told maintenance teams that they could work on
> any critical bug they choose. It was assumed that all tags were equal
> and that the queue could be driven to zero in 3-6 months. Maintenance
> teams continue to select bugs to fix based on these rules, though we
> know the predicated assumption is false.

I think they are still equal, and in particular I think the bias to
fixing the *oldest* of the criticals is still extremely important. The
oldest ones are the ones that have been affecting users for longest,
and that [probably IMO] have the greatest technical leverage over the
system. We could firefight the newest things at every stage but not
make any deep impact; things that will get us out in front, like
making uploads go through the librarian directly [a step towards
spltting it out, avoiding double-copying within the datacentre], or
making our schema constraints support the logic our users want (bug
62976 - the oldest critical we have) will have far reaching
consequences - making it harder for things to go wrong in the system.

I may do an analysis in the new year. I think we should review our
priorities / process once everyone is back from their various
scattered leaves - perhaps at the thunderpic.

The specific things I want from any system are:
 - very easy queues for engineers picking things - views which LP
directly supports.
 - no confusion for our users about which things may get a look in.
 - old-and-deep-and-significant technical work is not lost in the
flurry of day to day operations.

These are a bit vague :(. I think we achieve them tolerably ok today.
Doing better would be excellent.