← Back to team overview

launchpad-dev team mailing list archive

Re: The future of downtime for rollouts?

 

On Tue, 2010-09-14 at 22:09 +1200, Robert Collins wrote:
> On Tue, Sep 14, 2010 at 9:55 PM, Jonathan Lange <jml@xxxxxxxxxxxxx> wrote:
> > Hello,
> >
> > I've noticed that negotiating the downtime for Launchpad rollouts is
> > becoming increasingly tricky.
> >
> > So I can be clear when asked,
> >  * what's the downtime for rollout now?
> >  * are we doing anything to reduce it?
> >  * when are we expecting to have zero downtime for rollout?
> >
> > I'll put the answer on a wiki somewhere once the thread winds up.
> 
> I have a few thoughts here.
> 
> The current process, AIUI goes like this:
>  - the RM asks the LOSAs and stub the needed downtime.
>  - they estimate it via various arcane methods(*)
>  - that is then used for the announcement.
> 
> Short term:
> Perhaps it would be better to say:
> 'we have a 90 minute downtime window each release. Always 90 minutes,
> and never more than.'

Might be more reliable but less accurate :) We estimate the downtime
based on how long the last update took on staging, and then multiplying
by a factor that seems to have accurately reflected the difference in
time between staging and production (with a little padding). We could
only commit to 90 mins if we refused to rollout any DB updates that took
longer than a certain period of time on staging.




Follow ups

References