← Back to team overview

launchpad-dev team mailing list archive

Re: The future of downtime for rollouts?

 

On Wed, Sep 15, 2010 at 1:09 AM, Tom Haddon <tom.haddon@xxxxxxxxxxxxx> wrote:
> On Tue, 2010-09-14 at 08:46 -0400, Curtis Hovey wrote:
>> On Tue, 2010-09-14 at 11:41 +0100, Tom Haddon wrote:
>> > Might be more reliable but less accurate :) We estimate the downtime
>> > based on how long the last update took on staging, and then
>> > multiplying
>> > by a factor that seems to have accurately reflected the difference in
>> > time between staging and production (with a little padding). We could
>> > only commit to 90 mins if we refused to rollout any DB updates that
>> > took
>> > longer than a certain period of time on staging.
>>
>> Staging restore times trend up, so we are always talking about
>> increasing time for a rollout. We will continue to do schema development
>> after the featureflag is complete. What we cannot see is the staging
>> restore time verses the real time--maybe that is pointless because there
>> are other rollout incidents that increased the rollout.
>
> We do keep logs of each step of the rollout which we could give you
> access to:
>
> https://pastebin.canonical.com/37152/
>
> If you're interested in these, please file an RT asking for these to be
> synced to somewhere on devpad.

I thought I'd already asked for visibility on this; perhaps not as an
RT. Will do so now.
-Rob



References