← Back to team overview

launchpad-dev team mailing list archive

Re: Staging & qastaging running with PostgreSQL 9.1

 

On Sat, Mar 31, 2012 at 12:24 PM, William Grant
<william.grant@xxxxxxxxxxxxx> wrote:
> On 31/03/12 00:59, Stuart Bishop wrote:
>> Apart from smoke testing on staging, including cross version
>> replication, we are ready for production upgrades to start 23rd April.
>> The Precise release schedule will of course be the biggest factor in
>> deciding when the upgrades actually start. Upgrades will be staggered,
>> slave databases first. I expect the upgrade of the master will be done
>> with a scheduled 30 minute outage window. The alternative is two short
>> outages while we promote slaves to master and a period we are running
>> with a single slave and half our usual CPU power on the master.
>
> A 30 minute outage is extremely disruptive and unprecedented since
> fastdowntime, so I'd really prefer that we didn't. With recent
> optimisations we can sensibly run the master on a slave's hardware for
> even an extended period, as long as we first tweak the replication code
> to use a slave if it's up to date, rather than only using a slave if the
> entire cluster is up to date.

Perhaps we can use slony to create a postgresql 9.1 instance on the
master server - there is space AFAICT - and then do one pivot to move
the master to that instance. That should be ~ the same downtime as
FDT, once the migration steps are tightened up. (We have to down
pgbouncer, sync the cluster, reconfigure it, reconfigure pgbouncer to
point to the new master, bring up pgbouncer, done).

That would avoid running off of a slave (which I'm not entirely sure
is doable even with the excellent recent improvements), doesn't need
replication code changes (which are waste as we're dropping slony for
regular use anyhow) and has AFAICT the lowest downtime window.

-Rob


Follow ups

References