launchpad-reviewers team mailing list archive

Thread
Date

[Merge] lp:~stub/launchpad/replication into lp:launchpad

To: mp+121410@xxxxxxxxxxxxxxxxxx
From: Stuart Bishop <stuart.bishop@xxxxxxxxxxxxx>
Date: Tue, 28 Aug 2012 12:34:41 -0000
Reply-to: mp+121410@xxxxxxxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx

Stuart Bishop has proposed merging lp:~stub/launchpad/replication into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #307407 in Launchpad itself: "slave database should never be used when lag is too great"
  https://bugs.launchpad.net/launchpad/+bug/307407
  Bug #345835 in Launchpad itself: "Database load balancing should use slave lag, not cluster lag"
  https://bugs.launchpad.net/launchpad/+bug/345835
  Bug #447453 in Launchpad itself: "Changes made through the API (via javascript) aren't blacklisting the Slave DBs"
  https://bugs.launchpad.net/launchpad/+bug/447453
  Bug #461800 in Launchpad itself: "new-slave.py no longer works"
  https://bugs.launchpad.net/launchpad/+bug/461800
  Bug #504696 in Launchpad itself: "Replication lag checks can block"
  https://bugs.launchpad.net/launchpad/+bug/504696
  Bug #504751 in Launchpad itself: "Standalone slave not subscribed to the authdb replication set"
  https://bugs.launchpad.net/launchpad/+bug/504751
  Bug #504807 in Launchpad itself: "authdb replication set sequence values not being restored on staging"
  https://bugs.launchpad.net/launchpad/+bug/504807
  Bug #514267 in Launchpad itself: "InternalError on clusters under busy load"
  https://bugs.launchpad.net/launchpad/+bug/514267
  Bug #1014661 in Launchpad itself: "Replication lag checks do not understand PG 9.1 streaming replication"
  https://bugs.launchpad.net/launchpad/+bug/1014661

For more details, see:
https://code.launchpad.net/~stub/launchpad/replication/+merge/121410

= Summary =

We want systems that only need a hot standby database to be always
available, even during database updates. To support this, we plan
to have the fast downtime deployment scripts stagger how db changes
get applied (master first while hot standby is available, then hot
standby when the master is available).

== Proposed fix ==

If a client requests a hot standby Store, and the hot standby is
down, return the master Store instead.

I'm doing this in the BaseDatabasePolicy, so this logic affects
everything. This means not only do we get the behavior we are after
with the hot-standby only clients (that don't exist yet), other systems
will become available sooner during a FDT update because they will only
be down until the database updates have been applied on the master and
not until those changes have propagated to the hot standbys.

== Pre-implementation notes ==

== LOC Rationale ==

== Implementation details ==

== Tests ==

== Demo and Q/A ==


= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/services/webapp/dbpolicy.py
-- 
https://code.launchpad.net/~stub/launchpad/replication/+merge/121410
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~stub/launchpad/replication into lp:launchpad.

Follow ups

[Merge] lp:~stub/launchpad/replication into lp:launchpad
From: Stuart Bishop, 2012-09-07
Re: [Merge] lp:~stub/launchpad/replication into lp:launchpad
From: Stuart Bishop, 2012-09-06
[Merge] lp:~stub/launchpad/replication into lp:launchpad
From: Stuart Bishop, 2012-08-30
Re: [Merge] lp:~stub/launchpad/replication into lp:launchpad
From: William Grant, 2012-08-30
Re: [Merge] lp:~stub/launchpad/replication into lp:launchpad
From: Stuart Bishop, 2012-08-28