launchpad-dev team mailing list archive

Thread
Date

Re: architecture review progress

To: Robert Collins <robert.collins@xxxxxxxxxxxxx>
From: Stuart Bishop <stuart.bishop@xxxxxxxxxxxxx>
Date: Tue, 27 Jul 2010 15:32:04 +0700
Cc: Launchpad Community Development Team <launchpad-dev@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <AANLkTikM-tX2-pHWKfrHR8wNyyO4vd8rBYFdU0E_jBuf@mail.gmail.com>
Sender: stuart@xxxxxxxxxxxxxxxx

On Sun, Jul 25, 2010 at 2:47 AM, Robert Collins
<robert.collins@xxxxxxxxxxxxx> wrote:

>  - We have interlinked performance problems; the DB is a choke point
> for writes, and we write a lot - enough that when a backup goes wrong,
> we have a timeout spike on lpnet and edge, because we have little

I don't follow this. What backups go wrong?

writes are generally not a problem, except when they are done in bulk.
Importing a translation can do a lot of writes. Making a new branch
can do a lot of writes. Spreading these into multiple transactions
could ease pain.

> headroom. Queries that take 6000ms on staging (when in cache) take
> 14000ms on prod slaves, and 24000ms or more on prod main : we're

I've seen situations where this is false btw - people trying to
diagnose slow queries on production using staging, but staging has
data that is a week out of date and the performance problem does not
exist there. But yes, an idle staging will perform better than a
loaded production (production dbs have more ram and cores, but shared
over orders of magnitude more connections).

One of my focuses is to get people using the slave databases much more
aggressively. The more the slave databases are used, the faster and
more scalable we get. The master database is not running on the server
with 16 cores because of the write load, but because of its read load.

-- 
Stuart Bishop <stuart@xxxxxxxxxxxxxxxx>
http://www.stuartbishop.net/

References

architecture review progress
From: Robert Collins, 2010-07-24