← Back to team overview

maria-discuss team mailing list archive

Re: Replication and auto-increment


Hi Brian, 

On 1/27/10 12:51 PM PST, "Brian Evans" <grknight@xxxxxxxxxxxxxx> wrote:

> I was reading over the replication discussion on maria-developers
> I thought I'd share some thoughts that are not so technical as I do not
> understand the back end that well.
> Our system is currently a master-master WAN replication setup, each
> writing to a different DB with an offline process to do copies.
> Our databases extensively uses auto-increments on tables (more than
> necessary, but that's another story).
> Would future developments allow synchronous replication to occur on a
> single database when the WAN link has a single point of failure(SPoF)?
> How might an auto-increment system handle such a failure?
> This SPoF is our network link that can be a transient problem out of our
> control.
> Both sides may need to keep inserting data during this transient issue.
> This is usually not an issue 99% of the time, but it's better to prepare
> for downtime than trying to fix data errors in that 1%.
Given the requirement you just stated synchronous replication is not an
option because your database will become unavailable if the WAN connection
goes down.  This is not a limitation of current implementations.  It is a
consequence of the problem of distributed consensus, which is like the
halting problem for distributed systems and means that distributed databases
cannot be guaranteed to have the same contents if one of them is allowed to

Instead there are workarounds.

1.) Have masters in both locations as you seem to describe and replicate to
backup copies on the other site.

2.) Use auto-increment-offset as described by Arjen up-thread.  However,
this does not protect from other types of conflicts such as adding a child
row with a particular FK on one site while deleting its parent row and hence
PK on the other site.  It just handles inserts.

3.) Rewrite your data model to eliminate conflicts.  In that case you are
better off dispensing with auto-increment entirely.

4.) Route updates to a single master for processing and then replicate them
back to local slaves.  In this case you must enqueue updates on remote sites
(i.e., accept them provisionally) during an outage.  They don't commit fully
until the link is back up.  This usually involves not just a data model
rewrite but re-architecting the application.

What's wrong with the current approach you are using?  Unless I'm mistaken
you don't have a problem with outages now because you can continue to update
the masters and replicate once the link resumes.  If you are looking for
full data consistency across sites that problem is not solvable but others
may be.  

Cheers, Robert
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
> Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-discuss
> More help   : https://help.launchpad.net/ListHelp