← Back to team overview

maria-developers team mailing list archive

Re: [Commits] Rev 4376: MDEV-6676: Speculative parallel replication in http://bazaar.launchpad.net/~maria-captains/maria/10.0

 

Robert Hodges <robert.hodges@xxxxxxxxxxxxxx> writes:

> Right.  I thought about that problem a lot in the Tungsten parallel apply
> design and ended up with an approach that allows workers to diverge by
> several minutes or longer.  This enables Tungsten to maintain good
> throughput even in the face of lumpy workloads that contain transactions

> So are replication domains "shards"?  My definition of a shard in this
> context is a causally independent stream of transactions, which is
> effectively a partial order within the fully serialized log.  That's an
> excellent feature.  Assuming that's what you have done, how do you handle
> operations like CREATE USER that are global in effect?

Yes, it sounds like replication domains are basically the same as shard.

So in MariaDB, I suppose my approach is that we will try to do some amount of
parallelisation automatically, and completely transparent to all applications
(this is the in-order parallel replication). If that is not sufficient, the
user can additionally help by splitting their load into replication domains,
eg. to put the "lumps" in a separate domain, which will allow other
transactions to execute ahead.

And when splitting into separate domains, the burden falls on the
user/application to ensure that different domains can replicate
independently. So for something like CREATE USER or CREATE TABLE and the like,
it will be necessary to ensure manually that all slaves have replicated the
statement with global effect, before doing dependent transactions in a
separate domain on the master. One way to ensure this is to run a
MASTER_GTID_WAIT() on all slaves with the @@LAST_GTID of the statement from
the master.

> (Just point me to
> docs or your blog if you wrote it up.  I would love to learn more.)

Docs are here:

    https://mariadb.com/kb/en/mariadb/documentation/replication-cluster-multi-master/replication/parallel-replication/
    https://mariadb.com/kb/en/mariadb/documentation/replication-cluster-multi-master/replication/global-transaction-id/

I wrote some stuff on my blog:

    http://kristiannielsen.livejournal.com/18435.html
    http://kristiannielsen.livejournal.com/16826.html
    http://kristiannielsen.livejournal.com/17008.html
    http://kristiannielsen.livejournal.com/17238.html
    http://kristiannielsen.livejournal.com/18308.html

I notice that I wrote mostly about global transaction ID, and less about the
parallel replication. Well, they are strongly interdependent, and eg. the
replication domains are well explained in my writings on GTID, I hope. Though
some features of parallel replication can be used even without GTID.

 - Kristian.


References