← Back to team overview

maria-developers team mailing list archive

Re: Semisync plugin incompatibility

 

On 2013-11-15 19:34, Pavel Ivanov wrote:
On Fri, Nov 15, 2013 at 1:28 AM, Alex Yurchenko
<alexey.yurchenko@xxxxxxxxxxxxx> wrote:
Please pardon this arrogant interruption of your discussion and shameless self-promotion, but I just could not help noticing that Galera replication was designed specifically with these goals in mind. And it does seem to achieve them better than semi-sync plugin. Have you considered Galera? What
makes you prefer semi-sync over Galera, if I may ask?

To be honest I never looked at how Galera works before. I've looked at
it now and I don't see how it can fit with us. The major disadvantages
I immediately see:
1. Synchronous replication. That means client must wait while
transaction is applied on all nodes which is unacceptably big latency
of each transaction. And what if there's a network blip and some node
becomes inaccessible? All writes will just freeze? I see the statement
that "failed nodes automatically excluded from the cluster", but to do
that cluster must wait for some timeout in case it's indeed a network
blip and node will "quickly" reconnect. And every client must wait for
cluster to decide what happened with that one node.
2. Let's say node fell out of the cluster for 5 minutes and then
reconnected. I guess it will be treated as "new node", it will
generate state transfer and the node will start downloading the whole
database? And while it's trying to download say 500GB of data files
all other nodes (or maybe just donor?) won't be able to change those
files locally and thus will blow up its memory consumption. That means
they could quickly run out of memory and "new node" won't be able to
finish its "initialization"...
3. It looks like there's strong asymmetry in starting cluster nodes --
the first one should be started with empty wsrep_cluster_address and
all others should be started with the address of the first node. So I
can't start all nodes uniformly and then issue some commands to
connect them to each other. That's bad.
4. What's the transition path? How do I upgrade MySQL/MariaDB
replicating using usual replication to Galera? It looks like there's
no such path and the solution is stop the world using regular
replication and restart it using Galera. Sorry I can't do that with
our production systems.

I believe these problems are severe enough for us, so that we can't
work with Galera.

Pavel, you seem to be terribly mistaken on almost all accounts:

1. *Replication* (i.e. data buffer copying) is indeed synchronous. But nobody said that commit is. What Galera does is very similar to semi-sync, except that it does it technically better. I would not dare to suggest Galera replication if I didn't believe it to be superior to semi-sync in every respect. As an example here's an independent comparison of Galera vs. semi-sync performance: http://linsenraum.de/erkules/2011/06/momentum-galera.html. In fact, majority of Galera users migrated from the regular *asynchronous* MySQL replication, which I think is a testimony to Galera performance.

2. Node reconnecting to cluster will normally receive only events that it missed while being disconnected.

3. You are partially right about it, but isn't it much different from regular MySQL replication where you first need to set up master and then connect slaves (even if you have physically launched the servers at the same time). Yet, Galera nodes can be started simultaneously and then joined together by setting wsrep_cluster_address from mysql client connection. This is not advertised method, because in that case state snapshot transfer can be done only by mysqldump. If you set the address in advance, rsync or xtrabackup can be used to provision the fresh node.

4. Every Galera node can perfectly work as either master or slave to native MySQL replication. So migration path is quite clear.

It is very sad that you happen to have such gross misconceptions about Galera. If those were true, how would MariaDB Galera Cluster get paying customers? May be my reply will convince you to have a second look at it. (In addition to the above Galera is fully multi-master, does parallel applying and works great in WAN)

Kind regards,
Alex

Pavel

--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011


References