maria-discuss team mailing list archive

Thread
Date

Re: Galera replication - optimistic locking problems

To: maria-discuss@xxxxxxxxxxxxxxxxxxx
From: Markus Mäkelä <markus.makela@xxxxxxxxxxx>
Date: Wed, 16 Dec 2020 03:35:33 +0200
In-reply-to: <202012152253.02955.Antony.Stone@mariadb.open.source.it>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0

Hi,

With Galera, you really want to write to a single node to avoidconflicts. Since Galera doesn't do any partitioning of the data, the IOwill still happen on all nodes which leaves convenience as the onlyreason to write to multiple nodes. A common method of avoiding deadlockswith Galera is to put a proxy in front of it that understands thecluster. One of these is MariaDB MaxScale (a team which I'm a part of)which has advanced support for Galera clusters.

The Galera monitor in MaxScale uses the wsrep_local_index variable topick a single node in the cluster that all MaxScales write to. Thiseliminates the possibility of conflicts due to the distributed nature ofGalera but still allows you to write to any of the MaxScale instances.

Another feature of MaxScale that could help with applications that don'tknow to retry transactions is the transaction_replay<https://mariadb.com/kb/en/mariadb-maxscale-25-readwritesplit/#transaction_replay>feature of the readwritesplit router. This feature has a mode<https://mariadb.com/kb/en/mariadb-maxscale-25-readwritesplit/#transaction_replay_retry_on_deadlock>where it can retry an active transaction if it ends up in a deadlock.This allows transparent retrying of the transaction while still keepingall the consistency guarantees. There are of course some limitations towhat can be successfully retried which means some edge cases might notbe solved by this.

Using a proxy does have its downside, increased latency and additionalmaintenance burden of the added servers being the most obvious ones. Oneway you can avoid this is to place the proxy on the application serverand have it behave as a sort of a connector.


Markus

On 12/15/20 11:53 PM, Antony Stone wrote:

Therefore I'd like to find some way to:

1. tell Galera to use pessimistic locking for replication if possible (I can
accept the performance penalty)

2. tell MariaDB to automatically retry the write when the error occurs
(although I can't think of any way that could be done, since I can't create a
transaction of any sort - the operation is entirely determined for me by
Asterisk)

3. find a High-Availability (which basically means no single point of failure)
front-end for the whole cluster of 4 MariaDB servers so that the problem does
not occur.


Questions:

1. How do other people deal with this problem?

2. Are any of my potential solutions above actually feasible?  (If so, how?)

3. Does anyone have any alternative ideas about how to connect an application
which doesn't understand retrying database writes (Asterisk) with a database
which doesn't guarantee to write the data you give it (MariaDB + Galera)?


--
Markus Mäkelä, Senior Software Engineer
MariaDB Corporation
t: +358 40 7740484

Follow ups

Re: Galera replication - optimistic locking problems
From: Antony Stone, 2020-12-16

References

Galera replication - optimistic locking problems
From: Antony Stone, 2020-12-15