← Back to team overview

maria-developers team mailing list archive

Re: Option to _not_ disable semisync on slave timeout?

 

Actually, Kristian's description is pretty accurate. When last
semi-sync slave disconnects from master rpl_semi_sync_master_clients
becomes 0. When after that waiting for semi-sync ack times out
rpl_semi_sync_master_status becomes OFF and master continues to commit
transactions without waiting for semi-sync acks.

Note that it's pretty much impossible to fail and rollback in this
case because the transaction is already written to the binlog (no
matter whether waiting for semi-sync ack happens after_sync or
after_commit) and possibly other transactions are written to binlog
after it too.

So the only cure against this at Google was to set
rpl_semi_sync_master_timeout to a very big value. Although that
doesn't help much in case of crashes, shutdowns and forced failovers.
To help with those Jonas implemented recently additional code that:
a) Never stops waiting for semi-sync ack, even if client is
disconnected or connection is killed.
b) Forcefully crashes when rpl_semi_sync_master_enabled is switching
to OFF and there are transactions in progress waiting for semi-sync
ack.
c) Rollbacks all prepared and uncommitted transactions in InnoDB at
startup, and truncates those transactions from binlog.

We believe that this new code together with
rpl_semi_sync_master_wait_point=after_sync will guarantee no unacked
transactions on any server.

On Mon, Dec 28, 2015 at 4:38 AM,  <g.maxia@xxxxxxxxx> wrote:
> Hi Kristian,
> Your description is not accurate. If there are no slaves ready to
> acknowledge a transaction, after a timeout the status variable
> Rpl_semi_sync_master_clients in the master becomes 0. The master itself,
> though, remains able to perform semi-synch operations. As soon as one of the
> slaves is enabled, the variable Rpl_semi_sync_master_clients is incremented
> and semi-synch replication resumes.
> HTH
>
> Giuseppe
>
>
> On 28 Dec 2015 at 13:17:10 , Kristian Nielsen (knielsen@xxxxxxxxxxxxxxx)
> wrote:
>
> Hi Jonas (or anyone else who may know),
>
> If I understand correctly, semisync replication has a timeout for how long
> it will wait for a slave to acknowledge transactions. If the timeout is
> exceeded, semisync is turned off on the master.
>
> Do you know if there is a facility to avoid semisync being turned off
> automatically in this case, maybe instead fail and roll back the master
> transaction?
>
> I was thinking at Google you might already have needed and implemented this
> (merged or not merged into MariaDB yet), or you may be aware of a patch by
> someone else for this?
>
> Thanks, and merry Christmas!
>
> - Kristian.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-developers
> Post to : maria-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-developers
> More help : https://help.launchpad.net/ListHelp
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-developers
> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-developers
> More help   : https://help.launchpad.net/ListHelp
>


References